US 12,223,447 B2
	Drone taxi system based on multi-agent reinforcement learning and drone taxi operation using the same
Joongheon Kim, Seoul (KR); Won Joon Yun, Seoul (KR); Jae-Hyun Kim, Seoul (KR); and Soyi Jung, Suwon-si (KR)
Assigned to Korea University Research and Business Foundation, Seoul (KR); and AJOU University Industry-Academic Cooperation Foundation, Suwon-si (KR)
Filed by Korea University Research and Business Foundation, Seoul (KR); and AJOU UNIVERSITY INDUSTRY-ACADEMIC COOPERATION FOUNDATION, Suwon-si (KR)
Filed on Mar. 9, 2022, as Appl. No. 17/690,231.
Claims priority of application No. 10-2021-0034692 (KR), filed on Mar. 17, 2021.
Prior Publication US 2022/0300870 A1, Sep. 22, 2022
Int. Cl. G06Q 10/0631 (2023.01); B64U 101/61 (2023.01); G01C 21/20 (2006.01); G05D 1/00 (2024.01)

CPC G06Q 10/0631 (2013.01) [G01C 21/20 (2013.01); B64U 2101/61 (2023.01); B64U 2201/10 (2023.01); G05D 1/101 (2013.01)]

9 Claims

1. A processor-implemented drone taxi system comprising:

a plurality of drone taxies configured to receive call information including a departure location point and a destination location point from passenger terminals present within a certain range; and

a control server configured to receive call information of passengers from each drone taxi, select a candidate passenger depending on whether a passenger is present, generate travel route information of each drone taxi from drone state information of the plurality of drone taxies using an optimization model trained through multi-agent reinforcement learning, transmit the travel route information to each drone taxi, and control each drone taxi to travel, according to the generated travel route information and under a control of the control server, to a departure location point of the selected candidate passenger, the controlling further including controlling each drone taxi to update a route of the traveling to be set with the generated travel route information and controlling each drone taxi to transmit assignment information and boarding information to a terminal of the selected candidate passenger,

wherein the control server comprises:

a passenger selector configured to receive the call information of the passengers from each drone taxi and select the candidate passenger depending on whether a passenger is present; and

a route optimizer configured to, every time the drone state information of each of the plurality of drone taxies is changed and updated, re-generate the travel route information of each drone taxi from the drone state information of the plurality of drone taxies using the optimization model trained through the multi-agent reinforcement learning and transmit the travel route information to each drone taxi, and

the drone state information includes current location information, onboard passenger information, candidate passenger information, and vacant seat information,

wherein the optimization model is trained using a two-stage attention mechanism trained through multi-agent reinforcement learning.