| CPC G06Q 10/0631 (2013.01) [G01C 21/20 (2013.01); B64U 2101/61 (2023.01); B64U 2201/10 (2023.01); G05D 1/101 (2013.01)] | 9 Claims |

|
1. A processor-implemented drone taxi system comprising:
a plurality of drone taxies configured to receive call information including a departure location point and a destination location point from passenger terminals present within a certain range; and
a control server configured to receive call information of passengers from each drone taxi, select a candidate passenger depending on whether a passenger is present, generate travel route information of each drone taxi from drone state information of the plurality of drone taxies using an optimization model trained through multi-agent reinforcement learning, transmit the travel route information to each drone taxi, and control each drone taxi to travel, according to the generated travel route information and under a control of the control server, to a departure location point of the selected candidate passenger, the controlling further including controlling each drone taxi to update a route of the traveling to be set with the generated travel route information and controlling each drone taxi to transmit assignment information and boarding information to a terminal of the selected candidate passenger,
wherein the control server comprises:
a passenger selector configured to receive the call information of the passengers from each drone taxi and select the candidate passenger depending on whether a passenger is present; and
a route optimizer configured to, every time the drone state information of each of the plurality of drone taxies is changed and updated, re-generate the travel route information of each drone taxi from the drone state information of the plurality of drone taxies using the optimization model trained through the multi-agent reinforcement learning and transmit the travel route information to each drone taxi, and
the drone state information includes current location information, onboard passenger information, candidate passenger information, and vacant seat information,
wherein the optimization model is trained using a two-stage attention mechanism trained through multi-agent reinforcement learning.
|