| CPC B60W 30/0956 (2013.01) [B60W 30/09 (2013.01); B60W 60/0015 (2020.02); B60W 2554/4041 (2020.02)] | 20 Claims |

|
1. A method comprising:
applying, to at least one machine learning model (MLM), a dataset corresponding to sensor data obtained using one or more sensors of a plurality of machines, the dataset representing examples of collision-free trajectories using joint states of entities navigating through an environment;
based at least on the applying, training the at least one MLM to learn a control policy defining a control action space for a machine of the plurality of the machines and to predict, using the control policy and for a joint state between agents, one or more control actions of the control action space for at least one agent of the agents to perform in response to the joint state, the control policy being learned using a function that assigns values to states resulting from control actions of the control action space, the values defining whether the states are within an unsafe region of a state space or a safe region of the state space based at least on a threshold value that defines a boundary between the safe region and the unsafe region, the safe region corresponding to the collision-free trajectories;
determining one or more functions that compute, based at least on the joint state and for the one or more control actions, output indicating a likelihood of collision if the at least one agent were to use the one or more control actions predicted by the at least one MLM using the control policy; and
performing one or more control operations for the machine using the output indicating the likelihood of collision.
|