US 12,485,890 B2
	Learning autonomous vehicle safety concepts from demonstrations
Karen Yan Ming Leung, Los Altos, CA (US); Sushant Veer, Sunnyvale, CA (US); Edward Fu Schmerling, Los Altos, CA (US); and Marco Pavone, Stanford, CA (US)
Assigned to NVIDIA Corporation, Santa Clara, CA (US)
Filed by NVIDIA Corporation, Santa Clara, CA (US)
Filed on Mar. 14, 2023, as Appl. No. 18/183,566.
Claims priority of provisional application 63/359,414, filed on Jul. 8, 2022.
Prior Publication US 2024/0010196 A1, Jan. 11, 2024
Int. Cl. B60W 30/095 (2012.01); B60W 30/09 (2012.01); B60W 60/00 (2020.01)

CPC B60W 30/0956 (2013.01) [B60W 30/09 (2013.01); B60W 60/0015 (2020.02); B60W 2554/4041 (2020.02)]

20 Claims

1. A method comprising:

applying, to at least one machine learning model (MLM), a dataset corresponding to sensor data obtained using one or more sensors of a plurality of machines, the dataset representing examples of collision-free trajectories using joint states of entities navigating through an environment;

based at least on the applying, training the at least one MLM to learn a control policy defining a control action space for a machine of the plurality of the machines and to predict, using the control policy and for a joint state between agents, one or more control actions of the control action space for at least one agent of the agents to perform in response to the joint state, the control policy being learned using a function that assigns values to states resulting from control actions of the control action space, the values defining whether the states are within an unsafe region of a state space or a safe region of the state space based at least on a threshold value that defines a boundary between the safe region and the unsafe region, the safe region corresponding to the collision-free trajectories;

determining one or more functions that compute, based at least on the joint state and for the one or more control actions, output indicating a likelihood of collision if the at least one agent were to use the one or more control actions predicted by the at least one MLM using the control policy; and

performing one or more control operations for the machine using the output indicating the likelihood of collision.