| CPC B60W 60/0011 (2020.02) [B60W 40/04 (2013.01); B60W 50/0097 (2013.01); B60W 60/0027 (2020.02); B60W 2050/0008 (2013.01); B60W 2050/0028 (2013.01); B60W 2556/10 (2020.02)] | 18 Claims |

|
1. A method of operation of an autonomous vehicle in an environment, comprising:
determining a set of inputs using a sensor suite of the autonomous vehicle, the set of inputs comprising an environmental agent instance identifier and a state history associated with the environmental agent instance identifier;
based on the set of inputs, determining a set of multiple environmental policies for the environmental agent instance identifier;
for each environmental policy of the set of multiple environmental policies:
determining a historical score by comparing the state history associated with the environmental agent instance identifier to a reference trajectory associated with the environmental policy; and
determining a feasibility score by a forward simulation of the environmental policy, wherein the forward simulation of each environmental policy comprises a closed-loop simulation for a deterministic controller associated with the environmental policy, wherein the feasibility score is determined based on a time-derivative of an accumulation of lateral error between the reference trajectory and the forward simulation; and
aggregating the historical score and the feasibility score for each environmental policy of the set of multiple environmental policies into a respective aggregate score, wherein producing the respective aggregate score comprises multiplying the respective historical score and the respective feasibility score;
determining an ego policy by evaluating a set of ego policies for the autonomous vehicle relative to the set of multiple environmental policies, based on the feasibility scores and the historical scores, wherein the evaluation of the set of ego policies relative to the set of multiple environmental policies is weighted based on the aggregate score of each environmental policy of the set of multiple environmental policies; and
controlling driving of the autonomous vehicle based on the ego policy.
|