US 12,311,981 B2
	Machine-learned cost estimation in tree search trajectory generation for vehicle control
Yan Chang, Mountain View, CA (US); Aaron Huang, San Francisco, CA (US); Peter Scott Schleede, El Dorado Hills, CA (US); Gary Linscott, Seattle, WA (US); Marin Kobilarov, Baltimore, MD (US); Ethan Miller Pronovost, Redwood City, CA (US); Ke Sun, Foster City, CA (US); and Xiangyu Xie, Baltimore, MD (US)
Assigned to Zoox, Inc., Foster City, CA (US)
Filed by Zoox, Inc., Foster City, CA (US)
Filed on Dec. 19, 2022, as Appl. No. 18/084,419.
Prior Publication US 2024/0199083 A1, Jun. 20, 2024
Int. Cl. B60W 60/00 (2020.01); B60W 30/12 (2020.01); B60W 30/18 (2012.01); G06N 20/00 (2019.01)

CPC B60W 60/0027 (2020.02) [G06N 20/00 (2019.01); B60W 30/12 (2013.01); B60W 30/18163 (2013.01); B60W 2554/20 (2020.02); B60W 2554/40 (2020.02)]

22 Claims

1. A system comprising:

one or more processors; and

a memory storing processor-executable instructions that, when executed by the one or more processors, cause the system to perform operations comprising:

receiving environment state data indicating static characteristics of an environment associated with a vehicle;

receiving dynamic object data indicating at least one of a historical or a current state of at least one of a dynamic object or the vehicle;

receiving prediction node data indicating one or more actions performed by the vehicle to reach a future state indicated by a prediction node of a tree search, the future state comprising a future vehicle state and at least one of a predicted environment state, or a predicted dynamic object state;

determining, by a first machine-learned model and based at least in part on the environment state data, environment features;

determining, by a second machine-learned model and based at least in part on the dynamic object data, dynamic features;

initializing a third machine-learned model based at least in part on the dynamic features, wherein the first machine-learned model, the second machine-learned model, and the third machine-learned model are distinct and have different types of inputs;

determining, by the third machine-learned model and based at least in part on the environment features and the prediction node data, a first output;

aggregating, by an encoder, the first output and the dynamic features as an encoded output;

determining, by a decoder and based at least in part on the encoded output, an estimated cost; and

determining a trajectory for controlling the vehicle based at least in part on the estimated cost.