US 12,434,737 B2
Trajectory prediction for autonomous vehicles using attention mechanism
Ethan Miller Pronovost, Redwood City, CA (US)
Assigned to Zoox, Inc., Foster City, CA (US)
Filed by Zoox, Inc., Foster City, CA (US)
Filed on Jan. 4, 2023, as Appl. No. 18/093,256.
Prior Publication US 2024/0217548 A1, Jul. 4, 2024
Int. Cl. B60W 60/00 (2020.01); B60W 50/00 (2006.01); G06N 20/00 (2019.01)
CPC B60W 60/0011 (2020.02) [B60W 50/0097 (2013.01); B60W 60/00274 (2020.02); G06N 20/00 (2019.01); B60W 2554/20 (2020.02); B60W 2554/4041 (2020.02); B60W 2554/4045 (2020.02)] 20 Claims
OG exemplary drawing
 
1. A system comprising:
one or more processors; and
one or more non-transitory computer-readable media storing computer-executable instructions that, when executed, cause the one or more processors to perform operations comprising:
receiving sensor data of an environment captured by a sensor of an autonomous vehicle, wherein the environment includes a first agent and a plurality of additional objects;
determining a maximum object subset size based at least in part on a dimension of a fixed-size data structure;
determining, from the plurality of additional objects, a first subset of objects of the maximum object subset size, wherein the first subset of objects is relevant to a future state of the first agent, the first subset of objects including a first object and a second object;
determining a first feature vector associated with the first agent at a first time in the environment;
determining a second vector of relative state data associated with the first agent at the first time, based at least in part on first state data associated with the first object and second state data associated with the first object;
determining an object interaction vector of the maximum object subset size and associated with the first agent, based at least in part on a dot product of:
the first feature vector; and
the second vector of relative state data,
wherein the object interaction vector includes a first attention score associated with the first object and a second attention score associated with the second object;
encoding the object interaction vector into the fixed-size data structure, wherein a number of attention scores in the object interaction vector is limited to the maximum object subset size;
determining, based at least in part on using a fixed-cost attention mechanism to perform a tensor operation on the fixed-size data structure, a predicted trajectory for the first agent within the environment; and
controlling the autonomous vehicle based at least in part on the predicted trajectory for the first agent.