US 12,462,573 B1
Detection and classification based on scene embeddings and image embeddings
Samir Joshi, Los Altos, CA (US); Shengnan Liang, Pittsburgh, PA (US); Kelly Leece McGuire, San Francisco, CA (US); William Harland Montgomery, IV, Burlingame, CA (US); and Peter Scott Schleede, El Dorado Hills, CA (US)
Assigned to Zoox, Inc., Foster City, CA (US)
Filed by Zoox, Inc., Foster City, CA (US)
Filed on Sep. 30, 2024, as Appl. No. 18/901,242.
Int. Cl. G06V 20/56 (2022.01); B60W 60/00 (2020.01); G06V 10/25 (2022.01); G06V 10/764 (2022.01)
CPC G06V 20/56 (2022.01) [B60W 60/001 (2020.02); G06V 10/25 (2022.01); G06V 10/764 (2022.01)] 20 Claims
OG exemplary drawing
 
1. A system comprising:
one or more processors; and
one or more non-transitory computer-readable media storing computer-executable instructions that, when executed, cause the system to perform operations comprising:
receiving, from a sensor device associated with an autonomous vehicle, sensor data of an environment;
determining, based at least in part on the sensor data, perception data;
generating a scene representation associated with the perception data;
receiving, from an image capturing device of the autonomous vehicle, image data of the environment;
generating an image representation associated with the image data;
generating, based at least in part on the scene representation and the image representation, an aggregated representation;
inputting the aggregated representation into a machine learned model;
receiving, from the machine learned model, data associated with the environment; and
controlling the autonomous vehicle based at least in part on data.