US 11,810,225 B2
Top-down scene generation
Gerrit Bagschik, Foster City, CA (US); Andrew Scott Crego, Foster City, CA (US); Gowtham Garimella, Burlingame, CA (US); Michael Haggblade, El Dorado Hills, CA (US); Andraz Kavalar, San Francisco, CA (US); and Kai Zhenyu Wang, Foster City, CA (US)
Assigned to Zoox, Inc., Foster City, CA (US)
Filed by Zoox, Inc., Foster City, CA (US)
Filed on Mar. 30, 2021, as Appl. No. 17/218,010.
Prior Publication US 2022/0319057 A1, Oct. 6, 2022
Int. Cl. G06T 11/00 (2006.01); G06N 3/088 (2023.01); G06F 30/27 (2020.01); G06N 3/045 (2023.01)
CPC G06T 11/00 (2013.01) [G06F 30/27 (2020.01); G06N 3/045 (2023.01); G06N 3/088 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A system comprising:
one or more processors; and
one or more non-transitory computer-readable media storing instructions executable by the one or more processors, wherein the instructions, when executed, cause the system to perform operations comprising:
inputting, to a first convolutional neural network (CNN), multi-channel image data and map data of an environment;
generating, using the first CNN and based at least in part on the multi-channel image data and the map data, a generated top-down scene including first occupancy information and first attribute information for generated objects within the generated top-down scene, wherein the generated objects are absent in the multi-channel image data and the map data;
inputting, to a second CNN, scene data comprising the generated top-down scene and a real top-down scene including second occupancy information and second attribute information for one or more real objects within the real top-down scene;
receiving, from the second CNN, binary classification data indicative of whether an individual scene in the scene data is classified as generated or not generated; and
providing the binary classification data as a loss to the first CNN and the second CNN.