US 12,272,120 B2
Enriching feature maps using multiple pluralities of windows to generate bounding boxes
Jongwoo Park, Long Island City, NY (US); Apoorv Sing, Pittsburgh, PA (US); and Varun Kumar Reddy Bankiti, Bellevue, WA (US)
Assigned to Motional AD LLC, Boston, MA (US)
Filed by Motional AD LLC, Boston, MA (US)
Filed on Aug. 19, 2022, as Appl. No. 17/821,154.
Prior Publication US 2024/0062520 A1, Feb. 22, 2024
Int. Cl. G06V 10/77 (2022.01); B60R 1/28 (2022.01); G06T 3/00 (2024.01); G06T 3/16 (2024.01); G06V 10/26 (2022.01); G06V 20/56 (2022.01); H04N 5/262 (2006.01)
CPC G06V 10/7715 (2022.01) [B60R 1/28 (2022.01); G06T 3/16 (2024.01); G06V 10/26 (2022.01); G06V 20/56 (2022.01); H04N 5/2628 (2013.01); B60R 2300/607 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A method, comprising:
receiving a plurality of images from a plurality of image sensors, the plurality of images corresponding to a plurality of views of a scene of a vehicle;
generating a plurality of feature maps based on the plurality of images;
determining a first plurality of windows for the plurality of feature maps, wherein a first window of the first plurality of windows includes a first grid cell from a first feature map of the plurality of feature maps and a second grid cell from a second feature map of the plurality of feature maps;
enriching a set of semantic data associated with the plurality of feature maps based on the first plurality of windows to provide a set of first enriched semantic data, wherein enriching the set of semantic data associated with the plurality of feature maps based on the first plurality of windows comprises determining first semantic data for the first grid cell using second semantic data associated with the second grid cell based on the first grid cell and the second grid cell being included in the first window;
determining a second plurality of windows for the plurality of feature maps, wherein a third window of the second plurality of windows includes the first grid cell and a third grid cell and a fourth window of the second plurality of windows includes the second grid cell;
enriching the set of first enriched semantic data associated with the plurality of feature maps based on the second plurality of windows to provide a set of second enriched semantic data associated with the plurality of feature maps, wherein enriching the set of first enriched semantic data associated with the plurality of feature maps based on the second plurality of windows comprises determining third semantic data for the first grid cell using fourth semantic data associated with the third grid cell based on the first grid cell and the third grid cell being included in the third window;
generating at least one bounding box for an object in the scene of the vehicle based on the set of second enriched semantic data; and
causing the vehicle to be controlled based on the at least one bounding box.