US 11,899,099 B2
Early fusion of camera and radar frames
Radhika Dilip Gowaikar, San Diego, CA (US); Ravi Teja Sukhavasi, La Jolla, CA (US); Daniel Hendricus Franciscus Fontijne, Haarlem (NL); Bence Major, Amsterdam (NL); Amin Ansari, San Diego, CA (US); Teck Yian Lim, Urbana, IL (US); Sundar Subramanian, San Diego, CA (US); and Xinzhou Wu, San Diego, CA (US)
Assigned to QUALCOMM Incorporated, San Diego, CA (US)
Filed by QUALCOMM Incorporated, San Diego, CA (US)
Filed on Nov. 27, 2019, as Appl. No. 16/698,601.
Claims priority of provisional application 62/774,020, filed on Nov. 30, 2018.
Prior Publication US 2020/0175315 A1, Jun. 4, 2020
Int. Cl. G01S 13/931 (2020.01); G01S 7/41 (2006.01); G01S 13/86 (2006.01); G05D 1/00 (2006.01); G05D 1/02 (2020.01); G06T 7/60 (2017.01); G06V 20/56 (2022.01); G06F 18/25 (2023.01); G06F 18/22 (2023.01); G06F 18/213 (2023.01); G06V 10/80 (2022.01)
CPC G01S 13/931 (2013.01) [G01S 7/417 (2013.01); G01S 13/867 (2013.01); G05D 1/0088 (2013.01); G05D 1/0231 (2013.01); G05D 1/0257 (2013.01); G06F 18/213 (2023.01); G06F 18/22 (2023.01); G06F 18/253 (2023.01); G06T 7/60 (2013.01); G06V 10/80 (2022.01); G06V 20/56 (2022.01); G01S 2013/9318 (2020.01); G01S 2013/9319 (2020.01); G01S 2013/9321 (2013.01); G01S 2013/93185 (2020.01); G01S 2013/93276 (2020.01); G05D 2201/0213 (2013.01); G06T 2207/10044 (2013.01); G06T 2207/30252 (2013.01)] 28 Claims
OG exemplary drawing
 
1. A method of performing early fusion of camera and radar frames to perform object detection in one or more spatial domains performed by an on-board computer of a host vehicle, comprising:
receiving, from a camera sensor of the host vehicle, a plurality of camera frames;
receiving, from a radar sensor of the host vehicle, a plurality of radar frames;
performing a camera feature extraction process on a first camera frame of the plurality of camera frames to generate a first camera feature map;
performing a radar feature extraction process on a first radar frame of the plurality of radar frames to generate a first radar feature map, wherein the first radar frame corresponds in time to the first camera frame;
converting the first camera feature map, the first radar feature map, or both to a common spatial domain;
concatenating the first radar feature map and the first camera feature map to generate a first concatenated feature map in the common spatial domain;
performing object detection on the first concatenated feature map to detect one or more objects in the first concatenated feature map without performing object detection on the first camera feature map or the first radar feature map; and
estimating a width, length, or both of the one or more objects in a bird's eye view, after inverse perspective mapping, based on a bounding box in the first camera frame encapsulating each of the one or more objects.