US 12,249,159 B2
Systems and methods for detecting objects based on lidar data
Prasanna Sivakumar, Pittsburgh, PA (US); Kris Kitani, Pittsburgh, PA (US); Matthew Patrick O'Toole, Pittsburgh, PA (US); Xinshuo Weng, Pittsburgh, PA (US); and Shawn Hunt, Bethel Park, PA (US)
Assigned to DENSO CORPORATION, Kariya (JP)
Filed by DENSO CORPORATION, Aichi (JP)
Filed on Mar. 30, 2022, as Appl. No. 17/708,745.
Claims priority of provisional application 63/262,211, filed on Oct. 7, 2021.
Prior Publication US 2023/0112664 A1, Apr. 13, 2023
Int. Cl. G06V 20/58 (2022.01); G01S 7/4914 (2020.01); G01S 17/894 (2020.01); G06V 10/44 (2022.01); G06V 10/764 (2022.01)
CPC G06V 20/58 (2022.01) [G01S 17/894 (2020.01); G06V 10/454 (2022.01); G06V 10/764 (2022.01); G01S 7/4914 (2013.01)] 20 Claims
OG exemplary drawing
 
15. A system for detecting one or more objects based on lidar data obtained from a lidar sensor of a vehicle, wherein the lidar data corresponds to an ambient environment of the vehicle, the system comprising:
one or more processors and one or more nontransitory computer-readable mediums storing instructions that are executable by the one or more processors, wherein the instructions comprise:
generating a plurality of lidar inputs based on the lidar data, wherein:
each lidar input from among the plurality of lidar inputs comprises an image-based portion and a geometric-based portion;
each lidar input from among the plurality of lidar inputs defines a position coordinate of the one or more objects; and
the image-based portion defines a pixel value associated with the position coordinate and at least one of a light intensity value and a surface reflectance value associated with the position coordinate;
performing, for each lidar input from among the plurality of lidar inputs, a convolutional neural network (CNN) routine based on the image-based portion to generate one or more image-based outputs;
assigning the plurality of lidar inputs to a plurality of echo groups based on the geometric-based portion;
concatenating the one or more image-based outputs and the plurality of echo groups to generate a plurality of fused outputs; and
identifying the one or more objects based on the plurality of fused outputs.