US 12,080,045 B2
Obstacle recognition method and apparatus, computer device, and storage medium
Shixi Chen, Guangzhou (CN); Yuchen Zhou, Guangzhou (CN); Hua Zhong, Guangzhou (CN); and Xu Han, Guangzhou (CN)
Assigned to GUANGZHOU WERIDE TECHNOLOGY CO., LTD., (CN)
Appl. No. 17/601,005
Filed by GUANGZHOU WERIDE TECHNOLOGY CO., LTD., Guangdong (CN)
PCT Filed Apr. 17, 2019, PCT No. PCT/CN2019/083083
§ 371(c)(1), (2) Date Oct. 1, 2021,
PCT Pub. No. WO2020/206708, PCT Pub. Date Oct. 15, 2020.
Claims priority of application No. 201910278795.5 (CN), filed on Apr. 9, 2019.
Prior Publication US 2022/0198808 A1, Jun. 23, 2022
Int. Cl. G01S 17/89 (2020.01); G06T 3/02 (2024.01); G06T 7/73 (2017.01); G06T 11/00 (2006.01); G06V 10/44 (2022.01); G06V 10/80 (2022.01); G06V 10/82 (2022.01); G06V 20/58 (2022.01)
CPC G06V 10/454 (2022.01) [G01S 17/89 (2013.01); G06T 3/02 (2024.01); G06T 7/74 (2017.01); G06T 11/00 (2013.01); G06V 10/806 (2022.01); G06V 10/82 (2022.01); G06V 20/58 (2022.01); G06T 2207/10028 (2013.01); G06T 2207/20068 (2013.01); G06T 2207/20084 (2013.01); G06T 2207/30261 (2013.01)] 10 Claims
OG exemplary drawing
 
1. An obstacle recognition method, wherein the method comprises:
acquiring point cloud data scanned by a Light Detection and Ranging (LiDAR) and time-sequence pose information of a vehicle;
determining a spliced image of bird's eye view according to the point cloud data, the time-sequence pose information, and a historical frame embedded image, comprising:
determining a grid embedded image of the bird's eye view according to the point cloud data;
determining a conversion image of the historical frame embedded image according to the time-sequence nose information and the historical frame embedded image, comprising:
calculating an affine transformation parameter from a historical frame to a current frame according to the time-sequence pose information; and
transforming the historical frame embedded image by translation and rotation according to the affine transformation parameter to obtain the conversion image of the historical frame embedded image:
splicing the grid embedded image of the bird's eye view and the conversion image of the historical frame embedded image to obtain the spliced image of the bird's eye view;
inputting the spliced image into a preset first CNN model to obtain a current frame embedded image and pixel-level information of the bird's eye view; and
determining recognition information of at least one obstacle according to the current frame embedded image and the pixel-level information, comprising:
determining attribute information of the at least one obstacle according to the pixel-level information, the attribute information comprising position information and size information of the obstacle;
determining pixel-level embedding of each obstacle from the current frame embedded image according to the attribute information of each obstacle; and
inputting the pixel-level embedding of each obstacle into a preset neural network model to obtain recognition information of each obstacle, wherein the neural network model comprises a third CNN model and a second FC network model; and
the inputting the pixel-level embedding of each obstacle into a preset neural network model to obtain recognition information of each obstacle comprises:
inputting the pixel-level embedding of each obstacle into the third CNN model to obtain object-level embedding of each obstacle; and
inputting the object-level embedding into the second FC network model to obtain the recognition information of the at least one obstacle.