US 12,307,712 B1
	Method and apparatus for localizing intelligent vehicle in dynamic scene
Zhiwei Li, Beijing (CN); Jingwei Wang, Beijing (CN); Ruosen Hao, Beijing (CN); Yang Zhou, Beijing (CN); Wei Zhang, Beijing (CN); Kunfeng Wang, Beijing (CN); Wei Zhang, Beijing (CN); Li Wang, Beijing (CN); Qifan Tan, Beijing (CN); and Tianyu Shen, Beijing (CN)
Assigned to Beijing University of Chemical Technology, Beijing (CN)
Filed by Beijing University of Chemical Technology, Beijing (CN)
Filed on Jan. 17, 2025, as Appl. No. 19/026,644.
Claims priority of application No. 202410435956.8 (CN), filed on Apr. 11, 2024.
Int. Cl. G06T 7/73 (2017.01); G06T 7/20 (2017.01); G06T 7/521 (2017.01); G06V 10/75 (2022.01); G06V 20/58 (2022.01)

CPC G06T 7/74 (2017.01) [G06T 7/20 (2013.01); G06T 7/521 (2017.01); G06V 10/751 (2022.01); G06V 20/58 (2022.01); G06T 2207/10016 (2013.01); G06T 2207/10024 (2013.01); G06T 2207/10028 (2013.01); G06T 2207/10044 (2013.01); G06T 2207/20081 (2013.01); G06T 2207/30252 (2013.01); G06V 2201/07 (2022.01)]

16 Claims

1. A method for localizing an intelligent vehicle in a dynamic scene, comprising:

acquiring first point cloud data collected by a 4D millimeter wave radar, second point cloud data collected by a lidar, and a red green blue (RGB) image collected by an RGB camera, for a target area at a current time;

processing the RGB image using a target detection model to determine a rectangular box of a movable object;

converting the first point cloud data and the second point cloud data to a pixel coordinate system, and dividing the movable object into a static object and a dynamic object based on a velocity of the first point cloud data in the rectangular box of the movable object;

converting the second point cloud data of the movable object in the pixel coordinate system to a map coordinate system to obtain a semantic point cloud map;

converting the second point cloud data of a communication area outside the rectangular box of the movable object, and of a rectangular box of the static object, in the pixel coordinate system to the map coordinate system to obtain a static point cloud map;

determining observation weights for objects in the second point cloud data based on the static point cloud map and the semantic point cloud map; and

determining pose information of the intelligent vehicle at the current time using the observation weights of the objects in the second point cloud data;

wherein the method further comprises:

acquiring a training sample set comprising consecutive intelligent vehicle sequence frame data, wherein each of the consecutive intelligent vehicle sequence frame data comprises a first point cloud data sample, a second point cloud data sample, an RGB image sample, and an actual pose of the intelligent vehicle;

processing the RGB image sample using the target detection model to determine the rectangular box of the movable object;

converting the first point cloud data sample and the second point cloud data sample to the pixel coordinate system, and dividing the movable object into the static object and the dynamic object based on a velocity of the first point cloud data sample in the rectangular box of the movable object;

converting the second point cloud data sample of the movable object in the pixel coordinate system to the map coordinate system to obtain a semantic point cloud map sample;

converting the second point cloud data sample of the communication area outside the rectangular box of the movable object, and of the rectangular box of the static object, in the pixel coordinate system to the map coordinate system to obtain a static point cloud map sample;

determining the observation weights for the objects in the second point cloud data sample based on the static point cloud map sample and the semantic point cloud map sample;

acquiring a distance value from the lidar for each point in the second point cloud data sample;

calculating a third distance value in the static point cloud map sample for the distance value from the lidar for each point, and a fourth distance value in the semantic point cloud map sample for the distance value from the lidar for each point, and determining a first difference between the third distance value and the fourth distance value;

determining an observation weight corresponding to the first difference between the third distance value and the fourth distance value according to an equation for a relation between the observation weight and the first difference: W=f(δ, ∇d); wherein f(δ, ∇d) is an inverse exponential function, W is the observation weight, ∇d is the first difference, δ is a parameter to be learned, and W takes values in a range [0, 1];

determining predicted pose information of the intelligent vehicle at a current frame using the observation weights of the objects in the second point cloud data sample;

calculating a position error using predicted position information and actual position information of the intelligent vehicle at the current frame; and

updating the parameter δ using the position error.