US 11,748,449 B2
Data processing method, data processing apparatus, electronic device and storage medium
Yao Zhou, Beijing (CN); Guowei Wan, Beijing (CN); Shenhua Hou, Beijing (CN); and Shiyu Song, Beijing (CN)
Assigned to Beijing Baidu Netcom Science and Technology Co., Ltd., Beijing (CN)
Filed by Beijing Baidu Netcom Science and Technology Co., Ltd., Beijing (CN)
Filed on Nov. 25, 2020, as Appl. No. 17/105,027.
Prior Publication US 2022/0164603 A1, May 26, 2022
Int. Cl. G06V 10/22 (2022.01); G06F 18/214 (2023.01); G06N 3/08 (2023.01); G06F 18/22 (2023.01); G06F 18/2415 (2023.01)
CPC G06F 18/2148 (2023.01) [G06F 18/22 (2023.01); G06F 18/2415 (2023.01); G06N 3/08 (2013.01); G06V 10/22 (2022.01)] 20 Claims
OG exemplary drawing
 
1. A data processing method, comprising:
inputting a reference image and a captured image into a feature extraction model, respectively, to obtain a first descriptor map and a second descriptor map, the captured image being obtained by capturing an external environment from a vehicle when the vehicle is in a real pose, the reference image being obtained by pre-capturing the external environment by a capturing device;
obtaining, based on the first descriptor map, a set of reference descriptors corresponding to a set of keypoints in the reference image;
determining a plurality of sets of training descriptors corresponding to a set of spatial coordinates when the vehicle is in a plurality of training poses, respectively, the plurality of sets of training descriptors belonging to the second descriptor map, the set of spatial coordinates being determined based on the set of keypoints, the plurality of training poses being obtained by offsetting a known pose based on the real pose;
obtaining a predicted pose of the vehicle by inputting the plurality of training poses and a plurality of similarities into a pose prediction model, the plurality of similarities being between the plurality of sets of training descriptors and the set of reference descriptors; and
training the feature extraction model and the pose prediction model based on a metric representing a difference between the predicted pose and the real pose, in order to apply the trained feature extraction model and the trained pose prediction model to vehicle localization,
wherein one of the following:
the pose prediction model provides, based on the plurality of similarities, probabilities that the plurality of training poses are real poses, respectively, and the metric comprises a concentration of distribution of the probabilities; or
the pose prediction model generates a plurality of regularized similarities based on the plurality of similarities, and the metric is determined based on the plurality of regularized similarities.