US 11,720,110 B2
	Dynamic obstacle avoidance method based on real-time local grid map construction
Changyin Sun, Hefei (CN); Qiaoyun Wu, Hefei (CN); and Lu Ren, Hefei (CN)
Assigned to Anhui University, Hefei (CN)
Filed by Anhui University, Hefei (CN)
Filed on Dec. 28, 2022, as Appl. No. 18/147,484.
Claims priority of application No. 202210284948.9 (CN), filed on Mar. 23, 2022.
Prior Publication US 2023/0161352 A1, May 25, 2023
Int. Cl. G05D 1/00 (2006.01); G05D 1/02 (2020.01)

CPC G05D 1/0214 (2013.01) [G05D 1/0088 (2013.01); G05D 1/024 (2013.01); G05D 1/0221 (2013.01); G05D 1/0248 (2013.01); G05D 1/0274 (2013.01)]

4 Claims

1. A dynamic obstacle avoidance method based on real-time local grid map construction, comprising:

(a) acquiring Red-Green-Blue-Depth (RGBD) image data of a real indoor scene by using a RGBD sensor;

(b) inputting the RGBD image data into a trained obstacle detection and semantic segmentation network to extract obstacles of different types and semantic segmentation results in the real indoor scene;

(c) combining the obstacles and semantic segmentation results with an internal parameter of the RGBD sensor and depth information acquired by the RGBD sensor; and generating three-dimension (3D) point cloud data with semantic information under the real indoor scene by using a spatial mapping relationship between two-dimension (2D) and 3D spaces;

(d) according to the 3D point cloud data generated in step (c), extracting state information of a dynamic obstacle followed by inputting to a trained dynamic obstacle trajectory prediction model, and predicting a trajectory of the dynamic obstacle in the real indoor scene;

wherein the step (d) comprises:

(d1) determining the state information of the dynamic obstacle; and designing a dynamic obstacle trajectory prediction model and a loss function of the dynamic obstacle trajectory prediction model;

wherein the step (d1) is performed through steps of:

(d11) for a set of dynamic obstacles custom character

={r_i}_{i=1 . . . N}currently detected, extracting a trajectory custom character

_i={(x₁, y₁), (x₂, y₂), . . . , (x_τ, y_τ)} corresponding to an i-th dynamic obstacle in first τ frames of data; and obtaining state information Ω_i custom character

[

_iΔ

_ic l]^Tof the i-th dynamic obstacle;

wherein N represents the number of dynamic obstacles in the set of dynamic obstacles; i represents an index of individual dynamic obstacles; (x_τ, y_τ) represents a spatial position of the i-th dynamic obstacle in a τ-th frame; Δ represents a derivative operator; := represents a define symbol; c represents a vector composed of dynamic densities of the i-th dynamic obstacle in individual frames of data, expressed as c custom character

[c(x₁, y₁), . . . c(x_τ,y_τ)]^T; c(x,y) represents a dynamic density of the i-th dynamic obstacle in a spatial position c(x,y); and l represents a size of the i-th dynamic obstacle;

(d12) designing the dynamic obstacle trajectory prediction model, wherein the dynamic obstacle trajectory prediction model is formed by three one-dimensional (1D) convolutional layers and three fully-connected (FC) layers connected sequentially; the three 1D convolutional layers have 64, 32, and 32 channels, respectively; each of the three 1D convolutional layers has a kernel size of 3, and a stride of 1; the three FC layers have 256, 128, and 20 channels, respectively; and outputting spatial positions {(x_τ+1,y_τ+1), (x_τ+2,y_τ+2), . . . ,(x_τ+10,y_τ+10)} of the i-th dynamic obstacle in (τ+1)−(τ+10) frames after the last layer of the three FC layers; and

(d13) based on a real trajectory custom character

^gt_iof the i-th dynamic obstacle, setting the loss function I of the dynamic obstacle trajectory prediction model, expressed as:

wherein w represents parameters of the dynamic obstacle trajectory prediction model; ϕ(w^TΩ_i) represents a prediction trajectory of the i-th dynamic obstacle; and custom character

^gt_irepresents a real trajectory of the i-th dynamic obstacle;

(d2) inputting the state information of the dynamic obstacle determined in step (d1) into the dynamic obstacle trajectory prediction model for training, until the loss function I of the dynamic obstacle trajectory prediction model converges, so as to obtain the trained dynamic obstacle trajectory prediction model; and

(d3) based on the 3D point cloud data, inputting the state information of the dynamic obstacle determined in step (d1) into the trained dynamic obstacle trajectory prediction model to predict the trajectory of the dynamic obstacle in the real indoor scene;

(e) according to the 3D point cloud data generated in step (c) and the trajectory of the dynamic obstacle predicted in step (d), building a real-time local grid map centered on a mobile robot with a radius of 3 m through steps of:

(e1) allowing the mobile robot to always face toward an east direction of the real-time local grid map; and setting a resolution of the real-time local grid map to 240×240, wherein each grid in the real-time local grid map represents a probability of the existence of an obstacle;

(e2) for a static obstacle in the real indoor scene, filling 1in a corresponding grid on the real-time local grid map;

(e3) for the dynamic obstacle, filling 1 in a current grid; filling a probability in individual grids covered by a predicted trajectory of the dynamic obstacle based on a distance between individual grids and a position of the dynamic obstacle, wherein the probability of individual grids covered by the predicted trajectory is one minus a ratio of a distance from a grid to a center of the dynamic obstacle to a length of the predicted trajectory;

(f) based on the real-time local grid map, building a dynamic obstacle avoidance model based on a deep reinforcement learning framework; and sending a speed instruction to the mobile robot in real time to avoid the obstacles of different types during a navigation process;

wherein the step (f) is performed through steps of:

(f1) determining a state input, an action space and a reward function of the dynamic obstacle avoidance model through steps of:

(f11) determining the state input of the dynamic obstacle avoidance model at moment t, wherein the state input comprises lidar data l_t, the real-time local grid map, a position p_tof a navigation target relative to the mobile robot, and an angular velocity w_t^gtand a linear velocity v_t^gtof the mobile robot; wherein the lidar data l_tis obtained by a lidar sensor of the mobile robot, l_tϵR^3×360, and R^3×360represents a matrix of 3×360;

(f12) determining the action space at the moment t, wherein the action space comprises moment (t+1) of the mobile robot, and an angular velocity w_t+1and a linear velocity v_t+1of the mobile robot, wherein w_t+1ϵ(−0.5,0.5), and v_t+1ϵ(0.0,1.0); and

(f13) building the reward function r_t=r_t^p+r_t^c+r_tⁿ+r_t^wof the dynamic obstacle avoidance model at the moment t, wherein r_t^prepresents a reward of the mobile robot when continuously approaching the navigation target; if the mobile robot reaches the navigation target, r_t^p=10.0, otherwise, r_t^p=∥x_t−1−x_p∥−∥x_t−x_p∥; x_t−1represents a position of the mobile robot in a global coordinate system at moment (t−1); x_prepresents a position of the navigation target in the global coordinate system; x_trepresents a position of the mobile robot in the global coordinate system at the moment t; r_t^crepresents a penalty when the mobile robot encounters a collision, wherein if the collision occurs, r_t^c=−10.0, otherwise, r_t^c=0.0; r_tⁿrepresents a penalty when the mobile robot approaches an obstacle, wherein if d_th>d_ob, r_tⁿ=−0.1*∥d_ob−d_th∥, otherwise, r_tⁿ=0.0; d_obrepresents a current distance between the mobile robot and the obstacle; d_threpresents a distance threshold; r_t^wrepresents a penalty when a rotation speed of the mobile robot exceeds a threshold, wherein if |Δw_t|>0.3, r_t^w=−0.1* ∥Δw_t∥, otherwise, r_t^w=0.0; and Δw_trepresents a difference between the angular velocity w_t+1and the angular velocity w_t^gt, expressed as Δw_t=w_t+1−w_t^gt;

(f2) building and training the dynamic obstacle avoidance model until the reward function converges;

(f3) inputting the state input determined in step (f11) into a trained dynamic obstacle avoidance model, and outputting the speed instruction to the mobile robot to avoid the obstacles during the navigation process;

wherein the dynamic obstacle avoidance model based on the deep reinforcement learning framework comprises a first module for extracting features from the lidar data, a second module for extracting features from the real-time local grid map, a third module for extracting features from a relative position of the navigation target to the mobile robot, a fourth module for extracting features from a current speed of the mobile robot, a three-layer perceptron, and a FC layer with three branches;

an output end of the first module, an output end of the second module, an output end of the third module, and an output end of the fourth module are connected to an input end of the three-layer perceptron; and an output end of the three-layer perceptron is connected to the FC layer with three branches;

a first branch of the FC layer is configured to output an average velocity of the mobile robot in two dimensions, and the average velocity comprises an average angular velocity and an average linear velocity; a second branch of the FC layer is configured to output a velocity variance of the mobile robot in two dimensions, and the velocity variance comprises an angular velocity variance and a linear velocity variance; and a third branch of the FC layer is configured to output a dynamic obstacle avoidance evaluation value of one dimension;

the first module is composed of three 1D convolutional layers and one 256-dimensional FC layer connected sequentially; the three 1D convolutional layers have 32, 16 and 16 channels, respectively; each of the three 1D convolutional layers has a kernel size of 3 and a stride of 1;

the second module is composed of three 2D convolutional layers, one 512-dimensional FC layer, and one 256-dimensional FC layer connected sequentially; the three 2D convolutional layers have 64, 64, and 32 channels, respectively; the three 2D convolutional layers have a kernel size of 5×5, 5×5, and 3×3, respectively, and a stride of each of the three 2D convolutional layers is 2; and

the third module and the fourth module are each formed by one 32-dimensional FC layer.