US 11,704,825 B2
	Method for acquiring distance from moving body to at least one object located in any direction of moving body by utilizing camera-view depth map and image processing device using the same
Chang Hee Won, Seoul (KR)
Assigned to MULTIPLEYE CO., LTD., Seoul (KR)
Filed by MULTIPLEYE CO., LTD., Seoul (KR)
Filed on Jul. 15, 2022, as Appl. No. 17/865,855.
Claims priority of application No. 10-2021-0126023 (KR), filed on Sep. 23, 2021.
Prior Publication US 2023/0086983 A1, Mar. 23, 2023
Int. Cl. G06T 7/593 (2017.01); H04N 13/271 (2018.01); H04N 13/156 (2018.01); G06T 7/70 (2017.01); G06T 7/10 (2017.01); G06T 7/13 (2017.01); G06T 7/60 (2017.01); H04N 23/90 (2023.01); H04N 13/282 (2018.01); H04N 13/00 (2018.01)

CPC G06T 7/593 (2017.01) [G06T 7/10 (2017.01); G06T 7/13 (2017.01); G06T 7/60 (2013.01); G06T 7/70 (2017.01); H04N 13/156 (2018.05); H04N 13/271 (2018.05); H04N 23/90 (2023.01); G06T 2207/10012 (2013.01); G06T 2207/20021 (2013.01); G06T 2207/20081 (2013.01); G06T 2207/20084 (2013.01); G06T 2207/30244 (2013.01); G06T 2207/30252 (2013.01); H04N 13/282 (2018.05); H04N 2013/0081 (2013.01); H04N 2013/0092 (2013.01)]

20 Claims

1. A method for acquiring a distance from a moving body to at least one object located in any direction of the moving body, comprising steps of:

(a) acquiring, at an image processing device, a 1-st image to a p-th image generated by a 1-st camera to a p-th camera, spaced apart from one another on the moving body and capable of covering all directions of the moving body by using each of their respective Field of Views (FOVs), and inputting, by the image processing device, the 1-st image to the p-th image to a sweep network and instructing the sweep network to (i) project a plurality of pixels on each of the 1-st image to the p-th image onto N main virtual geometries, respectively formed on a basis of a predetermined main reference point or a predetermined main reference virtual geometry, to thereby generate a plurality of main stereoscopic images, and (ii) apply three-dimensional (3D) concatenation operation to the plurality of the main stereoscopic images and thus generate an initial four-dimensional (4D) cost volume;

(b) inputting, by the image processing device, the initial 4D cost volume to a cost volume computation network, including a plurality of 3D convolution layers and their corresponding 3D deconvolution layers, to thereby generate a final main 3D cost volume; and

(c) performing, by the image processing device, (i) processes of (i−1) projecting a plurality of pixels on a k-th image, generated from a k-th camera among the 1-st camera to the p-th camera, onto M k-th sub virtual geometries, respectively formed on a basis of a predetermined k-th sub reference point, to thereby generate a plurality of k-th sub stereoscopic images, (i−2) converting each of positions of each of a plurality of pixels on the plurality of the k-th sub stereoscopic images on the basis of the predetermined main reference point, to thereby acquire k-th 3D pixel coordinates, (i−3) applying the k-th 3D pixel coordinates to the final main 3D cost volume to thereby acquire a k-th sub cost factor, (i−4) generating a k-th sub cost volume by using the k-th sub cost factor, (i−5) generating one or more k-th sub inverse distance indices corresponding to one or more k-th sub inverse distances by using the k-th sub cost volume, wherein each of the k-th sub inverse distances represents each of inverse values of each of a (k_1)-st sub separation distance to a (k_M)-th sub separation distance, each of the (k_1)-st sub separation distance to the (k_M)-th sub separation distance meaning each of distances between the predetermined k-th sub reference point of the k-th camera and the M k-th sub virtual geometries, and (i−6) acquiring at least one sub separation distance among the (k_1)-st sub separation distance to the (k_M)-th sub separation distance by referring to the k-th sub inverse distances extracted from the k-th sub inverse distance indices, and (ii) processes of (ii−1) generating one or more main inverse distance indices corresponding to one or more main inverse distances by using the final main 3D cost volume, wherein each of the main inverse distances represents each of inverse values of each of N main separation distances, each of the N main separation distances meaning each of distances between the predetermined main reference point and the N main virtual geometries or between the predetermined main reference virtual geometry and the N main virtual geometries, (ii−2) acquiring at least one main separation distance among the N main separation distances by referring to the main inverse distances extracted from the main inverse distance indices, and thus (ii−3) acquiring a distance from the moving body to at least one object located in any direction of the moving body by referring to the at least one sub separation distance among the (k_1)-st sub separation distance to the (k_M)-th sub separation distance and the at least one main separation distance among the N main separation distances.