CPC G06T 1/0014 (2013.01) [G06T 1/20 (2013.01); G06T 7/174 (2017.01); G06T 7/593 (2017.01); G06T 7/90 (2017.01); G06T 2207/10024 (2013.01); G06T 2207/10028 (2013.01); G06T 2207/20221 (2013.01); H04N 13/204 (2018.05)] | 21 Claims |
1. A system, comprising:
a communication interface configured to receive image data from each of a plurality of sensors associated with a workspace wherein the image data comprises for each sensor in the plurality of sensors one or both of visual image information and depth information, and the plurality of image sensors comprises a plurality of cameras; and
a processor coupled to the communication interface and configured to:
merge image data from the plurality of sensors to generate a merged point cloud data;
perform segmentation based on visual image data from a subset of the sensors in the plurality of sensors to generate a segmentation result, wherein:
the segmentation result is obtained based on performing segmentation using RGB data from a camera;
the segmentation result comprises a plurality of RGB pixels; and
a subset of the plurality of RGB pixels is identified based at least in part on determination that the corresponding RGB pixels are associated with an object boundary;
use one or both of the merged point cloud data and the segmentation result to generate a merged three dimensional and segmented view of the workspace, including by:
mapping RGB pixels identified in the segmentation result to corresponding depth pixels to obtain mapped depth pixel information;
using the segmentation result and the mapped depth pixel information to de-project to a point cloud with segmented shapes around points for each object;
for each of the plurality of cameras, labelling the point cloud generated by that camera for each object and computing a corresponding centroid; and
using nearest neighbor computations between centroids of corresponding object point clouds of the plurality of cameras to segment objects within the workspace; and
use one or both of the merged point cloud data and the segmentation result to determine a strategy to grasp an object present in the workspace using a robotic arm.
|