US 12,283,064 B2
	Shape information generation apparatus, control apparatus, loading/unloading apparatus, logistics system, non-transitory computer-readable medium, and control method
Rosen Diankov, Tokyo (JP); Xutao Ye, Tokyo (JP); and Ziyan Zhou, Tokyo (JP)
Assigned to MUJIN, INC., Tokyo (JP)
Filed by MUJIN, INC., Tokyo (JP)
Filed on Dec. 1, 2023, as Appl. No. 18/526,261.
Application 18/526,261 is a continuation of application No. 17/188,160, filed on Mar. 1, 2021, granted, now 11,836,939.
Application 17/188,160 is a continuation of application No. 16/739,184, filed on Jan. 10, 2020, granted, now 10,970,866, issued on Apr. 6, 2021.
Application 16/739,184 is a continuation of application No. PCT/JP2019/023739, filed on Jun. 14, 2019.
Claims priority of application No. 2018-194064 (JP), filed on Oct. 15, 2018.
Prior Publication US 2024/0095941 A1, Mar. 21, 2024
This patent is subject to a terminal disclaimer.
Int. Cl. G06T 7/55 (2017.01); B25J 9/00 (2006.01); B25J 9/16 (2006.01); B65G 59/02 (2006.01); B65G 67/02 (2006.01); B65G 67/04 (2006.01); B65G 67/24 (2006.01); G06T 5/50 (2006.01); G06T 7/50 (2017.01); G06T 7/593 (2017.01)

CPC G06T 7/55 (2017.01) [B25J 9/0093 (2013.01); B25J 9/1697 (2013.01); B65G 59/02 (2013.01); B65G 67/02 (2013.01); B65G 67/04 (2013.01); B65G 67/24 (2013.01); G06T 5/50 (2013.01); G06T 7/50 (2017.01); G06T 7/593 (2017.01); B65G 2201/0235 (2013.01); B65G 2814/0311 (2013.01); G06T 2207/10028 (2013.01)]

15 Claims

1. An information processing apparatus comprising:

a communication interface configured to communicate with a first camera disposed at a first location and having a first camera field of view, and with a second camera disposed at a second location and having a second camera field of view;

at least one processor configured, when a plurality of objects are in the first camera field of view and in the second camera field of view, to:

acquire a first three-dimensional (3D) image that represents a plurality of object surfaces associated with the plurality of objects, wherein a portion of the first 3D image represents a first surface region which is part of the plurality of object surfaces;

acquire a second image that represents the plurality of object surfaces, wherein the second 3D image is generated by the second camera, and wherein a portion of the second 3D image represents a second surface region which is part of the plurality of object surfaces and overlaps with at least a portion of the first 3D image; and

generate a composite depth map that combines information from the first 3D image and information from the second 3D image, wherein the composite depth map includes features, identified based on the first 3D image and/or the second 3D image, corresponding to the plurality of objects.

7. A non-transitory computer-readable medium having instructions thereon that, when executed by at least one processor of an information processing apparatus, causes the at least one processor to:

acquire a first three-dimensional (3D) image when a plurality of objects are in a first camera field of view and in a second camera field of view, wherein the first 3D image represents a plurality of object surfaces associated with the plurality of objects, wherein the first 3D image is generated by a first camera, and wherein a portion of the first 3D image represents a first surface region which is part of the plurality of object surfaces;

acquire a second 3D image that represents the plurality of object surfaces, wherein the second 3D image is generated by a second camera, and wherein a portion of the second 3D image represents a second surface region which is part of the plurality of object surfaces and overlaps with at least a portion of the first 3D image; and

generate a composite depth map that combines information from the first 3D image and information from the second 3D image, wherein the composite depth map comprises features, identified based on the first 3D image and/or the second 3D image, corresponding to the plurality of objects.

12. An information processing apparatus comprising:

a communication interface configured to communicate with at least a camera having a camera field of view; and

at least one processor configured, when a stack of objects are in the camera field of view, to:

acquire a 3D image which represents one or more object surfaces of the stack of objects,

determine, based on the 3D image, a positional relationship among the stack of objects, wherein the positional relationship indicates how the stack of objects are arranged relative to each other,

select, based on the positional relationship of the stack of objects, one object of the objects for robot interaction, and

determine, based on the positional relationship of the stack of objects, an unloading direction in which the one object is to be unloaded.