US 12,217,236 B2
Overlap detection for an item recognition system
Shiyuan Yang, Jersey City, NJ (US); and Shray Chandra, Jersey City, NJ (US)
Assigned to Maplebear Inc., San Francisco, CA (US)
Filed by Maplebear Inc., San Francisco, CA (US)
Filed on Apr. 21, 2022, as Appl. No. 17/726,389.
Claims priority of provisional application 63/177,937, filed on Apr. 21, 2021.
Prior Publication US 2022/0343308 A1, Oct. 27, 2022
Int. Cl. G06Q 20/20 (2012.01); G01G 19/414 (2006.01); G06Q 20/18 (2012.01); G06T 7/10 (2017.01); G06T 7/50 (2017.01); G06V 10/10 (2022.01); G06V 10/22 (2022.01); G06V 10/70 (2022.01); G06V 10/94 (2022.01); G06V 20/60 (2022.01); G06V 20/64 (2022.01); G07G 1/00 (2006.01); H04N 23/90 (2023.01)
CPC G06Q 20/208 (2013.01) [G01G 19/4144 (2013.01); G06Q 20/18 (2013.01); G06T 7/10 (2017.01); G06T 7/50 (2017.01); G06V 10/16 (2022.01); G06V 10/23 (2022.01); G06V 10/70 (2022.01); G06V 10/945 (2022.01); G06V 20/60 (2022.01); G06V 20/64 (2022.01); G07G 1/0036 (2013.01); H04N 23/90 (2023.01); G06T 2200/24 (2013.01); G06T 2207/20081 (2013.01); G06T 2207/20084 (2013.01)] 21 Claims
OG exemplary drawing
 
1. An item recognition system comprising:
a receiving surface;
a top camera coupled to a top portion of the item recognition system, wherein the top camera is configured to capture images of the receiving surface from a top-down view;
one or more peripheral cameras coupled to one or more side portions of the item recognition system, wherein the one or more peripheral cameras are configured to capture images of the receiving surface from different peripheral views;
a user interface;
a processor; and
a non-transitory, computer-readable medium storing instructions that, when executed by the processor, cause the processor to:
access a top image comprising an image captured by the top camera, wherein the top image depicts a first item and a second item on the receiving surface;
access one or more peripheral images, each comprising an image captured by a peripheral camera of the one or more peripheral cameras;
generate a pixel-wise mask for the top image based on the top image, wherein pixel-wise mask indicates one or more portions of the top image where an item is depicted;
apply an overlap detection model to the top image, the one or more peripheral images, and the pixel-wise mask to detect whether the first item overlaps with the second item, wherein the overlap detection model is a machine-learning model trained to detect overlapping items in top images based on the top images, peripheral images, and pixel-wise masks of the top images; and
responsive to detecting that the first item overlaps with the second item, present a notification of the overlap through the user interface.