US 12,450,883 B2
	Systems and methods for processing images captured at a product storage facility
Raghava Balusu, Achanta (IN); Avinash M. Jade, Bangalore (IN); Lingfeng Zhang, Dallas, TX (US); William C. Robinson, Jr., Centerton, AR (US); Benjamin R. Ellison, San Francisco, CA (US); Srinivas Muktevi, Bengaluru (IN); Amit Jhunjhunwala, Bangalore (IN); Zhaoliang Duan, Frisco, TX (US); Siddhartha Chakraborty, Kolkata (IN); Ashlin Ghosh, Ernakulam (IN); and Mingquan Yuan, Flower Mound, TX (US)
Assigned to Walmart Apollo, LLC, Bentonville, AR (US)
Filed by WALMART APOLLO, LLC, Bentonville, AR (US)
Filed on Jan. 24, 2023, as Appl. No. 18/158,925.
Prior Publication US 2024/0249505 A1, Jul. 25, 2024
Int. Cl. G06V 10/774 (2022.01); G06V 10/74 (2022.01); G06V 10/94 (2022.01); G06V 20/50 (2022.01); G06V 10/82 (2022.01)

CPC G06V 10/774 (2022.01) [G06V 10/761 (2022.01); G06V 10/945 (2022.01); G06V 20/50 (2022.01); G06V 10/82 (2022.01)]

18 Claims

1. A system for processing captured images of objects at a product storage facility, the system comprising:

a trained machine learning model configured to:

process unprocessed captured images, wherein at least some of the unprocessed captured images depict objects in the product storage facility; and

output processed images; and

a control circuit configured to:

associate each of the processed images into one of a first group, a second group, or a third group,

wherein the first group corresponds to at least one of (a) images depicting one or more objects that are not detected by the trained machine learning model as being associated with a recognized product but a recognized price tag was detected as being associated with the recognized product, or (b) images depicting the one or more objects having at least one of a textual similarity or a visual similarity with a product description stored in a database but the trained machine learning model did not detect as being associated with the recognized product,

wherein the second group corresponds to images depicting one or more objects that are detected by the trained machine learning model as being associated with more than one recognized product, and

wherein the third group corresponds to images depicting one or more objects that the trained machine learning model is unable to detect as depicting an object;

remove the images associated with the third group from the processed images;

calculate a similarity score for each of the processed images in the first group, each similarity score representing the textual similarity or the visual similarity between the processed image and previously processed images stored in the database that are associated with false-negatives;

remove at least one processed image from the first group based on the similarity score for the at least one processed image; and

output remaining processed images associated with the first group and processed images associated with the second group to be used to retrain the trained machine learning model.