US 11,922,729 B1
	Detecting interactions with non-discretized items and associating interactions with actors using digital images
Kaustav Kundu, Seattle, WA (US); Pahal Kamlesh Dalal, Seattle, WA (US); Nishitkumar Ashokkumar Desai, Redmond, WA (US); Jayakrishnan Kumar Eledath, Princeton Junction, NJ (US); Geoffrey A. Franz, Seattle, WA (US); Gerard Guy Medioni, Los Angeles, CA (US); Hoi Cheung Pang, Bellevue, WA (US); and Rakesh Ramakrishnan, Issaquah, WA (US)
Assigned to Amazon Technologies, Inc., Seattle, WA (US)
Filed by Amazon Technologies, Inc., Seattle, WA (US)
Filed on Feb. 13, 2023, as Appl. No. 18/168,247.
Application 18/168,247 is a continuation of application No. 16/436,500, filed on Jun. 10, 2019, granted, now 11,580,785.
This patent is subject to a terminal disclaimer.
Int. Cl. G06V 40/20 (2022.01); G06N 20/00 (2019.01); G06T 7/246 (2017.01); G06V 20/64 (2022.01); G06V 40/10 (2022.01)

CPC G06V 40/25 (2022.01) [G06N 20/00 (2019.01); G06T 7/246 (2017.01); G06V 20/64 (2022.01); G06V 40/103 (2022.01); G06V 40/28 (2022.01); G06T 2207/30241 (2013.01)]

20 Claims

1. A system comprising:

a first camera having a first field of view;

a second camera having a second field of view;

a storage unit, wherein at least a portion of the storage unit is within each of the first field of view and the second field of view;

a first container of non-discretized items disposed on the portion of the storage unit; and

a server in communication with the first camera and the second camera, wherein the server comprises at least one processor configured to at least:

receive a first image from the first camera, wherein the first image was captured at a first time;

receive a second image from the second camera, wherein the second image was captured at a second time;

provide at least the first image and the second image to a first machine learning tool as inputs, wherein the first machine learning tool is trained to detect at least a portion of an arm within imaging data;

receive at least a first output from the first machine learning tool;

generate a first regression vector for the first image based at least in part on the first output, wherein the first regression vector associates a pixel of the first image corresponding to a portion of a product space including the first container with a pixel of the first image corresponding to one of the first actor or the second actor;

generate a second regression vector for the second image based at least in part on the first output, wherein the second regression vector associates a pixel of the second image corresponding to a portion of the product space with a pixel of the second image corresponding to one of the first actor or the second actor;

provide at least the first regression vector and the second regression vector to a second machine learning tool as inputs;

receive at least a second output from the second machine learning tool; and

in response to receiving at least the second output,

identify a subset of a period of time during which at least one interaction with the product space occurred based at least in part on the second output, wherein the period of time includes the first time and the second time; and

determine that the first actor executed an event of interest with the first container during the period of time based at least in part on the second output.

5. A method comprising:

capturing a first image by a first imaging device, wherein the first imaging device has a first field of view that includes at least a portion of a first inventory area having a first container with a non-discretized item disposed therein and at least a first actor;

generating a first pair of pixels by at least a first processor unit, wherein the first pair of pixels associates a pixel of the first image corresponding to the first actor with a pixel of the first image corresponding to the portion of the first inventory area;

determining a first probability, wherein the first probability is one of a probability that the first pair of pixels represents an event of interest involving at least one of the first container or the non-discretized item, a probability that the first pair of pixels does not represent any events, or a probability that the first pair of pixels represents an event other than the event of interest;

determining that the first actor executed the event of interest based at least in part on the first pair of pixels and the first probability; and

storing an indication that the event of interest is associated with the first actor in at least one data store.

17. A method comprising:

capturing a first image by a first imaging device having a first field of view, wherein the first image was captured at a first time, wherein a first container comprising a first volume of a first non-discretized item is within the first field of view at the first time;

capturing a second image by a second imaging device having a second field of view, wherein the second image was captured at approximately the first time, wherein the second field of view overlaps the first field of view at least in part, and wherein the first container is within the second field of view at approximately the first time;

detecting at least a portion of a first actor within at least one of the first image or the second image;

detecting at least a portion of a second actor within at least one of the first image or the second image;

generating a first regression vector for the first image, wherein the first regression vector associates at least one pixel of the first container with at least one pixel of one of the first actor or the second actor;

generating a second regression vector for the second image, wherein the second regression vector associates at least one pixel of the first container with at least one pixel of one of the first actor or the second actor;

determining, based at least in part on the first regression vector and the second regression vector, that the first actor executed at least one interaction with a product space including the first container; and

storing an indication that the first actor deposited at least some of the first non-discretized item into a second container.