US 11,657,595 B1
Detecting and locating actors in scenes based on degraded or supersaturated depth data
Samuel Nathan Hallman, Seattle, WA (US); Petko Tsonev, Issaquah, WA (US); Michael Francis O'Malley, Seattle, WA (US); Jayakrishnan Eledath, Kenmore, WA (US); Jue Wang, Seattle, WA (US); and Tian Lan, Seattle, WA (US)
Assigned to Amazon Technologies, Inc., Seattle, WA (US)
Filed by Amazon Technologies, Inc., Seattle, WA (US)
Filed on Feb. 8, 2021, as Appl. No. 17/170,722.
Application 17/170,722 is a continuation of application No. 16/220,461, filed on Dec. 14, 2018, granted, now 10,915,783.
This patent is subject to a terminal disclaimer.
Int. Cl. G06V 10/75 (2022.01); G06T 5/50 (2006.01); G06N 3/08 (2023.01); G06V 20/64 (2022.01); G06V 40/10 (2022.01)
CPC G06V 10/751 (2022.01) [G06N 3/08 (2013.01); G06T 5/50 (2013.01); G06V 20/647 (2022.01); G06V 40/10 (2022.01); G06T 2207/10028 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A system comprising:
an imaging device having a field of view, wherein the imaging device comprises an image sensor, a time-of-flight sensor, a computer processor and a memory component; and
a shelving unit, wherein at least a portion of the shelving unit is within the field of view,
wherein the memory component has stored thereon executable instructions that, as a result of being executed by at least the processor, cause the imaging device to at least:
capture a first visual image and a first depth image at a first time;
determine that a first number of saturated pixels within the first depth image is less than a predetermined threshold;
determine a first position of an actor at a first time based at least in part on the first depth image;
capture a second visual image and a second depth image at a second time;
determine that a second number of saturated pixels within the second depth image exceeds the predetermined threshold;
in response to determining that the second number of saturated pixels exceeds the predetermined threshold,
detect a representation of the actor in at least a portion of the second visual image;
predict a distance from the imaging device to the actor based at least in part on the second visual image; and
determine a second position of the actor at the second time based at least in part on the portion of the second image and the distance.