| CPC G06V 10/764 (2022.01) [G06Q 10/087 (2013.01); G06T 7/11 (2017.01); G06T 7/579 (2017.01); G06T 7/73 (2017.01); G06T 19/006 (2013.01); G06V 10/751 (2022.01); G06V 10/774 (2022.01); G06V 10/809 (2022.01); G06V 10/817 (2022.01); G06V 10/82 (2022.01); G06V 20/20 (2022.01); G06V 20/70 (2022.01); G06T 2200/24 (2013.01); G06T 2207/20084 (2013.01); G06T 2207/20092 (2013.01); G06T 2210/12 (2013.01)] | 33 Claims |

|
1. A system for identifying and tracking of an inventory of products on one or more shelves, comprising a mobile device including:
an image sensor;
at least one processor; and
a non-transitory computer-readable medium having instructions that, when executed by the processor, causes the processor to perform the following steps:
apply a simultaneous localization and mapping in three dimensions (SLAM 3D) program, on images of a shelf input from the image sensor, to thereby generate a plurality of 3D-mapped bounding boxes, each 3D-mapped bounding box representing a three-dimensional location and boundaries of an unidentified product from the inventory;
capture a plurality of two-dimensional images of the shelf;
assign an identification to each product displayed in the plurality of two-dimensional images using a deep neural network, wherein the deep neural network is a hierarchical system of multiple neural networks running dependently in a sequence;
associate each identified product in a respective two-dimensional image with a corresponding 3D-mapped bounding box,
repeat the assign and associate steps on a plurality of images of the shelf captured from different angles, to thereby generate a plurality of identifications of products associated with each 3D-mapped bounding box;
aggregate the plurality of identifications associated with each 3D-mapped bounding box; and
apply a voting analysis to the aggregated identifications to thereby determine an identification of the product in each 3D-mapped bounding box,
wherein the mobile device comprises a long-term memory for storing each hierarchical deep neural network for each shelf on a floor plan, and the processor is further configured to upload deep neural network levels for identification of products on a particular shelf to a short-term memory of the mobile device in advance of imaging of said shelf, so that the identification is performed solely as an edge computing process.
|