US 11,991,344 B2
	Systems, methods and apparatuses for stereo vision and tracking
Tej Tadi, Lausanne (CH); Leandre Bolomey, Lausanne (CH); Nicolas Fremaux, Lausanne (CH); Jose Rubio, Lausanne (CH); Jonas Ostlund, Lausanne (CH); Sylvain Cardin, Lausanne (CH); Flavio Roth, Lausanne (CH); Renaud Ott, Lausanne (CH); Frederic Condolo, Lausanne (CH); Nicolas Bourdaud, Lausanne (CH); Flavio Levi Capitao Cantante, Lausanne (CH); Corentin Barbier, Lausanne (CH); and Ieltxu Gomez Lorenzo, Lausanne (CH)
Assigned to MINDMAZE GROUP SA, Lausanne (CH)
Filed by MINDMAZE HOLDING SA, Lausanne (CH)
Filed on Sep. 14, 2021, as Appl. No. 17/474,078.
Application 17/474,078 is a continuation of application No. 16/532,604, filed on Aug. 6, 2019, abandoned.
Application 16/532,604 is a continuation in part of application No. PCT/IB2018/000386, filed on Feb. 7, 2018.
Claims priority of provisional application 62/598,487, filed on Dec. 14, 2017.
Claims priority of provisional application 62/553,953, filed on Sep. 4, 2017.
Claims priority of provisional application 62/456,050, filed on Feb. 7, 2017.
Prior Publication US 2022/0182598 A1, Jun. 9, 2022
Int. Cl. H04N 13/383 (2018.01); G02B 27/01 (2006.01)

CPC H04N 13/383 (2018.05) [G02B 27/0172 (2013.01); G02B 2027/0138 (2013.01)]

23 Claims

1. A stereo vision procurement apparatus for obtaining stereo visual data, comprising: a stereo RGB camera;

a depth sensor; and

an RGB-D fusion module,

a processor; and

a plurality of tracking devices to track movement of a subject,

wherein:

the processor is configured to process data from the tracking devices to form a plurality of sub-features,

each of said stereo RGB camera and said depth sensor are configured to provide pixel data corresponding to a plurality of pixels,

said RGB-D fusion module is configured to combine RGB pixel data from said stereo RGB camera and depth information pixel data from said depth sensor to form stereo visual pixel data (SVPD), and

said RGB-D fusion module is implemented in an FPGA (field-programmable gate array) and wherein said sub-features are combined by said FPGA to form a feature to track movements of the subject;

further comprising: a memory; and

wherein said processor is configured to perform a defined set of operations in response to receiving a corresponding instruction selected from an instruction set of codes, and said instruction set of codes include:

a first set of codes for operating said RGB-D fusion module to synchronize RGB pixel data and depth pixel data, and for creating a disparity map; and

a second set of codes for creating a point cloud from said disparity map and said depth pixel data;

wherein said disparity map is generated by executing a plurality of instructions, wherein said instructions comprise performing a matching cost computation by measuring a similarity of pixels in left and right images by producing a cost; aggregating said costs to form a 3-D costs map; performing disparity selection to generate a 2-D disparity map; and refining said 2-D disparity map to generate said disparity map.