US 12,125,146 B1
	Multimodal 3D deep learning fusion system and method for reducing the need of 3D training dataset of 3D object tracking for enterprise digital twin mixed reality
Yiyong Tan, Mountain View, CA (US); Bhaskar Banerjee, Mountain View, CA (US); and Rishi Ranjan, Mountain View, CA (US)
Assigned to GridRaster, Inc, Mountain View, CA (US)
Filed by GridRaster, Inc., Mountain View, CA (US)
Filed on Jan. 13, 2022, as Appl. No. 17/575,091.
Application 17/575,091 is a continuation of application No. 17/320,968, filed on May 14, 2021, granted, now 11,250,637.
This patent is subject to a terminal disclaimer.
Int. Cl. G06T 19/00 (2011.01); G06F 18/25 (2023.01); G06N 3/045 (2023.01); G06N 5/04 (2023.01); G06N 20/10 (2019.01)

CPC G06T 19/006 (2013.01) [G06F 18/25 (2023.01); G06N 3/045 (2023.01); G06N 5/04 (2013.01); G06N 20/10 (2019.01)]

12 Claims

1. A 3D digital twin mixed reality environment generating method comprising:

tracking, on a backend computer system having a processor, a digital twin of an actual object in a 3D scene that overlays the actual object to generate a mixed reality environment having the digital twin overlaying the actual object in the mixed reality environment;

generating the mixed reality environment including the 3D scene and the digital twin; and

wherein tracking the digital twin to the actual object in the 3D scene further comprises receiving, at the backend computer system having the processor and instructions wherein the processor executes the instructions, data about the 3D scene and the digital twin; training, on the backend computer system, at least two deep learning models using at least one 3D benchmark training data set; predicting, on the backend computer system, at least two sets of labels for the 3D scene date using the trained at least two deep learning models; determining a first histogram for each trained deep learning model; merging, on the backend computer system, the at least two sets of labels generated from the trained deep learning models; training a machine learning model using the merged sets of labels from the trained deep learning models which reduce a complexity of a point cloud of the 3D scene by representing raw RGB and XYZ data of the point cloud in a histogram/distribution of labels of each 3D point; and inferring, on the backend computer system, the digital twin for the actual object that overlays the actual object in the mixed reality environment using the trained machine learning model.