US 12,229,876 B1
	System and method to convert two-dimensional video into three-dimensional extended reality content
Kay M. Stanney, Winter Park, FL (US); Matthew Archer, Oviedo, FL (US); Nicholas Brawand, Orlando, FL (US); Katherine Del Giudice, Oviedo, FL (US); Samuel George Haddad, Jr., Oviedo, FL (US); and Jennifer M. Riley, Madison, MS (US)
Assigned to Design Interactive, LLC, Orlando, FL (US)
Filed by Design Interactive, Inc., Orlando, FL (US)
Filed on Dec. 12, 2022, as Appl. No. 18/079,395.
Application 18/079,395 is a continuation of application No. 17/463,963, filed on Sep. 1, 2021, granted, now 11,551,407.
Int. Cl. G06T 15/20 (2011.01); A63F 13/525 (2014.01); G06T 19/00 (2011.01); G06V 20/40 (2022.01); G06V 40/10 (2022.01)

CPC G06T 15/20 (2013.01) [A63F 13/525 (2014.09); G06T 19/00 (2013.01); G06V 20/41 (2022.01); G06V 40/10 (2022.01); G06T 2210/21 (2013.01)]

20 Claims

1. A method, comprising, by at least one processor:

receiving a two-dimensional (2D) video captured by an imaging device, the imaging device includes a first height and a first angle;

detecting objects in a scene frame of the 2D video using digital image processing;

determining object image coordinates of the detected objects in the scene frame;

storing the detected objects in an asset database;

deploying a virtual camera in a three-dimensional (3D) environment to create a virtual image frame in the 3D environment;

generating a scene floor in the 3D environment in a plane below the virtual camera;

adjusting the virtual camera to have a second height and a second angle to adjust the virtual image frame so that the second height and the second angle of the virtual camera match the first height and the first angle of the imaging device, wherein the adjusting the virtual camera to have the second height and the second angle at least partially comprises estimating the first height and the first angle of the imaging device that captured the 2D video, wherein such estimating is at least partially based on each of one or more frames of the 2D video utilizing one or more machine-learning algorithms; and

generating an extended reality (XR) coordinate location relative to the floor for placing a respective one detected object of the detected objects in the 3D environment, the XR coordinate location being a point of intersection of a ray cast of the virtual camera through the virtual frame on the floor that translates to a determined object image coordinate for the respective one detected object of the scene frame,

wherein the imaging device comprises one of a red, green, blue (RGB) camera device, a light detection and ranging (LiDAR) image capture device, charge-coupled device (CCD) sensor devices, complementary metal-oxide semiconductor (CMOS) sensor device.