US 12,223,595 B2
	Method and system for mixing static scene and live annotations for efficient labeled image dataset collection
Matthew A. Shreve, Mountain View, CA (US); and Jeyasri Subramanian, Sunnyvale, CA (US)
Assigned to Xerox Corporation, Norwalk, CT (US)
Filed by Palo Alto Research Center Incorporated, Palo Alto, CA (US)
Filed on Aug. 2, 2022, as Appl. No. 17/879,480.
Prior Publication US 2024/0046568 A1, Feb. 8, 2024
Int. Cl. G06T 17/20 (2006.01); G06V 10/70 (2022.01); G06V 20/64 (2022.01); G09G 5/37 (2006.01)

CPC G06T 17/20 (2013.01) [G06V 20/64 (2022.01); G06T 2219/004 (2013.01)]

19 Claims

1. A computer-implemented method, comprising:

obtaining, by a first augmented reality (AR) recording device based on input from one or more sensors, a three-dimensional (3D) mesh of a scene with a plurality of physical objects;

marking, by the first AR recording device while in an online mode, first annotations for a physical object displayed in the 3D mesh,

wherein marking while in the online mode comprises a user associated with the first AR recording device moving around the scene and marking the first annotations in a live view setting by using tools on the first AR recording device;

switching from the online mode to an offline mode by setting a mode on the first AR recording device;

displaying, on the first AR recording device while in the offline mode, the 3D mesh including a first projection indicating a two-dimensional (2D) bounding area corresponding to the marked first annotations;

marking, by the first AR recording device while in the offline mode, second annotations for the physical object or another physical object displayed in the 3D mesh,

wherein marking while in the offline mode comprises the user marking the second annotations in a static view setting by using the tools on the first AR recording device or an input device;

switching from the offline mode to the online mode by setting the mode on the first AR recording device;

displaying, on the first AR recording device while in the online mode, the 3D mesh including a second projection indicating a 2D bounding area corresponding to the marked second annotations; and

training a machine learning model based on the first annotations marked while in the online mode on the first AR recording device and the second annotations marked while in the offline mode.