US 12,254,679 B2
Systems and methods for generating three-dimensional annotations for training a machine learning model
Gellert Nacsa, Nagykovácsi (HU); Domonkos Huszar, Kistarcsa (HU); Felician Benda, Budapest (HU); Akos Kiss, Budapest (HU); Istvan S. Horvath, Budapest (HU); Gabor Majoros, Tokaji utca (HU); and Csaba Rekeczky, Monte Sereno, CA (US)
Assigned to Verizon Patent and Licensing Inc., Basking Ridge, NJ (US)
Filed by Verizon Patent and Licensing Inc., Basking Ridge, NJ (US)
Filed on Mar. 2, 2022, as Appl. No. 17/653,213.
Prior Publication US 2023/0281975 A1, Sep. 7, 2023
Int. Cl. G06T 7/00 (2017.01); G06T 7/70 (2017.01); G06T 7/80 (2017.01); G06V 10/24 (2022.01); G06V 10/26 (2022.01); G06V 10/776 (2022.01)
CPC G06V 10/776 (2022.01) [G06T 7/80 (2017.01); G06V 10/248 (2022.01); G06V 10/26 (2022.01); G06T 2207/10012 (2013.01); G06T 2207/20081 (2013.01); G06T 2207/20084 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A method, comprising:
receiving, by a device, a video and corresponding camera information associated with a camera that captured the video;
selecting, by the device, an object in the video and a wire model for the object;
adjusting, by the device, one or more of an orientation, a location, or a size of the wire model to align the wire model on the object in a frame of the video, based on the corresponding camera information and to generate an adjusted wire model;
identifying, by the device, the object in another frame of the video;
aligning, by the device, the adjusted wire model on the object in the other frame;
interpolating, by the device, the adjusted wire model for the object for intermediate frames of the video between the frame and the other frame, wherein the interpolation is based on three-dimensional position data of the object, three-dimensional orientation data of the object, and lens distortion of the camera;
generating, by the device, three-dimensional annotations for the video based on the adjusted wire models for the frame, the intermediate frames, and the other frame; and
training, by the device, a machine learning model based on the three-dimensional annotations.