US 11,676,278 B2
Deep learning for dense semantic segmentation in video with automated interactivity and improved temporal coherence
Anthony Rhodes, Santa Clara, CA (US); and Manan Goel, Portland, OR (US)
Assigned to Intel Corporation, Santa Clara, CA (US)
Filed by Intel Corporation, Santa Clara, CA (US)
Filed on Sep. 26, 2019, as Appl. No. 16/584,709.
Prior Publication US 2020/0026928 A1, Jan. 23, 2020
Int. Cl. G06K 9/00 (2022.01); G06T 7/11 (2017.01); G06T 7/20 (2017.01); G06T 7/174 (2017.01); G06T 7/00 (2017.01); G06T 7/70 (2017.01); G06V 20/40 (2022.01); G06F 18/24 (2023.01); G06F 18/21 (2023.01); G06V 10/764 (2022.01); G06V 10/82 (2022.01); G06V 10/44 (2022.01)
CPC G06T 7/11 (2017.01) [G06F 18/217 (2023.01); G06F 18/24 (2023.01); G06T 7/174 (2017.01); G06T 7/20 (2013.01); G06T 7/70 (2017.01); G06T 7/97 (2017.01); G06V 10/454 (2022.01); G06V 10/764 (2022.01); G06V 10/82 (2022.01); G06V 20/41 (2022.01); G06V 20/49 (2022.01); G06T 2200/24 (2013.01); G06T 2207/10016 (2013.01); G06T 2207/20081 (2013.01); G06T 2207/20084 (2013.01)] 22 Claims
OG exemplary drawing
 
1. A system for providing segmentation in video comprising:
a memory to store a current video frame; and
one or more processors coupled to the memory, the one or more processors to:
generate a convolutional neural network input comprising the current video frame, a temporally previous video frame, an object of interest indicator frame comprising one or more indicators of an object of interest in the current video frame, a motion frame comprising motion indicators indicative of motion from the previous video frame to the current video frame, and a plurality of feature frames each comprising features compressed from feature layers of an object classification convolutional neural network as applied to the current video frame;
apply a segmentation convolutional neural network to the convolutional neural network input to generate a plurality of candidate segmentations of the current video frame; and
select one of the candidate segmentations as a final segmentation corresponding to the current video frame.