US 12,231,666 B2
Front-end architecture for neural network based video coding
Hilmi Enes Egilmez, San Diego, CA (US); Ankitesh Kumar Singh, San Diego, CA (US); Muhammed Zeyd Coban, Carlsbad, CA (US); and Marta Karczewicz, San Diego, CA (US)
Assigned to QUALCOMM Incorporated, San Diego, CA (US)
Filed by QUALCOMM Incorporated, San Diego, CA (US)
Filed on Dec. 8, 2021, as Appl. No. 17/643,383.
Claims priority of provisional application 63/131,802, filed on Dec. 30, 2020.
Claims priority of provisional application 63/124,016, filed on Dec. 10, 2020.
Prior Publication US 2022/0191523 A1, Jun. 16, 2022
Int. Cl. H04N 19/42 (2014.01); G06N 3/048 (2023.01); H04N 19/124 (2014.01); H04N 19/172 (2014.01); H04N 19/186 (2014.01); H04N 19/91 (2014.01); G06N 3/045 (2023.01)
CPC H04N 19/42 (2014.11) [G06N 3/048 (2023.01); H04N 19/124 (2014.11); H04N 19/172 (2014.11); H04N 19/186 (2014.11); H04N 19/91 (2014.11); G06N 3/045 (2023.01); G06T 2207/10016 (2013.01); G06T 2207/20084 (2013.01)] 60 Claims
OG exemplary drawing
 
1. A method of processing video data, the method comprising:
generating, by a first convolutional layer of an encoder sub-network of a neural network system, output values associated with a luminance channel of a frame;
generating, by a second convolutional layer of the encoder sub-network, output values associated with at least one chrominance channel of the frame;
generating, by a third convolutional layer based on the output values associated with the luminance channel of the frame and the output values associated with the at least one chrominance channel of the frame, a combined representation of the frame; and
generating encoded video data based on the combined representation of the frame.