US 12,223,719 B2
	Apparatus and method for prediction of video frame based on deep learning
Kun Fan, Seoul (KR); Chung-In Joung, Seoul (KR); Seungjun Baek, Seoul (KR); and Seunghwan Byun, Seoul (KR)
Assigned to Korea University Research and Business Foundation, Seoul (KR)
Filed by KOREA UNIVERSITY RESEARCH AND BUSINESS FOUNDATION, Seoul (KR)
Filed on Dec. 13, 2021, as Appl. No. 17/548,824.
Claims priority of application No. 10-2020-0173072 (KR), filed on Dec. 11, 2020; and application No. 10-2020-0186716 (KR), filed on Dec. 29, 2020.
Prior Publication US 2022/0189171 A1, Jun. 16, 2022
Int. Cl. G06V 20/40 (2022.01); G06N 3/08 (2023.01); G06V 10/82 (2022.01)

CPC G06V 20/46 (2022.01) [G06N 3/08 (2013.01); G06V 10/82 (2022.01)]

12 Claims

1. An apparatus for predicting a video frame, the apparatus comprising:

a level encoder configured to extract and learn at least one feature from a video frame;

a feature learning unit configured to learn based on the at least one feature or transmit predicted feature data corresponding to the at least one feature; and

a level decoder configured to obtain and learn a predicted video frame based on the predicted feature data,

wherein the level encoder receives first to (T−1)th video frames, respectively, and extracts at least one feature from each of the first to (T−1)th video frames, where “T” includes a natural number equal to or greater than 2,

wherein the feature learning unit is trained based on at least one feature extracted from each of the first to (T−1)th video frames,

wherein the level encoder receives the T-th video frame,

wherein the level decoder obtains a (T+1)th predicted video frame corresponding to the T th video frame,

wherein the level encoder receives the (T+1)th predicted video frame, and

wherein the level decoder obtains a (T+2)th predicted video frame corresponding to the (T+1)th predicted video frame.