US 12,236,556 B2
	Video resolution enhancement method, storage medium, and electronic device
Lijie Zhang, Beijing (CN); and Dan Zhu, Beijing (CN)
Assigned to BOE TECHNOLOGY GROUP CO., LTD., Beijing (CN)
Appl. No. 17/762,199
Filed by BOE TECHNOLOGY GROUP CO., LTD., Beijing (CN)
PCT Filed Apr. 19, 2021, PCT No. PCT/CN2021/088187 § 371(c)(1), (2) Date Mar. 21, 2022, PCT Pub. No. WO2021/213340, PCT Pub. Date Oct. 28, 2021.
Claims priority of application No. 202010326998.X (CN), filed on Apr. 23, 2020.
Prior Publication US 2022/0292638 A1, Sep. 15, 2022
Int. Cl. G06T 3/4053 (2024.01); G06T 5/20 (2006.01); G06V 10/82 (2022.01)

CPC G06T 3/4053 (2013.01) [G06T 5/20 (2013.01); G06V 10/82 (2022.01)]

20 Claims

1. A method for enhancing a video resolution, comprising:

obtaining multiple frames of images as input data, and obtaining initial data by performing feature extraction on the input data using a first three-dimensional convolutional layer;

obtaining first feature data by performing down-sampling on the initial data at a preset multiple;

obtaining first reference data by performing a convolution operation on the first feature data using a second three-dimensional convolutional layer to merge the first feature data into one frame; and

obtaining first output data by performing up-sampling on the first reference data at the preset multiple;

wherein the method further comprises:

performing an Nth super-resolution operation on the first feature data, the super-resolution operation comprising a down-sampling operation, a first feature extraction operation, a merging operation, a second feature extraction operation, and an up-sampling operation, wherein

the down-sampling operation comprises performing down-sampling on the first feature data at the preset multiple;

the first feature extraction operation comprises performing the first feature extraction operation on the down-sampled first feature data by using the first three-dimensional convolution layer to obtain third feature data;

the merging operation comprises performing a convolution operation on the third feature data by using the second three-dimensional convolutional layer to merge the third feature data into one frame to obtain second reference data;

the second feature extraction operation comprises performing the second feature extraction operation on stacked data of the second reference data and (N+1)th output result by using the first three-dimensional convolution layer to obtain fourth feature data; and

the up-sampling operation comprises performing up-sampling on the fourth feature data at the preset multiple to obtain third output data; and

updating the first reference data with the third output data;

wherein an input of Nth down-sampling operation is an output of the first feature extraction operation of (N−1)th super-resolution operation, and N is a positive integer starting from 1.