| CPC G06T 3/4053 (2013.01) [G06T 5/20 (2013.01); G06V 10/82 (2022.01)] | 20 Claims |

|
1. A method for enhancing a video resolution, comprising:
obtaining multiple frames of images as input data, and obtaining initial data by performing feature extraction on the input data using a first three-dimensional convolutional layer;
obtaining first feature data by performing down-sampling on the initial data at a preset multiple;
obtaining first reference data by performing a convolution operation on the first feature data using a second three-dimensional convolutional layer to merge the first feature data into one frame; and
obtaining first output data by performing up-sampling on the first reference data at the preset multiple;
wherein the method further comprises:
performing an Nth super-resolution operation on the first feature data, the super-resolution operation comprising a down-sampling operation, a first feature extraction operation, a merging operation, a second feature extraction operation, and an up-sampling operation, wherein
the down-sampling operation comprises performing down-sampling on the first feature data at the preset multiple;
the first feature extraction operation comprises performing the first feature extraction operation on the down-sampled first feature data by using the first three-dimensional convolution layer to obtain third feature data;
the merging operation comprises performing a convolution operation on the third feature data by using the second three-dimensional convolutional layer to merge the third feature data into one frame to obtain second reference data;
the second feature extraction operation comprises performing the second feature extraction operation on stacked data of the second reference data and (N+1)th output result by using the first three-dimensional convolution layer to obtain fourth feature data; and
the up-sampling operation comprises performing up-sampling on the fourth feature data at the preset multiple to obtain third output data; and
updating the first reference data with the third output data;
wherein an input of Nth down-sampling operation is an output of the first feature extraction operation of (N−1)th super-resolution operation, and N is a positive integer starting from 1.
|