US 12,394,103 B2
Class-specific neural network for video compressed sensing
Yifei Pei, Santa Clara, CA (US); Ying Liu, Santa Clara, CA (US); Nam Ling, Santa Clara, CA (US); Lingzhi Liu, San Jose, CA (US); Yongxiong Ren, San Jose, CA (US); and Ming Kai Hsu, Fremont, CA (US)
Assigned to KWAI INC., Palo Alto, CA (US); and SANTA CLARA UNIVERSITY, Santa Clara, CA (US)
Filed by KWAI INC., Palo Alto, CA (US); and SANTA CLARA UNIVERSITY, Santa Clara, CA (US)
Filed on Mar. 15, 2022, as Appl. No. 17/695,684.
Claims priority of provisional application 63/161,431, filed on Mar. 15, 2021.
Prior Publication US 2022/0292727 A1, Sep. 15, 2022
Int. Cl. G06K 9/00 (2022.01); G06T 9/00 (2006.01); H04N 19/176 (2014.01); H04N 19/625 (2014.01)
CPC G06T 9/002 (2013.01) [H04N 19/176 (2014.11); H04N 19/625 (2014.11)] 7 Claims
OG exemplary drawing
 
1. A method for video compressed sensing by a class-specific neural network, comprising:
classifying, by a Gaussian-mixture model (GMM), video frame blocks with a plurality of clusters and assigning the video frame blocks to the plurality of clusters;
receiving, by a plurality of encoders, the video frame blocks;
generating, by the plurality of encoders, a plurality of compressed-sensed frame block vectors, wherein the plurality of encoders respectively correspond to the plurality of clusters,
wherein each encoder comprises a flatten layer, a discrete cosine transform (DCT) transform layer, and a trainable compressed sensing layer;
predicting, by a logistic regression classifier, class labels for the plurality of compressed-sensed frame block vectors without recording clustering information, wherein the logistic regression classifier predicts the class labels by respectively maximizing probabilities of the plurality of compressed-sensed frame block vectors;
sending, by the logistic regression classifier, the compressed-sensed frame block vectors to a plurality of decoders, and
wherein the plurality of compressed-sensed frame block vectors are assigned to the plurality of decoders based on the class labels for the plurality of compressed-sensed frame block vectors, wherein the class labels are saved in a hashmap data structure.