US 12,147,771 B2
Topical vector-quantized variational autoencoders for extractive summarization of video transcripts
Sangwoo Cho, Sammamish, WA (US); Franck Dernoncourt, San Jose, CA (US); Timothy Jeewun Ganter, Woodinville, WA (US); Trung Huu Bui, San Jose, CA (US); Nedim Lipka, Campbell, CA (US); Varun Manjunatha, Newton, MA (US); Walter Chang, San Jose, CA (US); Hailin Jin, San Jose, CA (US); and Jonathan Brandt, Santa Cruz, CA (US)
Assigned to ADOBE INC., San Jose, CA (US)
Filed by ADOBE INC., San Jose, CA (US)
Filed on Jun. 29, 2021, as Appl. No. 17/361,878.
Prior Publication US 2022/0414338 A1, Dec. 29, 2022
Int. Cl. G06F 40/35 (2020.01); G06F 40/279 (2020.01)
CPC G06F 40/35 (2020.01) [G06F 40/279 (2020.01)] 20 Claims
OG exemplary drawing
 
1. A method comprising:
receiving text including an utterance;
generating a semantic embedding of the utterance using an embedding network;
generating a plurality of feature vectors based on the semantic embedding using a convolution network;
identifying a first plurality of latent codes respectively corresponding to the plurality of feature vectors by identifying a closest latent code from a second plurality of latent codes of a codebook to each corresponding feature vector of the plurality of feature vectors, wherein the second plurality of latent codes of the codebook discretizes a semantic space based on a number of dimensions of the semantic space, and wherein the closest latent code is identified by computing a similarity between the closest latent code and the corresponding feature vector;
identifying a prominent code among the first plurality of latent codes; and
generating an indication that the utterance is a summary utterance based on the prominent code.