US 12,190,867 B2
Keyword detection for audio content
Zvi Figov, Modiin (IL)
Assigned to Microsoft Technology Licensing, LLC, Redmond, WA (US)
Filed by Microsoft Technology Licensing, LLC, Redmond, WA (US)
Filed on May 31, 2022, as Appl. No. 17/804,603.
Claims priority of provisional application 63/363,283, filed on Apr. 20, 2022.
Prior Publication US 2023/0343329 A1, Oct. 26, 2023
Int. Cl. G10L 15/22 (2006.01); G06F 40/279 (2020.01); G06F 40/40 (2020.01); G10L 15/04 (2013.01); G10L 15/08 (2006.01); G10L 25/57 (2013.01)
CPC G10L 15/08 (2013.01) [G06F 40/279 (2020.01); G06F 40/40 (2020.01); G10L 15/04 (2013.01); G10L 15/22 (2013.01); G10L 25/57 (2013.01); G10L 2015/088 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A method for detecting keywords for audio content, the method comprising:
segmenting the audio content into a plurality of audio segments;
generating a plurality of text segments corresponding to the plurality of audio segments;
generating a plurality of phrase candidate values using a textual analysis of the plurality of text segments;
generating a plurality of sentence embedding values using a sentence embedding analysis of the plurality of text segments;
calculating an average sentence embedding value using the plurality of sentence embedding values;
comparing each phrase candidate value of the plurality of phrase candidate values to the average sentence embedding value;
labeling each phrase candidate value having a comparison value above a threshold value as a keyword; and
presenting the keyword as a stream of the audio content progresses.