US 12,444,419 B1
Method and apparatus for generating text from audio
Manbinder Pal Singh, Coral Springs, FL (US)
Filed by Citrix Systems, Inc., Fort Lauderdale, FL (US)
Filed on Dec. 16, 2021, as Appl. No. 17/644,608.
Int. Cl. G10L 13/08 (2013.01); G10L 15/18 (2013.01); G10L 15/26 (2006.01)
CPC G10L 15/26 (2013.01) [G10L 15/1815 (2013.01)] 17 Claims
OG exemplary drawing
 
1. A method comprising:
detecting a first event with use of a sensor, the first event to occur while audio data is output on a computing device, the audio data including speech;
generating a transcript of the audio data using a speech-to-text engine;
identifying a first location in the transcript of the speech from the audio data based on a signal from the sensor that detected an occurrence of the first event;
identifying a portion of the transcript that includes the first location based on one or more timestamps that define the portion of the transcript relative to the first location;
generating a link to audio that is associated with the extracted portion of the transcript; and
providing the extracted portion of the transcript and the link to an application;
wherein the identified portion of the transcript starts at the first location, and an extracted portion includes a semantically-continuous block of sentences, which the first location is part of, the semantically-continuous block of sentences including a sentence that is being output during detection of the first event.