CPC G10L 25/84 (2013.01) [G10L 15/063 (2013.01); G10L 15/16 (2013.01); G10L 15/22 (2013.01); G10L 25/18 (2013.01)] | 20 Claims |
1. A method comprising:
determining, by one or more computer processors coupled to memory, an audio file associated with video content;
generating a plurality of audio segments using the audio file, the plurality of audio segments comprising a first segment and a second segment;
determining that the first segment comprises first voice activity;
determining that the second segment comprises second voice activity;
determining that voice activity is present between a first timestamp associated with the first segment and a second timestamp associated with the second segment;
generating an empty subtitle file comprising an indication that the voice activity is present between the first timestamp and the second timestamp; and
generating text data representing the voice activity that is present between the first timestamp and the second timestamp.
|