CPC G10L 25/57 (2013.01) [G10L 15/1815 (2013.01); G10L 15/26 (2013.01); G10L 17/00 (2013.01); G10L 25/87 (2013.01)] | 19 Claims |
1. A method of using machine natural language processing to analyze language in transcribed camera footage comprising:
extracting at least one audio segment from a body camera video track;
detecting voice activity to identify starting and ending timestamps of voice;
transcribing the at least one audio segment;
identifying audio of at least one speaker;
separating the audio of the at least one speaker;
scoring the audio of the at least one speaker after separation to identify interactions of interest;
wherein a voice detection model is used to analyze the at least one audio segment to identify the starting and ending timestamps of voice; and,
wherein each word in the at least one audio segment is assigned a start and stop time.
|