US 12,217,756 B2
	Systems and methods for improved digital transcript creation using automated speech recognition
Robert Ackerman, Boca Raton, FL (US); Anthony J. Vaglica, Silver Spring, MD (US); Holli Goldman, Richboro, PA (US); Amber Hickman, Swedesboro, NJ (US); Walter Barrett, Gibbsboro, NJ (US); Cameron Turner, Palo Alto, CA (US); and Shawn Rutledge, Seattle, WA (US)
Assigned to AUDAX PRIVATE DEBT LLC, Boston, MA (US)
Filed by Magna Legal Services, LLC, Philadelphia, PA (US)
Filed on Sep. 2, 2021, as Appl. No. 17/465,509.
Application 17/465,509 is a continuation of application No. 16/570,699, filed on Sep. 13, 2019, abandoned.
Claims priority of provisional application 62/730,700, filed on Sep. 13, 2018.
Prior Publication US 2022/0059096 A1, Feb. 24, 2022
Int. Cl. G10L 15/26 (2006.01); G06F 17/18 (2006.01); G06V 40/16 (2022.01)

CPC G10L 15/26 (2013.01) [G06F 17/18 (2013.01); G06V 40/172 (2022.01)]

17 Claims

15. A device, comprising:

a memory;

a display;

a user interface; and

one or more processors operatively coupled to the memory, wherein the one or more processors are configured to execute instructions causing the one or more processors to:

obtain a video recording of a testimony given by a first deponent;

obtain a transcript of the testimony;

scan the video recording to locate one or more emotional cues and nonverbal cues;

link the located one or more emotional cues and non-verbal cues to corresponding portions of the transcript;

update the corresponding portions of the transcript with indications of the corresponding located one or more emotional cues and non-verbal cues;

update the corresponding portions of the video recording with indications of the corresponding located one or more emotional cues and non-verbal cues, wherein:

at least one of the indications comprises a graphical overlay on a frame of the video recording visible during playback of the video recording;

the graphical overlay includes a visual indicator displaying derived insights corresponding to the located one or more emotional cues and non-verbal cues, wherein the derived insights comprise a bar graph depicting an amount of the one or more emotional cues and non-verbal cues; and

playing the video recording with the graphical overlay.