US 12,086,564 B2
System and method for voice morphing in a data annotator tool
Dylan H. Ross, Los Angeles, CA (US)
Assigned to SoundHound AI IP, LLC., Santa Clara, CA (US)
Filed by SoundHound, Inc., Santa Clara, CA (US)
Filed on Nov. 30, 2021, as Appl. No. 17/539,182.
Application 17/539,182 is a division of application No. 16/578,386, filed on Sep. 22, 2019, granted, now 11,205,056.
Prior Publication US 2022/0092273 A1, Mar. 24, 2022
Int. Cl. G10L 15/18 (2013.01); G06F 40/56 (2020.01); G06F 40/58 (2020.01); G10L 15/06 (2013.01); G10L 19/125 (2013.01); G10L 19/26 (2013.01); G10L 21/013 (2013.01)
CPC G06F 40/56 (2020.01) [G06F 40/58 (2020.01); G10L 15/06 (2013.01); G10L 15/18 (2013.01); G10L 19/125 (2013.01); G10L 19/265 (2013.01); G10L 21/013 (2013.01); G10L 2021/0135 (2013.01)] 17 Claims
OG exemplary drawing
 
1. A system for transcribing natural language speech, the system comprising: a computer implementing a data annotator tool that performs:
receiving an audio clip comprising the natural language speech from a server;
morphing the audio clip to a morphed audio clip where the audio clip is pitch shifted, frequency shifted, and pitch shifted a second time;
playing the morphed audio clip for a human being;
receiving a transcription input from the human being for the morphed audio clip; and
providing the transcription input to a memory, wherein the data annotator tool further comprises:
a first UI area that allows the human being to play the morphed audio clip;
a second UI area that allows the human being to enter the transcription input of the morphed audio clip;
a third UI area that allows the human being to enter a gender input for the morphed audio clip;
a fourth UI area that allows the human being to enter an accent input for the morphed audio clip; and
a fifth UI area that allows the human being to enter a noise input of the morphed audio clip.