US 12,217,738 B2
Method for generating training data and method for post-processing of speech recognition using the same
Heuiseok Lim, Seoul (KR); and Chanjun Park, Seoul (KR)
Assigned to Korea University Research and Business Foundation, Seoul (KR)
Filed by Korea University Research and Business Foundation, Seoul (KR)
Filed on May 9, 2022, as Appl. No. 17/739,383.
Claims priority of application No. 10-2021-0060914 (KR), filed on May 11, 2021.
Prior Publication US 2022/0366894 A1, Nov. 17, 2022
Int. Cl. G10L 15/06 (2013.01); G06F 40/166 (2020.01); G10L 13/00 (2006.01)
CPC G10L 15/063 (2013.01) [G06F 40/166 (2020.01); G10L 13/00 (2013.01)] 6 Claims
OG exemplary drawing
 
1. A training data construction method performed by a computing apparatus comprising at least one processor, the training data construction method comprising:
converting first text data, which is an initial text of mono corpus including a plurality of sentences, to first speech data using a predetermined text-to-speech scheme;
generating second speech data based on the first speech data, the generating comprising:
adding noise to the first speech data;
performing a frequency transformation on the first speech data;
removing a portion of a predetermined frequency band of the first speech data; and
performing a retransformation to a time domain to generate the second speech data; and
converting the second speech data to second text data, which is a final text,
wherein the second text data includes a plurality of sentences corresponding to each of the plurality of sentences included in the first text data, thereby constituting a parallel corpus with the first text data.