US 11,942,074 B2
Learning data acquisition apparatus, model learning apparatus, methods and programs for the same
Takaaki Fukutomi, Tokyo (JP); Takashi Nakamura, Tokyo (JP); and Kiyoaki Matsui, Tokyo (JP)
Assigned to NIPPON TELEGRAPH AND TELEPHONE CORPORATION, Tokyo (JP)
Appl. No. 17/429,737
Filed by NIPPON TELEGRAPH AND TELEPHONE CORPORATION, Tokyo (JP)
PCT Filed Jan. 29, 2020, PCT No. PCT/JP2020/003062
§ 371(c)(1), (2) Date Aug. 10, 2021,
PCT Pub. No. WO2020/166322, PCT Pub. Date Aug. 20, 2020.
Claims priority of application No. 2019-022516 (JP), filed on Feb. 12, 2019.
Prior Publication US 2022/0101828 A1, Mar. 31, 2022
Int. Cl. G10L 15/06 (2013.01); G06N 20/00 (2019.01); G10L 21/0208 (2013.01); G10L 25/78 (2013.01); G10L 25/81 (2013.01); G10L 25/84 (2013.01); G10L 25/87 (2013.01)
CPC G10L 15/063 (2013.01) [G06N 20/00 (2019.01); G10L 21/0208 (2013.01); G10L 25/78 (2013.01); G10L 25/81 (2013.01); G10L 25/84 (2013.01); G10L 25/87 (2013.01); G10L 2025/783 (2013.01); G10L 2025/786 (2013.01)] 16 Claims
OG exemplary drawing
 
1. A learning data acquisition device comprising a processor configured to execute operations comprising:
determining an influence degree on voice recognition accuracy caused by a change of a signal-to-noise ratio, based on a result of voice recognition on kth noise superimposed voice data and a result of voice recognition on k−1th noise superimposed voice data, wherein K is an integer of 2 or larger, k=2, 3, . . . , K, and a signal-to-noise ratio of the kth noise superimposed voice data is smaller than a signal-to-noise ratio of the k−1th noise superimposed voice data;
obtaining a largest signal-to-noise ratio SNRapply among signal-to-noise ratios of the k−1th noise superimposed voice data when the influence degree meets a given threshold condition; and
acquiring noise superimposed voice data having a signal-to-noise ratio that is equal to or larger than the signal-to-noise ratio SNRapply, as learning data.