US 11,942,074 B2
	Learning data acquisition apparatus, model learning apparatus, methods and programs for the same
Takaaki Fukutomi, Tokyo (JP); Takashi Nakamura, Tokyo (JP); and Kiyoaki Matsui, Tokyo (JP)
Assigned to NIPPON TELEGRAPH AND TELEPHONE CORPORATION, Tokyo (JP)
Appl. No. 17/429,737
Filed by NIPPON TELEGRAPH AND TELEPHONE CORPORATION, Tokyo (JP)
PCT Filed Jan. 29, 2020, PCT No. PCT/JP2020/003062 § 371(c)(1), (2) Date Aug. 10, 2021, PCT Pub. No. WO2020/166322, PCT Pub. Date Aug. 20, 2020.
Claims priority of application No. 2019-022516 (JP), filed on Feb. 12, 2019.
Prior Publication US 2022/0101828 A1, Mar. 31, 2022
Int. Cl. G10L 15/06 (2013.01); G06N 20/00 (2019.01); G10L 21/0208 (2013.01); G10L 25/78 (2013.01); G10L 25/81 (2013.01); G10L 25/84 (2013.01); G10L 25/87 (2013.01)

CPC G10L 15/063 (2013.01) [G06N 20/00 (2019.01); G10L 21/0208 (2013.01); G10L 25/78 (2013.01); G10L 25/81 (2013.01); G10L 25/84 (2013.01); G10L 25/87 (2013.01); G10L 2025/783 (2013.01); G10L 2025/786 (2013.01)]

16 Claims

1. A learning data acquisition device comprising a processor configured to execute operations comprising:

determining an influence degree on voice recognition accuracy caused by a change of a signal-to-noise ratio, based on a result of voice recognition on k^thnoise superimposed voice data and a result of voice recognition on k−1^thnoise superimposed voice data, wherein K is an integer of 2 or larger, k=2, 3, . . . , K, and a signal-to-noise ratio of the k^thnoise superimposed voice data is smaller than a signal-to-noise ratio of the k−1^thnoise superimposed voice data;

obtaining a largest signal-to-noise ratio SNR_applyamong signal-to-noise ratios of the k−1^thnoise superimposed voice data when the influence degree meets a given threshold condition; and

acquiring noise superimposed voice data having a signal-to-noise ratio that is equal to or larger than the signal-to-noise ratio SNR_apply, as learning data.