US 11,990,136 B2
Speech recognition device, search device, speech recognition method, search method, and program
Tetsuo Amakasu, Tokyo (JP); Kaname Kasahara, Tokyo (JP); Takafumi Hikichi, Tokyo (JP); and Masayuki Sugizaki, Tokyo (JP)
Assigned to NIPPON TELEGRAPH AND TELEPHONE CORPORATION, Tokyo (JP)
Appl. No. 17/428,276
Filed by NIPPON TELEGRAPH AND TELEPHONE CORPORATION, Tokyo (JP)
PCT Filed Jan. 24, 2020, PCT No. PCT/JP2020/002558
§ 371(c)(1), (2) Date Aug. 4, 2021,
PCT Pub. No. WO2020/162229, PCT Pub. Date Aug. 13, 2020.
Claims priority of application No. 2019-019476 (JP), filed on Feb. 6, 2019.
Prior Publication US 2022/0108699 A1, Apr. 7, 2022
Int. Cl. G10L 15/00 (2013.01); G06F 16/245 (2019.01); G06N 3/04 (2023.01); G10L 15/02 (2006.01); G10L 15/04 (2013.01); G10L 15/14 (2006.01); G10L 15/16 (2006.01); G10L 15/22 (2006.01); G10L 15/32 (2013.01); G10L 15/08 (2006.01)
CPC G10L 15/32 (2013.01) [G06F 16/245 (2019.01); G06N 3/04 (2013.01); G10L 15/02 (2013.01); G10L 15/04 (2013.01); G10L 15/142 (2013.01); G10L 15/16 (2013.01); G10L 15/22 (2013.01); G10L 2015/088 (2013.01)] 9 Claims
OG exemplary drawing
 
1. A speech recognition device a processor configured to execute operations comprising:
performing first speech recognition processing using a first method on speech data of a conversation made by a plurality of speakers and outputs a speech recognition result for each of respective uttered speech segments of the plurality of speakers;
determining, on the basis of a result of the first speech recognition processing, a subject segment of the conversation, wherein the subject segment represents a segment of the speech data including a part of the conversation with utterances about a subject; and
performing second speech recognition processing using a second method higher in accuracy than the first method on speech data in the segment determined to be the subject segment by the determiner and outputs a speech recognition result as a subject text.