US 12,094,450 B2
Speech processing device, speech processing method, and non-transitory computer readable medium storing program
Ling Guo, Tokyo (JP); Hitoshi Yamamoto, Tokyo (JP); and Takafumi Koshinaka, Tokyo (JP)
Assigned to NEC CORPORATION, Tokyo (JP)
Appl. No. 17/616,224
Filed by NEC Corporation, Tokyo (JP)
PCT Filed Jun. 7, 2019, PCT No. PCT/JP2019/022805
§ 371(c)(1), (2) Date Dec. 3, 2021,
PCT Pub. No. WO2020/246041, PCT Pub. Date Dec. 10, 2020.
Prior Publication US 2022/0238097 A1, Jul. 28, 2022
Int. Cl. G10L 15/00 (2013.01); G10L 15/04 (2013.01); G10L 15/08 (2006.01); G10L 25/51 (2013.01)
CPC G10L 15/04 (2013.01) [G10L 15/08 (2013.01); G10L 25/51 (2013.01)] 10 Claims
OG exemplary drawing
 
1. A speech processing device comprising:
at least one memory storing instructions; and
at least one processor configured to execute the instructions stored in the memory to:
divide predetermined first speech into a plurality of first speech segments;
divide second speech in which a plurality of types of speech of multiple speakers are mixed into a plurality of second speech segments;
calculate first scores indicating similarities among the plurality of first speech segments, second scores indicating similarities among the plurality of second speech segments, and third scores indicating similarities between the plurality of first speech segments and the plurality of second speech segments;
calculate a threshold value based on the first scores indicating the similarities among the plurality of first speech segments;
classify the plurality of second speech segments into one or more clusters respectively having one or more similarities higher than a similarity indicated by the threshold value;
calculate whether speech corresponding to the first speech is contained in each of the one or more clusters; and
calculate a similarity between each of the one or more clusters and the first speech and determine based on a calculation result of whether the speech corresponding to the first speech is contained in each of the one or more clusters.