US 11,798,563 B2
Method, apparatus and device for voiceprint recognition of original speech, and storage medium
Yuechao Guo, Shenzhen (CN); Yixuan Qiao, Shenzhen (CN); Yijun Tang, Shenzhen (CN); Jun Wang, Shenzhen (CN); Peng Gao, Shenzhen (CN); and Guotong Xie, Shenzhen (CN)
Assigned to PING AN TECHNOLOGY (SHENZHEN) CO., LTD., Shenzhen (CN)
Appl. No. 17/617,296
Filed by PING AN TECHNOLOGY (SHENZHEN) CO., LTD., Shenzhen (CN)
PCT Filed Aug. 26, 2020, PCT No. PCT/CN2020/111439
§ 371(c)(1), (2) Date Dec. 7, 2021,
PCT Pub. No. WO2021/217978, PCT Pub. Date Nov. 4, 2021.
Claims priority of application No. 202010351208.3 (CN), filed on Apr. 28, 2020.
Prior Publication US 2022/0254350 A1, Aug. 11, 2022
Int. Cl. G10L 17/06 (2013.01); G10L 17/02 (2013.01); G10L 17/18 (2013.01); G10L 25/18 (2013.01); G10L 25/21 (2013.01)
CPC G10L 17/06 (2013.01) [G10L 17/02 (2013.01); G10L 17/18 (2013.01); G10L 25/18 (2013.01); G10L 25/21 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A method for a voiceprint recognition of an original speech, comprising:
obtaining original speech data, and segmenting the original speech data based on a preset time length to obtain segmented speech data;
performing a tail-biting convolution processing and a discrete Fourier transform on the segmented speech data through a preset convolution filter bank to obtain voiceprint feature data corresponding to the segmented speech data;
pooling the voiceprint feature data corresponding to the segmented speech data through a preset deep neural network to obtain a target voiceprint feature;
performing an embedded vector transformation on the target voiceprint feature to obtain voiceprint feature vectors corresponding to the target voiceprint feature; and
performing a calculation on the voiceprint feature vectors through a preset loss function to obtain target voiceprint data, wherein the preset loss function comprises a cosine similarity matrix loss function and a minimum mean square error matrix loss function.