US 11,721,324 B2
Providing high quality speech recognition
Yuan Jin, Shanghai (CN); Xi Xi Liu, Shanghai (CN); Li ping Wang, Shanghai (CN); Fan Xiao Xin, Shanghai (CN); and Zheng Ping Chu, Shanghai (CN)
Assigned to International Business Machines Corporation, Armonk, NY (US)
Filed by International Business Machines Corporation, Armonk, NY (US)
Filed on Jun. 9, 2021, as Appl. No. 17/343,431.
Prior Publication US 2022/0399006 A1, Dec. 15, 2022
Int. Cl. G10L 15/30 (2013.01); G10L 15/02 (2006.01); G06N 3/08 (2023.01); G10L 25/51 (2013.01); G10L 25/30 (2013.01)
CPC G10L 15/02 (2013.01) [G06N 3/08 (2013.01); G10L 25/30 (2013.01); G10L 25/51 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A computer-implemented method for providing high quality speech recognition, the method comprising:
selecting a first speech-to-text model to perform speech recognition of words spoken by a customer;
selecting a second speech-to-text model to perform speech recognition of words spoken by an agent;
analyzing combined results of said first and said second speech-to-text models to generate a reference speech-to-text result;
reprocessing cached customer speech data with a plurality of speech-to-text models to perform speech recognition of said customer's spoken words in response to a confidence rate of a speech-to-text result performed by said first speech-to-text model not exceeding a threshold value;
performing a similarity analysis on results of said plurality of speech-to-text models with respect to said reference speech-to-text result;
assigning similarity scores for each of said plurality of speech-to-text models based on said similarity analysis; and
selecting one of said plurality of speech-to-text models with a highest similarity score as a new speech-to-text model for speech-to-text processing of words spoken by said customer during an ongoing call.