US 12,462,810 B2
System and method using machine learned voiceprint sets for efficient determination of voice membership using enhanced score normalization and locality sensitive hashing
Vijay K. Gurbani, Lisle, IL (US); Yu Zhou, Naperville, IL (US); and Bopsi Chandramouli, Palatine, IL (US)
Assigned to VAIL SYSTEMS, INC., Deerfield, IL (US)
Filed by Vail Systems, Inc., Deerfield, IL (US)
Filed on May 25, 2023, as Appl. No. 18/323,576.
Claims priority of provisional application 63/365,344, filed on May 26, 2022.
Prior Publication US 2023/0386476 A1, Nov. 30, 2023
Int. Cl. G10L 17/12 (2013.01); G10L 17/02 (2013.01); G10L 25/30 (2013.01)
CPC G10L 17/12 (2013.01) [G10L 17/02 (2013.01); G10L 25/30 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A computer-implemented method for recognizing a user of a communicating device as belonging to a list of known users from an utterance included in a voice signal received from the communicating device, the method comprising:
applying an utterance of a speaker to a machine learning voiceprint extraction model to extract a voiceprint set comprising an i-vector or a speaker embedding based on the utterance;
outputting the voiceprint set by the machine learning voiceprint extraction model;
applying the output voiceprint set to a machine learning model to compute an utterance match score based on the voiceprint set, or to a machine learning hashing model to reduce the voiceprint set to a reduced dimension voiceprint set and apply the reduced dimension voiceprint set to the machine learning model to compute the utterance match score based on the reduced dimension voiceprint set;
outputting the utterance match score by the machine learning model;
applying the output match score to a machine learning score normalization model (NL-NORM) to calibrate the match score;
comparing the calibrated match score to a match score threshold; and,
when the calibrated match score is greater than the match score threshold, identifying the user as belonging to a list of known users.