US 12,266,350 B1
Pronunciation features for language models
Siddha Ganju, Santa Clara, CA (US); Ruthie Lyle, Durham, NC (US); and Steven Dalton, Cary, NC (US)
Assigned to Nvidia Corporation, Santa Clara, CA (US)
Filed by Nvidia Corporation, Santa Clara, CA (US)
Filed on Jan. 25, 2022, as Appl. No. 17/583,812.
Claims priority of provisional application 63/272,952, filed on Oct. 28, 2021.
Claims priority of provisional application 63/181,934, filed on Apr. 29, 2021.
Int. Cl. G10L 15/16 (2006.01); G10L 13/08 (2013.01); G10L 15/02 (2006.01)
CPC G10L 15/16 (2013.01) [G10L 13/08 (2013.01); G10L 15/02 (2013.01); G10L 2015/025 (2013.01)] 17 Claims
OG exemplary drawing
 
1. A computer-implemented method, comprising:
receiving an auditory input including at least one phoneme forming at least a portion of a word;
determining, using a trained machine learning system, a similarity between the at least one phoneme and a target word;
obtaining one or more properties of a user providing the auditory input;
determining, based at least in part on the one or more properties, one or more tuning parameters;
determining the similarity is within a range of tolerance, the range of tolerance being tunable based at least on the one or more properties; and
providing confirmation of the at least one phoneme.