US 11,769,012 B2
Automated system and method to prioritize language model and ontology expansion and pruning
Ian Roy Beaver, Spokane, WA (US); and Christopher James Jeffs, Roswell, GA (US)
Assigned to Verint Americas Inc., Alpharetta, GA (US)
Filed by Verint Americas Inc., Alpharetta, GA (US)
Filed on Mar. 25, 2020, as Appl. No. 16/829,101.
Claims priority of provisional application 62/824,429, filed on Mar. 27, 2019.
Prior Publication US 2020/0311346 A1, Oct. 1, 2020
Int. Cl. G10L 15/22 (2006.01); G10L 15/197 (2013.01); G06F 40/295 (2020.01); G06F 16/36 (2019.01); G06F 16/33 (2019.01); G10L 15/18 (2013.01)
CPC G06F 40/295 (2020.01) [G06F 16/3344 (2019.01); G06F 16/367 (2019.01); G10L 15/18 (2013.01); G10L 15/197 (2013.01); G10L 15/22 (2013.01)] 23 Claims
OG exemplary drawing
 
1. A non-transitory computer readable medium comprising instructions that, when executed by a processor of a processing system, cause the processor to perform a method of updating a language model for a language domain for an interactive virtual assistant, the method comprising:
monitoring business related textual data from a plurality of platforms;
collecting trending n-grams from the textual data over a sliding time window;
comparing the n-grams to a vocabulary in a data model to identify terms comprising one of the trending n-grams existing in the data model;
for a term comprising the one of the trending n-grams in the data model and appearing in a new context, wherein the new context is not associated with the trending n-gram in the data model, passing the term in the new context to a human for determination if the term should be added to a training example in the new context for retraining the data model;
for a term comprising the one of the trending n-grams in the data model and not appearing in a new context, ignoring the term;
for a term comprising the one of the trending n-grams and not in the data model, checking the term for frequency of use in the textual data and adding the term in context to the training example if the frequency of use has reached a predetermined threshold;
recompiling the language model based on the term in context; and
adopting the recompiled language model in an interactive virtual assistant.