US 11,681,878 B2
Methods and apparatus for creating domain-specific intended-meaning natural language processing pipelines
Adam Polak, Riverside, CT (US); and Ryan David Gleeson, Los Angeles, CA (US)
Assigned to Ernst & Young U.S. LLP, New York, NY (US)
Filed by Ernst & Young U.S. LLP, New York, NY (US)
Filed on Nov. 8, 2022, as Appl. No. 17/982,760.
Claims priority of provisional application 63/281,755, filed on Nov. 22, 2021.
Prior Publication US 2023/0161965 A1, May 25, 2023
Int. Cl. G06F 17/00 (2019.01); G06F 40/30 (2020.01)
CPC G06F 40/30 (2020.01) 20 Claims
OG exemplary drawing
 
1. A method, comprising:
receiving, via a processor, a dataset that includes a plurality of input texts, each input text from the plurality of input texts associated with a content category from a plurality of content categories based on a comparison between that input text and an intended meaning that is common for each comparison;
for each model in a plurality of models, running, via the processor, that model on each input text from the plurality of input texts to generate an average similarity/dissimilarity score for each content category from the plurality of content categories;
selecting, via the processor and based on the average similarity/dissimilarity score for each content category from the plurality of content categories for each model in the plurality of models, at least one model from the plurality of models to determine whether an input text is similar/dissimilar to the intended meaning; and
generating, via the processor, at least one content category-specific natural language processing pipeline associated with at least one content category included in the plurality of content categories, the average similarity/dissimilarity score for the at least one content category being outside an acceptable range.