CPC G06F 40/30 (2020.01) [G06F 16/951 (2019.01); G06N 20/00 (2019.01)] | 20 Claims |
8. A system, comprising:
a memory; and
at least one processor coupled to the memory and configured to perform instructions that cause the at least one processor to perform operations comprising:
identifying a natural language processor (NLP) trained on a first set of documents, wherein the NLP is trained to perform a set of functionality based on the first set of documents;
determining an industry in which the NLP is to be configured to perform the set of functionality;
identifying a set of words corresponding to the industry;
identifying a set of sentences including at least a subset of the set of words corresponding to the industry;
scoring the set of sentences based on a similarity to one another;
identifying a subset of the set of sentences that exceed a similarity threshold; and
training the NLP with the subset of the set of sentences that exceed the similarity threshold, wherein the trained NLP with the subset is configured to perform the set of functionality within the industry with a greater accuracy than an NLP trained on only the first set of documents.
|