| CPC G06N 20/00 (2019.01) [G06F 40/40 (2020.01); G16H 10/20 (2018.01); G16H 50/20 (2018.01)] | 11 Claims |

|
8. A system comprising:
at least one processor; and
at least one memory, the at least one memory not constituting a transitory propagating data signal, wherein the at least one memory is configured to cause the at least one processor to:
receive a plurality of training natural-language text strings;
identify one or more noun phrases in each of the plurality of training natural-language text strings;
determine one or more sentiment-qualified topics based on the one or more noun phrases;
determine an entity class for each of the one or more sentiment-qualified topics:
modify each training natural-language text string to preserve a location of each sentiment-qualified topic in each training natural-language text string while replacing all information about an identity of each sentiment-qualified topic;
use the modified plurality of training natural-language strings and the entity class for each of the one or more sentiment-qualified topics to train a machine learning model to determine a sentiment of each sentiment-qualified topic by at least performing a largest-first comparison of substrings of each of the plurality of training natural-language text strings to identify longer, multi-word topics to an exclusion of included shorter topics, and comparing the longer, multi-word topics to a list of topics or named entities specified for a domain; and
store the sentiment with each training natural-language text string by restoring the identity and using the location associated with each sentiment-qualified topic in each training natural-language text string.
|