| CPC G06N 5/04 (2013.01) [G06F 16/81 (2019.01); G06N 20/00 (2019.01)] | 19 Claims |

|
1. A system comprising:
a memory having instructions stored thereon; and
at least one processor to execute the instructions to:
generate statistical data from one or more training documents stored in random access memory (RAM) comprising at least one of blogs, messages, and emails, the one or more training documents comprising positive or negative discussions regarding one of a brand, product, and service and filter the one or more training documents to clean textual content, and create a plurality of classification rules, including creating at least one topic model-based classification rule using the statistical data, the at least one topic model-based classification rule is formatted as an XML file and stored in RAM;
evaluate the at least one topic model-based classification rule using a precision equation and a recall equation, the precision equation comprising
![]() and the recall equation comprising
![]() wherein N(dc+) is a number of test documents correctly classified to a category C based on the at least one topic model-based classification rule, N(dc−) is a number of test documents incorrectly classified to the category C based on the at least one topic model-based classification rule, and N(dc) denotes a number of test documents that should be classified to the category C; and
create at least one query-based classification rule using one or more user defined categories and the statistical data, the at least one query based classification rule is formatted as an XML file stored in RAM.
|