US 12,223,441 B2
Systems and methods for classifying electronic documents
Caimei Lu, Kirkwood, MO (US); Ricky John Prosser, St. Charles, MO (US); and Michael Wayne Bryars, Chesterfield, MO (US)
Assigned to TSG Technologies, LLC, St. Louis, MO (US)
Filed by TSG Technologies, LLC, Brentwood, MO (US)
Filed on Mar. 8, 2024, as Appl. No. 18/600,109.
Application 18/600,109 is a continuation of application No. 16/780,413, filed on Feb. 3, 2020, granted, now 11,928,606.
Application 16/780,413 is a continuation of application No. 15/629,332, filed on Jun. 21, 2017, granted, now 10,579,646.
Application 15/629,332 is a continuation of application No. 15/069,661, filed on Mar. 14, 2016, granted, now 9,710,540, issued on Jul. 18, 2017.
Application 15/069,661 is a continuation of application No. 13/839,817, filed on Mar. 15, 2013, granted, now 9,298,814, issued on Mar. 29, 2016.
Prior Publication US 2024/0211781 A1, Jun. 27, 2024
This patent is subject to a terminal disclaimer.
Int. Cl. G06N 5/04 (2023.01); G06F 16/81 (2019.01); G06N 20/00 (2019.01)
CPC G06N 5/04 (2013.01) [G06F 16/81 (2019.01); G06N 20/00 (2019.01)] 19 Claims
OG exemplary drawing
 
1. A system comprising:
a memory having instructions stored thereon; and
at least one processor to execute the instructions to:
generate statistical data from one or more training documents stored in random access memory (RAM) comprising at least one of blogs, messages, and emails, the one or more training documents comprising positive or negative discussions regarding one of a brand, product, and service and filter the one or more training documents to clean textual content, and create a plurality of classification rules, including creating at least one topic model-based classification rule using the statistical data, the at least one topic model-based classification rule is formatted as an XML file and stored in RAM;
evaluate the at least one topic model-based classification rule using a precision equation and a recall equation, the precision equation comprising

OG Complex Work Unit Math
 and the recall equation comprising

OG Complex Work Unit Math
 wherein N(dc+) is a number of test documents correctly classified to a category C based on the at least one topic model-based classification rule, N(dc−) is a number of test documents incorrectly classified to the category C based on the at least one topic model-based classification rule, and N(dc) denotes a number of test documents that should be classified to the category C; and
create at least one query-based classification rule using one or more user defined categories and the statistical data, the at least one query based classification rule is formatted as an XML file stored in RAM.