US 12,079,572 B2
	Rule-based machine learning classifier creation and tracking platform for feedback text analysis
Sathia Prabhu Thirumal, Bothell, WA (US); Christopher Lawrence Laterza, Issaquah, WA (US); Manoj Kumar Rawat, Bellevue, WA (US); Karan Singh Rekhi, Sammamish, WA (US); Natarajan Arumugam, Bothell, WA (US); and Pranav Jayant Farswani, Seattle, WA (US)
Assigned to Microsoft Technology Licensing, LLC, Redmond, WA (US)
Filed by Microsoft Technology Licensing, LLC, Redmond, WA (US)
Filed on May 17, 2021, as Appl. No. 17/322,720.
Prior Publication US 2022/0366138 A1, Nov. 17, 2022
Int. Cl. G06F 40/279 (2020.01); G06F 16/33 (2019.01); G06F 16/338 (2019.01); G06F 16/35 (2019.01); G06N 20/00 (2019.01)

CPC G06F 40/279 (2020.01) [G06F 16/3331 (2019.01); G06F 16/338 (2019.01); G06F 16/35 (2019.01); G06N 20/00 (2019.01)]

20 Claims

1. A system for creating a machine-learning classifier for a database, comprising:

a processor; and

machine-readable media including instructions which, when executed by the processor, cause the processor to:

accept a topic and a set of keywords manually selected by one user, wherein the user is a subject matter expert regarding the topic;

create a keyword topic classifier based on the topic and the set of keywords, wherein the keyword topic classifier classifies an entry in the database as pertinent to the topic by checking whether the entry contains a word in the set of keywords;

process a subset of entries in the database with a labelling component to automatically run the keyword topic classifier on the subset of entries in the database and create a training data set, wherein the training data set is selected to include both positive examples and negative examples pertinent to the topic;

use a model-building component to create a machine-learning classifier from the training data set, the model-building component being configured to use a machine-learning algorithm to identify additional classifying elements to include in the machine-learning classifier;

determine a metric for the machine-learning classifier and for the keyword topic classifier, based on at least a recall value and a precision value for each classifier using a human-created validation test sample;

select, based on the metric or a user input in response to a display of the metric, a classifier from a group consisting of the machine-learning classifier and the keyword topic classifier; and

store a topic name and a set of keywords associated with the selected classifier for retrieval by a second user.