US 12,405,933 B2
Content management system for trained machine learning models
Thomas Guzik, Edina, MN (US); Muhammad Adeel, Edina, MN (US); and Ryan Kucera, Columbia Heights, MN (US)
Assigned to Getac Technology Corporation, Taipei (TW); and WHP Workflow Solutions, Inc., North Charleston, SC (US)
Filed by Getac Technology Corporation, Taipei (TW); and WHP Workflow Solutions, Inc., North Charleston, SC (US)
Filed on Nov. 30, 2020, as Appl. No. 17/107,763.
Prior Publication US 2022/0171750 A1, Jun. 2, 2022
Int. Cl. G06F 16/21 (2019.01); G06N 5/04 (2023.01); G06N 20/00 (2019.01)
CPC G06F 16/219 (2019.01) [G06N 5/04 (2013.01); G06N 20/00 (2019.01)] 19 Claims
OG exemplary drawing
 
1. One or more non-transitory computer-readable storage media storing computer-executable instructions that, if executed by one or more processors, cause the one or more processors to perform acts comprising:
creating a first dataset comprised of first data and first annotations;
storing the first dataset in one or more data records in one or more data stores accessible by a content management system (CMS);
training a first machine learning (ML) model using the first dataset, until an output of the first ML model meets a first statistical confidence greater than a first threshold, the trained first ML model configured to output a first value of a first dependent variable derived from data of the first dataset, wherein the data of the first dataset is used as a first independent variable;
storing the trained first ML model in the one or more data records in the one or more data stores accessible by the CMS, wherein the one or more data records are comprised of fields for one or more of trained ML models, datasets, data underlying the datasets, statistical confidences of outputs of the trained ML models, and statistical confidence thresholds and the data records:
associate the trained ML models with respective datasets that are associated with respective underlying data;
associate the trained ML models with respective statistical confidences and corresponding respective thresholds; and
associate the trained ML models with at least some metadata indicating respective one or more independent variables and respective one or more dependent variables;
retrieving a second dataset comprised of second data and second annotations, training a second ML model using the second dataset, until an output of the second ML model meets a second statistical confidence greater than a second threshold, the trained second ML model configured to output a second value of the first dependent variable derived from data of the second dataset, wherein the data of the second dataset is used as a second independent variable different from the first independent variable;
retrieving a recommendation from the CMS to combine the second dataset with the first dataset based at least on the first statistical confidence, the second statistical confidence, and an association between the first dependent variable and a second dependent variable of the trained first ML model;
creating a third dataset from a combination of the first data, the second data, the first annotations, and the second annotations;
training a third ML model using the third dataset, until an output of the third ML model meets a third statistical confidence greater than a third threshold, the trained third ML model configured to output a third value of the first dependent variable derived from the first and second independent variables;
storing the third dataset in the one or more data records in the one or more data stores; and
tracking one or more performance measures of the trained first ML model with respect to one or more criteria for selecting one or more content sources.