US 12,265,594 B2
Data processing methods and systems for simulating changes to categorical features
András Németh, Budapest (HU); Dániel Darabos, Budapest (HU); Péter Erben, Budapest (HU); and Dávid Herskovics, Budapest (HU)
Assigned to Lynx Analytics Pte. Ltd., Singapore (SG)
Filed by Lynx Analytics Pte. Ltd., Singapore (SG)
Filed on May 2, 2022, as Appl. No. 17/734,194.
Claims priority of application No. 10202105059R (SG), filed on May 14, 2021.
Prior Publication US 2022/0374654 A1, Nov. 24, 2022
Int. Cl. G06F 18/214 (2023.01); G06T 7/10 (2017.01); G16H 50/20 (2018.01)
CPC G06F 18/214 (2023.01) [G06T 7/10 (2017.01); G16H 50/20 (2018.01); G06T 2207/20081 (2013.01)] 27 Claims
OG exemplary drawing
 
1. A data processing method of simulating changes to categorical features of a subject, the method comprising:
receiving a current categorical feature set for the subject, the current categorical feature set for the subject comprising a plurality of categorical features for the subject, each categorical feature indicating a category from a plurality of possible categories into which the subject falls,
inputting categorical features from the current categorical feature set for the subject into a set of trained machine learning models, each trained machine learning model of the set of trained machine learning model being configured to predict a respective outcome value for the subject from one or more of the categorical features for the subject, and thereby generating a set of current predicted outcome values for the subject;
generating a plurality of simulated categorical features sets for the subject by varying respective categorical features for the subject;
inputting categorical features from the simulated categorical feature sets for the subject into the set of trained machine learning models, and thereby generating a plurality of simulated sets of predicted outcome values for the subject; and
storing a predicted outcome dataset for the subject, the predicted outcome dataset comprising the set of current predicted outcome values for the subject and the plurality of simulated sets of predicted outcome values for the subject.