US 12,224,071 B2
	Generation of simulated patient data for training predicted medical outcome analysis engine
Lukasz R. Kiljanek, Chesapeake Beach, MD (US)
Filed by Lukasz R. Kiljanek, Chesapeake Beach, MD (US)
Filed on Oct. 10, 2019, as Appl. No. 16/599,109.
Claims priority of provisional application 62/743,789, filed on Oct. 10, 2018.
Prior Publication US 2020/0118691 A1, Apr. 16, 2020
Int. Cl. G16H 50/50 (2018.01); G06N 20/20 (2019.01); G16H 50/20 (2018.01)

CPC G16H 50/50 (2018.01) [G06N 20/20 (2019.01); G16H 50/20 (2018.01)]

80 Claims

1. A method of generating and processing simulated patient information, the method comprising:

receiving feature parameters and one or more possible outcomes, wherein the feature parameters and the one or more possible outcomes both correspond to features, wherein each feature parameter of the feature parameters identifies a distribution of possible values for one feature of the features, wherein the features include one or more patient characteristics, wherein the one or more possible outcomes include one or more possible diagnoses corresponding to the features;

generating, based on the feature parameters and the one or more possible outcomes, a simulated patient population dataset that includes a plurality of simulated patient datasets, wherein each simulated patient dataset of the plurality of simulated patient datasets represents one simulated patient and includes feature values randomly selected according to the distribution of possible values that respectively correspond to each feature;

generating at least one machine learning model by training the at least one machine learning model using training data, wherein the training data includes at least the simulated patient population dataset, wherein the at least one machine learning model includes one or more decision trees having nodes, wherein a first subset of the nodes corresponds to specific features of the features, wherein a second subset of the nodes corresponds to at least one specific outcome of the one or more possible outcomes, wherein at least a first node of the nodes corresponds to a symptom, wherein at least a second node of the nodes corresponds to a diagnosis, wherein a query dataset identifies query feature values for the features associated with a query patient, wherein the at least one machine learning model is configured to generate one or more respective outcome probabilities of each of one or more predicted outcomes in response to input of the query dataset into the at least one machine learning model, wherein the one or more predicted outcomes includes at least a subset of the one or more possible outcomes, wherein the one or more respective outcome probabilities include a first outcome probability determined using the at least one machine learning model based on the query dataset and a second outcome probability determined using the at least one machine learning model based on the query dataset;

identifying a difference between the first outcome probability and the second outcome probability;

determining that the difference is below a threshold; and

adding an additional feature value to the query dataset based on the difference being below a threshold.