US 11,915,807 B1
Machine learning extraction of clinical variable values for subjects from clinical record data
Jeremy Canfield, Brooklyn, NY (US); Nisha Singh, Queens, NY (US); Marc Knight, Pembroke Pines, FL (US); Kimberly Wiederkehr, Brooklyn, NY (US); Sarina Dass, New York, NY (US); John Ritten, Brooklyn, NY (US); Ashley Allen, Union, MO (US); Andrea Ratzlaff, Virginia Beach, VA (US); Stacie Sienicki, Goshen, IN (US); Katherine Harrison, Brooklyn, NY (US); Will Shapiro, New York, NY (US); and Brett Wittmershaus, Brooklyn, NY (US)
Assigned to Flatiron Health, Inc., New York, NY (US)
Filed by Flatiron Health, Inc., New York, NY (US)
Filed on Oct. 11, 2022, as Appl. No. 17/964,004.
Int. Cl. G16H 10/60 (2018.01)
CPC G16H 10/60 (2018.01) 20 Claims
OG exemplary drawing
 
1. A method of using machine learning to configure a graphical user interface (GUI) to guide extraction of values of clinical variables from clinical record data for a plurality of subjects, the method comprising:
using at least one processor to perform:
obtaining clinical record data associated with the plurality of subjects;
generating a GUI on a computing device for assigning values to the clinical variables for the plurality of subjects in a dataset, the clinical variables comprising:
a subset of clinical variables designated as hybrid variables that can have their values assigned by machine learning model prediction or by manual extraction; and
a subset of clinical variables designated as non-hybrid variables that cannot have their values assigned by machine learning prediction;
extracting values of the clinical variables for the plurality of subjects from the clinical record data associated with the plurality of subjects, the extracting comprising configuring the GUI responsive to machine learning model predictions of hybrid variable values, the configuring comprising:
for each of the hybrid variables to be assigned a value for each of the plurality of subjects:
processing, using a machine learning model, clinical record data associated with the subject to obtain a predicted hybrid variable value and a confidence score associated with the predicted hybrid variable value, the machine learning model comprising a neural network including multiple layers, the processing comprising:
 generating a token vector comprising a plurality of words extracted from the clinical record data associated with the subject;
 generating a numeric representation of the token vector as feature values;
 processing the numeric representation of the token vector using parameters of the neural network to obtain a vector; and
 processing the vector using parameters of the neural network to obtain an output classification indicating the predicted hybrid variable value and a corresponding probability value as the confidence score associated with the predicted hybrid variable value;
determining, using the confidence score associated with the predicted hybrid variable value, whether to assign the predicted hybrid variable value to the hybrid variable for the subject;
in response to determining to assign the predicted hybrid variable value to the hybrid variable for the subject:
 generating, in the GUI, a GUI portion presenting the predicted hybrid variable value as a hybrid variable value for the subject; and
 configuring the GUI portion to restrict user modification of the hybrid variable value for the subject through the GUI; and
in response to determining to not assign the predicted hybrid variable value to the hybrid variable for the subject:
 generating a GUI portion configured to receive user input indicating a manually extracted value of the hybrid variable for the subject; and
 in response to receiving user input indicating the manually extracted value of the hybrid variable through the GUI portion, assigning the manually extracted value to the hybrid variable for the subject.
 
10. A system that uses machine learning to configure GUIs to guide extraction of values of clinical variables from clinical record data for a plurality of subjects, the system comprising:
at least one processor; and
at least one non-transitory computer-readable storage medium storing instructions that, when executed by the at least one processor, cause the at least one processor to:
obtain clinical record data associated with a plurality of subjects;
generate a GUI on a computing device for assigning values to the clinical variables for the plurality of subjects in a dataset, the clinical variables comprising:
a subset of clinical variables designated as hybrid variables that can have their values assigned by machine learning model prediction or by manual extraction; and
a subset of clinical variables designated as non-hybrid variables that cannot have their values assigned by machine learning prediction;
extract values of the clinical variables for the plurality of subjects from the clinical record data associated with the plurality of subjects, the extracting comprising configuring the GUI responsive to machine learning model predictions of hybrid variable values, the configuring comprising:
for each of the hybrid variables to be assigned a value for each of the plurality of subjects:
process, using a machine learning model, clinical record data associated with the subject to obtain a predicted hybrid variable value and a confidence score associated with the predicted hybrid variable value, the machine learning model comprising a neural network including multiple layers, the processing comprising:
 generating a token vector comprising a plurality of words extracted from the clinical record data associated with the subject;
 generating a numeric representation of the token vector as feature values;
 processing the numeric representation of the token vector using parameters of the neural network to obtain a vector; and
 processing the vector using parameters of the neural network to obtain an output classification indicating the predicted hybrid variable value and a corresponding probability value as the confidence score associated with the predicted hybrid variable value;
determine, using the confidence score associated with the predicted hybrid variable value, whether to assign the predicted hybrid variable value to the hybrid variable for the subject;
in response to determining to assign the predicted hybrid variable value to the hybrid variable for the subject:
 generate, in the GUI, a GUI portion presenting the predicted hybrid variable value as a hybrid variable value for the subject; and
 configure the GUI portion to restrict user modification of the hybrid variable value for the subject through the GUI; and
in response to determining to not assign the predicted hybrid variable value to the hybrid variable for the subject:
 generate a GUI portion configured to receive user input indicating a manually extracted value of the hybrid variable for the subject; and
 in response to receiving user input indicating the manually extracted value of the hybrid variable through the GUI portion, assigning the manually extracted value to the hybrid variable for the subject.
 
19. At least one non-transitory computer-readable storage medium storing instructions that, when executed by at least one processor, cause the at least one processor to perform a method of using machine learning to configure GUIs to guide extraction of values of clinical variables from clinical record data for a plurality of subjects, the method comprising:
obtaining clinical record data associated with a plurality of subjects;
generating a GUI on a computing device for assigning values to the clinical variables for the plurality of subjects in a dataset, the clinical variables comprising:
a subset of clinical variables designated as hybrid variables that can have their values assigned by machine learning model prediction or by manual extraction; and
a subset of clinical variables designated as non-hybrid variables that cannot have their values assigned by machine learning prediction;
extracting values of the clinical variables for the plurality of subjects from the clinical record data associated with the plurality of subjects, the extracting comprising configuring the GUI responsive to machine learning model predictions of hybrid variable values, the configuring comprising:
for each of the hybrid variables to be assigned a value for each of the plurality of subjects:
processing, using a machine learning model, clinical record data associated with the subject to obtain a predicted hybrid variable value and a confidence score associated with the predicted hybrid variable value, the machine learning model comprising a neural network including multiple layers, the processing comprising:
generating a token vector comprising a plurality of words extracted from the clinical record data associated with the subject;
generating a numeric representation of the token vector as feature values;
processing the numeric representation of the token vector using parameters of the neural network to obtain a vector; and
processing the vector using parameters of the neural network to obtain an output classification indicating the predicted hybrid variable value and a corresponding probability value as the confidence score associated with the predicted hybrid variable value;
determining, using the confidence score associated with the predicted hybrid variable value, whether to assign the predicted hybrid variable value to the hybrid variable for the subject;
in response to determining to assign the predicted hybrid variable value to the hybrid variable for the subject:
generating, in the GUI, a GUI portion presenting the predicted hybrid variable value as a hybrid variable value for the subject; and
configuring the GUI portion to restrict user modification of the hybrid variable value for the subject through the GUI; and
in response to determining to not assign the predicted hybrid variable value to the hybrid variable for the subject:
generating, in the GUI, a GUI portion configured to receive user input indicating a manually extracted value of the hybrid variable for the subject; and
in response to receiving user input indicating the manually extracted value of the hybrid variable through the GUI portion, assigning the manually extracted value to the hybrid variable for the subject.