US 12,354,720 B2
	Machine learning extraction of clinical variable values for subjects from clinical record data
Brett Wittmershaus, Brooklyn, NY (US); Guy Amster, Hoboken, NJ (US); Michael Waskom, New York, NY (US); Natalie Roher, New York, NY (US); Nisha Singh, Queens, NY (US); Sharang Phadke, New York, NY (US); and Will Shapiro, New York, NY (US)
Assigned to Flatiron Health, Inc., New York, NY (US)
Filed by Flatiron Health, Inc., New York, NY (US)
Filed on Dec. 20, 2023, as Appl. No. 18/391,514.
Application 18/391,514 is a continuation of application No. 17/963,998, filed on Oct. 11, 2022, granted, now 11,854,675.
Prior Publication US 2024/0347150 A1, Oct. 17, 2024
This patent is subject to a terminal disclaimer.
Int. Cl. G16H 10/60 (2018.01); G06F 16/35 (2025.01); G06N 20/00 (2019.01)

CPC G16H 10/60 (2018.01) [G06F 16/35 (2019.01); G06N 20/00 (2019.01)]

20 Claims

1. A method of using machine learning to automatically extract values of variables for a plurality of subjects from clinical record data associated with the plurality of subjects, the method comprising:

using at least one processor to perform:

generating, using the clinical record data, a dataset storing values of the variables for the plurality of subjects, the variables comprising:

a first variable designated as a hybrid variable that is configured for assignment by performance of machine learning (ML) prediction; and

a second variable configured for assignment by manual extraction without performance of ML prediction,

wherein the generating comprises;

setting, for each of the plurality of subjects, a value of the first variable in the dataset at least in part by:

processing, using an ML model trained to predict a value of the first variable, clinical record data associated with the subject to obtain a ML predicted value of the first variable and an associated confidence score;

determining, using the confidence score associated with the ML predicted value of the first variable, whether to set the value of the first variable for the subject to the ML predicted value of the first variable;

when it is determined to set the value of the first variable for the subject to the ML predicted value of the first variable, setting, in the dataset, the value of the first variable for the subject to the ML predicted value of the first variable; and

when it is determined to not set the value of the first variable for the subject to the ML predicted value:

obtaining a manually extracted value of the first variable; and

setting, in the dataset, the value of the first variable for the subject to the manually extracted value of the first variable; and

setting, for the plurality of subjects, values of the second variable in the dataset to manually extracted values of the second variable without obtaining ML predicted values of the second variable.