US 11,709,910 B1 | ||
Systems and methods for imputing missing values in data sets | ||
Armand E. Prieditis, Arcata, CA (US) | ||
Assigned to Cigna Intellectual Property, Inc., Wilmington, DE (US) | ||
Filed by Cigna Intellectual Property, Inc., Wilmington, DE (US) | ||
Filed on Mar. 18, 2019, as Appl. No. 16/356,500. | ||
Int. Cl. G06F 17/10 (2006.01); G16H 10/60 (2018.01); G06F 40/174 (2020.01); G06F 40/18 (2020.01) |
CPC G06F 17/10 (2013.01) [G06F 40/18 (2020.01); G06F 40/174 (2020.01); G16H 10/60 (2018.01)] | 22 Claims |
1. A system comprising:
a computer readable medium including a data set with data stored in rows and N columns,
wherein each of the rows is associated with one individual patient,
wherein each of the N columns is associated with one type of data for patients, and
wherein N is an integer greater than one; and
one or more processors configured to:
(i) initialize missing values in M ones of the N columns in the data set with M values for the M ones of the N columns, respectively,
wherein M is an integer that is greater than zero and less than or equal to N;
(ii) generate M mathematical models for the M ones of the N columns of the data set having one or more missing values, respectively, based on non-missing values of the other ones of the N columns in the data set;
(iii) for each of the rows of the data set having one or more missing values, update ones of the M values for the M ones of the N columns based on non-missing values of that row of the data set, the ones of the M mathematical models, respectively, and ones of the M values for other ones of the M ones of the N columns with missing values; and
(iv) fill missing values in the M ones of the N columns in the data set with the M values, respectively.
|