US 11,709,910 B1
Systems and methods for imputing missing values in data sets
Armand E. Prieditis, Arcata, CA (US)
Assigned to Cigna Intellectual Property, Inc., Wilmington, DE (US)
Filed by Cigna Intellectual Property, Inc., Wilmington, DE (US)
Filed on Mar. 18, 2019, as Appl. No. 16/356,500.
Int. Cl. G06F 17/10 (2006.01); G16H 10/60 (2018.01); G06F 40/174 (2020.01); G06F 40/18 (2020.01)
CPC G06F 17/10 (2013.01) [G06F 40/18 (2020.01); G06F 40/174 (2020.01); G16H 10/60 (2018.01)] 22 Claims
OG exemplary drawing
 
1. A system comprising:
a computer readable medium including a data set with data stored in rows and N columns,
wherein each of the rows is associated with one individual patient,
wherein each of the N columns is associated with one type of data for patients, and
wherein N is an integer greater than one; and
one or more processors configured to:
(i) initialize missing values in M ones of the N columns in the data set with M values for the M ones of the N columns, respectively,
wherein M is an integer that is greater than zero and less than or equal to N;
(ii) generate M mathematical models for the M ones of the N columns of the data set having one or more missing values, respectively, based on non-missing values of the other ones of the N columns in the data set;
(iii) for each of the rows of the data set having one or more missing values, update ones of the M values for the M ones of the N columns based on non-missing values of that row of the data set, the ones of the M mathematical models, respectively, and ones of the M values for other ones of the M ones of the N columns with missing values; and
(iv) fill missing values in the M ones of the N columns in the data set with the M values, respectively.