US 12,229,188 B2
Machine learning techniques for generating disease prediction utilizing cross-temporal semi-structured input data
Michael J. McCarthy, Dublin (IE); Kieran O'Donoghue, Dublin (IE); Mostafa Bayomi, Dublin (IE); Neill Michael Byrne, Dublin (IE); and Vijay S. Nori, Roswell, GA (US)
Assigned to Optum Services (Ireland) Limited, Dublin (IE)
Filed by Optum Services (Ireland) Limited, Dublin (IE)
Filed on May 17, 2022, as Appl. No. 17/663,771.
Prior Publication US 2023/0376532 A1, Nov. 23, 2023
Int. Cl. G06F 16/84 (2019.01); G06N 3/045 (2023.01); G06N 5/04 (2023.01)
CPC G06F 16/84 (2019.01) [G06N 3/045 (2023.01); G06N 5/04 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A computer-implemented method comprising:
identifying, by one or more processors, a non-codified input data object that is generated by a respective data ingestion source of a plurality of data ingestion sources and is associated with a respective temporal marker of a plurality of temporal markers;
generating, by the one or more processors, an inferred record set comprising one or more inferred records based at least in part on the non-codified input data object, wherein: (i) an inferred record of the inferred record set is associated with a record field set comprising a plurality of record fields, and (ii) the plurality of record fields for the inferred record comprises an ingestion source identifier field, a source-specific data type identifier field, and a data value field;
generating, by the one or more processors, a discretized data value code for the inferred record based at least in part on (i) a subrange mapping scheme for a source-specific data type identifier code associated with the source-specific data type identifier field and (ii) the data value field for the inferred record;
generating, by the one or more processors, an inferred codified field set comprising a plurality of inferred codified fields based at least in part on the one or more inferred records, wherein: (i) an inferred codified field of the plurality of inferred codified fields is associated with the inferred record and is generated based at least in part on the plurality of record fields for the inferred record, and (ii) the inferred codified field comprises an ingestion source identifier code that is generated based at least in part on the ingestion source identifier field for the inferred record, a source-specific data type identifier code associated with the source-specific data type identifier field for the inferred record, and the discretized data value code;
generating, by the one or more processors, a temporally-arranged codified field set comprising a temporal arrangement of a group of input codified fields, wherein: (i) the group of input codified fields comprises the inferred codified field set and one or more input codified field sets, (ii) an input codified field of the one or more input codified field sets is associated with a corresponding temporal marker, and (iii) the corresponding temporal marker for the inferred codified field is generated based at least in part on the respective temporal marker for the non-codified input data object that is used to generate the inferred codified field;
generating, by the one or more processors and using a temporally encoded prediction machine learning model, a predictive output based at least in part on the temporally-arranged codified field set; and
performing, by the one or more processors, one or more prediction-based actions based at least in part on the temporally-arranged codified field set.