US 11,954,602 B1
Hybrid-input predictive data analysis
Daniel J. Mulcahy, Cambridge, MA (US); Subhash Seelam, Exton, PA (US); Damian Kelly, Kildare (IE); Vijay S. Nori, Roswell, GA (US); and Adam Russell, Arlington, MA (US)
Assigned to Optum, Inc., Minnetonka, MN (US)
Filed by Optum, Inc., Minnetonka, MN (US)
Filed on Feb. 17, 2020, as Appl. No. 16/792,635.
Claims priority of provisional application 62/872,455, filed on Jul. 10, 2019.
Int. Cl. G06N 5/02 (2023.01); G06F 40/284 (2020.01); G06N 20/00 (2019.01)
CPC G06N 5/02 (2013.01) [G06F 40/284 (2020.01); G06N 20/00 (2019.01)] 19 Claims
OG exemplary drawing
 
1. A computer-implemented method comprising:
receiving, by one or more processors, a vocabulary data object associated with an input text data object for an entity, wherein:
(i) the vocabulary data object identifies (a) one or more tokenized terms and (b) a per-term numeric representation for each of the one or more tokenized terms,
(ii) the one or more tokenized terms are determined based at least in part on a cross-object frequency measure for a selected training term from one or more training text data objects, and
(iii) the input text data object comprises a medical note for the entity;
determining, by the one or more processors and based at least in part on the vocabulary data object, a per-input-entity tokenized representation for the input text data object, wherein:
(i) the input text data object comprises a plurality of input terms,
(ii) the plurality of input terms comprises one or more mapped input terms associated with one or more predetermined per-term numeric representations in the vocabulary data object, and
(iii) the per-input-entity tokenized representation comprises an ordered sequence of the one or more predetermined per-term numeric representations from the vocabulary data object that are associated with the one or more mapped input terms;
generating, by the one or more processors and using a hybrid-input predictive model, a prediction score for the entity based at least in part on (a) the per-input-entity tokenized representation and (b) an input structured data object associated with the entity,
generating, by the one or more processors and based at least in part on the prediction score satisfying a predictive threshold, a predictive output associated with the entity, wherein the predictive output comprises a medical prediction for the entity; and
initiating, by the one or more processors, the performance of one or more prediction-based actions based at least in part on the predictive output.