CPC G06F 16/24568 (2019.01) [G06F 16/2255 (2019.01); G06F 16/2272 (2019.01); G06F 16/254 (2019.01); G06F 21/6245 (2013.01)] | 18 Claims |
1. A computer-implemented method for data analysis, comprising:
receiving, for each of a plurality of patients, unstructured medical information for the patient from a plurality of different sources of medical information and generating a data object for the patient using a plurality of different models that provide structure for processing the unstructured medical information for the different sources of medical information;
selecting a data type for at least one data object in a plurality of data objects that is optimal for encoding the unstructured information into the at least one data object based on properties of the at least one data object, wherein the at least one data object comprises at least one header and a plurality of data components, wherein the at least one header comprises information regarding the selected data type and memory mappings of the plurality of data components within a body of the at least one data object;
encoding the unstructured information in the at least one data object of the selected data type, wherein the unstructured information is encoded within the plurality of data components in a serialized in-memory byte-stream format;
receiving a search query to analyze patient medical data with a plurality of parameters related to a plurality of data components for a plurality of patient demographics;
determining a particular data component relevant to the search query; and
retrieving a data value directly from the particular data component of the at least one data object using the header of the at least one data object to identify a memory location of the particular data component and without deserialization of the at least one data object, wherein the data value is retrieved in a serialized in-memory byte-stream format; and
generating an identification of a cohort of patients for the search query that includes the data value.
|