CPC G06F 16/258 (2019.01) [G06F 16/125 (2019.01); G06F 16/1748 (2019.01); G06F 16/906 (2019.01); G16H 15/00 (2018.01)] | 19 Claims |
1. One or more non-transitory media having instructions which, when executed by one or more hardware processors, cause the one or more hardware processors to facilitate a plurality of operations, the operations comprising:
receiving a plurality of records from one or more sources disparate from a first source of records;
transforming the plurality of records to a Fast Healthcare Interoperability Resources (FHIR) format, wherein:
the transforming includes:
identifying, by a pre-processor, one or more parameters in the plurality of records; and
performing a grouping operation at the pre-processor based on the identified one or more parameters, and
the one or more parameters comprise at least one item selected from a group consisting of:
a coding system parameter selected from a group consisting of:
RxNorm; and
CVX; and
a type parameter selected from a group consisting of:
a codable concept; and
free text;
calculating a probability of duplication for each particular record of the plurality of records at least with respect to one other record of the plurality of records, wherein a first probability of duplication for a first record with respect to another record indicates a likelihood that the first record is a copy of the other record,
wherein:
the calculating comprises utilizing one or more rules,
the one or more rules evaluate a variable of a field within the plurality of records to determine an outcome from a set of possible outcomes,
the set of possible outcomes is selected from a group consisting of:
a match;
a mismatch; and
a determination that the variable is null, and
the one or more rules assign a numerical value to the outcome for computing the probability of duplication;
based at least on the probability of duplication for each particular record of the plurality of records, classifying the particular record into one of a plurality of collections;
assigning weights to respective records within the plurality of collections, wherein within a collection of the plurality of collections, a first weight is assigned to a first record and a second weight is assigned to a second record, the second weight differing from the first weight;
selecting a record from each collection, of the plurality of collections, to include in a reduced set of records, wherein selecting the record from each collection comprises selecting the first record based on the first weight being greater than the second weight;
generating the reduced set of records comprising the selected record from each collection of the plurality of collections;
generating an updated set of electronic records based at least in part on the reduced set of records; and
writing the updated set of electronic records to the first source.
|