CPC G06F 16/248 (2019.01) | 20 Claims |
1. A system for merging data in a plurality of database records based on data in a search query related to determining a subject consumer's credit risk, the system comprising:
a processor; and
a memory device having a plurality of instructions stored thereon that, when executed by the processor, cause the processor to:
arrange a search query search engine and a search query matching engine to be in communication with the processor and an Internet accessible database, the database comprising a plurality of unstructured, incomplete, or inconsistently formatted data about a plurality of consumers from a free form data source, the data being stored in respective database fields in the plurality of database records;
in response to receiving the search query comprising a search field and communicated over the Internet to the search query search engine by a remote application to search for and retrieve credit-related data corresponding to the subject consumer, determine a subset of a plurality of normalized database records from an initial set of search results, the step of determining the subset being accomplished by:
converting and standardizing the search query and the plurality of database records via exact and pattern substitutions using regular expressions into a normalized search query and the plurality of normalized database records, based on a normalization rule, wherein the normalized search query comprises a normalized search field and each of the plurality of normalized database records comprises a normalized database field; and
refining the initial set of search results to determine the subset of the plurality of normalized database records corresponding to the subject consumer, wherein the subset of the plurality of normalized database records meets qualifying criteria that are based on a matching strength metric, and wherein the matching strength metric is associated with each of the plurality of normalized database records and is assigned based on a comparison, by the search query matching engine via the processor, of the normalized search field and the normalized database field of each of the plurality of normalized database records;
determine, by the search query matching engine, a degree of similarity between the normalized search field of the normalized search query and the normalized database field of each of the plurality of normalized database records;
assign, by the search query matching engine, a similarity score associated with each of the plurality of normalized database records, based on the degree of similarity;
order, by the search query matching engine, the plurality of normalized database records to produce an ordered set of the plurality of normalized database records, based on the similarity score associated with each of the plurality of normalized database records;
compare, by the search query matching engine, a base record of the ordered set with remaining records of the ordered set, the base record having the similarity score that is highest;
merge, by the search query matching engine, the base record and one of the remaining records of the ordered set to produce a merged record, based on comparing the base record of the ordered set with the remaining records of the ordered set, using the processor; and
transmit, by the search query matching engine to the remote application over the Internet, an ordered subset of the ordered set from the processor, the ordered subset comprising one or more of the base record, the merged record, or the remaining records.
|