CPC G06F 16/3322 (2019.01) [G06F 16/334 (2019.01); G06F 16/335 (2019.01); G06F 16/338 (2019.01)] | 20 Claims |
1. A computer implemented method for record matching in a database system, the method comprising:
identifying records representing respective entities, wherein a record of the identified records comprises structured attributes;
assigning an initial contribution weight to the structured attributes;
identifying one or more unstructured data objects corresponding to the records;
processing the one or more unstructured data objects to identify unstructured attribute values corresponding to respective records of the identified records;
identifying entity relation scores corresponding to the identified records, wherein an entity relation score indicates how often an entity represented by a record occurs alongside a selected entity;
comparing two records based, at least in part, on the updated contribution weight of the selected structured attribute and a comparison of the entity relation scores and the unstructured attribute values of the two records to determine a similarity level between the two records;
selecting unstructured attribute values that are present with respect to the identified records; and
responsive to determining a structured attribute value of the structured attribute values does not match any of the selected unstructured attributes, replacing the contribution weight of said structured attribute by an updated contribution weight indicative of the similarity between the two records of the identified records.
|