CPC G06F 7/14 (2013.01) [G06F 16/215 (2019.01); G06F 16/2365 (2019.01); G06F 16/244 (2019.01); G06F 16/24556 (2019.01)] | 20 Claims |
1. A method comprising:
de-duplicating a first database table and a second database table to generate a first de-duplicated database table and a second de-duplicated database table, the de-duplicating for a respective database table comprising:
performing pairwise comparisons on the respective database table to determine related pairs of records stored in the respective database table having a degree of similarity that exceeds a preset threshold,
identifying clusters based on the pairwise comparisons, and
consolidating redundant records in the respective database table using the clusters to generate a respective deduplicated database table;
performing third pairwise comparisons by comparing the first de-duplicated database table and second de-duplicated database table; and
generating a merged database table based on the third pairwise comparisons.
|