US 12,314,268 B1
Semantic matching model for data de-duplication or master data management
Balakumaran Vaithyalingam, Frisco, TX (US); Abhishek Seth, Deoband (IN); Trent A. Gray-Donald, Ottawa (CA); and Soma Shekar Naganna, Bangalore (IN)
Assigned to International Business Machines Corporation, Armonk, NY (US)
Filed by INTERNATIONAL BUSINESS MACHINES CORPORATION, Armonk, NY (US)
Filed on Mar. 1, 2024, as Appl. No. 18/593,057.
Int. Cl. G06F 16/00 (2019.01); G06F 16/2457 (2019.01)
CPC G06F 16/24573 (2019.01) 20 Claims
OG exemplary drawing
 
1. A computer-implemented method, comprising:
generating, by a processor set, a semantic label based on a first frequency of similar values in a data field; and
profiling, by the processor set, a data source;
identifying, by the processor set and based on the profiling, critical data elements (CDEs) and non-CDEs within the data source;
auto-persisting, by the processor set, the CDEs;
matching, by the processor set, the CDEs together under the semantic label;
establishing, by the processor set, virtual objects for the non-CDEs; and
joining, by the processor set, the CDEs and the virtual objects.