US 12,174,876 B2
Method for identifying a sematic type of data contained in a column of a table
Guy Shaked, Be'er Sheva (IL); Dima Alberg, Arad (IL); and David Greenfield, Woodstock (GB)
Assigned to Oracle International Corporation, Redwood Shores, CA (US)
Filed by Oracle International Corporation, Redwood Shores, CA (US)
Filed on Aug. 31, 2022, as Appl. No. 17/899,799.
Prior Publication US 2024/0070184 A1, Feb. 29, 2024
Int. Cl. G06F 16/00 (2019.01); G06F 16/35 (2019.01)
CPC G06F 16/355 (2019.01) 22 Claims
OG exemplary drawing
 
1. A method comprising:
using a first fingerprint-generation technique, generating a first target-column fingerprint set for a target column of a table in a database system based on values contained in the target column;
performing first comparisons between the first target-column fingerprint set and each of a plurality of first semantic-type fingerprint sets that were generated using the first fingerprint-generation technique, wherein:
each first semantic-type fingerprint set corresponds to a respective semantic-type of a plurality of semantic-types, and
each first semantic-type fingerprint set represents a plurality of values of the respective semantic-type;
based on the first comparisons, generating a plurality of first similarity measures, wherein the plurality of first similarity measures includes a first similarity measure for each semantic-type of the plurality of semantic-types;
based, at least in part, on the plurality of first similarity measures, determining that the target column contains values that correspond to a particular semantic-type of the plurality of semantic-types; and
wherein the method is performed by one or more computing devices.