US 11,720,580 B1
Entity matching with machine learning fuzzy logic
Elliot Hirsch, New York, NY (US); Johannes Beil, Copenhagen (DK); Lauren Brown, London (GB); Nicolas Prettejohn, Bath (GB); and Paul Baseotto, Poole (GB)
Assigned to Palantir Technologies Inc., Denver, CO (US)
Filed by Palantir Technologies Inc., Denver, CO (US)
Filed on Mar. 1, 2022, as Appl. No. 17/683,986.
Claims priority of provisional application 63/156,524, filed on Mar. 4, 2021.
Int. Cl. G06F 7/00 (2006.01); G06F 16/2458 (2019.01); G06F 16/248 (2019.01)
CPC G06F 16/2468 (2019.01) [G06F 16/248 (2019.01)] 18 Claims
OG exemplary drawing
 
1. A computerized method, performed by a computing system having one or more hardware computer processors and one or more non-transitory computer readable storage device storing software instructions executable by the computing system to perform the computerized method comprising:
providing a user interface including controls allowing a user to select:
a first attribute of a first data set comprising a plurality of first data records;
a second attribute a second data set comprising a plurality of second data records; and
one or more matching algorithms, wherein each of the matching algorithms is configured to output a match score indicative of likelihood of a match between two data records;
for each pair of first data records and second data records:
executing the selected matching algorithms on property values of the pair of data records to generate a plurality of match scores;
determining, based on a match scoring algorithm, an overall match score for the pair of data records based on at least some of the plurality of match scores associated with respective attributes;
displaying a results user interface indicating at least a first candidate pair of data records having an overall match score greater than a predefined overall score threshold;
receiving user feedback indicating whether the first candidate pair of data records are accurately identified as a match; and
updating the match scoring algorithm based on the user feedback.