| CPC G06F 16/285 (2019.01) [G06F 16/215 (2019.01); G06F 16/24575 (2019.01)] | 20 Claims |

|
1. A multi-cluster data storage system, comprising:
a first computing cluster of a first datacenter, the first computing cluster comprising a first database instance executing on a first server, the first computing cluster storing a first set of records;
a second computing cluster of a second datacenter separate from the first datacenter, the second computing cluster comprising a second database instance executing on a second server, the second computing cluster storing a second set of records;
a search server executing separate from the first server and the second server, the search server comprising storing a multi-cluster index having a sorted object identifier key; and
a duplicate record detector comprising one or more processors and one or more non-transitory computer-readable media storing computer-executable instructions that, when executed by the one or more processors, cause the one or more processors to perform operations comprising:
retrieving a first set of records from the first computing cluster;
paring the first set of records, into a first pared subset of records, based at least in part on the multi-cluster index;
retrieving a second set of records from the second computing cluster;
paring the second set of records, into a second pared subset of records, based at least in part on the multi-cluster index; and
determining a duplicate record, based at least in part on comparing the first pared subset of records and the second pared subset of records.
|