US 12,411,818 B2
Method and apparatus of comparing data of heterogeneous data sources, device, and storage medium
Zhao Luo, Beijing (CN); Fengtian Wang, Beijing (CN); Shaozhong He, Beijing (CN); and Zezhong Wang, Beijing (CN)
Assigned to Beijing Volcano Engine Technology Co., Ltd., Beijing (CN)
Filed by Beijing Volcano Engine Technology Co., Ltd., Beijing (CN)
Filed on Nov. 21, 2024, as Appl. No. 18/955,901.
Claims priority of application No. 202311666657.7 (CN), filed on Dec. 6, 2023.
Prior Publication US 2025/0190404 A1, Jun. 12, 2025
Int. Cl. G06F 7/00 (2006.01); G06F 16/21 (2019.01); G06F 16/25 (2019.01); G06F 16/27 (2019.01)
CPC G06F 16/211 (2019.01) [G06F 16/258 (2019.01); G06F 16/27 (2019.01)] 17 Claims
OG exemplary drawing
 
1. A method of comparing data of heterogeneous data sources, comprising:
obtaining first metadata and second metadata corresponding to the heterogeneous data sources to be compared;
determining a data isomorphism benchmark from the heterogeneous data sources based on the first metadata and the second metadata, and determining a first target data table and a second target data table corresponding to the heterogeneous data sources;
associating the first target data table with the second target data table to determine a comparison indicator of the heterogeneous data sources;
determining, based on the first target data table and the second target data table, a target comparison manner between the first target data table and the second target data table; and
generating, according to the target comparison manner and the data isomorphism benchmark, a data comparison result corresponding to the comparison indicator;
wherein the first metadata comprises a first update frequency and a first data amount, the second metadata comprises a second update frequency and a second data amount; and
determining the data isomorphism benchmark from the heterogeneous data sources based on the first metadata and the second metadata comprises:
comparing the first update frequency and the second update frequency, to determine a first target data source with a slower update frequency; and
determining the first target data source as the data isomorphism benchmark;
and/or
comparing the first data amount and the second data amount, to determine a second target data source with a larger data amount, and
determining the second target data source as the data isomorphism benchmark.