US 12,079,472 B2
Data reduction method, apparatus, computing device, and storage medium for forming index information based on fingerprints
Bang Liu, Saint Petersburg (RU); Liyu Wang, Beijing (CN); Kun Guan, Saint Petersburg (RU); Wen Yang, Chengdu (CN); and Jianqiang Shen, Hangzhou (CN)
Assigned to HUAWEI TECHNOLOGIES CO., LTD., Shenzhen (CN)
Filed by HUAWEI TECHNOLOGIES CO., LTD., Guangdong (CN)
Filed on Apr. 29, 2022, as Appl. No. 17/732,675.
Application 17/732,675 is a continuation of application No. PCT/CN2020/120990, filed on Oct. 14, 2020.
Claims priority of application No. 201911061340.4 (CN), filed on Nov. 1, 2019.
Prior Publication US 2022/0253222 A1, Aug. 11, 2022
Int. Cl. G06F 3/06 (2006.01)
CPC G06F 3/0608 (2013.01) [G06F 3/064 (2013.01); G06F 3/0671 (2013.01)] 17 Claims
OG exemplary drawing
 
1. A data reduction method, comprising:
obtaining fingerprints of to-be-reduced data blocks;
forming an index set based on the fingerprints of the to-be-reduced data blocks by using index information of data blocks with identical fingerprints, the index set comprising the index information of the data blocks, and the index information including addresses of the data blocks; and
performing, in the to-be-reduced data blocks based on the fingerprints of the to-be-reduced data blocks, data reduction processing on data blocks with index information in a same index set,
wherein the fingerprints of the data blocks are similar fingerprints or to-be-deduplicated fingerprints, the similar fingerprints are for determining whether similar deduplication can be performed on the data blocks, and the to-be-deduplicated fingerprints are for determining whether the data blocks can be deduplicated, and
wherein the index information of each of the data blocks is indicated by a key-value pair comprising a key and a value corresponding to the key, and
wherein, in the key-value pair indicating each of the data blocks, the key is the similar fingerprint of the data block, and the value comprises both the address of the data block and the to-be-deduplicated fingerprint of the data block.