CPC G06F 3/0641 (2013.01) [G06F 3/0608 (2013.01); G06F 3/0683 (2013.01)] | 20 Claims |
1. A method for de-duplicating data, comprising:
determining a target physical block in a first storage device, a plurality of data blocks in the target physical block being to be transferred to a second storage device;
determining a compression ratio of a target data block in the plurality of data blocks;
determining a target hash value of the target data block in response to the compression ratio being lower than a threshold compression ratio; and
determining a de-duplication operation for the target data block based on the target hash value and a de-duplication hash table, the de-duplication hash table storing hash values of data blocks that have been transferred from the first storage device to the second storage device;
wherein determining the de-duplication operation comprises:
determining from a plurality of logically contiguous data blocks a group of logically contiguous data blocks starting from the target data block, hash values of the group of logically contiguous data blocks hitting data blocks in the de-duplication hash table that are located in contiguous physical space;
determining whether the number of data blocks in the group of logically contiguous data blocks exceeds a threshold number; and
de-duplicating the group of logically contiguous data blocks in response to the number exceeding the threshold number.
|