US 11,789,917 B2
Data deduplication in a storage system
Doron Tal, Geva Carmel (IL); and Yosef Shatsky, Karnei Shomron (IL)
Assigned to Dell Products L.P., Round Rock, TX (US)
Filed by Dell Products L.P., Round Rock, TX (US)
Filed on Jan. 25, 2022, as Appl. No. 17/583,365.
Prior Publication US 2023/0237029 A1, Jul. 27, 2023
Int. Cl. G06F 16/215 (2019.01); G06F 16/22 (2019.01)
CPC G06F 16/215 (2019.01) [G06F 16/2246 (2019.01)] 20 Claims
OG exemplary drawing
 
1. A method, comprising:
receiving, by a storage control system, a first data block to be written to a primary storage;
generating, by the storage control system, a content signature for the first data block;
adding, by the storage control system, a first entry for the first data block into a persistent deduplication database, wherein the first entry for the first data block comprises a key which comprises the content signature for the first data block, and wherein the persistent deduplication database comprises a tree data structure which comprises elements that are configured to store entries for data blocks;
merging, by the storage control system, the entries of at least two elements of the tree data structure to generate a set of merged entries which comprises the first entry for the first data block;
determining, by the storage control system, whether the set of merged entries includes a second data block having a second entry with a key that matches the key of the first entry of the first data block; and
commencing, by the storage control system, a deduplication process to determine if the first entry and the second entry correspond to duplicate data blocks, in response to determining that the first entry and the second entry in the set of merged entries have matching keys.