| CPC G06F 16/1748 (2019.01) [G06F 16/152 (2019.01)] | 20 Claims |

|
1. A method comprising:
providing or making accessible an on-drive deduplication (“dedupe”) index, the on-drive dedupe index including a plurality of index entries, each index entry including a digest of a data page and an address associated with a location where the data page is stored, each digest having a digest prefix, each data page being associated with a reference count, each index entry being assigned to a bucket data structure (“bucket”) defined by a respective digest prefix;
for each data page associated with a reference count decremented to zero (0), logging, in a dedupe log, a digest prefix of the data page and a corresponding address associated with a location where the data page is stored; and
for each bucket of the on-drive dedupe index:
constructing, dynamically and on-demand, an address bag data structure (“address bag”);
storing, in the address bag, one or more addresses from the dedupe log whose corresponding digest prefix is the same as the respective digest prefix defining the bucket; and
removing, from the bucket, each index entry that includes an address matching one of the addresses in the address bag, the index entry being regarded as a stale index entry.
|