US 11,989,158 B2
Maintaining retention policies in a block storage, multi-client dedup domain
Radia J. Perlman, Redmond, WA (US); and Kalyan C. Gunda, Bangalore (IN)
Assigned to EMC IP HOLDING COMPANY LLC, Hopkinton, MA (US)
Filed by EMC IP Holding Company LLC, Hopkinton, MA (US)
Filed on Jan. 28, 2021, as Appl. No. 17/160,783.
Prior Publication US 2022/0237148 A1, Jul. 28, 2022
Int. Cl. G06F 7/00 (2006.01); G06F 9/54 (2006.01); G06F 16/11 (2019.01); G06F 16/174 (2019.01); G06F 16/176 (2019.01)
CPC G06F 16/125 (2019.01) [G06F 9/547 (2013.01); G06F 16/1748 (2019.01); G06F 16/176 (2019.01)] 15 Claims
OG exemplary drawing
 
1. A method, comprising:
maintaining, at a server, a deduplication data structure comprising one or more entries, and each of the one or more entries comprises a respective fingerprint, and a respective pointer that points to a physical location on a disk where data associated with that fingerprint is stored;
maintaining, at the server, a ClientBlockList data structure comprising one or more entries, and each of the entries comprises a respective handle, a respective retention date, and a respective block;
receiving, at the server, a write request for writing a block of data, wherein the write request identifies a handle, retention date, and the block;
computing, at the server, a fingerprint of the block identified in the write request;
determining, by the server, whether the fingerprint of the block is in the deduplication data structure, and when the fingerprint of the block is not in the deduplication data structure, storing the block identified in the write request at a location in the deduplication data structure, and adding, to the deduplication data structure, an entry that includes the fingerprint of the block and a pointer that points to the location; and
adding, to the ClientBlockList data structure, an entry that identifies the handle, the retention date, and the fingerprint of the block,
wherein, when the handle is already included in an entry of the ClientBlockList data structure, the entry of the ClientBlockList data structure is overwritten by the entry that identifies the handle, the retention date, and the fingerprint of the block,
wherein the deduplication data structure spans multiple client domains, and
wherein a deduplication process performed with respect to the block identified in the write request is performed without reference to the retention date of that block.