US 12,340,107 B2
Deduplication selection and optimization
John Colgrove, Los Altos, CA (US); Ronald Karr, Palo Alto, CA (US); and Ethan L. Miller, Santa Cruz, CA (US)
Assigned to PURE STORAGE, INC., Santa Clara, CA (US)
Filed by PURE STORAGE, INC., Santa Clara, CA (US)
Filed on Jul. 17, 2023, as Appl. No. 18/353,264.
Application 18/353,264 is a continuation of application No. 16/194,119, filed on Nov. 16, 2018, granted, now 11,704,036.
Application 16/194,119 is a continuation of application No. 15/333,903, filed on Oct. 25, 2016, granted, now 10,133,503.
Claims priority of provisional application 62/330,728, filed on May 2, 2016.
Prior Publication US 2023/0359381 A1, Nov. 9, 2023
Int. Cl. G06F 12/00 (2006.01); G06F 3/06 (2006.01); G06F 12/1018 (2016.01); G06F 16/22 (2019.01); G06F 16/23 (2019.01); G06F 16/25 (2019.01)
CPC G06F 3/0641 (2013.01) [G06F 3/061 (2013.01); G06F 3/0619 (2013.01); G06F 3/0665 (2013.01); G06F 3/0689 (2013.01); G06F 12/1018 (2013.01); G06F 16/2255 (2019.01); G06F 16/2365 (2019.01); G06F 16/258 (2019.01)] 15 Claims
OG exemplary drawing
 
1. A storage system comprising:
a plurality of storage devices comprising flash memory; and
a processing device, operatively coupled to the plurality of storage devices, configured to:
generate a hash value for a portion of data blocks to be stored at a particular storage device of the plurality of storage devices;
determine that the hash value matches a corresponding hash value of a data block currently stored at the particular storage device;
select one of a first deduplication process or a second deduplication process to be performed by the processing device based on one or more performance metrics, the one or more performance metrics comprise a type of storage medium the other data blocks are retrieved from, wherein the selected one of the first deduplication process or the second deduplication process determines whether remaining portions of the data blocks to be stored at the particular storage device match other data blocks currently stored at the particular storage device, and wherein when performing the second deduplication process, the processing device is further configured to:
retrieve the other data blocks currently stored at the particular storage device that are associated with the data block currently stored at the particular storage device; and
determine whether one or more of the remaining portions of the data blocks to be stored at the particular storage device match the other data blocks; and
perform, by the processing device, the selected one of the first deduplication process or the second deduplication process.