US 12,067,236 B2
Data stability in data storage system
Huihui Cheng, Sunnyvale, CA (US); Gunjan Dang, Campbell, CA (US); Michael Goldsby, San Francisco, CA (US); Yanwei Jiang, Sunnyvale, CA (US); Aswin Karumbunathan, San Francisco, CA (US); Peter E. Kirkpatrick, Mountain View, CA (US); Naveen Neelakantam, Mountain View, CA (US); Neil Buda Vachharajani, Menlo Park, CA (US); and Junming Zhu, Sunnyvale, CA (US)
Assigned to PURE STORAGE, INC., Santa Clara, CA (US)
Filed by PURE STORAGE, INC., Mountain View, CA (US)
Filed on Dec. 13, 2019, as Appl. No. 16/714,029.
Application 16/714,029 is a continuation of application No. 15/416,385, filed on Jan. 26, 2017, granted, now 10,540,095.
Claims priority of provisional application 62/374,460, filed on Aug. 12, 2016.
Prior Publication US 2020/0117361 A1, Apr. 16, 2020
Int. Cl. G06F 3/06 (2006.01); G06F 12/02 (2006.01); G06F 16/13 (2019.01); G06F 16/14 (2019.01); G06F 16/16 (2019.01)
CPC G06F 3/061 (2013.01) [G06F 3/0665 (2013.01); G06F 3/0685 (2013.01); G06F 12/0261 (2013.01); G06F 16/13 (2019.01); G06F 16/14 (2019.01); G06F 16/16 (2019.01)] 15 Claims
OG exemplary drawing
 
1. A system comprising:
a storage array comprising a plurality of storage volumes; and
a storage controller coupled to the storage array, the storage controller comprising a processing device, the processing device to:
perform a sampling of data storage items in an append-only file system that permits new data values for the data storage items and permits old data values for the data storage items to remain in place as no longer used, to identify a sample set of data storage items;
apply a first percentile threshold value and a second threshold percentile value to the sample set of data storage items to identify three data storage item groups based on an age of the data storage items, the first and the second percentile thresholds associated with an age characteristic;
perform a garbage collection process on the append-only file system to identify stale data storage items in a first logical storage segment and active data storage items in the first logical storage segment; and
write a first active data storage item associated with a first group of the three data storage item groups from the first logical storage segment to one of a second logical storage segment or a third logical storage segment, the second logical storage segment and the third logical storage segment comprising other active data storage items associated with the first group, and the first active data storage item and a subset of the other active data storage items are grouped in a same logical storage segment based on a likelihood of garbage collection.