US 12,346,290 B2
Workload allocation for file system maintenance
Steven Henry Haber, Seattle, WA (US); Noah Trent Nelson, Seattle, WA (US); and Thomas Scott Urban, Seattle, WA (US)
Assigned to Qumulo, Inc., Seattle, WA (US)
Filed by Qumulo, Inc., Seattle, WA (US)
Filed on Jul. 13, 2022, as Appl. No. 17/864,190.
Prior Publication US 2024/0020268 A1, Jan. 18, 2024
Int. Cl. G06F 16/11 (2019.01); G06F 11/14 (2006.01); G06F 16/16 (2019.01)
CPC G06F 16/128 (2019.01) [G06F 11/1451 (2013.01); G06F 16/162 (2019.01); G06F 2201/84 (2013.01)] 28 Claims
OG exemplary drawing
 
1. A method for managing data in a file system over a network using one or more processors that execute instructions that are configured to cause performance of actions, comprising:
providing the file system that includes one or more storage nodes and a plurality of snapshots, wherein each snapshot is associated with a plurality of data blocks; and
in response to deleting one or more snapshots of the plurality of snapshots, performing further actions, including:
determining a plurality of dead blocks associated with the one or more deleted snapshots, wherein each dead block is a data block that is unassociated with one or more undeleted snapshots;
adding the plurality of dead blocks to a plurality of dead trees located on a randomly determined selection of the one or more storage nodes;
determining an urgency score based on a workload model that includes one or more individual workload sub-models for each of one or more file system metrics and one or more other metrics, wherein an overall urgency score is determined based on a highest urgency score determined by an individual workload sub-model for a plurality of delete tasks, and wherein the one or more other metrics are based on configuration information for a current state of the file system and include one or more of a network utilization, a cloud computing cost structure, a hardware capability, an ongoing rebalancing operation, or a rate of transactions;
determining initiation of the plurality of delete tasks for the one or more storage nodes based on the overall urgency score;
determining a portion of the storage nodes sequentially based on a total number of delete tasks that are initiated for each storage node in the portion, wherein a sequential order of the portion of the storage nodes is determined by an amount of the total number of initiated delete tasks that correspond to each storage node in the portion and configuration information for one or more of local requirements or local circumstances; and
executing each corresponding amount of initiated delete tasks concurrently on the portion of storage nodes to perform further actions including:
determining one or more dead blocks on the portion of storage nodes that are associated with the one or more deleted snapshots; and
deleting the one or more determined dead blocks on the portion of storage nodes and the one or more dead blocks added to the one or more dead trees on the one or more randomly determined storage nodes, wherein a storage capacity associated with the one or more deleted snapshots is returned to the file system.