US 11,797,208 B2
Backend deduplication awareness
Miles Mulholland, Eastleigh (GB); Eric John Bartlett, Chard (GB); Dominic Tomkins, Alton (GB); and Alex Dicks, Winchester (GB)
Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION, Armonk, NY (US)
Filed by INTERNATIONAL BUSINESS MACHINES CORPORATION, Armonk, NY (US)
Filed on Sep. 17, 2021, as Appl. No. 17/478,046.
Prior Publication US 2023/0089939 A1, Mar. 23, 2023
Int. Cl. G06F 3/06 (2006.01)
CPC G06F 3/0641 (2013.01) [G06F 3/061 (2013.01); G06F 3/0608 (2013.01); G06F 3/0679 (2013.01)] 18 Claims
OG exemplary drawing
 
1. A computer-implemented method, comprising:
performing an input/output (IO) operation to a physical address of a disk;
receiving a deduplication information from a backend storage controller associated with the performed IO operation;
translating the physical address of the disk to a logical address to apply to a plurality of storage extents in a virtualizing layer;
constructing a graph including a plurality of nodes and a plurality of edges connecting the plurality of nodes, wherein each node of the plurality of nodes represents a respective storage extent of the plurality of storage extents in the virtualization layer, and wherein each edge of the plurality of edges connects two nodes and represents a deduplication relationship between two respective storage extents corresponding to the two nodes, wherein each edge includes a corresponding deduplication edge weight representing a number of deduplications between the two respective storage extents indicated by the backend storage controller; and
identifying at least one subgraph within the constructed graph, wherein the identified at least one subgraph includes a cluster of nodes connected by corresponding edges, wherein the cluster of nodes is disconnected from other nodes of the plurality of nodes in the constructed graph, wherein the identified at least one subgraph represents a storage extent cluster from the plurality of storage extents, wherein the storage extent cluster is selectable for garbage collection as a group.