US 11,989,437 B1
Compression orchestration on a remote data replication facility
Ramesh Doddaiah, Westborough, MA (US); and Owen Martin, Hopedale, MA (US)
Assigned to Dell Products, L.P., Hopkinton, MA (US)
Filed by Dell Products, L.P., Hopkinton, MA (US)
Filed on Apr. 13, 2023, as Appl. No. 18/299,765.
Int. Cl. G06F 3/06 (2006.01)
CPC G06F 3/065 (2013.01) [G06F 3/0604 (2013.01); G06F 3/0679 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A method of compression orchestration on a remote data replication facility including a primary storage array and a remote storage array, comprising:
synchronizing extents of data between the primary storage array and the remote storage array over the remote data replication facility as host write operations occur on the extents of data on the primary storage array;
transmitting an IO activity heat map from the primary storage array to the remote storage array on the remote data replication facility;
exchanging compressibility heat maps between the primary storage array and remote storage array over the remote data replication facility, each respective compressibility heat map containing per-extent compressibility information determined by the respective primary storage array or remote storage array;
creating a per-extent compressibility forecast model for each extent of data by each of the primary storage array and remote storage array, each per-extent compressibility forecast model being based on a set of previously observed compressibility values for the extent over a preceding set of previous time periods;
using the exchanged compressibility heat maps to update the per-extent compressibility forecast models;
determining a forecast compressibility value for each extent from the updated per-extent compressibility forecast models for an upcoming time period;
selecting a first set of extents to be compressed by the primary storage array based on the IO activity heat map, per-extent forecast compressibility values for each extent determined from the updated per-extent compressibility forecast models on the primary storage array, and a first target data reduction rate on the primary storage array; and
selecting a second set of extents to be compressed by the remote storage array based on the IO activity heat map, per-extent forecast compressibility values for each extent determined from the updated per-extent compressibility forecast models on the remote storage array, and a second target data reduction rate on the remote storage array.