US 11,755,229 B2
Archival task processing in a data storage system
Shiv Shankar Kumar, Pune (IN); and Avadut Mungre, North Goa (IN)
Assigned to EMC IP HOLDING COMPANY LLC, Hopkinton, MA (US)
Filed by EMC IP Holding Company LLC, Hopkinton, MA (US)
Filed on Jun. 25, 2020, as Appl. No. 16/911,814.
Prior Publication US 2021/0405878 A1, Dec. 30, 2021
Int. Cl. G06F 12/00 (2006.01); G06F 3/06 (2006.01); G06F 21/62 (2013.01); G06F 21/60 (2013.01)
CPC G06F 3/065 (2013.01) [G06F 3/0604 (2013.01); G06F 3/067 (2013.01); G06F 3/0619 (2013.01); G06F 3/0653 (2013.01); G06F 3/0659 (2013.01); G06F 3/0683 (2013.01); G06F 21/602 (2013.01); G06F 21/6218 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A data storage system, comprising:
a memory that stores computer executable components; and
a processor that executes computer executable components stored in the memory, wherein the computer executable components comprise:
a file designation component that selects a file stored by a primary computing cluster of the data storage system for archival to a remote storage system;
a cluster selection component that identifies secondary computing clusters of the data storage system that are distinct from the primary computing cluster and have respective copies of the file, wherein the cluster selection component selects, from the secondary computing clusters after identifying the secondary computing clusters and in response to determining that respective second file systems of the secondary computing clusters utilize second file system versions that are different from a first file system version of a first file system utilized by the primary computing cluster, a first secondary computing cluster having a second file system, of the second file systems, that utilizes a second file system version, of the second file system versions, that is closest to the first file system version; and
an archival management component that, in response to determining that a copy, of the respective copies of the file and stored by the first secondary computing cluster, matches the file stored by the primary computing cluster, directs the first secondary computing cluster to archive the copy of the file to the remote storage system instead of the primary computing cluster.