US 12,346,210 B2
Method and system for idempotent synthetic full backups in storage devices
Deepthi Urs, Bangalore (IN); Shraddha Chunekar, Indore (IN); Adrian Dobrean, Oakville (CA); Navneet Upadhyay, Greater Noida (IN); Sunder Ramesh Andra, Bangalore (IN); and Amith Ramachandran, Bangalore (IN)
Assigned to EMC IP HOLDING COMPANY LLC, Hopkinton, MA (US)
Filed by EMC IP Holding Company LLC, Hopkinton, MA (US)
Filed on Nov. 11, 2021, as Appl. No. 17/524,578.
Prior Publication US 2023/0143903 A1, May 11, 2023
Int. Cl. G06F 11/14 (2006.01)
CPC G06F 11/1451 (2013.01) [G06F 2201/84 (2013.01)] 17 Claims
OG exemplary drawing
 
1. A method for generating synthetic full backups, the method comprising:
performing a verification that a previous backup of source data stored in a data domain (DD) is a failed synthetic full backup (SFB);
obtaining, based on the verification and based on a user initiated backup request to perform another backup of the source data to the DD, a latest snapshot of the source data, wherein the latest snapshot of the source data is a most recently taken successful snapshot of the source data before the failed SFB, wherein a source node is configured as a Hadoop cluster and stores the source data;
obtaining, based on the verification, a prior snapshot of the source data, wherein the prior snapshot is created before the latest snapshot and used to perform the failed SFB;
generating a snapshot difference report using the latest snapshot and the prior snapshot, wherein the snapshot difference report comprises a delete list, a rename list, and a copy list comprising a list of data items;
making a first determination that a copy started file (CSF) does not exist in the previous backup, wherein the non-existence of the CSF indicates that neither a delete operation and a rename operation was performed on the previous backup;
making, after the first determination, a second determination, using the copy list, that a first portion of the data items in the copy list exists in the previous backup, and a second portion of the data items does not exist in the previous backup;
performing, based on the second determination and using the snapshot difference report, a copy operation on a copy of the previous backup to copy the second portion of the data items to the DD to obtain a SFB; and
updating, upon obtaining the SFB, permissions and attributes associated with files and/or folders of the source data included in the SFB to verify a status of the SFB during a next synthetic full backup cycle.