| CPC G06F 11/1451 (2013.01) [G06F 2201/84 (2013.01)] | 17 Claims |

|
1. A method for generating synthetic full backups, the method comprising:
performing a verification that a previous backup of source data stored in a data domain (DD) is a failed synthetic full backup (SFB);
obtaining, based on the verification and based on a user initiated backup request to perform another backup of the source data to the DD, a latest snapshot of the source data, wherein the latest snapshot of the source data is a most recently taken successful snapshot of the source data before the failed SFB, wherein a source node is configured as a Hadoop cluster and stores the source data;
obtaining, based on the verification, a prior snapshot of the source data, wherein the prior snapshot is created before the latest snapshot and used to perform the failed SFB;
generating a snapshot difference report using the latest snapshot and the prior snapshot, wherein the snapshot difference report comprises a delete list, a rename list, and a copy list comprising a list of data items;
making a first determination that a copy started file (CSF) does not exist in the previous backup, wherein the non-existence of the CSF indicates that neither a delete operation and a rename operation was performed on the previous backup;
making, after the first determination, a second determination, using the copy list, that a first portion of the data items in the copy list exists in the previous backup, and a second portion of the data items does not exist in the previous backup;
performing, based on the second determination and using the snapshot difference report, a copy operation on a copy of the previous backup to copy the second portion of the data items to the DD to obtain a SFB; and
updating, upon obtaining the SFB, permissions and attributes associated with files and/or folders of the source data included in the SFB to verify a status of the SFB during a next synthetic full backup cycle.
|