US 12,130,781 B2
Sparse files aware rolling checksum
Giuseppe Scrivano, Milan (IT)
Assigned to Red Hat, Inc., Raleigh, NC (US)
Filed by Red Hat, Inc., Raleigh, NC (US)
Filed on Apr. 28, 2022, as Appl. No. 17/731,955.
Prior Publication US 2023/0350852 A1, Nov. 2, 2023
Int. Cl. G06F 16/00 (2019.01); G06F 16/11 (2019.01); G06F 16/13 (2019.01); G06F 16/178 (2019.01)
CPC G06F 16/178 (2019.01) [G06F 16/113 (2019.01); G06F 16/137 (2019.01)] 20 Claims
OG exemplary drawing
 
1. A method, comprising:
passing, in sequence, each byte of an archival file stored on a source system to a hash function; and
detecting that a first sequence of bytes from the archival file produces outputs from the hash function of zero, wherein a first number of bytes in the first sequence of bytes satisfies a chunk-end threshold;
determining that the first sequence of bytes is located in a hole in the archival file of a greater number of bytes than the chunk-end threshold;
designating a hole-chunk of the archival file that includes first metadata for a first location and a first length of the hole in the archival file;
detecting that a second sequence of bytes from the archival file produces outputs from the hash function of non-zero, wherein a second number of bytes in the second sequence of bytes satisfies the chunk-end threshold;
designating a data-chunk that includes second metadata of a second location and a second length of the second sequence of bytes in the archival file and a hashed value of the second sequence of bytes, wherein the second metadata configures a destination system to ignore the hole-chunk and its additional metadata during synchronization of the archival file;
receiving a request to transmit the archival file to the destination system; and
synchronizing the archival file with respect to the destination system by transmitting the second metadata and the archival file to the destination system by transmitting, from the source system to the destination system, the data-chunk and not transmitting the hole-chunk and the additional metadata.