US 11,734,117 B2
Data recovery in a storage system
Yogev Vaknin, Tel Aviv (IL); Lior Klipper, Tel Aviv (IL); and Alon Berger, Tel Aviv (IL)
Assigned to VAST DATA LTD., Tel Aviv (IL)
Filed by VAST DATA LTD., Tel Aviv (IL)
Filed on Apr. 29, 2021, as Appl. No. 17/302,318.
Prior Publication US 2022/0358017 A1, Nov. 10, 2022
Int. Cl. G06F 11/00 (2006.01); G06F 11/14 (2006.01); G06F 11/10 (2006.01)
CPC G06F 11/1435 (2013.01) [G06F 11/1096 (2013.01); G06F 2201/805 (2013.01)] 31 Claims
OG exemplary drawing
 
1. A method for recovering failed chunks, the method comprises:
obtaining a failure indication about a failure of a first number (X1) of failed chunks; wherein the failed chunks were stored in a group of disks, the group of disks is configured to store multiple (α) stripes, each stripe comprises multiple (N) chunks that comprises a first plurality (K) of data chunks and a second plurality (R) of parity chunks; wherein at least one parity chunk that belongs to a certain stripe is calculated based on one or more data chunks that belong to other stripes; and
performing at least one recovery iteration until fulfilling a stop condition; wherein each recovery iteration of the at least one recovery iteration comprises:
selecting valid chunks to provide selected valid chunks, wherein a number of selected chunks is smaller than a product of a multiplication of R by α, wherein the selecting includes arbitrarily selecting at least two stripes from the multiple stripes, wherein the at least two stripes are associated with a subset of equations, wherein the subset of equations includes a portion of multiple equations that need to be solved for recovering the first number of failed chunks, and wherein the selecting further includes selecting at least one extra data chunk that belongs to a non-selected stripe of the multiple stripes, in a case where the subset of equations dictates the at least one extra data chunk;
when it is determined that a number of failed chunks related to the subset of equations exceeds a number of equations in the subset of equations:
adding additional valid chunks to the selected valid chunks, wherein the adding includes arbitrarily selecting, from the multiple stripes, one or more additional stripes that were not previously selected, wherein the one or more additional stripes are associated with at least one additional equation that is added to the subset of equations;
wherein the step of adding is repeated if it is determined that the number of failed chunks related to the subset of equations still exceeds the number of equations in the subset of equations;
when it is determined that the number of failed chunks related to the subset of equations does not exceed the number of equations in the subset of equations:
retrieving valid data chunks that are relevant to the selected valid chunks; and reconstructing the failed chunks related to the subset of equations based on the retrieved chunks and the subset of equations.