US 12,346,570 B2
Data regeneration and storage in a raid storage system
David W Cosby, Raleigh, NC (US); Wilson Velez, Raleigh, NC (US); Patrick L Caporale, Cary, NC (US); and Gerald C Ushery, Jr., Raleigh, NC (US)
Assigned to Lenovo Global Technology (United States) Inc., Morrisville, NC (US)
Filed by Lenovo Enterprise Solutions (Singapore) Pte Ltd., Singapore (SG)
Filed on Mar. 31, 2023, as Appl. No. 18/193,921.
Prior Publication US 2024/0329853 A1, Oct. 3, 2024
Int. Cl. G06F 3/06 (2006.01)
CPC G06F 3/0617 (2013.01) [G06F 3/0658 (2013.01); G06F 3/0689 (2013.01)] 18 Claims
OG exemplary drawing
 
1. A computer program product comprising a non-transitory computer readable medium and program instructions embodied therein, the program instructions being configured to be executable by a processor to cause the processor to perform operations comprising:
identifying, via communication with a RAID controller that manages operation of an array of drives as a RAID storage system, one of the drives that has been compromised and a failed component of the identified drive that compromised the identified drive;
identifying a failure domain associated with the failed component, wherein data stored within the failure domain associated with the failed component has become inaccessible;
instructing, in response to the failed component having a failure domain that satisfies a first condition, the RAID controller to perform a first recovery action that includes regenerating the inaccessible data using data from other drives within the array of drives and storing the regenerated data on available storage of the identified drive outside the failure domain associated with the failed component;
identifying a storage location on the identified drive that has a sufficient amount of available storage capacity to store the regenerated data, wherein instructing the RAID controller to perform the first recovery action includes instructing the RAID controller to store the regenerated data at the identified storage location;
maintaining a log of recovery actions performed on the array of drives;
updating the log of recovery actions to include a log entry for the first recovery action identifying the storage location of the failure domain associated with the failed component and the identified storage location; and
causing, in response to detecting that the identified drive has experienced a complete failure, the RAID controller to regenerate the data that was stored on the identified drive using data from other drives within the array of drives and move, according to a reverse of the log entry for the first recovery action, the data that was stored in the identified storage location to a storage location on a replacement drive or hot spare drive that corresponds with the storage location of the failure domain associated with the failed component.
 
18. A computer program product comprising a non-transitory computer readable medium and program instructions embodied therein, the program instructions being configured to be executable by a processor to cause the processor to perform operations comprising:
identifying, via communication with a RAID controller that manages the operation of an array of drives as a RAID storage system, one of the drives that has been compromised and a failed component of the identified drive that compromised the identified drive;
identifying a failure domain associated with the failed component, wherein data stored within the failure domain associated with the failed component has become inaccessible;
instructing the RAID controller to regenerate the inaccessible data using data from other drives within the array of drives and store the regenerated data on a separate RAID stripe stored across a plurality of the drives in the array of drives;
maintaining a log of recovery actions performed on the array of drives;
updating the log of recovery actions to include a log entry identifying the storage location of the failure domain associated with the failed component and the storage location of the separate RAID stripe stored across a plurality of the drives in the array of drives; and
causing, in response to detecting that the identified drive has experienced a complete failure, the RAID controller to regenerate the data that was stored on the identified drive using data from other drives within the array of drives and move, according to a reverse of the log entry, the data that was stored in the separate RAID stripe to a storage location on a replacement drive or hot spare drive that corresponds with the storage location of the failure domain associated with the failed component.