US 11,733,874 B2
Managing replication journal in a distributed replication system
Rivka Matosevich, Zichron Ya'acov (IL); Roman Spiegelman, Yokneam Illit (IL); German Goft, Pardess Hanna Karkur (IL); and Lior Zilpa, Holon (IL)
Assigned to EMC IP Holding Company LLC, Hopkinton, MA (US)
Filed by EMC IP Holding Company LLC, Hopkinton, MA (US)
Filed on May 3, 2021, as Appl. No. 17/306,601.
Prior Publication US 2022/0350497 A1, Nov. 3, 2022
Int. Cl. G06F 3/06 (2006.01); G06F 16/18 (2019.01)
CPC G06F 3/0619 (2013.01) [G06F 3/065 (2013.01); G06F 3/0608 (2013.01); G06F 3/0634 (2013.01); G06F 3/0641 (2013.01); G06F 3/0683 (2013.01); G06F 16/1815 (2019.01)] 20 Claims
OG exemplary drawing
 
1. A method, comprising:
controlling, by a data replication system, data replication operations performed on a storage node of a data storage system, wherein the data replication system comprises a replication component layer comprising a plurality of replication components operating on the storage node, and a distribution layer operating on the storage node and configured to distribute a replication workload among the plurality of replication components, wherein controlling the data replication operations comprises:
assigning, by the data replication system, an associated replication journal volume to each replication component of the plurality of replication components operating on the storage node, wherein each replication component operating on the storage node is (i) assigned to handle a respective portion of a replication workload associated with replication input/output (I/O) requests directed to logical addresses which correspond to a respective block of logical addresses of a storage volume, and (ii) configured to write journal data, which is associated with I/O write operations handled by the replication component in response to the replication I/O requests, in the associated replication journal volume of the replication component;
distributing, by the distribution layer, the replication workload among the plurality of replication components by directing the replication I/O requests to respective replication components which are assigned to handle the replication workload associated with the respective logical addresses of the replication I/O requests; and
in response to detecting a failed replication component of the plurality of replication components, performing, by the data replication system, a recovery process which comprises:
designating at least one replication component of the plurality of replication components as a recovery replication component;
designating the associated replication journal volume of the failed replication component as a recovery journal volume; and
assigning the recovery journal volume to the recovery replication component to enable the recovery replication component to recover journal data in the recovery journal volume.