| CPC G06F 16/2365 (2019.01) [G06F 16/2255 (2019.01)] | 20 Claims |

|
1. A method, comprising:
identifying, by a local data station comprising at least one processor, a group of source data blocks in a storage device of the local data station that satisfies a criterion for performance of partial garbage collection on the group of source data blocks, wherein each source data block of the group of source data blocks comprises respective garbage data and respective valid data;
performing, by the local data station, a partial garbage collection process on the group of source data blocks, wherein the partial garbage collection process comprises:
generating, by the local data station, a target data block in the storage device of the local data station, wherein the target data block comprises the respective valid data of each of the source data blocks and does not contain the respective garbage data of each of the source data blocks;
generating, by the local data station, data collection information based on the group of source data blocks and the target data block, wherein the data collection information comprises:
for each source data block of the group of source data blocks, a respective source identifier for the source data block and respective one or more physical location ranges of the respective valid data in the source data block, wherein the data collection information does not comprise the respective valid data of the source data blocks nor information about the respective garbage data of the source data blocks;
sending, by the local data station, the data collection information to a remote data station that stores replicated source data blocks matching the group of source data blocks, wherein the data collection information causes the remote data station to perform the partial garbage collection on the replicated source data blocks to generate a replicated target data block corresponding to the target data block of the local data station based on the data collection information;
receiving, by the local data station, a completion response from the remote data station, wherein the completion response indicates that the replicated target data block has been generated at the remote data station; and
in response to receiving the completion response, releasing, by the local data station, the group of source data blocks from the storage device of the local data station.
|