CPC G06F 3/0664 (2013.01) [G06F 3/0608 (2013.01); G06F 3/0641 (2013.01); G06F 3/0683 (2013.01)] | 20 Claims |
1. A system comprising:
a data storage platform comprising computer nodes, wherein each computer node among the computer nodes comprises a hardware processor and data storage devices; and
a computer server, which comprises a hardware processor and is in communication with at least one of the computer nodes of the data storage platform, wherein the computer server is configured to host a controller virtual machine, which is configured to intercept write requests issued by a software application hosted by the computer server;
wherein the data storage platform is configured with a system deduplication disk, wherein the system deduplication disk is configured as a virtual disk that is not presented by the data storage platform as an addressable target for the software application, wherein the system deduplication disk comprises storage locations distributed across one or more of the data storage devices of the computer nodes of the data storage platform; and
wherein the computer server is configured to:
intercept a first write request issued by the software application, wherein the first write request comprises a first data block and is addressed to a first virtual disk that is configured on the data storage platform, wherein the first virtual disk is distinct from the system deduplication disk,
based on determining that a first hash value for the first data block is not present in a first data structure at the controller virtual machine, determine whether the first hash value is present in a second data structure at a first computer node among the computer nodes of the data storage platform, wherein at least one of the first data structure and the second data structure includes hash values of data blocks that have been stored in the system deduplication disk of the data storage platform, and
based on determining that the first hash value is not present in either the first data structure or the second data structure: cause a second computer node among the computer nodes of the data storage platform to store the first data block at a first location of the system deduplication disk, wherein the first location is configured at the second computer node, and wherein the first data block is not stored into the first virtual disk addressed by the first write request.
|