US 12,093,575 B2
Global de-duplication of virtual disks in a storage platform
Avinash Lakshman, Fremont, CA (US); and Gaurav Yadav, Mountain View, CA (US)
Assigned to Commvault Systems, Inc., Tinton Falls, NJ (US)
Filed by Commvault Systems, Inc., Tinton Falls, NJ (US)
Filed on Jun. 2, 2023, as Appl. No. 18/205,448.
Application 18/205,448 is a continuation of application No. 17/707,077, filed on Mar. 29, 2022, granted, now 11,733,930.
Application 17/707,077 is a continuation of application No. 17/028,164, filed on Sep. 22, 2020, granted, now 11,314,458, issued on Apr. 26, 2022.
Application 17/028,164 is a continuation of application No. 15/155,838, filed on May 16, 2016, granted, now 10,846,024, issued on Nov. 24, 2020.
Prior Publication US 2023/0325124 A1, Oct. 12, 2023
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 3/06 (2006.01)
CPC G06F 3/0664 (2013.01) [G06F 3/0608 (2013.01); G06F 3/0641 (2013.01); G06F 3/0683 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A system comprising:
a data storage platform comprising computer nodes, wherein each computer node among the computer nodes comprises a hardware processor and data storage devices; and
a computer server, which comprises a hardware processor and is in communication with at least one of the computer nodes of the data storage platform, wherein the computer server is configured to host a controller virtual machine, which is configured to intercept write requests issued by a software application hosted by the computer server;
wherein the data storage platform is configured with a system deduplication disk, wherein the system deduplication disk is configured as a virtual disk that is not presented by the data storage platform as an addressable target for the software application, wherein the system deduplication disk comprises storage locations distributed across one or more of the data storage devices of the computer nodes of the data storage platform; and
wherein the computer server is configured to:
intercept a first write request issued by the software application, wherein the first write request comprises a first data block and is addressed to a first virtual disk that is configured on the data storage platform, wherein the first virtual disk is distinct from the system deduplication disk,
based on determining that a first hash value for the first data block is not present in a first data structure at the controller virtual machine, determine whether the first hash value is present in a second data structure at a first computer node among the computer nodes of the data storage platform, wherein at least one of the first data structure and the second data structure includes hash values of data blocks that have been stored in the system deduplication disk of the data storage platform, and
based on determining that the first hash value is not present in either the first data structure or the second data structure: cause a second computer node among the computer nodes of the data storage platform to store the first data block at a first location of the system deduplication disk, wherein the first location is configured at the second computer node, and wherein the first data block is not stored into the first virtual disk addressed by the first write request.