| CPC G06F 11/1453 (2013.01) [G06F 16/128 (2019.01); G06F 16/2246 (2019.01); G06F 2201/84 (2013.01)] | 20 Claims |

|
1. A method comprising:
maintaining metadata of a file system in a distributed key value store running on multiple nodes of a cluster hosting the file system, the metadata comprising multiple namespaces, each namespace represented as a B+ tree having pages written to the distributed key value store;
defining key names to the pages as a tuple, the tuple comprising an mtree identifier, a snapshot identifier and a page number, wherein the file system is divided into multiple logical partitions and the mtree identifier identifies a mountable directory hierarchy of a logical partition of the file system;
storing keys, having the key names, to the pages in the distributed key value store, each key name thereby comprising the mtree identifier, the snapshot identifier, and the page number;
tracking, using the mtree and the snapshot identifiers present in the key names, pages that are shared between first and second snapshots of a namespace corresponding to the mountable directory hierarchy of the file system, the first and the second snapshots not being immutable because the key value store in which the namespace is stored is not immutable, wherein the first snapshot comprises a first snapshot identifier, and the second snapshot comprises a second snapshot identifier,
wherein the distributed key value store comprises a plurality of keys including a first key and a second key,
the first key identifies a first page and a first key name of the first key comprises a first mtree identifier, the first snapshot identifier, and a first page number,
the first page comprises a child pointer to a second page, the child pointer comprising an entry having the first mtree identifier, the first snapshot identifier, and a second page number, and has a key name comprising the first mtree identifier, the first snapshot identifier, and the second page number,
the second key identifies a third page and a second key name of the second key comprises the first mtree identifier, the second snapshot identifier, and the first page number, and
the third page comprises the child pointer to the second page;
receiving a request to write to the second page;
comparing a key name of the child pointer comprising the first mtree identifier and the first snapshot identifier in the third page and a key name of the second key comprising the first mtree identifier and the second snapshot identifier identifying the third page;
determining that the key name of the child pointer comprising the first mtree identifier and the first snapshot identifier in the third page is different from the key name of the second key comprising the first mtree identifier and the second snapshot identifier identifying the third page;
conducting a copy-on-write for the second page to create a new page; and
modifying the child pointer of the third page with a new entry to point to the new page, the new entry comprising a new key having a new key name including the first mtree identifier, the second snapshot identifier, and the second page number.
|