US 11,921,704 B2
Version control interface for accessing data lakes
Abhishek Gupta, San Jose, CA (US); Richard P. Spillane, Palo Alto, CA (US); Christos Karamanolis, Los Gatos, CA (US); and Marin Nozhchev, Sofia (BG)
Assigned to VMware, Inc., Palo Alto, CA (US)
Filed by VMware LLC, Palo Alto, CA (US)
Filed on Dec. 28, 2021, as Appl. No. 17/564,206.
Prior Publication US 2023/0205757 A1, Jun. 29, 2023
Int. Cl. G06F 16/23 (2019.01); G06F 16/22 (2019.01)
CPC G06F 16/2379 (2019.01) [G06F 16/2246 (2019.01)] 20 Claims
OG exemplary drawing
 
1. A method comprising:
creating a private branch from a first master branch, the first master branch comprising a tree data structure having a plurality of leaf nodes referencing data objects stored in a data lake, wherein the private branch is configured to be written to by a writer and wherein the first master branch is configured to be read from by a reader; and
writing a new data object into the private branch, wherein writing the new data object into the private branch comprises:
queuing data for the new data object in a write ahead log (WAL);
reading data from the WAL; and
adding the read data from the WAL to the private branch as the new data object; and
generating a new master branch for the data objects stored in the data lake, wherein generating the new master branch comprises merging the private branch with the first master branch, wherein the new master branch references the new data object written to the data lake, and wherein the new master branch is configured to be read from by the reader.