CPC G06F 16/2246 (2019.01) [G06F 12/0253 (2013.01); G06F 16/2358 (2019.01); G06F 16/24552 (2019.01); G06F 16/24561 (2019.01); G06F 16/93 (2019.01)] | 20 Claims |
1. A computer-implemented method for maintaining data in a data management system, the computer-implemented method comprising:
storing a set of documents in log-structured object store comprising sequence numbers and document value, wherein the log-structured object store maintains documents sequence numbers and document values, the log-structured object store comprising a plurality of log segments;
storing a first log-structured merge-tree mapping keys to sequence numbers for accessing documents of the set of documents;
maintaining a delete list using a second log-structured merge-tree, the delete list comprising a list of stale document sequence numbers and corresponding sizes per log segment;
responsive to receiving a request to delete a document associated with a key,
identifying a sequence number of the deleted document from the first log-structured merge-tree based on the key value;
retrieving a size of the deleted document based on metadata of the deleted document stored in the log-structured object store based on the sequence number; and
recording the sequence number and the size of the deleted document in the second log-structured merge-tree;
for each log segment from the plurality of log segments, determining a measure of fragmentation of the log segment based on sizes of deleted documents of the log segment from the second log-structured merge-tree; and
responsive to the fragmentation exceeding a threshold, initiating a compaction operation for the log segment.
|