| CPC G06F 16/1744 (2019.01) [G06F 16/116 (2019.01); G06F 16/1748 (2019.01)] | 20 Claims |

|
1. A method comprising:
receiving a request to convert a data-interchange file, comprising a hierarchy of nodes, into a binary file for use by a database management system;
generating a tree representation of the hierarchy of nodes, wherein the tree representation references a plurality of leaf values;
determining whether the binary file is to be compressed or uncompressed;
in response to determining that the binary file is to be compressed:
embedding relative node jump offsets when generating the tree representation to enable navigation of the hierarchy of nodes; and
storing the tree representation within a compressed container of the binary file; and
in response to determining that the binary file is to be uncompressed:
determining whether the data-interchange file is immutable or mutable;
in response to determining that the data-interchange file is immutable, deduplicating the plurality of leaf values in a space optimized manner such that at least a subset of the plurality of leaf values is unique; and
in response to determining that the data-interchange file is mutable, deduplicating the plurality of leaf values in a stream optimized manner such that adjacent leaf nodes with duplicate leaf values in the tree representation reference a single shared leaf value; and
storing the deduplicated plurality of leaf values in the binary file.
|