CPC G06F 16/24557 (2019.01) [G06F 16/2272 (2019.01); G06F 16/283 (2019.01); G06F 16/9035 (2019.01); G06F 17/18 (2013.01)] | 30 Claims |
1. A system comprising:
at least one hardware processor; and
at least one memory storing instructions that cause the at least one hardware processor to perform operations comprising:
generating an index for a source table comprising a column of semi-structured data, the index indexing distinct values in each column of the source table, the generating of the index comprising:
identifying, based on a reassembly hook object, a first set of values corresponding to a first portion of the semi-structured data that is subcolumnarized, the reassembly hook object comprising a first data structure that represents the first portion of the semi-structured data; and
identifying, based on a residual object, a second set of values corresponding to a second portion of the semi-structured data that is not subcolumnarized, the residual object comprising a second data structure that represents at least a portion of the second portion of the semi-structured data;
storing the index with an association with the source table;
receiving a query directed at the column;
generating a set of search fingerprints based on a value in the query; and
processing the query by scanning a reduced scan set generated using the index and the set of search fingerprints.
|