CPC G06F 16/2228 (2019.01) [G06F 16/24545 (2019.01); G06F 16/24561 (2019.01)] | 21 Claims |
1. A computing system comprising:
at least one memory;
one or more hardware processor units coupled to the at least one memory; and
one or more computer readable storage media storing computer-executable instructions that, when executed, cause the computing system to perform operations comprising:
receiving a set of a plurality of numerical values that is at least partially unsorted, the set of a plurality of numerical values representing values in a dataset, the dataset being data stored in at least one computer-implemented data structure;
sorting the set of a plurality of numerical values to provide a sorted set of numerical values sorted according to a sorting criterion, wherein the sorting criterion defines how numerical values in the set of a plurality of numerical values will be ordered with respect to one another in the sorted set of numerical values, the sorting comprising:
for a second element in the set of a plurality of numerical values, the second element being a numerical value of the set of a plurality of numerical values at a specific position in the set of a plurality of numerical values, determining that the second element is out of order compared with a first element in the set of a plurality of numerical values, the first element in the set of a plurality of numerical values being at a specific position in the set of a plurality of numerical values that precedes the position of the second element in the set of a plurality of numerical values;
determining whether the second element would be sorted compared with the first element if a current numerical offset value was added to the second element, wherein a numerical offset value is incrementally adjusted during the sorting of the set of a plurality of numerical values, and the current numerical offset value is the offset value at the time of the determining; and
adding the current numerical offset value to the second element;
defining summary metadata for the dataset using the sorted set of values;
receiving a computer-implemented data retrieval request requesting information in the dataset, the computer-implemented data retrieval request specifying at least one value;
in response to the computer-implemented data retrieval request, receiving a request to determine whether the at least one value is or is likely to be in the dataset;
determining from the summary metadata whether the specified value is or is likely to be present in the dataset; and
when the summary metadata indicates the at least one value is or is likely to be present in the dataset, loading at least a portion of the dataset from the at least one computer-implemented data structure for use in processing the computer-implemented data retrieval request, wherein the at least a portion of the dataset is not loaded if the summary metadata indicates that the at least one value is not or is not likely to be present in the dataset, thereby improving computing efficiency by avoiding operations to load and process data sets that do not, or are not likely to, include the at least one value.
|