US 12,117,963 B2
System and method for efficient multi-stage querying of archived data
Damien Laurent Richard, Montreuil (FR); Markus Theodorus Hendrikus Polman, Rotterdam (NL); Conrado Eduardo Poole Siguero, San Francisco, CA (US); and Andreas Kalogeropoulos, Vincennes (FR)
Assigned to Open Text Holdings, Inc., Menlo Park, CA (US)
Filed by Open Text Holdings, Inc., San Mateo, CA (US)
Filed on Oct. 8, 2021, as Appl. No. 17/497,697.
Prior Publication US 2023/0109804 A1, Apr. 13, 2023
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 16/11 (2019.01); G06F 16/14 (2019.01); G06F 16/172 (2019.01)
CPC G06F 16/113 (2019.01) [G06F 16/148 (2019.01); G06F 16/172 (2019.01)] 17 Claims
OG exemplary drawing
 
1. A method for searching indexed packages, comprising:
ingesting records of data, the records of data ingested in packages, the ingesting comprising:
indexing the records of data based on a parameter, the indexing including storing an index file in each of the packages, the index file in each of the packages indexing records of data in that package using the parameter, wherein the parameter is characterized by a range of values;
generating indexed packages for the records of data based on the parameter, wherein each of the generated indexed packages corresponds to one of a plurality of subsets of the range of values;
generating metadata for the indexed packages, the metadata for each indexed package comprising a reference to that indexed package of records and a package key that characterizes that indexed package of records based on a value of the parameter, wherein the package key differs from the parameter; and
storing the indexed packages in a data repository;
querying the records of data based on a query defining a search value of the parameter, the querying comprising:
searching the metadata based on the search value of the parameter, wherein the search value is defined within one of the subsets;
identifying a package key for the metadata referencing the search value of the parameter;
loading, from a file-based cache, an indexed package based on the identified package key, when the indexed package is stored in the cache; and
loading, from the data repository which is an archive storage, the indexed package when the indexed package is not stored in the cache.