US 11,947,511 B2
Indexing a data corpus to a set of multidimensional points
Volkmar Uhlig, Cupertino, CA (US); John Hayes, Mountain View, CA (US); Akash J. Sagar, Redwood City, CA (US); Faissal Sleiman, Austin, TX (US); David Stephenson, San Mateo, CA (US); Daniel J. Fillingham, Sunnyvale, CA (US); and Timothy Cerexhe, Mountain View, CA (US)
Assigned to GHOST AUTONOMY INC., Mountain View, CA (US)
Filed by GHOST AUTONOMY INC., Mountain View, CA (US)
Filed on May 10, 2022, as Appl. No. 17/740,888.
Prior Publication US 2023/0367755 A1, Nov. 16, 2023
Int. Cl. G06F 18/22 (2023.01); G06F 16/22 (2019.01); G06K 9/62 (2022.01)
CPC G06F 16/2264 (2019.01) [G06F 16/2272 (2019.01); G06F 18/22 (2023.01)] 20 Claims
OG exemplary drawing
 
1. A method of indexing a data corpus to a set of multidimensional points, the method comprising:
generating a set of points comprising a Sobol sequence in a multidimensional space;
identifying, for each sample in a plurality of samples in a data corpus, a nearest point in the set of points;
generating an index mapping each sample with the nearest point in the Sobol sequence;
receiving a request for a number of samples from the data corpus;
selecting a subset of points from the Sobol sequence, wherein the subset of points includes a number of points equal to the number of samples, and wherein the subset of points are sequential from a beginning of the Sobol sequence;
providing, in response to the request and based on mappings in the index to the subset of points, a subset of the plurality of samples corresponding to the subset of points; and
generating one or more models by training the one or more models using the subset of the plurality of samples.