US 11,836,136 B2
	Distributed pseudo-random subset generation
Donko Donjerkovic, San Mateo, CA (US); Prateek Gaur, San Jose, CA (US); and Eric Musser, Redwood City, CA (US)
Assigned to ThoughtSpot, Inc., Mountain View, CA (US)
Filed by ThoughtSpot, Inc., Mountain View, CA (US)
Filed on Dec. 6, 2022, as Appl. No. 18/075,665.
Application 18/075,665 is a continuation of application No. 17/223,999, filed on Apr. 6, 2021, granted, now 11,580,111.
Prior Publication US 2023/0117794 A1, Apr. 20, 2023
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 16/2455 (2019.01); G06F 16/22 (2019.01); G06F 16/248 (2019.01)

CPC G06F 16/2456 (2019.01) [G06F 16/2282 (2019.01); G06F 16/248 (2019.01)]

20 Claims

1. A method comprising:

in response to receiving data expressing a usage intent with respect to a low-latency data analysis system, wherein the low-latency data analysis system includes a distributed in-memory database:

obtaining, by the distributed in-memory database, a portion of a data query responsive to the data expressing the usage intent, wherein the portion of the data query indicates:

a first table including a first column; and

a limit value;

obtaining, by the distributed in-memory database, results data, wherein obtaining the results data includes:

obtaining filtering criteria;

pseudo-random filtering the first table using the filtering criteria and using, as candidate data, data from the first table, which includes using the first column as a target column;

in response to the pseudo-random filtering of the first table, obtaining the candidate data as intermediate results data; and

obtaining, as the results data, rows from the intermediate results data such that a cardinality of rows of the results data is at most the limit value; and

outputting the results data as responsive to the portion of the data query.