US 12,265,541 B2
Systems and methods for executing queries on tensor datasets
Sasun Hambardzumyan, Yerevan (AM); Ivo Stranic, Brooklyn, NY (US); Tatevik Hakobyan, Burlington, MA (US); and Davit Buniatyan, Mountain View, CA (US)
Assigned to Snark AI, Inc., San Francisco, CA (US)
Filed by Snark AI, Inc., San Francisco, CA (US)
Filed on Jan. 29, 2024, as Appl. No. 18/426,272.
Application 18/426,272 is a continuation of application No. 18/210,004, filed on Jun. 14, 2023, granted, now 11,886,435.
Claims priority of provisional application 63/437,546, filed on Jan. 6, 2023.
Prior Publication US 2024/0232201 A1, Jul. 11, 2024
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 16/00 (2019.01); G06F 16/2453 (2019.01); G06F 16/2455 (2019.01); G06F 16/248 (2019.01)
CPC G06F 16/24553 (2019.01) [G06F 16/24542 (2019.01); G06F 16/248 (2019.01)] 20 Claims
OG exemplary drawing
 
1. A method, comprising:
identifying, by one or more processors coupled to memory, a multi-dimensional sample dataset comprising a plurality of samples, each of the plurality of samples comprising a first tensor identified by a respective first identifier;
identifying, by the one or more processors, a query for the multi-dimensional sample dataset, the query specifying a sampling operation for the multi-dimensional sample dataset, the sampling operation of the query indicating an expression including the respective first identifier of the first tensor and a first weight for a probability distribution of query results to select from the plurality of samples of the multi-dimensional sample dataset;
parsing, by the one or more processors, the query to extract the sampling operation, the expression, and the first weight;
executing, by the one or more processors, the query based on the sampling operation to randomly select a subset of samples from the plurality of samples as a set of query results, the subset of samples selected to include a first number of samples that satisfy the expression for the first tensor, the first number of samples determined based on the first weight; and
providing, by the one or more processors, as output, the set of query results.