US 12,013,867 B2
Distributed data processing using embedded hermetic and deterministic language
Timothée Peignier, Vancouver (CA); and Edward Steel, Vancouver (CA)
Assigned to Treasure Data, Inc., Mountain View, CA (US)
Filed by Treasure Data, Inc., Mountain View, CA (US)
Filed on Jul. 27, 2022, as Appl. No. 17/815,234.
Prior Publication US 2024/0037114 A1, Feb. 1, 2024
Int. Cl. G06F 16/20 (2019.01); G06F 8/30 (2018.01); G06F 8/70 (2018.01); G06F 16/25 (2019.01)
CPC G06F 16/252 (2019.01) [G06F 8/31 (2013.01); G06F 8/70 (2013.01)] 11 Claims
OG exemplary drawing
 
1. A computer system comprising:
one or more central processing units (CPUs) that are communicatively coupled to a system clock, one or more network interfaces, and one or more database interfaces;
digital electronic main memory that is communicatively coupled to the one or more CPUs and storing one or more sequences of stored program instructions which, when executed using the one or more CPUs, cause the one or more CPUs to execute a plurality of different consumer services of a SaaS-based data analytics platform, each of the consumer services hosting an instance of a sandboxed runtime for a hermetic and deterministic programming language;
user function storage that is communicatively coupled to one of the database interfaces and storing a plurality of different user functions, each of the user functions having been programmed using the programming language, each of the user functions being stored in association with a reference to a destination table of a destination database;
each of the consumer services being programmed to initiate a data ingestion process;
load a copy of a user function from the user function storage to the sandboxed runtime that is associated with a particular consumer service from the plurality of different consumer services;
using the sandboxed runtime local to the particular consumer service, execute the use r function over records directed to the destination table identified in the reference;
filter the records or write new records resulting from the function to the destination table;
each of the consumer services being programmed to:
asynchronously, with respect to the execution of the particular consumer service, using a plurality of other consumer services among the plurality of different consumer services executing in the SaaS-based data analytics platform, initiate a plurality of other data ingestion processes for a plurality of datasets;
as part of each data ingestion process among the plurality of other data ingestion processes, load a second copy of the user function using a second sandboxed runtime that is associated with each other consumer service;
using the second sandboxed runtime that is local to each other consumer service, execute, over records of the datasets, the second copy of the user function directed to the destination table identified in the reference and filter the records or write new records resulting from the function to the destination table.