| CPC G06F 16/2379 (2019.01) [G06F 16/254 (2019.01)] | 22 Claims |

|
1. A computer-implemented method comprising:
storing a plurality of event records corresponding to a respective plurality of user interactions with a graphical user interface for a website or an application in a persistent storage, the storing comprising:
receiving the plurality of event records;
grouping the plurality of event records and compressing the plurality of event records for storage in a compressed format as a compressed set of event records, wherein the compressed set of event records is stored in association with a file path; and
writing the file path of the compressed set of event records to a publish-subscribe queue;
ingesting, from the publish-subscribe queue, a stream of data comprising the compressed set of event records, wherein the compressed set of event records is accessed using the file path as retrieved from a publish/subscribe message, wherein the ingesting decompresses the compressed set of event records back into the plurality of event records and is performed in parallel to the storing;
determining that a first record from the plurality of event records includes a first anonymous identifier and a first known identifier;
adding a mapping between the first anonymous identifier and the first known identifier to an identifier resolution database;
determining that a second record from the plurality of event records includes a second anonymous identifier and no known identifier;
using the identifier resolution database to identify a second known identifier that is mapped to the second anonymous identifier;
updating the second record to include the second known identifier; and
including the first record and the updated second record in a training matrix for training a machine learning model to compute causal inferences associated with the plurality of user interactions.
|