| CPC G06F 21/62 (2013.01) | 20 Claims |

|
1. A system for tokenizing data, the system comprising:
a key-value database configured to store a mapping;
an application server, the application server comprising an application processor and application memory, the application processor configured to execute: a requesting application to send an unscheduled tokenization request, and a predictive model to obtain a prediction based on a tokenization response to the unscheduled tokenization request, wherein the predictive model is pre-trained using a machine learning training dataset;
a batch processing computer operatively coupled to the key-value database, the batch processing computer comprising a memory and a processor configured to perform a set of batch operations as part of a scheduled batch to produce a batch output, the batch output comprising the machine learning training dataset for pre-training the predictive model, wherein the set of batch operations comprises:
obtaining a payload to be tokenized as part of the scheduled batch and metadata associated with the payload;
identifying a data type of the payload from the metadata associated with the payload; and
processing the payload as part of the scheduled batch to generate a key having a standard format based on the data type of the payload;
locking the key-value database for access by the batch processing computer;
while the key-value database is locked, conducting a search, as part of the scheduled batch, within the key-value database for a value corresponding to the key;
returning the value as the token in the batch output when the search is successful and otherwise:
generating, as part of the scheduled batch, a new value based on the payload, the new value generated based on a universally unique identifier;
appending, as part of the scheduled batch, an entry to the key-value database, where the entry comprises the key and the new value; and
returning the new value as the token in the batch output; and
in response to returning the token in the batch output, unlocking the key-value database; and
an on-demand tokenization server operatively coupled to the key-value database, the on-demand tokenization server comprising a tokenization server memory and a tokenization server processor, the tokenization server processor configured to:
receive the unscheduled tokenization request from the requesting application during a period in which the key-value database is locked by the batch processing computer, the unscheduled tokenization request comprising a second payload to be tokenized for input to the predictive model to obtain the prediction and metadata associated with the second payload;
identify a data type of the second payload from the metadata associated with the second payload;
process the second payload, individually and in real-time in response to the unscheduled tokenization request, to generate a second key having a standard format based on the data type of the second payload, the second key identical to the key;
conduct a search within the key-value database for the value corresponding to the key and the second key; and
respond to the unscheduled tokenization request individually with the tokenization response comprising the value as the token.
|