US 12,216,680 B1
High-throughput data replication
Moied Mohammed Abdul Wahid, Cupertino, CA (US); Xun Lai, San Jose, CA (US); and Binh Nguyen, San Jose, CA (US)
Assigned to Experian Information Solutions, Inc., Costa Mesa, CA (US)
Filed by Experian Information Solutions, Inc., Costa Mesa, CA (US)
Filed on Nov. 27, 2023, as Appl. No. 18/520,304.
Int. Cl. G06F 16/27 (2019.01); G06F 16/23 (2019.01); G06F 21/62 (2013.01)
CPC G06F 16/273 (2019.01) [G06F 16/2358 (2019.01); G06F 21/6245 (2013.01)] 18 Claims
OG exemplary drawing
 
1. A computer-implemented method for securely replicating data to remote network-accessible storage, the computer-implemented method comprising:
receiving, by an encryption application, a plurality of messages published by a first persistent message queue service, wherein the plurality of messages each relate to changes made to an on-premises database located remotely from a corresponding network-accessible database service, wherein the network-accessible database service stores at least a portion of data matching corresponding data that was stored in the on-premises database prior to the changes made to the on-premises database;
encrypting, by the encryption application, at least personally identifiable information within the plurality of messages at a field level, such that the encryption application separately encrypts individual fields of the personally identifiable information;
publishing, via a second persistent message queue service, a plurality of encrypted messages that correspond to the plurality of messages, wherein the plurality of encrypted messages include personally identifiable information encrypted at a field level as output by the encryption application;
receiving, at a replication application, at least (a) a first batch of encrypted messages of the plurality of encrypted messages as published by the second persistent message queue service and (b) a second batch of encrypted messages of the plurality of encrypted messages as published by the second persistent message queue service, wherein the first and second batches of encrypted messages each include hundreds of encrypted messages;
grouping the first batch of encrypted messages into a first plurality of sub-batches;
grouping the second batch of encrypted messages into a second plurality of sub-batches;
providing encrypted messages from at least the first plurality of sub-batches to a plurality of queues for parallel processing, wherein a different instance of a worker application or service is assigned to process each queue of encrypted messages;
prior to processing completing for the first batch of encrypted messages, beginning to provide encrypted messages from at least the second plurality of sub-batches to at least one of the plurality of queues for processing;
subsequent to beginning to provide the encrypted messages from the at least the second plurality of sub-batches to the at least one of the plurality of queues for processing, determining that the first batch of encrypted messages have been fully processed;
in response to the determining that the first batch of encrypted messages have been fully processed, committing changes related to the first batch of encrypted messages to corresponding data stored by the network-accessible database service; and
in response to the determining that the second batch of encrypted messages have been fully processed, committing changes related to the second batch of encrypted messages to corresponding data stored by the network-accessible database service.