US 11,768,954 B2
System, method and computer-accessible medium for capturing data changes
Mayur Jagtap, Frisco, TX (US); Naga Venkata Sriram Vadakattu, Frisco, TX (US); Abhijit Chitnis, Frisco, TX (US); Janardhan Deepak Prabhakara, Allen, TX (US); Anurag Jain, Murphy, TX (US); Parvesh Kumar, Plano, TX (US); Rahul Surendra Nath, Frisco, TX (US); Behdad Forghani, Allen, TX (US); and Mark Assousa, Plano, TX (US)
Assigned to CAPITAL ONE SERVICES, LLC, McLean, VA (US)
Filed by Capital One Services, LLC, McLean, VA (US)
Filed on Jun. 16, 2020, as Appl. No. 16/902,535.
Prior Publication US 2021/0390204 A1, Dec. 16, 2021
Int. Cl. G06F 21/62 (2013.01); G06F 21/31 (2013.01); G06F 16/17 (2019.01); G06F 16/22 (2019.01); G06F 16/18 (2019.01); G06F 16/21 (2019.01)
CPC G06F 21/6245 (2013.01) [G06F 16/1734 (2019.01); G06F 16/1865 (2019.01); G06F 16/211 (2019.01); G06F 16/2282 (2019.01); G06F 21/31 (2013.01)] 18 Claims
OG exemplary drawing
 
1. A non-transitory computer-accessible medium having stored thereon computer-executable instructions for optimal batch processing of transaction streams in real-time, wherein, when a computer arrangement executes the instructions, the computer arrangement is configured to perform procedures comprising:
decomposing each transaction in a transaction stream into one or more discrete events, wherein the transaction stream is decomposed into a plurality discrete events;
determining a fingerprint for each of the plurality of discrete events, by mapping one or more event data items into shorter bit string that uniquely identifies a corresponding event data item;
receiving a plurality of schemas in a scheme stream transmitted via a distinct schema channel, wherein the plurality of schemas are transmitted once and stored in a cache memory;
determining at least one schema for each of the plurality of discrete events based on a corresponding fingerprint, wherein a change in one or more data values associated with a discrete event results in a new schema;
compressing, two or more discrete events, corresponding to a common row, into a single discrete event, the common row being identified based on the at least one schema, wherein the at least one schema is enhanced with one or more supplemental metadata provided by a metadata registry;
storing the plurality of discrete events across a parallel arrangement of one or more source tables, based on a schema hash value associated with the at least one schema;
scanning, in parallel, one or more source tables for one or more untokenized data elements corresponding to a sensitive information record, wherein each of the one or more source tables is associated with a distinct schema hash value one or more discrete events; and
tokenizing the one or more untokenized data in parallel across the one or more source tables; and
writing back the one or more tokenized data to one or more corresponding source tables in parallel.