US 12,007,983 B2
Optimization of application of transactional information for a hybrid transactional and analytical processing architecture
Ippokratis Pandis, Menlo Park, CA (US); Gokul Soundararajan, San Jose, CA (US); Gopal Paliwal, Milpitas, CA (US); Vadim Skipin, Berlin (DE); and Sanuj Basu, San Mateo, CA (US)
Assigned to Amazon Technologies, Inc., Seattle, WA (US)
Filed by Amazon Technologies, Inc., Seattle, WA (US)
Filed on Jun. 30, 2022, as Appl. No. 17/810,318.
Prior Publication US 2024/0004867 A1, Jan. 4, 2024
Int. Cl. G06F 16/23 (2019.01); G06F 16/25 (2019.01)
CPC G06F 16/2379 (2019.01) [G06F 16/2358 (2019.01); G06F 16/254 (2019.01)] 20 Claims
OG exemplary drawing
 
1. A system, comprising:
one or more compute nodes organized into a node cluster, wherein the one or more compute nodes are configured to:
implement an analytical database; and
maintain, at the analytical database, a representation of at least a portion of a table of a separate transactional database implemented via one or more computing devices, wherein to maintain the representation of the at least a portion of the table, the one or more compute nodes of the node cluster are further configured to:
receive respective snapshots of segments of the at least a portion of the table of the separate transactional database and receive checkpoints relative to the respective snapshots, wherein:
the checkpoints comprise transactional changes that have been applied at the separate transactional database;
the transactional changes of a given checkpoint comprise two or more delete events and one or more insert events; and
the transactional changes are labeled with respective primary keys corresponding to respective rows that the transactional changes occur at in the separate transactional database; and
implement the transactional changes of the given checkpoint to its corresponding snapshot, wherein to implement comprises:
commit the one or more insert events to the corresponding snapshot;
commit the two or more delete events to a shadow table;
responsive to the shadow table having a given threshold of committed delete events of the two or more delete events, commit the committed delete events of the shadow table to the corresponding snapshot; and
responsive to the commit the committed delete events in the shadow table to the corresponding snapshot, remove the committed delete events from the shadow table; and
wherein, responsive to receiving an incoming analytical query, the one or more compute nodes provide the results of the incoming analytical query based, at least in part, on the committed delete events in the shadow table.