US 11,940,990 B1
Global clock values for consistent queries to replicated data
Sharatkumar Nagesh Kuppahally, Bellevue, WA (US); Ravi Math, Redmond, WA (US); Adam Douglas Morley, Seattle, WA (US); Ming-chuan Wu, Bellevue, WA (US); Wei Xiao, Bellevue, WA (US); and Rajaprabhu Thiruchi Loganathan, Issaquah, WA (US)
Assigned to Amazon Technologies, Inc., Seattle, WA (US)
Filed by Amazon Technologies, Inc., Seattle, WA (US)
Filed on Jun. 16, 2017, as Appl. No. 15/625,976.
Int. Cl. G06F 16/23 (2019.01); G06F 16/245 (2019.01); G06F 16/27 (2019.01)
CPC G06F 16/2379 (2019.01) [G06F 16/245 (2019.01); G06F 16/27 (2019.01)] 20 Claims
OG exemplary drawing
 
1. A system, comprising:
a memory to store program instructions which, if performed by at least one processor, cause the at least one processor to perform a method to at least:
receive at a first node, a query directed to a replicated portion of a data set, wherein the data set is stored at a plurality of other nodes and the replicated portion stored is at the first node;
identify, by the first node, a global time value for the query according to a global clock, wherein the global time value for the query is determined based on a local time when the query is received at the first node;
receive, at the first node from different ones of the plurality of other nodes, respective global time values identified for respective ones of a plurality of updates to the replicated portion, the updates received at the first node after being:
received at respective ones of the plurality of other nodes that store respective subsets of the data set;
identified with different respective local times at which the respective update is received at the respective node of the plurality of other nodes;
performed independently at the respective ones of the plurality of other nodes that store respective subsets of the data set; and
mapped to respective ones of the global time values according to the global clock based on the determined different respective local time;
wherein at least two of the respective local times of respective ones of at least two different updates received at the first node differ from one another and map to a same global time value of the respective global time values;
delay performance of the query at the first node until a determination that the replicated portion at the first node is consistent with the data set at the global time value identified for the query, wherein the determination:
compares the respective global time values received from the plurality of other nodes at the first node and that are identified for respective ones of the plurality of received updates, to the global time value identified for the query; and
based on the comparison, determines that respective performance of the plurality of updates to the replicated portion of the data set at the first node makes the replicated portion of the data set consistent with the data set at the global time value identified for the query;
perform the received updates to the replicated portion of the data set stored at the first node;
subsequent to performance of the received updates to the replicated portion of the data set stored at the first node, in response to the determination that the replicated portion at the first node is consistent with the data set, perform, by the first node, the query with respect to the replicated portion of the data set stored at the first node; and
return, by the first node, a result for the query performed by the first node.