US 12,216,592 B1
Enhancing i/o performance using in-memory reservation state caching at block storage services
Barak Pinhas, Ganei Tikva (IL); Hen Guetta, Netanya (IL); and Alex Friedman, Hadera (IL)
Assigned to Amazon Technologies, Inc., Seattle, WA (US)
Filed by Amazon Technologies, Inc., Seattle, WA (US)
Filed on Sep. 29, 2023, as Appl. No. 18/478,349.
Int. Cl. G06F 12/14 (2006.01); G06F 12/12 (2016.01)
CPC G06F 12/1466 (2013.01) [G06F 12/12 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A system, comprising:
a first storage server of a block storage service of a cloud computing environment, wherein the first storage server comprises a processor, a main memory and a set of storage devices, wherein a plurality of worker threads run at the processor including a first worker thread and a second worker thread, wherein the set of storage devices comprises a portion of a logical volume, and wherein the set of storage devices comprises a first replica of an operations journal;
a metadata store comprising a reservation record which indicates a set of access permissions granted, with respect to the logical volume, to one or more compute instances of a computing service which are programmatically attached to the logical volume; and
a commit propagator;
wherein, in response to determining that a command to modify the reservation record has been received at the first storage server, the first worker thread is configured to:
acquire, in exclusive mode, a change-sequencing lock associated with the reservation record, wherein a first version of the reservation record is stored in a cache in the main memory of the first storage server, wherein the cache is populated by copying the first version from the metadata store, and wherein, while the change-sequencing lock is held in exclusive mode by the first worker thread, other worker threads are prevented from modifying the reservation record;
acquire, in shared mode after acquiring the change-sequencing lock in exclusive mode, a reading-writing lock associated with the reservation record;
release the reading-writing lock after generating, in accordance with the command to modify the reservation record, a second version of the reservation record, wherein the reading-writing lock is released without storing the second version of the reservation record in the cache;
after verifying that respective entries comprising the second version of the reservation record have been stored in the first replica of the operations journal and in at least a second replica of the operations journal, provide an indication that the command to modify the reservation record has succeeded, without verifying that the second version of the reservation record has been stored in the metadata store;
acquire the reading-writing lock in exclusive mode, wherein, while the reading-writing lock is held in exclusive mode by the first worker thread, other worker threads are prevented from reading the reservation record from the cache;
store, while holding the reading-writing lock in exclusive mode, the second version of the reservation record in the cache;
release the reading-writing lock and the change-sequencing lock; and
wherein the second worker thread is configured to:
determine that a command to perform an input/output (I/O) operation directed to the portion of the logical volume from a particular compute instance of the one or more compute instances has been received at the first storage server; and
in response to (a) acquiring the reading-writing lock in shared mode while the change-sequencing lock is held in exclusive mode by the first worker thread and (b) determining that an access permission indicated in a particular version of the reservation record which is present in the cache permit the I/O operation, release the reading-writing lock and initiate the I/O operation; and
wherein the commit propagator is configured to:
propagate, asynchronously with respect to storage of the second version of the reservation record in the cache at the first storage server, contents of the second version of the reservation record from a selected replica of the operations journal to the metadata store.