US 11,714,573 B1
Storage optimization in a distributed object store
Shashank Bhardwaj, Jersey City, NJ (US); Roman Gavrilov, Marlboro, NJ (US); Brian Scott Ross, Woodmere, NY (US); Mehul A. Shah, Saratoga, CA (US); Benjamin Sowell, San Mateo, CA (US); Anthony A. Virtuoso, Hawthorne, NJ (US); and Linan Zheng, Jersey City, NJ (US)
Assigned to Amazon Technologies, Inc., Seattle, WA (US)
Filed by Amazon Technologies, Inc., Seattle, WA (US)
Filed on Mar. 29, 2021, as Appl. No. 17/216,373.
Int. Cl. G06F 3/06 (2006.01)
CPC G06F 3/0659 (2013.01) [G06F 3/067 (2013.01); G06F 3/0613 (2013.01); G06F 3/0614 (2013.01); G06F 3/0653 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A computer-implemented method comprising:
receiving, at a storage optimization service of a provider network, a message indicating a request to monitor a table for storage optimization, wherein the table is a logical table implemented at least in part using a table index identifying data objects storing records of the table;
receiving, at the storage optimization service from another service of the provider network, one or more messages indicating that a first set of data objects associated with the table have been added or removed, wherein the first set of data objects were or are stored by an object storage service of the provider network;
determining that the table meets one or more optimization criteria;
inserting an entry for the table into a queue of an optimization scheduler of the storage optimization service;
generating a priority score for the table based on determining a benefit that would result from optimizing the table;
updating a priority value of the entry for the table within the queue based on the priority score for the table, wherein the updating is performed asynchronously to the inserting of the entry for the table into the queue;
selecting the entry for the table from the queue based on the priority value of the entry for the table;
obtaining, from the object storage service, a second set of data objects referenced by the table;
performing an optimization involving the second set of data objects, comprising generating a third set of data objects based on the second set of data objects;
transmitting the third set of data objects to be stored by the object storage service; and
causing the table index for the table to be modified, via an atomic transactional update, to reference the third set of data objects.