US 11,704,216 B2
Dynamically adjusting statistics collection time in a database management system
Rafal P. Konik, Oronoco, MN (US); Roger A. Mittelstadt, Byron, MN (US); Brian R. Muras, Otsego, MN (US); and Chad A. Olstad, Rochester, MN (US)
Assigned to International Business Machines Corporation, Armonk, NY (US)
Filed by International Business Machines Corporation, Armonk, NY (US)
Filed on May 23, 2019, as Appl. No. 16/420,239.
Application 16/420,239 is a continuation of application No. 14/672,858, filed on Mar. 30, 2015, granted, now 10,372,578.
Prior Publication US 2019/0278682 A1, Sep. 12, 2019
Int. Cl. G06F 16/00 (2019.01); G06F 11/30 (2006.01); G06F 16/21 (2019.01); G06F 16/23 (2019.01); G06F 16/2453 (2019.01); G06F 11/34 (2006.01)
CPC G06F 11/302 (2013.01) [G06F 11/3409 (2013.01); G06F 11/3452 (2013.01); G06F 16/217 (2019.01); G06F 16/2379 (2019.01); G06F 16/24545 (2019.01)] 10 Claims
OG exemplary drawing
 
1. A computing device for determining a statistics collection time for a database table, comprising:
a memory to store a database and a database management system, the database management system having a statistics collection component, the statistics collection component including a data structure to store data associated with one or more commit cycles;
a processor to cause the statistics collection component to perform operations comprising:
receiving one or more commit cycles of the database, wherein each of the one or more commit cycles includes an associated data signature;
generating, based on the associated data signature for each of the one or more commit cycles, a commit cycle queue;
identifying, based on the commit cycle queue, uncommitted updates of a first commit cycle of the one or more commit cycles;
estimating a sum of predicted updates included in the one or more commit cycles before a second phase of the first commit cycle has begun, wherein each of the commit cycles is associated with a predicted number of updates;
determining whether the estimated sum of predicted updates is greater than a first threshold;
determining a progress point for the first commit cycle of the commit cycles,
wherein the progress point of the first commit cycle includes identifying a point during the life cycle of the first commit cycle, and the determining the progress point of the first commit cycle includes identifying the first commit cycle is at a middle of its life cycle;
comparing an age of a first set of statistics to a staleness threshold;
in response to the age of the first set of statistics exceeding the staleness threshold, collecting a second set of statistics using sampling until the first commit cycle finishes, wherein the collection of the second set of statistics does not require reading all of the data in the database table;
selecting a future time to collect statistics based on the progress point of the first commit cycle; and
collecting a third set of statistics when the future time is reached, wherein the collection of the third set of statistics requires reading all of the data in the database table.