US 12,141,072 B2
Method and system for evicting and reloading a cache for machine learning training data streams
John Thomas Cardente, Milford, MA (US); and Qi Bao, Acton, MA (US)
Assigned to Dell Products, L.P., Round Rock, TX (US)
Filed by Dell Products L.P., Round Rock, TX (US)
Filed on Mar. 31, 2023, as Appl. No. 18/193,814.
Prior Publication US 2024/0330192 A1, Oct. 3, 2024
Int. Cl. G06F 12/0891 (2016.01); G06F 12/0871 (2016.01)
CPC G06F 12/0891 (2013.01) [G06F 12/0871 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A method for managing training data, comprising:
monitoring, by a training data stream manager (TDSM), a cache comprising a plurality of training data examples, each respectively associated with streams of mini-batch sequences scheduled to be transmitted to a machine learning training environment;
making a first determination that a cache eviction is required;
in response to the first determination:
selecting a training data example of the plurality of training data examples;
making a second determination that the training data example is eligible for cache eviction;
in response to the second determination:
evicting the training data example from the cache; and
updating a training data example database entry to indicate that the training data example is evicted from the cache.