CPC G06N 20/00 (2019.01) [G06N 5/04 (2013.01)] | 20 Claims |
1. A method, comprising:
performing, at one or more computing devices:
providing access to a respective partition of a training data set of a machine learning model to a plurality of computing resources, including a first computing resource and a second computing resource, wherein the first computing resource is assigned to perform operations of a training technique on a first partition of the training data set, and wherein the second computing resource is assigned to perform operations of the training technique on a second partition of the training data set;
executing a training phase of the machine learning model on the first computing resource and the second computing resource according to the training technique;
detecting, during the training phase of the machine learning model, that a measure of progress of operations of the training technique through the first partition at the first computing resource exceeds a measure of progress of operations of the training technique through the second partition at the second computing resource;
configuring, during the training phase, based at least in part on said detecting, one or more additional computing resources to perform at least a subset of remaining operations of the training technique on the second partition; and
executing the at least subset of remaining operations of the training technique on the second partition on the one or more additional computing resources.
|