US 12,118,400 B2
Performing batched training for machine-learning pipelines
Martin Hirzel, Chappaqua, NV (US); Kiran A. Kate, Chappaqua, NY (US); and Avraham Ever Shinnar, Westchester, NY (US)
Assigned to International Business Machines Corporation, Armonk, NY (US)
Filed by International Business Machines Corporation, Armonk, NY (US)
Filed on Nov. 29, 2021, as Appl. No. 17/537,258.
Prior Publication US 2023/0168938 A1, Jun. 1, 2023
Int. Cl. G06F 9/50 (2006.01); G06F 9/48 (2006.01); G06N 20/00 (2019.01)
CPC G06F 9/5038 (2013.01) [G06F 9/4881 (2013.01); G06F 9/505 (2013.01); G06N 20/00 (2019.01)] 20 Claims
OG exemplary drawing
 
1. A computer-implemented method, comprising:
identifying a machine learning pipeline and a plurality of training data batches;
creating a plurality of tasks, based on the machine learning pipeline, the tasks including a transform task that includes an instance where an operator takes training data of one of the training data batches as input and outputs computed data, the tasks including a partial-fit task that includes an instance where an operator is trained partially using a single batch of the plurality of training data batches; and
determining an order in which the plurality of tasks is executed, utilizing a resource usage-aware approach, wherein the partial-fit task is prioritized over the transform task such that the partial-fit task is placed before the transform task within the determined order.