US 11,836,576 B2
Distributed machine learning at edge nodes
Shiqiang Wang, White Plains, NY (US); Tiffany Tuor, London (GB); Theodoros Salonidis, Boston, MA (US); Christian Makaya, Summit, NJ (US); and Bong Jun Ko, Harrington Park, NJ (US)
Assigned to International Business Machines Corporation, Armonk, NY (US)
Filed by International Business Machines Corporation, Armonk, NY (US)
Filed on Apr. 13, 2018, as Appl. No. 15/952,625.
Prior Publication US 2019/0318268 A1, Oct. 17, 2019
Int. Cl. G06Q 30/00 (2023.01); G06N 20/00 (2019.01); H04L 67/10 (2022.01); H04L 67/12 (2022.01)
CPC G06N 20/00 (2019.01) [H04L 67/10 (2013.01); H04L 67/12 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A computer-implemented method for distributed machine learning at an edge node of a distributed computer system comprising a plurality of edge nodes and a synchronization node, the method comprising:
executing, by the edge node, a training process of a machine learning model for a number of iterations to generate a model parameter based at least in part on a local dataset of the edge node and a global model parameter of the distributed computer system;
estimating, by the edge node, a resource parameter set indicative of a plurality of resources available at the edge node;
sending the model parameter and the resource parameter set to the synchronization node of the distributed computing system;
receiving, at the edge node, updates to the global model parameter and the number of iterations from the synchronization node based at least in part on the model parameter and the resource parameter set of the plurality of edge nodes of the distributed computer system, wherein updates to the global model parameter and updates to the number of iterations to perform the training process of the machine learning model are broadcast to the plurality of edge nodes by the synchronization node, and wherein updates to the number of iterations define a number of training steps to be performed between two global synchronizations for each of the plurality of edge nodes; and
repeating the training process of the machine learning model at the edge node to determine an update to the model parameter based at least in part on the local dataset and updates to the global model parameter and the number of iterations from the synchronization node.