US 11,715,040 B1
	Network switch with integrated gradient aggregation for distributed machine learning
William Brad Matthews, San Jose, CA (US); and Puneet Agarwal, Cupertino, CA (US)
Assigned to Innovium, Inc., San Jose, CA (US)
Filed by Innovium, Inc., San Jose, CA (US)
Filed on May 10, 2022, as Appl. No. 17/741,371.
Application 17/741,371 is a continuation of application No. 16/409,703, filed on May 10, 2019, granted, now 11,328,222.
This patent is subject to a terminal disclaimer.
Int. Cl. G06N 20/00 (2019.01); H04L 67/10 (2022.01); H04L 47/2441 (2022.01); H04L 49/00 (2022.01); H04L 47/32 (2022.01); H04L 49/25 (2022.01)

CPC G06N 20/00 (2019.01) [H04L 47/2441 (2013.01); H04L 47/32 (2013.01); H04L 49/3027 (2013.01); H04L 67/10 (2013.01); H04L 49/252 (2013.01)]

20 Claims

1. A network switching apparatus, comprising:

a plurality of communication interfaces configured to connect to specific computing devices in a network, including compute devices of a distributed learning system;

packet-switching logic configured to:

receive data units via the communication interfaces; and

forward first data units of the data units to destination devices identified for the first data units over the communication interfaces;

machine learning logic configured to:

recognize, in the data units, second data units carrying gradients of parameters in a machine learning model being trained against a training data set, each of the second data units carrying at least a portion of one of the gradients;

based on the second data units, aggregate sets of the gradients by performing one or more reduction operations on data sets within the gradients, the one or more reduction operations selected based on an attribute of the data sets; and

return the aggregated gradients to the compute devices.