| CPC G06N 3/082 (2013.01) [G06N 3/045 (2023.01)] | 19 Claims |

|
1. A method performed by a central computing device for training a machine learning (ML) model comprising a neural network using federated learning performed by a plurality of client devices, the method comprising:
determining a computation capability of each client device of the plurality of client devices;
associating the each client device with a dropout rate defining how much of each neural network layer of the ML model is to be included in a target submodel to be trained by the each client device, based on the determined computation capability;
generating different sized submodels of the ML model that are nested within the ML model such a largest submodel of the different sized submodels is included in the ML model and each of an increasingly smaller in size submodel of the different sized submodels is included in a next larger in size submodel of the different sized submodels, the different sized submodels being generated by using the dropout rate associated with the each client device to perform pruning of at least one neuron of at least one neural network layer of the ML model according to a predefined ordering such that the at least one neuron that is pruned is not used by any submodel of the different sized submodels for which the at least one neuron is pruned; and
distributing, during each federated learning training round, at least two submodels of the different sized submodels to at least one client device of the plurality of client devices based on the dropout rate associated with the at least one client device, the at least two submodels including, a submodel and at least one submodel nested within the submodel, from among the different sized submodels,
wherein the generating of the different sized submodels of the ML model that are nested within the ML model comprises:
using the dropout rate associated with the each client device to perform the ordered pruning of one neural network layer of the ML model; and
using at least one further dropout rate to perform the pruning of at least one further neural network layer of the ML model.
|