CPC H04L 9/008 (2013.01) [G06N 20/00 (2019.01)] | 20 Claims |
1. A computer-implemented method comprising:
receiving, at a first data owner from a second data owner, a second set of data identifiers, the second set of data identifiers comprising identifiers of data usable in federated model training by the second data owner;
determining, at the first data owner by comparing the second set of data identifiers with a first set of data identifiers, an intersection set of data identifiers, the first set of data identifiers comprising identifiers of data usable in federated model training by the first data owner, the intersection set of data identifiers consisting of data identifiers present in both the first set of data identifiers and the second set of data identifiers, wherein the intersection set of data identifiers is stored in ascending numerical order;
rearranging, at the first data owner according to the intersection set of data identifiers, the data usable in federated model training by the first data owner, the rearranging resulting in a first training dataset sorted into ascending order of data identifiers in the intersection set of data identifiers, the first training dataset comprising a set of labels;
performing a training iteration of a model by computing, at the first data owner, a first partial set of model weights, the first partial set of model weights computed using the intersection set of data identifiers, the first training dataset, and a previous iteration of an aggregated set of model weights computed by the first data owner and the second data owner in a previous training iteration of the model;
receiving, at the first data owner from an aggregator, an updated aggregated set of model weights, the updated aggregated set of model weights comprising the first partial set of model weights and a second partial set of model weights received at the aggregator from the second data owner, the second partial set of model weights computed at the second data owner during the training iteration of the model, wherein the updated aggregated set of model weights comprises a result of the training iteration of the model; and
receiving, at the first data owner from the aggregator subsequent to performance of a plurality of training iterations including the training iteration, a trained version of the model, the trained version of the model comprising a final aggregated set of model weights computed at the first data owner and the second data owner.
|