US 12,381,609 B2
Wireless federated learning framework and resource optimization method
Hui Tian, Beijing (CN); Wanli Ni, Beijing (CN); Ping Zhang, Beijing (CN); Shaoshuai Fan, Beijing (CN); Gaofeng Nie, Beijing (CN); and Shilin Tao, Beijing (CN)
Assigned to BEIJING UNIVERSITY OF POSTS AND TELECOMMUNICATIONS, Beijing (CN)
Filed by Beijing University of Posts and Telecommunications, Beijing (CN)
Filed on Nov. 6, 2023, as Appl. No. 18/387,068.
Claims priority of application No. 202310182495.3 (CN), filed on Mar. 1, 2023.
Prior Publication US 2024/0297700 A1, Sep. 5, 2024
Int. Cl. H04L 5/12 (2006.01); H04B 7/06 (2006.01)
CPC H04B 7/0626 (2013.01) [H04B 7/0639 (2013.01)] 17 Claims
OG exemplary drawing
 
1. A wireless federated learning (FL) framework, comprising:
N centralized learning (CL) users with limited computing resources, wherein the CL users send training data to a base station for CL to participate in FL;
K FL users with sufficient computing resources, wherein the FL users obtain local models through local training data, and upload local model parameters as an aggregation model to the base station; and
the base station serving as an FL server and configured to compute a global model, wherein the base station performs CL on the training data accumulated by the CL users to obtain a CL model, and performs weighted summation on the CL model and the aggregation model based on a data amount to obtain the global model,
wherein it is assumed that there are T FL cycles in total, which are represented as a set T={1, 2, . . . , T}; and in a tth FL cycle, a local model update formula of a kth FL user is as follows:
wk(t+1)=w(t)−ηgk(t),gk(t)=∇Fk(w(t);Dk(t)),∀k
a centralized update formula of the base station is as follows:
w(t+1)=w(t)−ηg(t),g(t)=∇F w(t);D(t))
a global model aggregation formula is as follows:

OG Complex Work Unit Math
wherein η is a learning rate of a stochastic gradient descent method, D(t)i=1tΣn=1NDn(i) is a training data set accumulated at the base station in t cycles, Fk(⋅) and gk(t) are respectively a local loss function and a gradient of the kth FL user, F(⋅) and g(t) are respectively a loss function and a gradient of the CL at the base station, and w(t+1) is the global model aggregated at the base station.