US 12,381,609 B2
	Wireless federated learning framework and resource optimization method
Hui Tian, Beijing (CN); Wanli Ni, Beijing (CN); Ping Zhang, Beijing (CN); Shaoshuai Fan, Beijing (CN); Gaofeng Nie, Beijing (CN); and Shilin Tao, Beijing (CN)
Assigned to BEIJING UNIVERSITY OF POSTS AND TELECOMMUNICATIONS, Beijing (CN)
Filed by Beijing University of Posts and Telecommunications, Beijing (CN)
Filed on Nov. 6, 2023, as Appl. No. 18/387,068.
Claims priority of application No. 202310182495.3 (CN), filed on Mar. 1, 2023.
Prior Publication US 2024/0297700 A1, Sep. 5, 2024
Int. Cl. H04L 5/12 (2006.01); H04B 7/06 (2006.01)

CPC H04B 7/0626 (2013.01) [H04B 7/0639 (2013.01)]

17 Claims

1. A wireless federated learning (FL) framework, comprising:

N centralized learning (CL) users with limited computing resources, wherein the CL users send training data to a base station for CL to participate in FL;

K FL users with sufficient computing resources, wherein the FL users obtain local models through local training data, and upload local model parameters as an aggregation model to the base station; and

the base station serving as an FL server and configured to compute a global model, wherein the base station performs CL on the training data accumulated by the CL users to obtain a CL model, and performs weighted summation on the CL model and the aggregation model based on a data amount to obtain the global model,

wherein it is assumed that there are T FL cycles in total, which are represented as a set T={1, 2, . . . , T}; and in a t^thFL cycle, a local model update formula of a k^thFL user is as follows:

w_k^(t+1)=w^(t)−ηg_k^(t),g_k^(t)=∇F_k(w^(t);D_k^(t)),∀k

a centralized update formula of the base station is as follows:

w^(t+1)=w^(t)−ηg^(t),g^(t)=∇F w^(t);D^(t))

a global model aggregation formula is as follows:

wherein η is a learning rate of a stochastic gradient descent method, D^(t)=Σ_i=1^tΣ_n=1^ND_n⁽ⁱ⁾is a training data set accumulated at the base station in t cycles, F_k(⋅) and g_k^(t)are respectively a local loss function and a gradient of the k^thFL user, F(⋅) and g^(t)are respectively a loss function and a gradient of the CL at the base station, and w^(t+1)is the global model aggregated at the base station.