US 12,342,221 B2
	Multi-base station queued preambles allocation method based on collaboration between multiple agent
Jun Sun, Jiangsu (CN); Mengzhu Guo, Jiangsu (CN); and Yin Lu, Jiangsu (CN)
Assigned to Nanjing University Of Posts And Telecommunications, Jiangsu (CN)
Appl. No. 18/031,866
Filed by Nanjing University Of Posts And Telecommunications, Jiangsu (CN)
PCT Filed Jul. 22, 2022, PCT No. PCT/CN2022/107420 § 371(c)(1), (2) Date Apr. 13, 2023, PCT Pub. No. WO2023/226183, PCT Pub. Date Nov. 30, 2023.
Claims priority of application No. 202210570855.2 (CN), filed on May 24, 2022.
Prior Publication US 2025/0088909 A1, Mar. 13, 2025
Int. Cl. H04W 28/08 (2023.01); H04W 28/086 (2023.01)

CPC H04W 28/0975 (2020.05) [H04W 28/0861 (2023.05)]

6 Claims

1. A multi-base station queued preambles allocation method based on collaboration between multiple agents, wherein a target area includes a network composed of at least two base stations and each base station includes a preambles pool; and for each agent that accesses the network, the following steps S1 to S3 are performed to complete preambles allocation to each agent; the method comprising:

S1. grouping the agent that accesses the network according to a service type of each agent, calculating an average delay tolerance for each group of agent, and arranging the average delay tolerances of all the groups of agent in an ascending order, to obtain a priority set;

S2. for each group of agent, conducting preambles allocation to each agent in each group based on a reinforcement learning algorithm;

wherein each preamble corresponds to a queue, wherein a status space S is established based on a maximum queuing number in each queue and an action space A is established based on an action of the agent selecting a preamble for queuing, wherein with the status space S as an input, by means of a deep neural network and a Q-learning method, the agent selects an action executable in the action space A by the agent based on a greedy strategy and with a goal of benefit maximization, and wherein with a Q-value of the action executable by the agent as an output, a local agent preambles allocation model is established; and

S3. establishing a global agent preambles allocation model based on the local agent preambles allocation model corresponding to each agent and a federal agent, and training the global agent preambles allocation model by means of a federal learning method, to obtain a trained global agent preambles allocation model; and completing preambles allocation to each agent that accesses the network by applying the global agent preambles allocation model.