US 12,457,629 B2
	Apparatus and method for frequency allocation using reinforced learning for low earth orbit satellite network
Bon Jun Ku, Daejeon (KR); Dae Sub Oh, Daejeon (KR); Chang Hee Joo, Seoul (KR); Ji Hyeon Yun, Seoul (KR); Tae Gun An, Seoul (KR); and Hae Sung Jo, Seoul (KR)
Assigned to ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE, Daejeon (KR)
Filed by Electronics and Telecommunications Research Institute, Daejeon (KR)
Filed on Nov. 1, 2022, as Appl. No. 17/978,653.
Claims priority of application No. 10-2021-0148166 (KR), filed on Nov. 1, 2021; and application No. 10-2022-0116858 (KR), filed on Sep. 16, 2022.
Prior Publication US 2023/0132791 A1, May 4, 2023
Int. Cl. H04W 72/541 (2023.01); G06N 3/006 (2023.01); G06N 20/00 (2019.01); H04B 7/185 (2006.01); H04W 72/0453 (2023.01)

CPC H04W 72/541 (2023.01) [G06N 3/006 (2013.01); G06N 20/00 (2019.01); H04B 7/1851 (2013.01); H04B 7/18539 (2013.01); H04W 72/0453 (2013.01)]

14 Claims

8. A frequency resource allocation method executed by a processor of a computing system comprising the processor that electrically communicates with a reinforcement learning model trained based on a reinforcement learning technique with a modeling for resources allocation process to determine resources to be allocated for transmitting a signal to a user within a target satellite network, the method comprising:

controlling the reinforcement learning model to output an action:

selecting resources for transmitting the signal to the user based on the action output by the reinforcement learning model;

allocating the selected resources to the user;

determining whether to transmit the signal to the user, before transmitting the signal to the user, based on a collision probability of when the selected resources are used for transmitting the signal to the user so that a probability of actually transmitting the signal to the user using the selected resources does not exceed a first threshold and a second threshold obtained based on a probability that the resources are selected, wherein the collision probability is a probability of collision between the target satellite network and an adjacent existing satellite network operated independently of the target satellite network and utilizing identical resources to the target satellite network,

transmitting the signal to the user using the selected resources, when it is determined to transmit the signal to the user;

receiving information about whether the transmission of the signal is successful or not from the user via a feedback channel after a delayed time, and

updating an internal parameter of the reinforcement learning model with respect to the resources used for transmitting the signal.