US 12,382,342 B2
	Traffic scenario clustering and load balancing with distilled reinforcement learning policies
Jimmy Li, Longueuil (CA); Di Wu, Saint Laurent (CA); Yi Tian Xu, Montreal (CA); Tianyu Li, Montreal (CA); Seowoo Jang, Seoul (KR); Xue Liu, Montreal (CA); and Gregory Lewis Dudek, Westmount (CA)
Assigned to SAMSUNG ELECTRONICS CO., LTD., Suwon-si (KR)
Filed by SAMSUNG ELECTRONICS CO., LTD., Suwon-si (KR)
Filed on Jul. 21, 2022, as Appl. No. 17/870,212.
Claims priority of provisional application 63/256,963, filed on Oct. 18, 2021.
Prior Publication US 2023/0117162 A1, Apr. 20, 2023
Int. Cl. H04W 28/086 (2023.01); G06N 20/00 (2019.01); H04W 28/08 (2023.01)

CPC H04W 28/0861 (2023.05) [G06N 20/00 (2019.01); H04W 28/0925 (2020.05)]

20 Claims

1. A method for traffic scenario clustering and load balancing by a network device, comprising:

training a plurality of learning agents to load balance a respective plurality of traffic scenarios to obtain a plurality of control policies;

performing at least one clustering iteration, each clustering iteration comprising:

selecting, from the plurality of control policies, a pair of control policies; and

merging the pair of control policies into a clustered control policy that replaces the pair of control policies from the plurality of control policies;

determining to stop the performing of the at least one clustering iteration when a quantity of control policies remaining in the plurality of control policies meets a predetermined value; and

deploying to each base station of a plurality of base stations a corresponding control policy from the plurality of control policies.