| CPC H04W 28/0861 (2023.05) [G06N 20/00 (2019.01); H04W 28/0925 (2020.05)] | 20 Claims |

|
1. A method for traffic scenario clustering and load balancing by a network device, comprising:
training a plurality of learning agents to load balance a respective plurality of traffic scenarios to obtain a plurality of control policies;
performing at least one clustering iteration, each clustering iteration comprising:
selecting, from the plurality of control policies, a pair of control policies; and
merging the pair of control policies into a clustered control policy that replaces the pair of control policies from the plurality of control policies;
determining to stop the performing of the at least one clustering iteration when a quantity of control policies remaining in the plurality of control policies meets a predetermined value; and
deploying to each base station of a plurality of base stations a corresponding control policy from the plurality of control policies.
|