US 12,355,524 B2
	Multi-agent policy machine learning
Heunchul Lee, Täby (SE); Maksym Girnyk, Solna (SE); and Jaeseong Jeong, Solna (SE)
Assigned to Telefonaktiebolaget LM Ericsson (Publ), Stockhom (SE)
Appl. No. 18/261,928
Filed by Telefonaktiebolaget LM Ericsson (publ), Stockholm (SE)
PCT Filed Jan. 19, 2021, PCT No. PCT/SE2021/050027 § 371(c)(1), (2) Date Jul. 18, 2023, PCT Pub. No. WO2022/159008, PCT Pub. Date Jul. 28, 2022.
Prior Publication US 2024/0088959 A1, Mar. 14, 2024
Int. Cl. H04B 7/06 (2006.01); G06N 3/08 (2023.01)

CPC H04B 7/0617 (2013.01) [G06N 3/08 (2013.01); H04B 7/0619 (2013.01)]

20 Claims

1. A method of operating a beam-forming wireless communication system, the system comprising a plurality of radio nodes, an actor neural network being associated to each radio node, further to each actor neural network, there is associated a critic network, the method comprising:

training each actor neural network, for controlling at least one associated radio node, based on learning feedback provided by the associated critic network, the learning feedback being based on operation information provided by one or both of the actor neural network and the radio node associated thereto for the critic network.