US 12,439,431 B2
	Methods and systems for scheduling mmWave communications using reinforcement learning
Hongsheng Lu, San Jose, CA (US); Chenyuan He, Jiangsu (CN); Bin Cheng, New York, NY (US); and Takayuki Shimizu, Santa Clara, CA (US)
Assigned to Toyota Motor Engineering & Manufacturing North America, Inc., Plano, TX (US)
Filed by Toyota Motor Engineering & Manufacturing North America, Inc., Plano, TX (US)
Filed on Jun. 22, 2021, as Appl. No. 17/354,045.
Prior Publication US 2022/0408408 A1, Dec. 22, 2022
Int. Cl. H04W 72/30 (2023.01); G05B 13/02 (2006.01); H04W 4/40 (2018.01)

CPC H04W 72/30 (2023.01) [G05B 13/0265 (2013.01); H04W 4/40 (2018.02)]

13 Claims

1. A controller which is configured to:

receive an intent to communicate from one or more of a plurality of nodes;

identify a plurality of states based at least in part on the one or more intents to communicate, each of the plurality of states indicating status of mmWave communication links among the plurality of nodes;

calculate an updated value for each of the plurality of states iteratively based on a value iteration algorithm and a previous value for each of the plurality of states until the updated value for each of the plurality of states converges; and

select one of the plurality of states based on the converged values for the plurality of states, wherein:

the updated value for each of the plurality of states is calculated at least based on a reward value and a transition probability from a first state to a second state; and

the reward value is calculated based on a weight value related to a link to be added or removed and a number of conflicts due to an addition or a removal of the link.