US 12,113,680 B2
	Reinforcement learning for jitter buffer control
Xiulian Peng, Beijing (CN); Vinod Prakash, Redmond, WA (US); Xiangyu Kong, Beijing (CN); Sriram Srinivasan, Sammamish, WA (US); and Yan Lu, Beijing (CN)
Assigned to Microsoft Technology Licensing, LLC, Redmond, WA (US)
Filed by Microsoft Technology Licensing, LLC, Redmond, WA (US)
Filed on Dec. 30, 2022, as Appl. No. 18/091,992.
Application 18/091,992 is a continuation of application No. 16/877,257, filed on May 18, 2020, granted, now 11,558,275.
Claims priority of provisional application 62/976,047, filed on Feb. 13, 2020.
Prior Publication US 2023/0138038 A1, May 4, 2023
This patent is subject to a terminal disclaimer.
Int. Cl. H04L 47/283 (2022.01); G06F 18/21 (2023.01); G06F 18/214 (2023.01); G06N 20/00 (2019.01); H04L 41/14 (2022.01); H04L 41/16 (2022.01); H04L 43/087 (2022.01)

CPC H04L 41/16 (2013.01) [G06F 18/214 (2023.01); G06F 18/217 (2023.01); G06N 20/00 (2019.01); H04L 41/145 (2013.01); H04L 43/087 (2013.01); H04L 47/283 (2013.01)]

20 Claims

1. A device for controlling jitter-buffer delay in a media streaming session over a network, the device comprising:

a computer processor;

a memory, storing instructions, which when executed by the computer processor causes the computer processor to perform operations comprising:

identifying a jitter buffer state of a jitter buffer, the jitter buffer storing media frames of a media streaming session, the jitter buffer state comprising a current jitter buffer delay, current received frames in the jitter buffer, total delay of a current media frame of the media frames, and an immediately previous action;

identifying a network delay of the network;

determining an action for a media frame of media data in the jitter buffer based upon the jitter buffer state, the network delay, and a reward; and

determining a playback duration of a next media frame based upon the action.