US 12,463,893 B2
Avoiding user experience disruptions using contextual multi-armed bandits
Pierre-André Savalle, Rueil-Malmaison (FR); Jean-Philippe Vasseur, Saint Martin d'Uriage (FR); Grégory Mermoud, Venthône (CH); and Vinay Kumar Kolar, San Jose, CA (US)
Assigned to Cisco Technology, Inc., San Jose, CA (US)
Filed by Cisco Technology, Inc., San Jose, CA (US)
Filed on Jul. 30, 2021, as Appl. No. 17/389,823.
Prior Publication US 2023/0035691 A1, Feb. 2, 2023
Int. Cl. H04L 45/00 (2022.01); H04L 45/02 (2022.01)
CPC H04L 45/22 (2013.01) [H04L 45/08 (2013.01)] 16 Claims
OG exemplary drawing
 
1. A method comprising:
using, by a device, a multi-armed bandit model to select different network paths over time via which traffic associated with an online application is routed;
obtaining, by the device, application experience metrics associated with the different network paths as previously selected by the multi-armed bandit model for use at a particular time, the application experience metrics indicative of user satisfaction with the online application over the particular time;
learning, by the device and by the multi-armed bandit model, which of the different network paths will provide satisfactory application experience metrics, based on the application experience metrics associated with the different network paths as previously selected by the multi-armed bandit model being used as payoffs in the multi-armed bandit model for each of the different network paths, wherein similar paths are identified from among the different network paths for the multi-armed bandit model based on estimating a similarity function between the different network paths based on past performance of the application experience metrics and path metrics of the different network paths, and wherein the similarity function is adjusted based on the payoffs; and
causing, by the device, the traffic associated with the online application to be routed via a set of one or more paths expected by the multi-armed bandit model to provide satisfactory application experience metrics for the online application based on the payoffs.