US 12,001,835 B2
In-service software upgrade with active service monitoring
Francisco José Rojas Fonseca, Alajuela (CR); Jorge Arturo Sauma Vargas, San Jose (CR); Eduardo Francisco Ramirez Acosta, Heredia (CR); and Pablo Cesar Barrantes Chaves, Heredia (CR)
Assigned to Hewlett Packard Enterprise Development LP, Spring, TX (US)
Filed by Hewlett Packard Enterprise Development LP, Houston, TX (US)
Filed on Dec. 14, 2021, as Appl. No. 17/551,136.
Prior Publication US 2023/0185567 A1, Jun. 15, 2023
Int. Cl. G06F 8/656 (2018.01)
CPC G06F 8/656 (2018.02) 18 Claims
OG exemplary drawing
 
1. A computer-executed method for performing an in-service software upgrade on a network device, the method comprising:
in response to a software-upgrade command, generating an upgrade database based on a state database storing both a data-plane state and a control-plane state associated with the network device, wherein the network device is managed by a management unit comprising a data-plane-management sub-unit for configuring hardware devices according to the data-plane state and a control-plane sub-unit for monitoring the control-plane state, and wherein the upgrade database stores at least a copy of the data-plane state prior to the upgrade;
upgrading the management unit by separately upgrading the data-plane-management sub-unit and the control-plane sub-unit, without interrupting services provided by the network device, wherein upgrading the data-plane-management sub-unit comprises:
terminating the data-plane-management sub-unit to allow the hardware devices to operate in an autonomous mode;
restarting a newer version of the data-plane-management sub-unit; and
reattaching the newer version of the data-plane-management sub-unit to the hardware devices based on the copy of the data-plane state prior to the upgrade stored in the upgrade database;
monitoring the control-plane state in the state database and the data-plane state in the upgrade database to detect an event associated with the network device during the upgrading of the management unit; and
in response to determining, based on the detected event and a set of pre-defined criteria, that a triggering condition is met, performing an action to prevent a network outage or error.