US 11,868,855 B2
Resiliency for machine learning workloads
Sai Rahul Chalamalasetti, Palo Alto, CA (US); Sergey Serebryakov, San Jose, CA (US); and Dejan S. Milojicic, Palo Alto, CA (US)
Assigned to Hewlett Packard Enterprise Development LP, Spring, TX (US)
Filed by HEWLETT PACKARD ENTERPRISE DEVELOPMENT LP, Houston, TX (US)
Filed on Nov. 4, 2019, as Appl. No. 16/673,868.
Prior Publication US 2021/0133624 A1, May 6, 2021
Int. Cl. G06N 20/00 (2019.01); G06F 16/901 (2019.01); G06F 21/60 (2013.01)
CPC G06N 20/00 (2019.01) [G06F 16/901 (2019.01); G06F 21/602 (2013.01)] 19 Claims
OG exemplary drawing
 
1. A system comprising:
one or more processors, and
at least one memory communicatively coupled to the one or more processors, the at least one memory storing (i) a golden dataset, and (ii) instructions that, when executed by the one or more processors, cause the one or more processors to:
provide data points of golden data candidate datasets as input to a first machine learning (ML) model,
wherein outputs of the first ML model identify sensitive data points within the provided data points,
and wherein the outputs that correspond to the sensitive data points are more impacted by changes to weights of the first ML model relative to outputs that correspond to other data points within the provided data points;
generate a golden dataset using the sensitive data points;
receive a machine learning (ML) operation request comprising first input data on which the ML operation is to be performed using a second ML model;
retrieve, from the at least one memory, the golden dataset comprising golden input data and golden output data;
generate an input batch comprising the first input data and the golden input data;
run the second ML model using the input batch as inputs, causing the second ML model to generate output data comprising output data points;
determine a defined deviation threshold used in validating the output data from the second ML model, the defined deviation threshold being determined using a mean squared error metric associated with numerical elements generated in the output data from the second ML model;
compare the golden output data to the output data from the second ML model;
when the golden output data matches the output data within the defined deviation threshold, validate weights of the second ML model based on at least the output data points corresponding to the golden input data; and
wherein the weights of the second ML model are determined to be faulty if the output data points corresponding to the golden input data deviate beyond a permitted threshold from the respective golden output data points corresponding to the golden output data.