US 11,934,932 B1
Error aware module redundancy for machine learning
Giulio Gambardella, Portmarnock (IE); Nicholas Fraser, County Dublin (IE); Ussama Zahid, Saggart (IE); Michaela Blott, Dublin (IE); and Kornelis A. Vissers, Sunnyvale, CA (US)
Assigned to XILINX, INC., San Jose, CA (US)
Filed by XILINX, INC., San Jose, CA (US)
Filed on Nov. 10, 2020, as Appl. No. 17/094,598.
Int. Cl. G06N 20/20 (2019.01); G06F 11/16 (2006.01); G06N 3/045 (2023.01); G06N 3/08 (2023.01)
CPC G06N 20/20 (2019.01) [G06F 11/16 (2013.01); G06N 3/045 (2023.01); G06N 3/08 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A computing system, comprising:
a processor; and
memory comprising a machine learning (ML) training application, wherein, when executed by the processor, the ML training application performs an operation, the operation comprising:
training a first ML model using first training data;
evaluating the first ML model while injecting a first hardware fault in a first hardware system executing the first ML model;
generating an error characterization matrix indicating identification accuracy of the first ML model in identifying a plurality of classifications in the first training data when the first hardware fault is present;
adjusting second training data using the error characterization matrix by emphasizing a first set of classifications of the plurality of classifications that the first ML model misclassified when the first hardware fault was present and deemphasizing a second set of classifications of the plurality of classifications that the first ML model correctly classified when the first hardware fault was present; and
training a second ML model using the adjusted, second training data.