CPC G06N 3/008 (2013.01) [G06F 40/232 (2020.01); G06N 3/044 (2023.01); G06N 3/045 (2023.01); G06N 3/08 (2013.01)] | 17 Claims |
1. A method for evaluating robustness of one or more target neural network models, comprising:
receiving one or more natural typo generation rules associated with a first task associated with a first input document type;
receiving a first target neural network model;
receiving a first document and corresponding ground truth labels;
generating one or more natural typos for the first document based on the one or more natural typo generation rules, using gradient information of components of the first document, wherein the generating the one or more natural typos include:
segmenting the first document to generate a plurality of sub-word components;
generating gradient information for the plurality of sub-word components; and
generating the one or more natural typos based on the gradient information;
providing, to the first target neural network model, a test document generated based on the first document and the one or more natural typos as an input document to generate a first output; and
generating a robustness evaluation result of the first target neural network model based on a comparison between the output and the ground truth labels.
|