US 11,669,712 B2
Robustness evaluation via natural typos
Lichao Sun, Chicago, IL (US); Kazuma Hashimoto, Menlo Park, CA (US); Jia Li, Mountain View, CA (US); Richard Socher, Menlo Park, CA (US); and Caiming Xiong, Menlo Park, CA (US)
Assigned to salesforce.com, inc., San Francisco, CA (US)
Filed by salesforce.com, inc., San Francisco, CA (US)
Filed on Sep. 3, 2019, as Appl. No. 16/559,196.
Claims priority of provisional application 62/851,073, filed on May 21, 2019.
Prior Publication US 2020/0372319 A1, Nov. 26, 2020
Int. Cl. G06N 3/08 (2023.01); G06F 40/232 (2020.01); G06N 3/045 (2023.01); G06N 3/008 (2023.01); G06N 3/044 (2023.01)
CPC G06N 3/008 (2013.01) [G06F 40/232 (2020.01); G06N 3/044 (2023.01); G06N 3/045 (2023.01); G06N 3/08 (2013.01)] 17 Claims
OG exemplary drawing
 
1. A method for evaluating robustness of one or more target neural network models, comprising:
receiving one or more natural typo generation rules associated with a first task associated with a first input document type;
receiving a first target neural network model;
receiving a first document and corresponding ground truth labels;
generating one or more natural typos for the first document based on the one or more natural typo generation rules, using gradient information of components of the first document, wherein the generating the one or more natural typos include:
segmenting the first document to generate a plurality of sub-word components;
generating gradient information for the plurality of sub-word components; and
generating the one or more natural typos based on the gradient information;
providing, to the first target neural network model, a test document generated based on the first document and the one or more natural typos as an input document to generate a first output; and
generating a robustness evaluation result of the first target neural network model based on a comparison between the output and the ground truth labels.