US 12,265,615 B2
Systems and methods for binary code analysis
Matthew W Anderson, Idaho Falls, ID (US); Matthew R Sgambati, Rigby, ID (US); and Brandon S Biggs, Idaho Falls, UT (US)
Assigned to Battelle Energy Alliance, LLC, Idaho Falls, ID (US)
Filed by Battelle Energy Alliance, LLC, Idaho Falls, ID (US)
Filed on May 4, 2021, as Appl. No. 17/308,006.
Prior Publication US 2022/0358214 A1, Nov. 10, 2022
Int. Cl. G06F 21/56 (2013.01); G06N 20/00 (2019.01)
CPC G06F 21/56 (2013.01) [G06N 20/00 (2019.01); G06F 2221/034 (2013.01)] 23 Claims
OG exemplary drawing
 
1. A method for binary code analysis, comprising:
generating human-readable code for a binary, the binary configured for execution on a high-performance computing system, wherein the human-readable code comprises a plurality of functional units, each functional unit comprising a respective instruction sequence, each instruction sequence comprising human-readable instructions derived from a respective function of the binary;
utilizing a machine-learned translation (MLT) model to translate instruction sequences of the human-readable code generated for the binary to semantic labels, wherein the MLT model is trained to translate human-readable code derived from functions of training binaries having known functional behaviors to semantic labels of a function classification language comprising a plurality of semantic labels, the function classification language comprising one or more semantic labels configured to characterize binary functions configured to implement unauthorized functionality; and
blocking execution of the binary on the high-performance computing system in response to translation of an instruction sequence of the human-readable code generated for the binary to a semantic label of the one or more semantic labels configured to characterize binary functions configured to implement unauthorized functionality.