US 12,093,399 B1
Vulnerability detection method and device for smart contract, and storage medium
Xiaoqi Li, Haikou (CN); Wenkai Li, Haikou (CN); Zekai Liu, Haikou (CN); and Hailu Kuang, Haikou (CN)
Assigned to HAINAN UNIVERSITY, Haikou (CN)
Filed by Hainan University, Haikou (CN)
Filed on Feb. 7, 2024, as Appl. No. 18/435,717.
Claims priority of application No. 202310584937.7 (CN), filed on May 23, 2023.
Int. Cl. G06F 21/57 (2013.01); G06F 21/64 (2013.01); G06N 3/045 (2023.01); G06N 3/084 (2023.01)
CPC G06F 21/577 (2013.01) [G06F 21/64 (2013.01); G06N 3/045 (2023.01); G06N 3/084 (2013.01); G06F 2221/034 (2013.01)] 5 Claims
OG exemplary drawing
 
1. A vulnerability detection method for a smart contract, comprising:
S1: constructing a control flowchart for the smart contract and collecting opcodes, operands, and opcodes in a static single assignment (SSA) form based on a call flow thereof;
S2: crawling application binary interfaces in a blockchain browser based on an address of the smart contract;
S3: using the opcodes and the operands as an input, and outputting function parameters;
S4: monitoring whether there are specified actions in functions to determine function attributes; and
S5: fusing the opcodes in the SSA form and the application binary interfaces or a concatenation form of the function parameters and the function attributes by an encoder, and obtaining existent vulnerability types by a decoder;
wherein in step S3, after the opcodes and the operands of function bodies are collected, the function parameters are inferred from the opcodes and the operands by using a sequence-to-sequence model and are provided to the encoder;
step S3 comprises:
step 31: performing one-hot encoding on all the opcodes and operands in the function bodies to serve as semantic information, using the semantic information as an input, and extracting semantic feature information through a first bidirectional long short-term memory (LSTM) network having a structure;
step 32: searching for weight information capable of contributing a most to a current output in a current hidden layer state through an attention mechanism, and outputting a feature hidden layer state after all time steps; and
step 33: constructing a second bidirectional LSTM network with the same structure as the first bidirectional LSTM network in step 31 to decode the feature hidden layer state, and obtaining the function parameters possibly occurring in an input function; and
step S5 comprises:
step 51: extracting the opcodes corresponding to the smart contract, then removing stack operation instructions, and finally obtaining the opcodes in the SSA form as a semantic expression of the smart contract;
step 52: obtaining hidden-layer semantic features of the smart contract from the semantic expression of the smart contract through a bidirectional gate recurrent unit (GRU) model; determining whether the smart contract makes the application binary interfaces publicly available; if the application binary interfaces are not publicly available, directly jumping to step 53; otherwise, obtaining the publicly available application binary interfaces from an Ethereum browser Etherscan, and jumping to step 56;
step 53: obtaining the function parameters of the functions in the smart contract by using a function parameter inference method, and summarizing the function attributes corresponding to each of the functions;
step 54: concatenating the function parameters and the function attributes of each of the functions in the smart contract to serve as function interface features of each of the functions;
step 55: combining the function interface features of all the functions in a position order of each of the functions in an opcode sequence to serve as function interface features of the smart contract, and jumping to step 57;
step 56: obtaining hidden-layer function interface features from the application binary interfaces and inferred function signature information, and if the function interface features in the feature hidden layer are obtained from the application binary interfaces, jumping to step 561, otherwise, jumping to step 562;
step 561: extracting the hidden-layer function interface features of the application binary interfaces from a two-dimensional perspective by using a convolutional neural network (CNN), and jumping to step 57;
step 562: for the interface features of each of the functions in a single smart contract, extracting local features of the single function in the single smart contract by using a one-dimensional CNN;
step 563: for the interface features of all the functions in the single smart contract, first compressing all function interface information by using a global average pooling layer, and then extracting global features of all the functions in the single smart contract by using the one-dimensional CNN;
step 564: adding the local features and the global features, then multiplying a result by the function interface features in the single smart contract after passing through an activation function to obtain the function interface features of the hidden layer, and jumping to step 57;
step 57: fusing the hidden-layer semantic features of the smart contract that are obtained in step 52 and the hidden-layer function interface features obtained in step 56 to serve as hidden-layer features of the smart contract; and
step 58: decoding the hidden-layer features of the smart contract that are obtained in step 57 by using the bidirectional GRU model, and obtaining the existent vulnerability types.