US 12,248,876 B1
Network threat intelligence relational triple combined extraction method based on deep learning
Wenli Shang, Guangzhou (CN); Bowen Wang, Guangzhou (CN); Haotian Shi, Guangzhou (CN); Zhiwei Chang, Guangzhou (CN); Meng Zhang, Guangzhou (CN); Hai Jie, Guangzhou (CN); Zhong Cao, Guangzhou (CN); Man Zhang, Guangzhou (CN); and Sha Huan, Guangzhou (CN)
Assigned to GUANGZHOU UNIVERSITY, Guangzhou (CN)
Filed by Guangzhou University, Guangzhou (CN)
Filed on Oct. 22, 2024, as Appl. No. 18/923,078.
Claims priority of application No. 202311494302.4 (CN), filed on Nov. 9, 2023.
Int. Cl. G06N 3/08 (2023.01); G06F 40/12 (2020.01); G06F 40/211 (2020.01); G06F 40/279 (2020.01); H04L 9/40 (2022.01)
CPC G06N 3/08 (2013.01) [G06F 40/12 (2020.01); G06F 40/211 (2020.01); G06F 40/279 (2020.01); H04L 63/14 (2013.01)] 10 Claims
OG exemplary drawing
 
1. A network threat intelligence relational triple combined extraction method based on deep learning, comprising the following steps:
step S1: converting input text information into a vector representation and acquiring semantic information: encoding entity type words and relation type words to acquire label words, and encoding a network thread intelligence sentence sequence to obtain a sentence sequence S;
step S2: acquiring time sequence information and a long-term dependency relation of each word segment in a sentence representation sequence S in a sentence to obtain a sentence representation sequence T;
step S3: acquiring a syntactic dependency relation between the word segments of an input sequence T and a label of the syntactic dependency relation, acquiring syntactic dependency information of the word segment and embedding the same into the sentence representation sequence T, and performing calculation in combination with a dependency type attention score and a standard GCN formula to obtain a sentence representation sequence H;
step S4: extracting an entity span: listing all possible spans, dividing the sentence representation sequence H into three portions by taking the span as a center and performing max pooling on the three portions respectively to integrate span features, acquiring interactive information of span time sequence information, the span and an entity label, and filtering a redundancy span to obtain an entity span set L;
step S5: extracting a relational triple: listing all possible entity pairs according to the entity span set L, dividing a sentence representation sequence L into five portions by taking an entity span pair as a center and performing max pooling on the three portions respectively except the entity span pair to integrate span features, acquiring interactive information of time sequence information of the entity span pair, the entity span pair, and a relation label, and inputting an entity span pair sequence into a co-predictor to obtain a relational triple set Ts;
step S6: performing iterative training on network threat intelligence to obtain a network threat intelligence relational triple combined extraction model; and
step S7: inputting massive network threat intelligence into the network threat intelligence relational triple combined extraction model to obtain a relational triple.