US 12,293,191 B2
Fine-grained image recognition method and apparatus using graph structure represented high-order relation discovery
Jia Li, Beijing (CN); Yifan Zhao, Beijing (CN); Dingfeng Shi, Beijing (CN); and Qinping Zhao, Beijing (CN)
Assigned to BEIHANG UNIVERSITY, Beijing (CN)
Filed by BEIHANG UNIVERSITY, Beijing (CN)
Filed on Dec. 9, 2021, as Appl. No. 17/546,993.
Claims priority of application No. 202110567940.9 (CN), filed on May 24, 2021.
Prior Publication US 2022/0382553 A1, Dec. 1, 2022
Int. Cl. G06F 9/38 (2018.01); G06F 9/30 (2018.01); G06N 3/02 (2006.01)
CPC G06F 9/3836 (2013.01) [G06F 9/30036 (2013.01); G06N 3/02 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A fine-grained image recognition method using graph structure represented high-order relation discovery, comprising:
inputting an image to be classified into a convolutional neural network feature extractor with multiple stages, and extracting two layers of network feature graphs Xi and Yi in the last stage;
constructing a hybrid high-order attention module enhanced by a space-gated network according to the network feature graphs Xi and Yi and forming a high-order feature vector pool according to the hybrid high-order attention module;
using each vector in the high-order feature vector pool as a node to construct a graph neural network, and utilizing semantic similarity among high-order features to form representative vector nodes in groups; and
performing global pooling on the representative vector nodes to obtain classification vectors, and obtaining a fine-grained classification result through a fully connected layer and a classifier based on the classification vectors.
 
10. A fine-grained image recognition device using graph structure represented high-order relation discovery, comprising:
at least one processor; and
a memory storing computer-executable instructions, wherein the computer-executable instructions are executed by the at least one processor, to enable the at least one processor to:
input an image to be classified into a convolutional neural network feature extractor with multiple stages, and extract two layers of network feature graphs Xi and Yi in the last stage;
construct a hybrid high-order attention module enhanced by a space-gated network according to the two layers of network feature graphs Xi and Yi, and form a high-order feature vector pool according to the hybrid high-order attention module;
use each vector in the high-order feature vector pool as a node to construct a graph neural network, and utilize a semantic similarity among high-order features to form representative vector nodes in groups; and
perform global pooling on the representative vector nodes to obtain classification vectors, and obtain a fine-grained classification result through a fully connected layer and a classifier based on the classification vectors.
 
19. A non-transitory computer-readable storage medium storing a computer program, wherein the computer program is executed by a computer to:
input an image to be classified into a convolutional neural network feature extractor with multiple stages, and extract two layers of network feature graphs Xi and Yi in the last stage;
construct a hybrid high-order attention module enhanced by a space-gated network according to the two layers of network feature graphs Xi and Yi, and form a high-order feature vector pool according to the hybrid high-order attention module;
use each vector in the high-order feature vector pool as a node to construct a graph neural network, and utilize a semantic similarity among high-order features to form representative vector nodes in groups; and
perform global pooling on the representative vector nodes to obtain classification vectors, and obtain a fine-grained classification result through a fully connected layer and a classifier based on the classification vectors.