CPC G06V 10/778 (2022.01) [G06V 10/751 (2022.01); G06V 10/761 (2022.01)] | 3 Claims |
1. A method for re-recognizing an object image based on a multi-feature information capture and correlation analysis comprising:
a) collecting a plurality of object images to form an object image re-recognition database, labeling identifier (ID) information of an object image in the object image re-recognition database, and dividing the object image re-recognition database into a training set and a test set;
b) establishing an object image re-recognition model by using the multi-feature information capture and correlation analysis;
c) optimizing an objective function of the object image re-recognition model by using a cross-entropy loss function and a triplet loss function to obtain an optimized object image re-recognition model;
d) marking the object images with the ID information to obtain marked object images, inputting the marked object images into the optimized object image re-recognition model in step c) for training to obtain a trained object image re-recognition model and storing the trained object image re-recognition model;
e) inputting a to-be-retrieved object image into the trained object image re-recognition model in step d) to obtain a feature of a to-be-retrieved object; and
f) comparing the feature of the to-be-retrieved object with features of the object images in the test set and sorting comparison results by a similarity measurement
wherein step b) comprises the following steps:
b-1) setting an image input network to two branch networks comprising a first feature branch network and a second feature branch network;
b-2) inputting an object image h in the training set into the first feature branch network, wherein h∈
![]() ![]() b-3) inputting the object image h in the training set into the second feature branch network, wherein h∈
![]() ![]() ![]() wherein hi represents an embedding of an ith block obtained through a Gaussian distribution initialization, and i∈{1, . . . , n}; calculating an attention coefficient ai of the ith block according to a formula ai=qTσ(W1h0+W2hi+W3ha), wherein qT represents a weight, σ represents the sigmoid function, h0 represents a class marker, and W1, W2, and W3 are weights; calculating a new embedding hl of each of the two-dimensional blocks according to a formula
![]() and calculating a new class marker h′0 according to a formula h′0=W4[h0∥h1], wherein W4 represents a weight;
b-4) taking the new class marker h′0 and a sequence with an input size of hl∈
![]() |