CPC G06F 16/532 (2019.01) [G06F 16/51 (2019.01); G06F 16/538 (2019.01); G06F 16/55 (2019.01); G06F 16/5846 (2019.01); G06F 40/279 (2020.01); G06V 10/774 (2022.01); G06F 18/2148 (2023.01); G06F 18/2193 (2023.01); G06F 18/22 (2023.01)] | 12 Claims |
1. An apparatus comprising:
at least one processor configured to:
acquire a query image and a query text relating to a target object;
acquire candidate images of the target object, using the query text;
using the query image, identify from the candidate images a positive image containing a region demonstrating a similarity to the query image higher than or equal to a first threshold value, and identify a position of the region in the positive image;
output training data including the positive image, information representing the position of the region in the positive image, and a correct label based on the query text; and
train an object detection model for outputting information representing a position of the target object in an input image and the correct label, using the training data,
wherein
the candidate images include a first candidate image and a second candidate image, and
the at least one processor is further configured to:
acquire the first candidate image from a database storing an image group by conducting a search using the query text, and acquire the second candidate image by conducting a search using a query other than the query text;
identify, from the candidate images, a negative image containing no region demonstrating a similarity to the query image higher than or equal to a second threshold value; and
output negative data including the negative image and a label different from the correct label.
|