US 11,741,153 B2
Training data acquisition apparatus, training apparatus, and training data acquiring method
Hiroo Saito, Kawasaki Kanagawa (JP); and Tomoyuki Shibata, Kawasaki Kanagawa (JP)
Assigned to Kabushiki Kaisha Toshiba, Tokyo (JP)
Filed by KABUSHIKI KAISHA TOSHIBA, Tokyo (JP)
Filed on Feb. 26, 2021, as Appl. No. 17/249,359.
Claims priority of application No. 2020-143678 (JP), filed on Aug. 27, 2020.
Prior Publication US 2022/0067081 A1, Mar. 3, 2022
Int. Cl. G06F 16/532 (2019.01); G06V 10/774 (2022.01); G06F 40/279 (2020.01); G06F 16/538 (2019.01); G06F 16/51 (2019.01); G06F 16/55 (2019.01); G06F 16/583 (2019.01); G06F 18/22 (2023.01); G06F 18/214 (2023.01); G06F 18/21 (2023.01)
CPC G06F 16/532 (2019.01) [G06F 16/51 (2019.01); G06F 16/538 (2019.01); G06F 16/55 (2019.01); G06F 16/5846 (2019.01); G06F 40/279 (2020.01); G06V 10/774 (2022.01); G06F 18/2148 (2023.01); G06F 18/2193 (2023.01); G06F 18/22 (2023.01)] 12 Claims
OG exemplary drawing
 
1. An apparatus comprising:
at least one processor configured to:
acquire a query image and a query text relating to a target object;
acquire candidate images of the target object, using the query text;
using the query image, identify from the candidate images a positive image containing a region demonstrating a similarity to the query image higher than or equal to a first threshold value, and identify a position of the region in the positive image;
output training data including the positive image, information representing the position of the region in the positive image, and a correct label based on the query text; and
train an object detection model for outputting information representing a position of the target object in an input image and the correct label, using the training data,
wherein
the candidate images include a first candidate image and a second candidate image, and
the at least one processor is further configured to:
acquire the first candidate image from a database storing an image group by conducting a search using the query text, and acquire the second candidate image by conducting a search using a query other than the query text;
identify, from the candidate images, a negative image containing no region demonstrating a similarity to the query image higher than or equal to a second threshold value; and
output negative data including the negative image and a label different from the correct label.