US 12,293,578 B2
	Object detection method, object detection apparatus, and non-transitory computer-readable storage medium storing computer program
Hikaru Kurasawa, Matsumoto (JP)
Assigned to SEIKO EPSON CORPORATION, Tokyo (JP)
Filed by SEIKO EPSON CORPORATION, Tokyo (JP)
Filed on Nov. 24, 2021, as Appl. No. 17/535,141.
Claims priority of application No. 2020-194817 (JP), filed on Nov. 25, 2020.
Prior Publication US 2022/0164577 A1, May 26, 2022
Int. Cl. G06V 10/82 (2022.01); G06V 10/40 (2022.01); G06V 10/74 (2022.01); G06V 10/94 (2022.01); G06V 20/00 (2022.01)

CPC G06V 20/00 (2022.01) [G06V 10/40 (2022.01); G06V 10/74 (2022.01); G06V 10/82 (2022.01); G06V 10/95 (2022.01)]

4 Claims

1. An object detection method of causing one or more processors to detect an object from an input image, by using a vector neural network type machine learning model having a plurality of vector neuron layers, the vector neuron layers having different channels, kernel sizes and strides, the machine learning model being configured to, when a patch image having a predetermined size smaller than the input image is input to the machine learning model, output a determination value indicating that the patch image belongs to one of a plurality of classes, the object detection method comprising:

generating a similarity image by inputting the input image to the machine learning model, and obtaining a similarity from an output of at least one specific layer among the plurality of vector neuron layers for each pixel of the specific layer, the similarity indicating a degree of being similar to a feature of any class among the plurality of classes; and

generating a discriminant image including at least an unknown label by comparing the similarity of each pixel in the similarity image to a predetermined threshold value, and, when the similarity is less than the predetermined threshold value, assigning the unknown label to the pixel, wherein

the generating of the similarity image includes generating an output image from an output of the machine learning model in response to the input of the input image, the output image in which a known label indicating the class to which the output image belongs among the plurality of classes is assigned to each pixel, the generating of the discriminant image includes setting the unknown label to some pixels of the output image with reference to the discriminant image,

two or more specific layers are provided,

the generating of the similarity image includes obtaining the similarity image for each of the two or more specific layers, and

the generating of the discriminant image includes

obtaining the discriminant image for each of the two or more specific layers, and

when the unknown label is assigned to a predetermined number of corresponding pixels among corresponding pixels of the discriminant image for each of the two or more specific layers for each pixel of the output image, setting the unknown label to the pixel of the output image.