US 12,223,261 B2
Image processing apparatus, image processing method, and storage medium
Takashi Miyauchi, Inagi (JP)
Assigned to Canon Kabushiki Kaisha, Tokyo (JP)
Filed by CANON KABUSHIKI KAISHA, Tokyo (JP)
Filed on Mar. 5, 2021, as Appl. No. 17/193,683.
Claims priority of application No. 2020-043075 (JP), filed on Mar. 12, 2020; and application No. 2020-148383 (JP), filed on Sep. 3, 2020.
Prior Publication US 2021/0286991 A1, Sep. 16, 2021
Int. Cl. G06V 30/41 (2022.01); G06F 18/22 (2023.01); G06F 40/174 (2020.01); G06F 40/284 (2020.01); G06V 10/74 (2022.01); G06V 30/19 (2022.01); G06V 30/262 (2022.01); G06V 30/413 (2022.01); G06V 30/414 (2022.01); G06V 30/416 (2022.01); G06V 30/10 (2022.01)
CPC G06F 40/174 (2020.01) [G06F 18/22 (2023.01); G06F 40/284 (2020.01); G06V 10/761 (2022.01); G06V 30/19007 (2022.01); G06V 30/268 (2022.01); G06V 30/413 (2022.01); G06V 30/414 (2022.01); G06V 30/416 (2022.01); G06V 30/10 (2022.01)] 15 Claims
OG exemplary drawing
 
1. An image processing apparatus comprising:
at least one memory that stores instructions; and
at least one processor that execute the instructions to perform:
detecting text blocks in an input image;
determining a registered document corresponding to the input image among a plurality of registered documents;
determining a text block in the input image that corresponds to a processing target item, based on a partial layout defined in the determined registered document and including a first text block corresponding to the processing target item and at least one second text block present near the first text block; and
obtaining a character string corresponding to the processing target item by performing character recognition processing on the determined text block,
wherein the determination of the text block is performed by superimposing the partial layout at any of positions in a search range in the input image and deriving a matching degree based on a size of an area in which the text blocks included in the partial layout overlap the text blocks in the input image, and
wherein, in the determination of the text block, positions in a vertical direction used to determine the text block corresponding to the processing target item are determined based on differences between positions of the text blocks included in the partial layout in the vertical direction and positions of the text blocks in the search range in the vertical direction, and the matching degree at each of positions in a case where the partial layout is superimposed in a horizontal direction at each of the positions in the vertical direction in the search range is derived, and
wherein, in the determination of the text block,
candidate positions in the input image are determined, the candidate position being a position at which the matching degree is equal to or higher than a threshold, and
in a case where the number of the candidate positions is one, the candidate position is determined as a position used to perform the determination of the text block in the input image, or
in a case where the number of candidate positions is two or more, positions in the registered document are obtained as similar positions and the candidate positions and the similar positions are associated with one another to determine the position used to perform the determination of the text block in the input image, the similar position being position at which a matching degree is equal to or higher than a threshold in a case where the text blocks included in the partial layout are superimposed at one of positions in the registered document, the matching degree being derived by the same method as a method of deriving the matching degree to determine the candidate positions.