US 11,842,534 B2
Information processing apparatus for obtaining character string
Masaya Soga, Kanagawa (JP)
Assigned to Canon Kabushiki Kaisha, Tokyo (JP)
Filed by CANON KABUSHIKI KAISHA, Tokyo (JP)
Filed on Mar. 24, 2021, as Appl. No. 17/211,659.
Claims priority of application No. 2020-063778 (JP), filed on Mar. 31, 2020.
Prior Publication US 2021/0303895 A1, Sep. 30, 2021
Int. Cl. G06N 20/00 (2019.01); G06N 5/04 (2023.01); G06V 10/96 (2022.01); H04N 1/00 (2006.01); G06F 18/22 (2023.01); G06F 18/28 (2023.01); G06V 30/12 (2022.01); G06V 10/94 (2022.01); G06V 30/412 (2022.01); H04N 1/04 (2006.01); G06V 30/10 (2022.01)
CPC G06V 10/96 (2022.01) [G06F 18/22 (2023.01); G06F 18/28 (2023.01); G06N 5/04 (2013.01); G06N 20/00 (2019.01); G06V 10/945 (2022.01); G06V 30/12 (2022.01); G06V 30/412 (2022.01); H04N 1/00331 (2013.01); G06V 30/10 (2022.01); H04N 1/04 (2013.01); H04N 2201/0094 (2013.01)] 15 Claims
OG exemplary drawing
 
11. An information processing method to be performed by an information processing apparatus, the information processing method comprising:
obtaining a first character recognition result by performing character recognition processing on a text region in a first scan image; and
learning, if a part of a character string of the first character recognition result is deleted in setting attribute information about the first scan image, a regular expression based on the deleted character string,
wherein, if a character string of a second character recognition result obtained by performing a character recognition processing on a text region in a second scan image matches the learned regular expression, the second character recognition result is corrected by deleting a part matching the learned regular expression from the character string of the second character recognition result.