CPC G06V 30/413 (2022.01) [G06F 16/953 (2019.01); G06F 40/20 (2020.01); G06V 30/12 (2022.01); G06V 30/19093 (2022.01); G06V 30/412 (2022.01); G06V 30/414 (2022.01); G06V 30/416 (2022.01)] | 13 Claims |
1. An apparatus for data structuring of text, the apparatus comprising:
a processor; and
a memory storing instructions executable by the processor,
wherein the processor is configured to execute the instructions to:
extract text and location information of the text from an image based on an optical character recognition (OCR) technique;
generate a text unit based on the text and the location information;
classify a form of the image based on the text;
label the text unit as first text, second text, and third text respectively corresponding to an item name, an item value, and others based on the classified form of the image;
structure the text by mapping the second text corresponding to the item value and the first text corresponding to the item name; and
determine misrecognition of the first text and correct the first text determined to be misrecognized.
|