US 12,266,202 B2
Method and apparatus for classifying document based on attention mechanism and semantic analysis
Tae Hyun Kim, Seoul (KR); Eunbin Kim, Seoul (KR); and Jung Kyu Kim, Seoul (KR)
Assigned to HYUNDAI MOBIS CO., LTD., Seoul (KR)
Filed by HYUNDAI MOBIS CO., LTD., Seoul (KR)
Filed on Oct. 20, 2021, as Appl. No. 17/505,979.
Claims priority of application No. 10-2021-0094696 (KR), filed on Jul. 20, 2021.
Prior Publication US 2023/0027526 A1, Jan. 26, 2023
Int. Cl. G06V 30/413 (2022.01); G06F 16/2458 (2019.01); G06F 16/33 (2019.01); G06F 16/334 (2025.01); G06F 16/35 (2019.01); G06F 18/214 (2023.01); G06F 18/22 (2023.01); G06F 40/30 (2020.01); G06N 3/045 (2023.01); G06N 3/0455 (2023.01); G06N 3/088 (2023.01); G06V 30/262 (2022.01); G06V 30/416 (2022.01)
CPC G06V 30/413 (2022.01) [G06F 16/3347 (2019.01); G06F 18/2155 (2023.01); G06F 18/22 (2023.01); G06F 40/30 (2020.01); G06N 3/0455 (2023.01); G06N 3/088 (2013.01); G06V 30/274 (2022.01); G06V 30/416 (2022.01); G06F 16/2462 (2019.01); G06F 16/35 (2019.01); G06N 3/045 (2023.01)] 19 Claims
OG exemplary drawing
 
1. A method of operating a system for classifying a document, the method comprising:
obtaining a plurality of word embeddings from a plurality of words constituting a plurality of sentences included in the document;
providing, to a semantic analysis model, the plurality of word embeddings, wherein the semantic analysis model generates, based on the plurality of word embeddings, a plurality of document features representing the document, the plurality of document features including a keyword similarity and a sentence similarity;
extracting, from the semantic analysis model, the plurality of document features;
providing, to an inference model, the plurality of word embeddings and the plurality of document features, wherein the inference model evaluates the document based on the plurality of word embeddings and the plurality of document features and generates an evaluation result of the document;
extracting, from the inference model, the evaluation result; and
outputting the evaluation result,
wherein, for generating the evaluation result of the document, the inference model performs:
generating, using a hierarchical attention network (HAN), a document vector from the plurality of word embeddings;
concatenating the keyword similarity and sentence similarity with the document vector to generate a concatenated vector; and
generating, using a fully connected layer, the evaluation result of the document based on the concatenated vector.