US 12,282,847 B2
Method and system for extraction and annotation using semantic attribute paths
Salah Ait-Mokhtar, Montbonnot-Saint-Martin (FR); Caroline Brun, Grenoble (FR); and Agnes Sandor, Meylan (FR)
Assigned to NAVER CORPORATION, Seongnam-si (KR)
Filed by NAVER CORPORATION, Seongnam-si (KR)
Filed on Apr. 19, 2021, as Appl. No. 17/234,011.
Claims priority of application No. 20305634 (EP), filed on Jun. 10, 2020.
Prior Publication US 2021/0390395 A1, Dec. 16, 2021
Int. Cl. G06N 3/08 (2023.01); G06F 16/2457 (2019.01); G06F 16/28 (2019.01); G06F 40/284 (2020.01); G06F 40/30 (2020.01); G06N 3/04 (2023.01); G06F 3/04842 (2022.01)
CPC G06N 3/08 (2013.01) [G06F 16/24578 (2019.01); G06F 16/285 (2019.01); G06F 40/284 (2020.01); G06F 40/30 (2020.01); G06N 3/04 (2013.01); G06F 3/04842 (2013.01)] 20 Claims
OG exemplary drawing
 
1. An interactive annotation method for assisting a user in creation of annotated machine learning training data, the annotation method comprising:
providing, using a processor, a sequence of tokens to be annotated for display to a user in a graphical annotation interface, wherein the tokens comprise text;
receiving a span selection from the user via the graphical annotation interface, the span selection comprising one or more tokens from the sequence of tokens, wherein the user interacts with the graphical annotation interface using one or more selection devices;
in response to said received span selection, computing, by an artificial neural network using the processor, class probabilities for each token of the sequence of tokens, the class probabilities for a token corresponding to probabilities for the token to fall under respective classes of a predefined ontology;
computing, using the processor, scores for semantic attribute paths of the span selection, the scores for the semantic attribute paths being based on the class probabilities, wherein the semantic attribute paths correspond to paths in the predefined ontology;
providing a set of semantic attribute paths for the span selection for display to the user via the graphical annotation interface as an interactive graphical element;
receiving a user selection of a semantic attribute path from the set of displayed semantic attribute paths via the graphical annotation interface, wherein the user interacts with the graphical annotation interface using one or more selection devices; and
in response to said received user selection of a semantic attribute path, storing the sequence of tokens and the selected semantic attribute path for the span selection in the annotated machine learning training data.