US 11,868,726 B2
Named-entity extraction apparatus, method, and non-transitory computer readable storage medium
Yoshikata Tobita, Nishitokyo (JP); and Masaru Suzuki, Kawasaki (JP)
Assigned to KABUSHIKI KAISHA TOSHIBA, Tokyo (JP); and Toshiba Digital Solutions Corporation, Kawasaki (JP)
Filed by KABUSHIKI KAISHA TOSHIBA, Tokyo (JP); and Toshiba Digital Solutions Corporation, Kawasaki (JP)
Filed on Mar. 16, 2021, as Appl. No. 17/202,752.
Application 17/202,752 is a continuation of application No. PCT/JP2019/037915, filed on Sep. 26, 2019.
Claims priority of application No. 2018-183861 (JP), filed on Sep. 28, 2018.
Prior Publication US 2021/0200953 A1, Jul. 1, 2021
Int. Cl. G06F 40/295 (2020.01); G06F 16/93 (2019.01); G06N 20/00 (2019.01); G06F 40/166 (2020.01); G06F 40/242 (2020.01)
CPC G06F 40/295 (2020.01) [G06F 16/93 (2019.01); G06F 40/166 (2020.01); G06F 40/242 (2020.01); G06N 20/00 (2019.01)] 7 Claims
OG exemplary drawing
 
1. A named-entity extraction apparatus, comprising:
a first storage device that stores an extraction dictionary used when named entities of document data and relations between named entities are extracted from the document data;
a document receiving unit that receives input of extraction document data from which the named entities and the relations are extracted, and input of learning document data used for learning of the extraction dictionary;
an extraction unit that extracts, using the extraction dictionary, the named entities and the relations between named entities from the extraction document data received by the document receiving unit;
a designation unit that designates character strings corresponding to the named entities extracted by the extraction unit among character strings in the learning document data received by the document receiving unit;
a second storage device that stores a relation extraction rule in which relations between categories of named entities extracted from the extraction document data are defined;
a generator that generates, by applying the relation extraction rule stored in the second storage device, a learning document in which relations between named entities belonging to the categories defined by the relation extraction rule among the named entities designated by the designation unit are set; and
a learning unit that learns the extraction dictionary based on the learning document generated by the generator.