US 12,481,692 B2
Method, electronic apparatus, and computer-readable storage medium for identifying entity relationship pairs
Feng Hong, Beijing (CN); Min Huang, Beijing (CN); Weijie Zhou, Beijing (CN); Shanliang Xiong, Beijing (CN); Wenbi Cai, Beijing (CN); and Youpeng Wei, Beijing (CN)
Assigned to Beijing Hydrophis Network Technology Co., Ltd., Beijing (CN)
Filed by Beijing Hydrophis Network Technology Co., Ltd., Beijing (CN)
Filed on Mar. 8, 2024, as Appl. No. 18/599,543.
Claims priority of application No. 202310276088.9 (CN), filed on Mar. 20, 2023.
Prior Publication US 2024/0320253 A1, Sep. 26, 2024
Int. Cl. G06F 16/353 (2025.01); G06F 16/16 (2019.01); G06F 16/334 (2025.01)
CPC G06F 16/353 (2019.01) [G06F 16/3344 (2019.01)] 14 Claims
OG exemplary drawing
 
1. A method for identifying entity relationship pairs, wherein the method comprises:
acquiring a service text set, classifying service texts in the service text set based on text categories of the service texts to obtain a classified text set;
performing entity identification and relationship identification on classified texts in the classified text set to obtain an entity set and a relationship set;
constructing a positive example sample set and a negative example sample set based on the entity set and the relationship set; and
re-sampling the negative example sample set based on the positive example sample set to obtain a target entity relationship pair;
wherein the classifying service texts in the service text set based on text categories of the service texts to obtain a classified text set comprises:
traversing the service texts in the service text set and calculating a number of words in each traversed service text;
when the number of the words meets a preset word threshold, determining the corresponding traversed service text as a short text, and classifying the short text using a pre-built short text classification model;
when the number of the words does not meet the preset word threshold, determining the corresponding traversed service text as a long text, and classifying the long text using a pre-built long text classification model; and
summarizing all the classified service texts to obtain the classified text set
wherein the performing entity identification and relationship identification on the classified texts in the classified text set to obtain an entity set and a relationship set comprises:
vectorizing the classified texts in the classified text set using a pre-built BERT model to obtain sentence feature vectors;
performing entity identification and extraction on the sentence feature vectors using a pre-built entity identification model to obtain the entity set;
determining a keyword from the classified texts based on a preset number of entities in the entity set, and determining a relationship between different numbers of entities based on a preset attribute of the keyword, and summarizing all identified relationships to obtain the relationship set.