| CPC G06N 5/02 (2013.01) [G06F 16/288 (2019.01)] | 12 Claims |

|
1. A method for knowledge graph construction, comprising:
identifying, through a processing device and by using a web page parser, an entity concept from a title text of the target web page and at least one entity corresponding to the entity concept from a body text of the target web page;
constructing, through the processing device, a syntax parse tree of the title text based on syntax parse rules of a language to which the title text belongs, and determining, from the syntax parse tree, a modifier for modifying the entity concept through the processing device; and
generating, through the electronic device, a knowledge graph based on the entity concept, the modifier, and the at least one entity;
wherein identifying, through the processing device, the at least one entity corresponding to the entity concept from the body text of the target web page comprises:
after obtaining page source code of the target web page, generating, through the processing device, a coding label tree corresponding to the page source code based on encoding labels in the page source code;
determining, from the coding label tree, a plurality of target encoding label subtrees having a similarity greater than a predetermined threshold through the processing device; and
for each of the target encoding label subtrees, determining, through the processing device, the entity from a body text segment corresponding to the target encoding label subtree;
wherein a text pattern of the title text is a top K text pattern, and determining, from the coding label tree, the plurality of target encoding label subtrees having the similarity greater than the predetermined threshold through the processing device, comprises:
determining, through the processing device, a target encoding label node from the coding label tree, the target encoding label node having a number of encoding label subtrees greater than or equal to a predetermined number; and
determining, through the processing device, at least the predetermined number of target encoding label subtrees from encoding label subtrees under the target encoding label node;
wherein the predetermined number is determined by:
determining, from the syntax parse tree, a syntax subtree comprising the entity concept through the processing device; and
determining, from the syntax subtree, a quantifier K corresponding to a cardinal number label through the processing device.
|