US 11,886,822 B2
Hierarchical relationship extraction
Paidi Creed, London (GB); and Aaron Jefferson Khey Jin Sim, London (GB)
Assigned to BenevolentAI Technology Limited, London (GB)
Appl. No. 17/268,124
Filed by BENEVOLENTAI TECHNOLOGY LIMITED, London (GB)
PCT Filed Sep. 26, 2019, PCT No. PCT/GB2019/052721
§ 371(c)(1), (2) Date Feb. 12, 2021,
PCT Pub. No. WO2020/065326, PCT Pub. Date Apr. 2, 2020.
Claims priority of application No. 1815664 (GB), filed on Sep. 26, 2018.
Prior Publication US 2021/0312134 A1, Oct. 7, 2021
Int. Cl. G06F 40/30 (2020.01); G06F 16/35 (2019.01); G06N 20/00 (2019.01); G06F 40/279 (2020.01); G06F 40/237 (2020.01)
CPC G06F 40/30 (2020.01) [G06F 16/358 (2019.01); G06F 40/237 (2020.01); G06F 40/279 (2020.01); G06N 20/00 (2019.01)] 33 Claims
OG exemplary drawing
 
1. A computer-implemented method for embedding a portion of text describing a relationship for one or more entities of interest, the method comprising:
receiving a portion of text comprising data representative of a relationship for the one or more entities of interest, wherein the portion of text comprises multiple separable entities including one or more relationship entities and the one or more entities of interest;
for each of the multiple separable entities, generating a set of embeddings by (a) retrieving, from an embedding vocabulary dataset, one or more embeddings of entities associated with the separable entity and (b) forming a set of embeddings associated with the separable entity based on the retrieved one or more embeddings, wherein each set of embeddings comprises an embedding of the separable entity and at least one embedding of an entity associated with the separable entity;
sending at least one embedding from each of the sets of embeddings for input to a machine learning model or classifier; and
storing the generated sets of embeddings in the embedding vocabulary dataset, wherein the embedding vocabulary dataset comprises data representative of one or more entities mapped to one or more corresponding embeddings,
wherein,
retrieving from the embedding vocabulary dataset one or more embeddings of entities associated with a separable entity further comprises (a) determining whether an embedding corresponding to each of the separable entity and the one or more entities associated with the separable entity exists in the embedding vocabulary dataset, (b) retrieving those embeddings associated with the separable entity that exist in the embedding vocabulary dataset, (c) generating out-of-vocabulary embeddings for those embeddings associated with the separable entity that are not found in the embedding vocabulary dataset, and (d) generating a set of embeddings for the separable entity based on at least any retrieved embedding or any generated out-of-vocabulary embedding.