US 12,135,938 B2
Extended open information extraction by identifying nested relationships
Ajay Patel, Santa Clara, CA (US); and Alex Sands, Austin, TX (US)
Assigned to CORASCLOUD, INC., McLean, VA (US)
Filed by CORASCLOUD, INC., McLean, VA (US)
Filed on May 11, 2022, as Appl. No. 17/742,258.
Claims priority of provisional application 63/186,969, filed on May 11, 2021.
Prior Publication US 2022/0366135 A1, Nov. 17, 2022
Int. Cl. G06F 40/253 (2020.01); G06F 40/279 (2020.01); G06F 40/211 (2020.01)
CPC G06F 40/253 (2020.01) [G06F 40/279 (2020.01); G06F 40/211 (2020.01)] 20 Claims
OG exemplary drawing
 
1. A method, comprising:
utilizing a trained machine learning model executed in a computing device to learn syntax dependency patterns and parts of speech tag patterns of text based on labeled training data;
contextualizing, by the computing device executing the trained machine learning model, vector embeddings from a language model for each word in the text before determining whether to apply a syntax dependency pattern or parts of a speech tag pattern;
extracting, by the computing device executing the trained machine learning model, relationships for a given fragment of the text based on the contextualization;
resolving, by the computing device executing the trained machine learning model, relationships between a plurality of identified verbs based on a plurality of heuristics to identify the syntax dependency patterns;
identifying, by the computing device executing the trained machine learning model, at least one nested relationship;
capturing, by the computing device executing the trained machine learning model, metadata associated with the at least one nested relationship; and
performing, by utilizing the extracted relationships via the computing device executing the trained machine learning model, at least one of
open-domain question answering and natural language question answering,
answering questions posed by a user in a virtual assistant or chatbot application,
summarizing documents by filtering to keep salient information with a high frequency or importance, or
measuring information overlap and information disagreement between two or more documents.