US 12,248,886 B2
Method and system for extraction of cause-effect relation from domain specific text
Ravina Vinayak More, Pune (IN); Sachin Sharad Pawar, Pune (IN); Girish Keshav Palshikar, Pune (IN); Swapnil Hingmire, Pune (IN); Pushpak Bhattacharyya, Patna (IN); and Vasudeva Varma Kalidindi, Hyderabad (IN)
Assigned to TATA CONSULTANCY SERVICES LIMITED, Mumbai (IN)
Filed by Tata Consultancy Services Limited, Mumbai (IN)
Filed on Mar. 23, 2021, as Appl. No. 17/209,995.
Claims priority of application No. 202021050762 (IN), filed on Nov. 21, 2020.
Prior Publication US 2022/0207400 A1, Jun. 30, 2022
Int. Cl. G06F 40/30 (2020.01); G06F 40/205 (2020.01); G06F 40/284 (2020.01); G06F 40/289 (2020.01); G06N 5/046 (2023.01)
CPC G06N 5/046 (2013.01) [G06F 40/205 (2020.01); G06F 40/284 (2020.01); G06F 40/289 (2020.01); G06F 40/30 (2020.01)] 11 Claims
OG exemplary drawing
 
1. A processor implemented method for extraction of cause-effect relation from domain specific text, the method comprising:
receiving, via one or more hardware processors, the domain specific text and a causal trigger set, wherein the domain specific text comprises a set of sequence of words;
generating, via the one or more hardware processors, a dependency parse tree of the domain specific text;
identifying, via the one or more hardware processors, causal triggers in the domain specific text using the causal trigger set, if the causal triggers are an element of the causal trigger set; and
extracting, via the one or more hardware processors, a set of cause-effect relation(s) for each of the causal triggers from the domain specific text, wherein each of the cause-effect relation in the set of cause-effect relations is represented as a triplet, the triplet includes (i) a cause phrase (ii) a causal trigger out of the identified causal triggers and (iii) an effect phrase, the extracting comprising:
extracting a set of features for each pair of words in the domain specific text using a first set of predefined rules wherein one of the words in each pair of words is a causal trigger, wherein the set of features are (i) lexical features (ii) Part-of-speech (POS) tag based features and (iii) dependency based features;
obtaining a headword label for each word of the set of words from the set of features using one of (i) a trained classifier or (ii) a combination of the trained classifier and a second set of predefined rules, wherein the headword label is one or more of (i) cause headword, (ii) effect headword or (iii) negative;
expanding, using the dependency parse tree, the headword label to obtain:
cause phrase, if the headword label is classified as cause headword, and
effect phrase, if the headword label is classified as effect headword; and
updating each of the set of cause-effect relation for each of the causal trigger using the cause phrase, the causal trigger and the effect phrase,
wherein for obtaining the trained classifier, the training comprises:
receiving a training data and the causal trigger set for training a classifier wherein the training data is a set of annotated sentences wherein each sentence of the set of annotated sentences comprises (i) one or more cause-effect relation and (ii) one or more causal triggers;
generating a dependency parse tree for each sentence of the set of sentences in the training data;
generating, using the dependency parse tree, a set of cause headword and a set of effect headword for each of the cause-effect relation of the one or more cause-effect relation associated to each of the causal trigger of the one or more causal triggers;
obtaining a set of negative headword labels in the training data wherein the set of negative headword labels is a set of words in each sentence of the set of sentences and each word of the set of words is none of (i) a cause headword or (ii) an effect headword;
extracting a set of features for the set of cause headword, a set of features for the set of effect headword and a set of features for the set of headword classified as negative using the first set of predefined rules;
training the classifier using the set of features and the corresponding headword label.