CPC G06F 40/166 (2020.01) [G06F 40/289 (2020.01); G06N 20/00 (2019.01)] | 20 Claims |
1. A method executed by at least one processor, the method comprising:
receiving an input comprising natural language texts;
segmenting the natural language texts into a plurality of sections;
summarizing the natural language texts;
developing a first model based on the plurality of sections and the summary of the natural language texts;
identifying two or more salient sentences within the natural language texts using the first model;
determining a sentence quality score for each of the two or more salient sentences;
determining, for each of the two or more salient sentences, a sentence similarity score based on a similarity of the salient sentence to another salient sentence of the two or more salient sentences;
generating a second model, as a negative log-probability of a ground-truth extractive summary, based on performing batch matrix multiplication (BMM) between the sentence quality scores and the sentence similarity scores to calculate a matrix product;
combining the first model and the second model into a final model;
selecting sentences from the natural language texts based on the final model; and
generating an extractive summarization of the natural language texts using the selected sentences.
|