| CPC G06F 40/40 (2020.01) [G06F 40/226 (2020.01); G06F 40/295 (2020.01); G06N 5/04 (2013.01); G06F 16/35 (2019.01)] | 39 Claims |

|
1. A machine learning system for fact extraction and claim verification, comprising:
a memory; and
a processor in communication with the memory, the processor:
receiving a claim comprising one or more sentences;
retrieving, based at least in part on one or more machine learning models, a document from a dataset, the document having a first relatedness score higher than a first threshold, wherein the first relatedness score indicates that the one or more machine learning models determines that the document is most likely to be relevant to the claim, wherein the dataset comprises a plurality of supporting documents and a plurality of claims, the plurality of claims comprising a first group of claims supported by facts from more than two supporting documents from the plurality of supporting documents and a second group of claims not supported by the plurality of supporting documents;
selecting, based at least in part on the one or more machine learning models, a set of sentences from the document, the set of sentences having second relatedness scores higher than a second threshold, wherein the second relatedness scores indicate that the one or more machine learning models determine that the set of sentences are most likely to be relevant to the claim; and
determining, based at least in part on the one or more machine learning models, whether the claim includes one or more facts from the set of sentences.
|