CPC G06F 16/2465 (2019.01) [G06F 16/22 (2019.01); G06F 16/248 (2019.01); G06F 16/3329 (2019.01); G06F 16/3344 (2019.01); G06F 16/367 (2019.01); G06F 16/9024 (2019.01); G06F 40/211 (2020.01); G06F 40/253 (2020.01); G06F 40/284 (2020.01); G06F 40/289 (2020.01); G06N 5/02 (2013.01)] | 21 Claims |
1. A system, comprising:
a processor; and
a non-transitory computer readable medium comprising instructions for:
receiving text as input from a data source;
creating a parse graph of the text, wherein creating the parse graph comprises:
parsing the text by segmenting the text into a set of evidence spaces based on one or more identifiers within the text;
ordering the set of evidence spaces;
chunking each of the set of evidence spaces into one or more chunks, each chunk comprising a permutation of tokens of an associated evidence space within a predetermined distance of one another in the associated evidence space, wherein chunking the associated evidence space into the one or more chunks utilizes a moving window of the predetermined distance to iterate over tokens of the associated evidence space, and an iteration over a token of the associated evidence space comprises generating one or more chunks based on the permutation of a set of the tokens of the associated evidence space within the predetermined distance of the token; and
based on the ordering and chunking, creating nodes and relationships of the parse graph representing a sequence of the evidence spaces and chunks within each evidence space.
|