| CPC G06F 40/211 (2020.01) [G06F 40/268 (2020.01)] | 16 Claims |

|
1. A method of automated text analysis performed by at least one processor, the method comprising:
receiving text by the at least one processor;
preprocessing, by the at least one processor, the received text to generate a plurality of descriptive data elements that are associated with words in the received text, the plurality of descriptive data elements being descriptive of parts of speech of the words and forming a language space for the received text, wherein multiple descriptive data elements of the plurality of descriptive data element are associated simultaneously with a first word in a sentence of the received text; and
executing, by a language-processing virtual machine that operates on the at least one processor, an operation with a first finite state transducer to process the sentence and increase or decrease a number of the multiple descriptive data elements associated simultaneously with the first word, wherein:
the language-processing virtual machine comprises at least one finite state transducer that includes the first finite state transducer and at least one bi-machine transducer compiled from source code that comprises linguistic constraints,
the first finite state transducer comprises a first plurality of binary objects, and a first binary object of the first plurality of binary objects comprises a first identifying element and a first code segment or pointer to the first code segment associated with the first identifying element,
a first bi-machine transducer of the at least one bi-machine transducer comprises:
a forward transducer comprising a second plurality of binary objects; and
a reverse transducer comprising a third plurality of binary objects, wherein:
a first binary object of the second plurality of binary objects comprises a first intermediate identifier to match to a second intermediate identifier of a first binary object of the third plurality of binary objects, and
a second object of the first binary object of the third plurality of binary objects comprises a second code segment or pointer to the second code segment, and
the operation comprises:
identifying a match between the first identifying element in the first binary object of the first plurality of binary objects of the first finite state transducer and a first identifier of a first descriptive data element among the multiple descriptive data elements in the language space associated with the first word in the sentence; and
executing the first code segment to:
increase or decrease the number of the multiple descriptive data elements associated simultaneously with the first word, and
produce a modified language space in which a meaning of the first word in the sentence associated with the first descriptive data element is disambiguated.
|