US 12,333,245 B2
	Methods and apparatus to improve disambiguation and interpretation in automated text analysis using structured language space and transducers applied on automatons
Emmanuel Roche, Belmont, MA (US)
Assigned to Clover.AI, LLC, Belmont, MA (US)
Filed by Clover.AI, LLC, Belmont, MA (US)
Filed on Sep. 2, 2021, as Appl. No. 17/465,686.
Application 17/465,686 is a continuation of application No. PCT/US2020/020842, filed on Mar. 3, 2020.
Claims priority of provisional application 62/813,540, filed on Mar. 4, 2019.
Prior Publication US 2022/0004708 A1, Jan. 6, 2022
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 40/211 (2020.01); G06F 40/268 (2020.01)

CPC G06F 40/211 (2020.01) [G06F 40/268 (2020.01)]

16 Claims

1. A method of automated text analysis performed by at least one processor, the method comprising:

receiving text by the at least one processor;

preprocessing, by the at least one processor, the received text to generate a plurality of descriptive data elements that are associated with words in the received text, the plurality of descriptive data elements being descriptive of parts of speech of the words and forming a language space for the received text, wherein multiple descriptive data elements of the plurality of descriptive data element are associated simultaneously with a first word in a sentence of the received text; and

executing, by a language-processing virtual machine that operates on the at least one processor, an operation with a first finite state transducer to process the sentence and increase or decrease a number of the multiple descriptive data elements associated simultaneously with the first word, wherein:

the language-processing virtual machine comprises at least one finite state transducer that includes the first finite state transducer and at least one bi-machine transducer compiled from source code that comprises linguistic constraints,

the first finite state transducer comprises a first plurality of binary objects, and a first binary object of the first plurality of binary objects comprises a first identifying element and a first code segment or pointer to the first code segment associated with the first identifying element,

a first bi-machine transducer of the at least one bi-machine transducer comprises:

a forward transducer comprising a second plurality of binary objects; and

a reverse transducer comprising a third plurality of binary objects, wherein:

a first binary object of the second plurality of binary objects comprises a first intermediate identifier to match to a second intermediate identifier of a first binary object of the third plurality of binary objects, and

a second object of the first binary object of the third plurality of binary objects comprises a second code segment or pointer to the second code segment, and

the operation comprises:

identifying a match between the first identifying element in the first binary object of the first plurality of binary objects of the first finite state transducer and a first identifier of a first descriptive data element among the multiple descriptive data elements in the language space associated with the first word in the sentence; and

executing the first code segment to:

increase or decrease the number of the multiple descriptive data elements associated simultaneously with the first word, and

produce a modified language space in which a meaning of the first word in the sentence associated with the first descriptive data element is disambiguated.