US 11,755,838 B2
	Machine learning for joint recognition and assertion regression of elements in text
Ian H. Magnusson, Cambridge, MA (US); Scott Ehrlich Friedman, Minneapolis, MN (US); and Sonja M. Schmer-Galunder, San Francisco, CA (US)
Assigned to Smart Information Flow Technologies, LLC, Minneapolis, MN (US)
Filed by Smart Information Flow Technologies, LLC, Minneapolis, MN (US)
Filed on Sep. 14, 2020, as Appl. No. 16/948,332.
Prior Publication US 2022/0083739 A1, Mar. 17, 2022
Int. Cl. G06F 40/295 (2020.01); G06N 20/00 (2019.01); G06N 5/04 (2023.01); G06F 40/284 (2020.01)

CPC G06F 40/295 (2020.01) [G06F 40/284 (2020.01); G06N 5/04 (2013.01); G06N 20/00 (2019.01)]

15 Claims

1. An inference method comprising:

receiving, at one or more computing machines, an input comprising unstructured text;

identifying, within the unstructured text, one or more entities using a named entity recognition (NER) engine in a trained machine learning model, wherein the trained machine learning model embeds tokens from the unstructured text into a vector space and uses generated embeddings to identify one or more tokens as being associated with the one or more entities, wherein the one or more tokens associated with the one or more entities in the unstructured text correspond to candidate spans of tokens;

determining, based on the embedded tokens and using the trained machine learning model that identifies the one or more entities and determines assertions as predictions made on the candidate span of tokens, an assertion applied, within the text, to at least one entity, wherein the assertion is represented as a vector, wherein each dimension of multiple dimensions of the vector corresponds to a part of the assertion and has a value corresponding to a probability or a log-odds of the part of the assertion being present, wherein the trained machine learning model is a span-level model; and

providing an output associated with the assertion applied to the at least one entity, wherein the assertion comprises a moral assertion applied by an author of the text to the at least one entity, wherein the vector comprises one or more integer or binary values, the integer or binary values representing one or more of: dehumanization, moral condemnation/justification, ingroup/outgroup, violence, and harmed/responsible-for-harm.