CPC G06F 40/295 (2020.01) [G06F 18/214 (2023.01); G06F 40/242 (2020.01); G06N 20/00 (2019.01)] | 18 Claims |
1. A method in a computing system to establish a trained machine learning model, the method comprising:
accessing a plurality of training examples each comprising:
a sentence;
information identifying one or more multi-word expressions occurring in the sentence; and
for each multi-word expression identified as occurring in the sentence, an indication of whether the multi-word expression is a noun multi-word expression or a verb multi-word expression;
for each of the plurality of training examples:
for each of a plurality of constituent models, invoking the constituent model against the training example's sentence to obtain a constituent model result for the training example's sentence that identifies one or more portions of the training example's sentence each as a multi-word expression and specifies whether each multi-word expression is a noun multi-word expression or a verb multi-word expression;
constructing a training observation corresponding to the training example that comprises:
independent variable values comprising:
the training example's sentence; and
the constituent model result obtained for each of the plurality of constituent models; and
dependent variable values comprising:
the training example's information identifying one or more multi-word phrases occurring in the sentence; and
using the constructed training observation to train the machine learning model to predict dependent variable values based on independent variable values.
|