| CPC G06F 40/295 (2020.01) [G06F 40/30 (2020.01); G06F 40/58 (2020.01)] | 20 Claims |

|
1. A method, comprising:
receiving origin text content to be analyzed using natural language processing;
generating a two-dimensional item sequence representation for at least a portion of the received origin text content, wherein the portion of the received origin text content used to generate the two-dimensional item sequence representation corresponds to a sentence included in the received origin text content, wherein the two-dimensional item sequence representation is arranged into rows, wherein each row includes a different consecutive sequence of a same number of words from the sentence, wherein a first row of the two-dimensional item sequence representation includes beginning consecutive words of the sentence and a last row of the two-dimensional item sequence representation includes last consecutive words of the sentence;
using one or more processors to determine one or more evaluation metrics based on an analysis of the two-dimensional item sequence representation, wherein the one or more evaluation metrics includes one or more horizontal evaluation metrics and a vertical evaluation metric, the one or more horizontal evaluation metrics are determined using an evaluation dictionary, and the vertical evaluation metric is based at least in part on the one or more horizontal evaluation metrics;
based on the one or more evaluation metrics, automatically generating a reduced version of the origin text content to assist in satisfying a constraint of a natural language processing model; and
using the reduced version of the origin text content as an input to the natural language processing model.
|