| CPC G06F 40/126 (2020.01) [G06F 40/20 (2020.01)] | 12 Claims |

|
1. A method comprising:
receiving, by a natural language processing system comprising a processor, a first input text comprising n number of words;
receiving, by the natural language processing system, a second input text comprising m number of words, wherein the n number of words of the first input text is different than the m number of words of the second input text;
encoding, by the natural language processing system, the first input text into a first matrix using a word embedding algorithm, wherein the word embedding algorithm comprises a Word2Vec algorithm, wherein the first matrix of the first input text comprises a Word2Vec matrix of the first input text, and wherein encoding the first input text into the Word2Vec matrix of the first input text using the Word2Vec algorithm comprises embedding each word in the n number of words of the first input text into a k-dimensional Word2Vec vector using the Word2Vec algorithm resulting in the Word2Vec matrix of the first input text having a dimension of n×k;
encoding, by the natural language processing system, the second input text into a first matrix using the word embedding algorithm comprising the Word2Vec algorithm, wherein the first matrix of the second input text comprises a Word2Vec matrix of the second input text, wherein encoding the second input text into the Word2Vec matrix of the second input text using the Word2Vec algorithm comprises embedding each word in the m number of words of the second input text into a k-dimensional Word2Vec vector using the Word2Vec algorithm resulting in the Word2Vec matrix of the second input text having a dimension of m×k, and wherein the dimension of n×k of the Word2Vec matrix of the first input text is different from the dimension of m×k of the Word2Vec matrix of the second input text;
decoding, by the natural language processing system, the first matrix of the first input text into a second matrix of the first input text using a text embedding algorithm, wherein decoding the first matrix of the first input text into the second matrix of the first input text using the text embedding algorithm comprises decoding the Word2Vec matrix of the first input text into a first congruence derivative matrix using a congruence derivative vector representation, and wherein the first congruence derivative matrix has a first dimension;
decoding, by the natural language processing system, the first matrix of the second input text into a second matrix of the second input text using the text embedding algorithm, wherein decoding the first matrix of the second input text into the second matrix of the second input text using the text embedding algorithm comprises decoding the Word2Vec matrix of the second input text into a second congruence derivative matrix using the congruence derivative vector representation, wherein the second congruence derivative matrix has a second dimension, and wherein decoding the Word2Vec matrix of the first input text into the first congruence derivative matrix using the congruence derivative vector representation and decoding the Word2Vec matrix of the second input text into the second congruence derivative matrix using the congruence derivative vector representation results in the first dimension of the first congruence derivative matrix of the first input text being equal to the second dimension of the second congruence derivative matrix of the second input text; and
using, by a machine learning module executed by the natural language processing system, the first congruence derivative matrix and the second congruence derivative matrix as training data for a machine learning model, wherein the machine learning module requires matrices used for the training data to have uniform dimensions, and wherein using the first congruence derivative matrix and the second congruence derivative matrix as training data for a machine learning model comprises
creating, by the machine learning module of the natural language processing system, the machine learning model, and
passing, by the machine learning module of the natural language processing system, the first congruence derivative matrix and the second congruence derivative matrix through the machine learning model to train the machine learning model to determine mistakes in other input texts.
|