| CPC G06F 40/284 (2020.01) [G06F 40/242 (2020.01)] | 15 Claims |

|
1. A system for a codeword trained and operated large language model, comprising one or more computers with executable instruction that, when executed, cause the system to:
collect a comprehensive set of training data;
tokenize the set of training data into a plurality training tokens;
create a codeword dictionary by assigning unique codewords to the plurality of training tokens;
convert all training tokens into a plurality of training codewords using the codeword dictionary;
train a large language model using the plurality of training codewords;
receive a text prompt from a user;
tokenize the prompt into a plurality of tokens;
convert the plurality of tokens into a plurality of prompt codewords using a codeword dictionary;
process the sequence of prompt codewords through a large language model to generate a codeword response; and
convert the codeword response back into tokens, which are rendered as a text response to the user.
|