US 12,271,696 B1
System and method for training and operating large language models using codewords
Brian Galvin, Silverdale, WA (US)
Assigned to AtomBeam Technologies Inc., Moraga, CA (US)
Filed by AtomBeam Technologies Inc., Moraga, CA (US)
Filed on Aug. 1, 2024, as Appl. No. 18/791,465.
Application 18/791,465 is a continuation in part of application No. 18/736,498, filed on Jun. 6, 2024.
Claims priority of provisional application 63/651,359, filed on May 23, 2024.
Int. Cl. G06F 40/284 (2020.01); G06F 40/242 (2020.01)
CPC G06F 40/284 (2020.01) [G06F 40/242 (2020.01)] 15 Claims
OG exemplary drawing
 
1. A system for a codeword trained and operated large language model, comprising one or more computers with executable instruction that, when executed, cause the system to:
collect a comprehensive set of training data;
tokenize the set of training data into a plurality training tokens;
create a codeword dictionary by assigning unique codewords to the plurality of training tokens;
convert all training tokens into a plurality of training codewords using the codeword dictionary;
train a large language model using the plurality of training codewords;
receive a text prompt from a user;
tokenize the prompt into a plurality of tokens;
convert the plurality of tokens into a plurality of prompt codewords using a codeword dictionary;
process the sequence of prompt codewords through a large language model to generate a codeword response; and
convert the codeword response back into tokens, which are rendered as a text response to the user.