| CPC G06F 16/3332 (2019.01) [G06F 16/334 (2019.01)] | 20 Claims |

|
1. A system for reducing hallucinations in a large language model using probabilistic sentence space expansion, the system comprising:
one or more processors configured by computer-readable media to:
identify a plurality of input-output pairs, each input-output pair assigned to a probability and comprising an example input text string and a corresponding output text string, the example input text string and the corresponding output text of an input-output pair each comprising a matching variable;
sample a set of input-output pairs from the plurality of input-output pairs based on the probabilities assigned to the plurality of input-output pairs;
generate a list of aggregated input-output pairs from the set of input-output pairs by:
concatenating each of the example input text strings of the set of input-output pairs in a plurality of orders; and
concatenating each of the corresponding output text strings in orders corresponding to the orders of the concatenated example input text strings;
for each aggregated input-output pair of the list of aggregated input-output pairs, generate one or more queries by:
identifying each variable included in the aggregated input-output pair;
retrieving one or more values for each identified variable from a database; and
iteratively replacing each identified variable within the aggregated input-output pair with a different value of the retrieved one or more values for the identified variable to generate a different query of the one or more queries; and
train, using the one or more queries for each aggregated input-output pair, a first large language model to convert input queries to machine-readable prompts configured for input into a second large language model and input the machine-readable prompts into the second large language model.
|