CPC G06F 16/3329 (2019.01) [G06F 40/284 (2020.01)] | 28 Claims |
1. A system for answering queries using one or more families of large language models (h-LLMs) comprising:
a processor;
a non-transitory computer-readable storage medium positioned in communication with processor and having stored thereon software that, when executed by the processor, is operable to:
provide a user interface to receive a user prompt;
operate an input broker operable to generate a plurality of derived prompts from the user prompt;
generate a plurality of prompt embeddings from the plurality of derived prompts by applying a plurality of embedding models;
transmit the plurality of prompt embeddings to a vector database, the vector database comprising a database of knowledge documents, each knowledge document comprised by the database of knowledge documents having one or more embeddings associated therewith;
receive one or more knowledge documents that are determined to be relevant to the plurality of prompt embeddings at the input broker;
generate a plurality of context-aware prompts by the input broker responsive to the user prompt, the plurality of derived prompts, and the one or more knowledge documents;
transmit the plurality of context-aware prompts to the one or more h-LLMs;
operate an output broker operable to receive a plurality of h-LLM results, the h-LLM results being generated responsive to the one or more h-LLMs receiving at least one context-aware prompt and generating a response thereto;
process the plurality of h-LLM results by the output broker to produce processed h-LLM results, each processed h-LLM result having a score;
identify one or more preferred results responsive to the scores; and
transmit the one or more preferred results to a user via the user interface; and
a communication device operable to facilitate the transmitting and receiving functions of the software.
|