US 12,001,462 B1
	Method and system for multi-level artificial intelligence supercomputer design
Vijay Madisetti, Johns Creek, GA (US); and Arshdeep Bahga, Chandigarh (IN)
Assigned to Vijay Madisetti, Alpharetta, GA (US)
Filed by Vijay Madisetti, Johns Creek, GA (US)
Filed on Jul. 7, 2023, as Appl. No. 18/348,692.
Claims priority of provisional application 63/463,913, filed on May 4, 2023.
Claims priority of provisional application 63/469,571, filed on May 30, 2023.
Int. Cl. G06F 16/332 (2019.01); G06F 40/284 (2020.01)

CPC G06F 16/3329 (2019.01) [G06F 40/284 (2020.01)]

28 Claims

1. A method of answering queries using one or more families of large language models (h-LLMs) by a computer comprising a processor, a non-transitory storage medium, and software on the storage medium, the method comprising:

receiving a user prompt at a user interface;

generating a plurality of derived prompts from the user prompt at an input broker;

generating a plurality of prompt embeddings from the plurality of derived prompts by applying a plurality of embedding models;

transmitting the plurality of prompt embeddings to a vector database, the vector database comprising a database of knowledge documents, each knowledge document comprised by the database of knowledge documents having one or more embeddings associated therewith;

receiving one or more knowledge documents that are determined to be relevant to the plurality of prompt embeddings at the input broker;

generating a plurality of context-aware prompts by the input broker responsive to the user prompt, the plurality of derived prompts, and the one or more knowledge documents;

transmitting the plurality of context-aware prompts to the one or more h-LLMs;

receiving a plurality of h-LLM results at an output broker, the h-LLM results being generated responsive to the one or more h-LLMs receiving at least one context-aware prompt and generating a response thereto;

processing the plurality of h-LLM results by the output broker to produce processed h-LLM results, each processed h-LLM result having a score;

identifying one or more preferred results responsive to the scores; and

transmitting the one or more preferred results to a user via the user interface.