US 12,411,877 B1
Method and system for multi-level artificial intelligence supercomputer design
Vijay Madisetti, Alpharetta, GA (US); and Arshdeep Bahga, Chandigarh (IN)
Assigned to Vijay Madisetti, Alpharetta, GA (US)
Filed by Vijay Madisetti, Alpharetta, GA (US)
Filed on Apr. 9, 2025, as Appl. No. 19/174,165.
Application 19/174,165 is a continuation of application No. 18/795,345, filed on Aug. 6, 2024, granted, now 12,299,018.
Application 18/795,345 is a continuation of application No. 18/470,487, filed on Sep. 20, 2023, granted, now 12,147,461, issued on Nov. 19, 2024.
Application 18/470,487 is a continuation of application No. 18/348,692, filed on Jul. 7, 2023, granted, now 12,001,462, issued on Jun. 4, 2024.
Claims priority of provisional application 63/469,571, filed on May 30, 2023.
Claims priority of provisional application 63/463,913, filed on May 4, 2023.
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 16/3329 (2025.01); G06F 40/284 (2020.01)
CPC G06F 16/3329 (2019.01) [G06F 40/284 (2020.01)] 20 Claims
OG exemplary drawing
 
1. A method of answering queries using one or more families of large language models (h-LLMs) by a computer comprising a processor, a non-transitory storage medium, and software on the storage medium, the method comprising:
receiving one or more user prompts via an application programming interface (API) at a query layer from a user interface;
determining an analysis mode of the one or more user prompts to be a real-time mode;
transmitting a query prompt comprising the one or more user prompts from the query layer to a real-time layer responsive to determining the analysis mode being a real-time mode, the real-time layer comprising at least one h-LLM and being configured to generate a result in a response period on the order of seconds;
receiving a real-time response from the real-time layer responsive to the query prompt; and
transmitting the real-time response to the user interface via the API.