| CPC G06F 16/3329 (2019.01) [G06F 40/284 (2020.01)] | 20 Claims |

|
1. A method of answering queries using one or more families of large language models (h-LLMs) by a computer comprising a processor, a non-transitory storage medium, and software on the storage medium, the method comprising:
receiving one or more user prompts via an application programming interface (API) at a query layer from a user interface;
determining an analysis mode of the one or more user prompts to be a real-time mode;
transmitting a query prompt comprising the one or more user prompts from the query layer to a real-time layer responsive to determining the analysis mode being a real-time mode, the real-time layer comprising at least one h-LLM and being configured to generate a result in a response period on the order of seconds;
receiving a real-time response from the real-time layer responsive to the query prompt; and
transmitting the real-time response to the user interface via the API.
|