| CPC G06F 16/3329 (2019.01) [G06F 40/284 (2020.01)] | 14 Claims |

|
1. A method of processing language model input data in a distributed computing environment, comprising:
receiving an input data stream;
tokenizing data received from the input data stream into a plurality of tokens using a map-reduce operation in the distributed computing environment;
processing the plurality of tokens in parallel in the distributed computing environment to produce a plurality of processed tokens;
aggregating the plurality of processed tokens using a reduce operation to produce a plurality of aggregated tokens;
generating one or more updated incrementally-updated family of large language models (h-LLMs) by updating one or more incrementally-updated h-LLMs with the plurality of aggregated tokens in real-time; and
responding to a user query using the one or more updated incrementally-updated h-LLMs.
|