| CPC G06F 16/3329 (2019.01) [G06F 40/284 (2020.01)] | 30 Claims |

|
1. A method of processing large contexts in a retrieval-augmented generation (RAG) system comprising:
receiving a query from a user;
retrieving a plurality of relevant documents from at least one database based on the query, wherein the relevant documents form a combined context;
partitioning the combined context into a plurality of context partitions;
generating a plurality of intermediate analysis results by:
processing each context partition of the plurality of context partitions using a mapper prompt; and
sending an output of the mapper prompt from each context partition to one or more large language models (LLMs);
generating a final response by processing the plurality of intermediate analysis results using a reducer prompt sent to one or more LLMs; and
transmitting the final response to the user.
|