US 12,443,638 B1
Generalized validation framework for retrieval augmented generation (RAG)
Ankita Sinha, Mountain View, CA (US); and Karelia Del Carmen Pena-Pena, Mountain View, CA (US)
Assigned to INTUIT INC., Mountain View, CA (US)
Filed by INTUIT INC., Mountain View, CA (US)
Filed on Jul. 19, 2024, as Appl. No. 18/778,883.
Int. Cl. G06F 16/33 (2025.01); G06F 16/334 (2025.01)
CPC G06F 16/3347 (2019.01) 18 Claims
OG exemplary drawing
 
1. A method of validating a Retrieval Augmented Generation (RAG) system, comprising:
receiving rephrased text extracted from a database and rephrased by the RAG system in response to a user query;
searching and extracting corresponding sections from an original document corresponding to the rephrased text;
converting the rephrased text and the corresponding sections of the original document into semantic vectors using natural language processing techniques;
applying a sliding window to the semantic vectors of the corresponding sections of the original document, by systematically moving the sliding window through sentences of the corresponding sections of the original document;
calculating a semantic similarity score between the vectorized rephrased text and the vectorized text within the window for each position of the sliding window;
ranking the sentences based on their semantic similarity scores;
identifying the sentences that have a greatest semantic congruence with the rephrased text;
comparing the semantic similarity score to a predetermined threshold, and considering the rephrased text as semantically congruent and thus validated when the score is above the threshold; and
adjusting the sliding window as a variable size based on a length of the rephrased text.