CPC G06F 16/383 (2019.01) [G06F 16/3329 (2019.01)] | 20 Claims |
1. A system for reducing data retrieval times when accessing siloed data across disparate locations by generating a unified metadata graph via a Retrieval-Augmented Generation (RAG) framework, the system comprising:
at least one hardware processor; and
at least one non-transitory memory storing instructions, which, when executed by the at least one hardware processor, cause the system to:
receive, from a set of data silos, raw data comprising a set of metadata identifiers indicating (i) file-level metadata identifiers, (ii) container-level metadata identifiers, and (iii) system-level metadata identifiers;
select, from a set of structured Large Language Model (LLM) prompts, a first structured LLM prompt corresponding to a first metadata identifier of the set of metadata identifiers;
augment the first structured LLM prompt with the first metadata identifier to be provided to an LLM communicatively coupled to a set of domain-specific ontologies, wherein the LLM is configured to generate a first intermediate output indicating a second set of metadata identifiers corresponding to the first metadata identifier without accessing the set of domain-specific ontologies;
augment the first structured LLM prompt with the second set of metadata identifiers corresponding to the first metadata identifier to be provided to the LLM, wherein the LLM is configured to generate a second intermediate output indicating a filtered domain-specific metadata identifier by accessing the set of domain-specific ontologies;
generate a domain-specific unified metadata graph, via the LLM, using (i) the first metadata identifier and (ii) the second intermediate output indicating the filtered domain-specific metadata identifier, wherein the filtered domain-specific metadata identifier is a traversable identifier and the first metadata identifier is a non-traversable identifier within the domain-specific unified metadata graph;
perform a validation process on the domain-specific unified metadata graph by comparing first performance metrics of the domain-specific unified metadata graph to second performance metrics of another version of the domain-specific unified metadata graph; and
in response to determining that the first performance metrics fail to meet or exceed the second performance metrics of the other version of the domain-specific unified metadata graph, perform an update process on the domain-specific unified metadata graph.
|