US 12,265,538 B2
Schema-adaptable data enrichment and retrieval
Lauren A. Garib, McLean, VA (US); Stephen Winn, McLean, VA (US); and Philip Henault, McLean, VA (US)
Assigned to Capital One Services, LLC, McLean, VA (US)
Filed by Capital One Services, LLC, McLean, VA (US)
Filed on Nov. 3, 2021, as Appl. No. 17/517,718.
Prior Publication US 2023/0139783 A1, May 4, 2023
Int. Cl. G06F 16/2455 (2019.01); G06F 16/23 (2019.01); G06N 3/045 (2023.01); G06N 3/08 (2023.01)
CPC G06F 16/2455 (2019.01) [G06F 16/2379 (2019.01); G06N 3/045 (2023.01); G06N 3/08 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A system for record retrieval comprising a computer system that comprises one or more processors programmed with computer program instructions that, when executed, cause the computer system to perform operations comprising:
accessing a database to obtain a schema;
obtaining a field of a data table of the schema, the field comprising a field name and a field descriptor, wherein the field descriptor comprises natural language text;
accessing a first set of records structured in accordance with the schema;
assigning a category indicating a value type to the field of the schema based on the field name and the field descriptor using a machine learning model, wherein attributes of the first set of records of the field conform to a string pattern associated with the category;
subsequent to the assignment of the category to the field, receiving a first identifier and determining that the first identifier satisfying the string pattern is associated with the category;
retrieving a directed graph of identifiers that indicates that a plurality of identifiers are directed to the first identifier, the plurality of identifiers comprising a second identifier;
obtaining a query template associated with the database or the schema;
generating a plurality of database queries based on the query template, the first identifier, the plurality of identifiers, and the category, wherein the plurality of database queries comprises a query indicating the data table and the field;
retrieving a second set of records from the database based on the plurality of database queries, wherein retrieving the second set of records comprises retrieving a subset of records based on the second identifier;
determining a key based on the first identifier and the category; and
storing the second set of records in an in-memory data store, wherein the second set of records is stored in association with the key, and wherein storing the second set of records comprises:
determining whether a number of edges connecting a node of the directed graph with other nodes of the directed graph satisfies a threshold, wherein the node is mapped to the second identifier; and
in response to a determination that the number of edges connecting the node of the directed graph with other nodes of the directed graph satisfies the threshold, storing the subset of records in the in-memory data store in association with the second identifier.