US 12,242,444 B2
Generating rules for data processing values of data fields from semantic labels of the data fields
John Joyce, Newton, MA (US); Marshall A. Isman, Newton, MA (US); and Sandrick Melbouci, Myersville, MD (US)
Assigned to Ab Initio Technology LLC, Lexington, MA (US)
Filed by Ab Initio Technology LLC, Lexington, MA (US)
Filed on Dec. 19, 2023, as Appl. No. 18/545,416.
Application 18/545,416 is a continuation of application No. 17/006,504, filed on Aug. 28, 2020, granted, now 11,886,399.
Claims priority of provisional application 62/981,646, filed on Feb. 26, 2020.
Prior Publication US 2024/0152495 A1, May 9, 2024
Int. Cl. G06F 16/00 (2019.01); G06F 16/215 (2019.01); G06F 16/22 (2019.01); G06F 16/28 (2019.01); G06N 5/04 (2023.01); G06N 20/00 (2019.01)
CPC G06F 16/215 (2019.01) [G06F 16/2228 (2019.01); G06F 16/285 (2019.01); G06N 5/04 (2013.01); G06N 20/00 (2019.01)] 20 Claims
OG exemplary drawing
 
1. A method for determining a schema of a data record, the method including:
retrieving a label index that associates a label with a set of one or more fields in the data record, wherein the label identifies a type of information expected in a field of the set of the one or more fields;
accessing an index that associates the type of information indicated by the label with a set of attribute values representing requirements for values of the one or more fields associated with the label, the requirements including logical or syntactical characteristics of the values for the one or more fields; and
for a first field of a particular data record:
identifying, by accessing the label index, a particular label associated with the first field of the particular data record;
retrieving, from the index, an attribute value for the particular label, the attribute value specifying a particular requirement for the first field;
determining that the particular requirement specified by the attribute value includes a schema feature for values included in the first field;
determining, based on the schema feature, a relationship between the first field and a second field in the data record; and
generating an output that indicates the relationship between the first field and the second field in the data record.