US 12,248,455 B1
Systems and methods for generic data parsing applications
Roman Cwalina, Vernon, CT (US); Shashank Jain, Austin, TX (US); Vishesh Kain, Delhi (IN); Rudrappa Malapur, Karnataka (IN); and Kalyani Chidrawar, Maharashtra (IN)
Assigned to The Travelers Indemnity Company, Hartford, CT (US)
Filed by The Travelers Indemnity Company, Hartford, CT (US)
Filed on Nov. 2, 2023, as Appl. No. 18/500,390.
Application 18/500,390 is a continuation of application No. 17/235,148, filed on Apr. 20, 2021, granted, now 12,001,416.
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 16/00 (2019.01); G06F 16/182 (2019.01); G06F 16/21 (2019.01); G06F 16/22 (2019.01); G06F 16/2455 (2019.01); G06F 16/435 (2019.01)
CPC G06F 16/2272 (2019.01) [G06F 16/182 (2019.01); G06F 16/211 (2019.01); G06F 16/214 (2019.01); G06F 16/2455 (2019.01); G06F 16/24564 (2019.01); G06F 16/435 (2019.01)] 21 Claims
OG exemplary drawing
 
1. A method for loading semi-structured data into a data storage structure operable to accept and respond to structured queries, comprising:
deriving, by at least one electronic processing device and by execution of schema derivation rules stored in at least one non-transitory computer readable medium, a schema, by:
receiving, from a user and via a User Interface (UI), first input comprising a command and an identifier of a semi-structured data source;
extracting, utilizing the identifier of the semi-structured data source and from the semi-structured data source, a listing of tables and fields;
generating a unique identifier for each extracted table and field;
identifying keys linking two or more extracted tables;
creating a plurality of output files descriptive of the extracted tables and fields, the unique identifiers for the tables and fields, and the keys, thereby defining the schema;
generating, based on the plurality of output files, a plurality of table create statements; and
storing the plurality of table create statements in a schema definition file;
creating, by the at least one electronic processing device and by execution of the base layer creation rules stored in the at least one non-transitory computer readable medium, a base layer, by:
extracting, for each record in the semi-structured data source, all fields and corresponding values;
comparing each of the extracted fields and values to the plurality of output files descriptive of the extracted tables and fields;
mapping each of the extracted field values to a base layer table identified by a base layer table name;
storing an indication of the mapping for each of the extracted field values in a partition of a distributed file system;
creating, by executing at least one of the table create statements stored in the schema definition file, a plurality of base layer tables;
writing, utilizing the stored mapping for each of the extracted field values, each of the extracted field values into a corresponding base layer table of the plurality of base layer tables; and
creating, by the at least one electronic processing device and by execution of the Single Subject Layer (SSL) creation rules stored in the at least one non-transitory computer readable medium, an SSL layer, by:
creating, utilizing a mapping sheet comprising at least one column of data descriptive of at least one data relationship, an SSL configuration file;
generating, automatically, by executing a generic script and utilizing the SSL configuration file, SSL table creation code;
creating, utilizing the SSL table creation code, a plurality of SSL tables; and
loading the plurality of SSL tables with the corresponding values from the extracted fields of the records in the semi-structured data source.