US 12,141,109 B2
Systems and methods for building and publishing schemas based on data patterns and data formats
Srinu Dasari, Euless, TX (US)
Assigned to JPMORGAN CHASE BANK, N.A., New York, NY (US)
Filed by JPMORGAN CHASE BANK, N.A., New York, NY (US)
Filed on Oct. 19, 2021, as Appl. No. 17/505,542.
Prior Publication US 2023/0124333 A1, Apr. 20, 2023
Int. Cl. G06F 16/21 (2019.01); G06F 16/25 (2019.01); G06F 16/28 (2019.01); G06F 40/20 (2020.01)
CPC G06F 16/213 (2019.01) [G06F 16/258 (2019.01); G06F 16/285 (2019.01); G06F 40/20 (2020.01)] 16 Claims
OG exemplary drawing
 
1. A method for building and publishing schemas, comprising:
producing, by a producer proxy agent or an ingestion API executed by a client device, objects based on a protocol requirement and a type requirement;
connecting producers to an object store by a load balancer based on one or more of a proximity based connection and a direct connection;
accessing, by a schema recommendation program executed by a computer processor, a plurality of ingested objects in an object store;
extracting, by a data crawler open source program executed by the computer processor, metadata from each of the plurality of ingested objects, wherein the data crawler open source program extracts the metadata by utilizing natural language processing to read text from each of the plurality of objects and determine a part of speech for the text;
determining, based on the metadata, a schema for the plurality of objects based on patterns determined by separations in data of the plurality of the objects, and wherein the metadata comprises an object name, a field name, and a field type;
determining a superschema for all objects in the object store based on the schema;
identifying, by the data crawler open source program, a plurality of potential schemas for the plurality of ingested objects based on the metadata;
receiving, by the schema recommendation program, a selection of one of the plurality of potential schemas;
applying, by the schema recommendation program, a restriction to the selected schema, wherein the restriction restricts modification to the selected schema, wherein the restriction is based on an entitlement; and
publishing, by the schema recommendation program, the selected potential schema to a catalog store.