US 12,455,863 B2
Data ingestion and cleansing tool
Diego Bauducco, San Francisco, CA (US)
Assigned to Shipt, Inc., Birmingham, AL (US)
Filed by Shipt, Inc., Birmingham, AL (US)
Filed on Jan. 17, 2023, as Appl. No. 18/155,533.
Prior Publication US 2024/0241869 A1, Jul. 18, 2024
Int. Cl. G06F 16/20 (2019.01); G06F 16/21 (2019.01); G06F 16/215 (2019.01); G06F 16/25 (2019.01)
CPC G06F 16/215 (2019.01) [G06F 16/212 (2019.01); G06F 16/254 (2019.01)] 16 Claims
OG exemplary drawing
 
1. A method for ingesting data, the method comprising:
receiving, by a processor, a first CSV file, the first CSV file including a plurality of incoming fields;
translating, by the processor, the first CSV file to an internal schema, wherein translating the first CSV file to the internal schema comprises applying a mapping library to map one or more of the plurality of incoming fields to one or more of a plurality of internal fields of the internal schema;
displaying, by the processor, a user interface;
via the user interface, receiving, by the processor, an input corresponding to a mapping of an unmapped field of the plurality of incoming fields to a selected field of the plurality of internal fields;
determining, by the processor, a content of the incoming data of the first CSV file, wherein the content is a location or an identification;
validating, by the processor, the content using a third-party service;
in response to failing to validate the content, displaying, by the processor, an error user interface;
by the processor, receiving, via the error user interface, an updated data entry;
updating, by the processor, the mapping library to include the mapping of the unmapped field to the selected field of the plurality of internal fields, the mapping causing the unmapped field in subsequent received CSV files to be automatically mapped to the selected field;
receiving, by the processor, a second CSV file, the second CSV file including a second plurality of incoming fields, the second plurality of incoming fields including the unmapped field; and
translating, by the processor, the second CSV file to the internal schema, wherein translating the second CSV file to the internal schema comprises automatically applying the updated mapping library and the mapping of the unmapped field to the selected field.