US 11,886,382 B2
Customizable pipeline for integrating data
Abhinav Khanna, Los Altos, CA (US); Henry Tung, Redwood City, CA (US); Lucas Ray, San Francisco, CA (US); Stephen Yazicioglu, New York, NY (US); and Alexander Martino, New York, NY (US)
Assigned to Palantir Technologies Inc., Denver, CO (US)
Filed by Palantir Technologies Inc., Palo Alto, CA (US)
Filed on May 5, 2022, as Appl. No. 17/737,805.
Application 17/737,805 is a continuation of application No. 17/001,537, filed on Aug. 24, 2020, granted, now 11,379,407.
Application 17/001,537 is a continuation of application No. 16/035,250, filed on Jul. 13, 2018, granted, now 10,754,820.
Claims priority of provisional application 62/545,215, filed on Aug. 14, 2017.
Prior Publication US 2022/0261375 A1, Aug. 18, 2022
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 7/00 (2006.01); G06F 16/11 (2019.01); G06F 9/52 (2006.01); G06F 16/25 (2019.01)
CPC G06F 16/116 (2019.01) [G06F 9/52 (2013.01); G06F 16/254 (2019.01)] 16 Claims
OG exemplary drawing
 
1. A system comprising:
one or more processors; and a memory storing instructions that, when executed by the one or more processors, cause the system to perform:
determining a file is to be ingested into a data analysis platform in response to a user selecting the file, wherein the file is of a file type, and the file provides or enables error detection or encryption functionalities;
detecting, in parallel across detectors, the file type based on structure information, pattern information, or actual information within the file, wherein the detecting comprises continuously passing the file to the detectors until one of the detectors recognizes the file;
in response to the file being undetectable by the detectors, marking the file type as generic and creating a new detector to detect the file type;
in response to one of the existing detectors or the new detector detecting the file type:
generating, by the one of the existing detectors or the new detector, metadata including information relating to the file type or information relating to one or more operations to be performed based on the file type;
mapping each of the existing detectors to a transformer that executes at least a portion of the one or more operations based on the metadata;
in response to the creating of the new detector, mapping the new detector to a previously existing transformer or a new transformer;
identifying, by the one of the existing detectors or the new detector and based on the metadata, a particular transformer mapped to the one of the existing detectors or the new detector;
passing the metadata to the particular transformer; and
performing operations of normalizing or joining the file with a different file.