US 12,235,861 B2
Automated refinement and correction of exploration and/or production data in a data lake
Tze W. Ma, Spring, TX (US); Vincent Bergbauer, Katy, TX (US); and Krishna Mudda, Houston, TX (US)
Assigned to SCHLUMBERGER TECHNOLOGY CORPORATION, Sugar Land, TX (US)
Appl. No. 16/646,930
Filed by SCHLUMBERGER TECHNOLOGY CORPORATION, Sugar Land, TX (US)
PCT Filed Sep. 13, 2018, PCT No. PCT/US2018/050870
§ 371(c)(1), (2) Date Mar. 12, 2020,
PCT Pub. No. WO2019/055647, PCT Pub. Date Mar. 21, 2019.
Claims priority of provisional application 62/557,871, filed on Sep. 13, 2017.
Prior Publication US 2020/0278979 A1, Sep. 3, 2020
Int. Cl. G06F 16/25 (2019.01); G06F 16/215 (2019.01); G06F 16/23 (2019.01); G06N 20/00 (2019.01)
CPC G06F 16/254 (2019.01) [G06F 16/215 (2019.01); G06F 16/2365 (2019.01); G06N 20/00 (2019.01)] 17 Claims
OG exemplary drawing
 
1. A method implemented by one or more processors, the method comprising:
receiving data from a client device to obtain received data, the received data associated with an operation occurring at an exploration and production system of oilfield operations with a plurality of data sources;
ingesting the received data into a data lake using ingestor components of the plurality of data sources to generate ingested data, wherein ingesting the received data includes storing the received data in the data lake in a first format, the first format being a same format in which the received data is received;
indexing the ingested data into a search index;
receiving a request to export, from the data lake, the ingested data in a second format that is different than the first format in which the received data is received and stored;
generating, from the data lake and using the index, a consumption model for the ingested data, wherein the consumption model is configured to transform the ingested data in the first format;
applying the consumption model to the ingested data in the first format to generate modified data in the first format;
applying, after applying the consumption model to the ingested data, one or more transformations to the modified data to convert the modified data into formatted data that is formatted in the second format;
exporting the formatted data from the data lake to a consumption service;
tracking the one or more transformations made to the modified data to generate tracking data;
storing, in the data lake, the tracking data as a set of nodes and edges in a graph database;
receiving, after storing, an external change to the modified data to generate changed data; reapplying, using the tracking data in the graph database, the one or more transformations to the changed data to generate reformatted changed data;
receiving changed received data comprising a change to the received data:
generating metadata by applying a transformation to the changed received data, wherein the generated metadata comprises at least one of quality score, verified channels and verified channel units;
rerunning, automatically using the tracking data, the consumption model on the changed received data and the metadata to generate changed modified data in the first format;
reapplying the one or more transformations to the changed modified data to convert the changed modified data into changed formatted data that is formatted in the second format; and
returning the changed formatted data to the consumption service.