US 12,461,959 B2
System and method for data management
Swathi Shyam Sunder, Bengaluru (IN); Nataliia Rümmele, St. Pölten (AT); Tobias Aigner, Munich (DE); Yogesh Kamath, Karnataka (IN); Rani Joseph, Bangalore (IN); Yogesh Borkhade, Karnataka (IN); and Prithvi Raj Ramakrishnaraja, Secundrabad (IN)
Assigned to Siemens Schweiz AG, Zürich (CH)
Appl. No. 18/695,441
Filed by Siemens Schweiz AG, Zürich (CH)
PCT Filed Sep. 26, 2022, PCT No. PCT/EP2022/076632
§ 371(c)(1), (2) Date Mar. 26, 2024,
PCT Pub. No. WO2023/046945, PCT Pub. Date Mar. 30, 2023.
Claims priority of application No. 21199096 (EP), filed on Sep. 27, 2021.
Prior Publication US 2024/0419714 A1, Dec. 19, 2024
Int. Cl. G06F 16/353 (2025.01); G06F 16/332 (2025.01); G06F 16/36 (2019.01)
CPC G06F 16/353 (2019.01) [G06F 16/3326 (2019.01); G06F 16/367 (2019.01)] 8 Claims
OG exemplary drawing
 
1. A computer-implemented method for data management, the method comprising:
obtaining a dataset from a data source, by a processing unit, wherein the dataset comprises a plurality of datapoints, and wherein each of the datapoints belong to a column among a plurality of columns;
predicting an ontology label for at least one column in the dataset using a machine learning model, wherein the predicted ontology label is associated with an ontology comprising a plurality of ontology labels;
generating a mapping between the dataset and the ontology based on the relation between the predicted ontology label and the column;
classifying the datapoints with respect to the ontology labels based on the mapping generated;
outputting the classified datasets on a user interface;
providing an option to a user on the user interface for validation of predicted ontology labels for a plurality of datasets by a user-input via the user interface;
if the user rejects the respective predicted ontology label, requesting the user to manually select the correct ontology label for the column from a list of ontology labels associated with the ontology for assigning the correct ontology label to the column;
teaching the machine learning model a relationship between the column and the assigned ontology label;
identifying a relation between at least another column and at least another ontology label from the plurality of ontology labels, based on a user-input received from the user interface; and
training the machine learning model based on the relation identified;
wherein identifying the relation between the at least another column and the at least another ontology label from the plurality of ontology labels based on the user-input received from the user interface includes receiving the user-input from a user via the user interface, wherein the user input corresponds to assigning the ontology label to the at least another column; determining one or more attributes associated with the at least another column based on the user-input; and determining the relation based on the one or more attributes associated with the at least another column and one or more properties associated with the ontology label.