| CPC G06F 16/213 (2019.01) [G06F 16/2365 (2019.01); G06F 16/285 (2019.01)] | 20 Claims |

|
1. A method, comprising:
selecting, by a computing device, a subset of methods to generate data schemas for input data, from a list of methods for generating data schemas, based on an output of a regression model, wherein the output of the regression model comprises a numeric indicator of schema accuracy for each method in the set of methods associated with the determined data category;
generating, by the computing device, a candidate schema for each method in the subset of methods to generate data schemas; and
generating, by the computing device, a master data schema for the input data by merging the candidate schema for each method in the subset of methods to generate data schemas, utilizing predetermined rules, wherein the predetermined rules comprise selecting three methods having a highest numeric indicator or schema accuracy.
|