CPC G06F 40/18 (2020.01) [G06F 9/5077 (2013.01); G06F 40/186 (2020.01); G06N 20/00 (2019.01)] | 20 Claims |
1. A method of spreadsheet data analysis utilizing machine learning, the method comprising:
processing a spreadsheet file comprising a formula algorithm to be applied to a dataset comprising two or more data entries, the formula algorithm outputting a dependent variable and accepting as inputs one or more independent variables,
wherein the formula algorithm comprising one or more spreadsheet formulas stored in a first set of one or more cells of the spreadsheet file, the one or more independent variables referenced from the first set of one or more cells of the spreadsheet file, and the dependent variable output in a cell of the spreadsheet file,
generating from the formula algorithm an extrapolated algorithm expressed in a programming language that is at least one of a query language, an interpreted programming language, and a functional programming language,
wherein each of the one or more spreadsheet formulas equivalent to one or more functions of the programming language and each of the one or more independent variables define a declared variable of at least one of the one or more functions of the programming language;
running an automatic machine learning process to automatically apply one or more predictive models to the dataset;
determining a predictive model of the one or more predictive models fits the dataset; and
modifying the extrapolated algorithm in response to an application of the one or more predictive models to the dataset to result in a modified extrapolated algorithm.
|