CPC G06N 20/00 (2019.01) | 29 Claims |
1. A computer-program product comprising a non-transitory machine-readable storage medium storing computer instructions that, when executed by one or more processors, perform operations comprising:
obtaining a raw dataset comprising a plurality of data samples that store historical values of a target entity that includes an energy commodity, healthcare data management, retail inventory, an energy market, an energy consumption, energy utilities, electricity grid management, or energy;
executing an outlier filtration process based on obtaining the raw dataset, wherein the outlier filtration process includes:
detecting, by a quantile-based outlier filtration algorithm, outlier data samples of the plurality of data samples that exceed a lower quantile threshold or an upper quantile threshold,
generating an intermediate outlier-reduced dataset that includes a subset of the plurality of data samples, wherein the intermediate outlier-reduced dataset excludes the outlier data samples that exceed the lower quantile threshold or the upper quantile threshold,
decomposing, by a matrix decomposition algorithm, the intermediate outlier-reduced dataset into a transformed features matrix and a sparse matrix, wherein the transformed features matrix includes a plurality of feature vectors of a plurality of principal components of the intermediate outlier-reduced dataset; and
generating a refined outlier-reduced dataset that includes a subset of the plurality of feature vectors, wherein the refined outlier-reduced dataset excludes feature vectors of the transformed features matrix that are associated with an anomalous value in the sparse matrix;
training a model using the refined outlier-reduced dataset;
maintaining risk mitigation preparedness by predicting via the trained model a value of the target entity that includes predicting for a future time a demand of the energy commodity, the healthcare data management, the retail inventory, the energy market, the energy consumption, the energy utilities, the electricity grid management, or the energy; and
predicting, via the trained model, the value of the target entity at the future time.
|