US 12,450,311 B2
Techniques for determining cross-validation parameters for time series forecasting
Ankit Kumar Aggarwal, Mumbai (IN); Anku Kumar Pandey, New Delhi (IN); Ravijeet Ranjit Kumar, Bangalore (IN); and Samik Raychaudhuri, Bangalore (IN)
Assigned to ORACLE INTERNATIONAL CORPORATION, Redwood Shores, CA (US)
Filed by Oracle International Corporation, Redwood Shores, CA (US)
Filed on Mar. 14, 2022, as Appl. No. 17/694,323.
Claims priority of provisional application 63/254,957, filed on Oct. 12, 2021.
Prior Publication US 2023/0113287 A1, Apr. 13, 2023
Int. Cl. G06F 18/21 (2023.01); G06F 18/20 (2023.01); G06N 20/20 (2019.01)
CPC G06F 18/217 (2023.01) [G06F 18/285 (2023.01); G06N 20/20 (2019.01)] 20 Claims
OG exemplary drawing
 
1. A method comprising:
identifying, by a computing system, a set of one or more cross-validation parameters to be used for cross-validating a model to be used for generating a requested forecast, wherein the requested forecast includes a time series dataset and a forecast horizon identifying a number of time steps for which a forecast is to be made using the time series dataset;
identifying, by the computing system, an objective function to be minimized for determining optimal values for the set of one or more cross-validation parameters;
identifying, by the computing system, a set of constraints for one or more cross-validation parameters from the set of cross-validation parameters, wherein the objective function is represented as a set of penalty terms, and wherein:
a first penalty term in the set of penalty terms represents a cost of violation of a first constraint in the set of constraints on a first cross-validation parameter in the set of cross-validation parameters, wherein the first cross-validation parameter represents a left most fold cross-validation parameter for cross-validating the model; and
a second penalty term in the set of penalty terms represents a cost of violation of a second constraint in the set of constraints on a second cross-validation parameter in the set of cross-validation parameters, wherein the second cross-validation parameter represents a gap between the folds cross-validation parameter for cross-validating the model;
using, by the computing system, an optimization technique to determine the optimal values for the set of cross-validation parameters, wherein the optimal values for the set of cross-validation parameters is determined by:
determining one or more combinations of values to be assigned to the set of cross-validation parameters, wherein the set of cross validation parameters comprise the left most fold cross-validation parameter, the gap between the folds cross-validation parameter and a number of folds cross-validation parameter;
for each combination of values from the one or more combinations of values, computing a penalty value for the combination of values; and
determining the optimal values for the set of cross-validation parameters by selecting the combination of values from the one or more combinations of values that has the lowest penalty value; and
using, by the computing system, the optimal values determined for the set of cross-validation parameters to perform cross-validation of the model to be used for making the requested forecast.