US 11,657,194 B2
Experimental design for symbolic model discovery
Lior Horesh, North Salem, NY (US); Kenneth L. Clarkson, Madison, NJ (US); Cristina Cornelio, White Plains, NY (US); and Sara Magliacane, Peekskill, NY (US)
Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION, Armonk, NY (US)
Filed by International Business Machines Corporation, Armonk, NY (US)
Filed on Apr. 22, 2020, as Appl. No. 16/855,085.
Prior Publication US 2021/0334432 A1, Oct. 28, 2021
Int. Cl. G06F 30/20 (2020.01); G06F 111/10 (2020.01)
CPC G06F 30/20 (2020.01) [G06F 2101/14 (2013.01); G06F 2111/10 (2020.01)] 19 Claims
OG exemplary drawing
 
1. A computer-implemented method using a symbolic regression model to generate accurate output predictions, the method comprising:
obtaining a set of input values, wherein said set of input values are a starting point for a symbolic regression model discovery process relating to a system under investigation;
determining a prediction for a set of output values for a given inquiry data point, functional form and parameterization for conducting one or more experiments relating to said system under investigation that validates obtaining a set of output values from said set of input values, wherein said one or more experiments are selected to provide maximum information gained among a plurality of experiments available;
determining a sequence of actions for performing said at least one experiment and selecting one or more instrumentation to perform said at least one experiment, wherein said instrumentation generates, one or more input parameter inquiry values for a next sequence of experiments;
collecting data from said one or more experiments and using the collected data in performing discovery of a plurality of underlying symbolic regression models
selecting an optimal symbolic regression model from the plurality of underlying symbolic regression models, wherein the selected optimal symbolic regression model minimizes complexity for a bounded misfit, or minimizes a misfit measure, subject to bounded complexity;
determining a new data point by using said selected optimal symbolic regression model and updating a posterior distribution, given results and data collected from said one or more experiments relating to the system under investigation to provide informed assessment among a plurality of functional forms and parameterizations to generate said predicted accurate output values.