CPC G05B 13/048 (2013.01) [C12M 41/48 (2013.01); G05B 13/0265 (2013.01)] | 16 Claims |
1. An apparatus comprising:
a setting unit for setting an operation content for a manufacturing system configured to manufacture an object to be manufactured, wherein a state parameter set indicates a state of at least one of the manufacturing system or the object to be manufactured and wherein the state parameter set is referred to as a posterior state parameter set when acquired at a time point after the operation content is set by the setting unit,
a first acquisition unit for acquiring the posterior state parameter set indicating the state of at least one of the manufacturing system or the object to be manufactured after the operation content is set, and
a learning processing unit for executing, by using learning data including the operation content and the posterior state parameter set, a learning process of a control model of the manufacturing system configured to output the operation content that increases a reward value determined by a preset reward function in response to input of the state parameter set indicating the state of at least one of the manufacturing system or the object to be manufactured,
wherein the learning processing unit is configured to, in a case where an increase width of the reward value according to a setting result of one said operation content outputted in response to one said state parameter set being input to the control model is less than a reference width, execute the learning process of the control model not to output the one operation content in response to input of the one state parameter set.
|