| CPC G05B 19/4155 (2013.01) [G05B 19/4065 (2013.01); G05B 2219/37346 (2013.01); G05B 2219/40499 (2013.01); G05B 2219/50308 (2013.01)] | 10 Claims |

|
1. A machine learning device configured to perform machine learning with respect to a numerical control device configured to cause a machine tool to operate based on a machining program, the machine learning device comprising:
a memory that stores program; and
a processor configured to execute the program and control the machine learning device to:
acquire, as the numerical control device executes the machining program set with at least a cutting amount for one time and a cutting rate and causes the machine tool to perform cutting work, state information including the cutting amount for one time and the cutting rate;
output action information including adjustment information for the cutting amount for one time and the cutting rate included in the state information;
acquire determination information that is information regarding at least a magnitude of pressure applied to a tool during the cutting work, a shape of a waveform of the pressure applied to the tool, and a period of time taken for the cutting work, and, based on the determination information that has been acquired, to output a reward value used in reinforcement learning depending on a predetermined condition; and
update a value function based on the reward value, the state information, and the action information,
wherein the predetermined condition is either of a condition for prioritizing machining time and a condition for prioritizing lifetime of the tool,
the processor outputs a first reward value under the condition for prioritizing machining time and outputs a second reward value under the condition for prioritizing the lifetime of the tool, and
the processor updates a first value function based on the first reward value, the state information, and the action information under the condition for prioritizing machining time and updates a second value function based on the second reward value, the state information, and the action information under the condition for prioritizing the lifetime of the tool.
|