US 11,748,611 B2
	Method and apparatus for reinforcement learning training sessions with consideration of resource costing and resource utilization
Sumit Sanyal, Santa Cruz, CA (US); Anil Hebbar, Santa Cruz, CA (US); Abdul Puliyadan Kunnil Muneer, Bangalore (IN); Abhinav Kaushik, Bangalore (IN); Bharat Kumar Padi, Bangalore (IN); Jeroen Bédorf, Heerhugowaard (NL); and Tijmen Tieleman, Diemen (NL)
Filed by Sumit Sanyal, Santa Cruz, CA (US); Anil Hebbar, Santa Cruz, CA (US); Abdul Puliyadan Kunnil Muneer, Bangalore (IN); Abhinav Kaushik, Bangalore (IN); Bharat Kumar Padi, Bangalore (IN); Jeroen Bédorf, Heerhugowaard (NL); and Tijmen Tieleman, Diemen (NL)
Filed on Feb. 18, 2019, as Appl. No. 16/278,699.
Prior Publication US 2020/0265302 A1, Aug. 20, 2020
Int. Cl. G06N 3/08 (2023.01); G06N 5/022 (2023.01); G06Q 10/0631 (2023.01); G06N 3/045 (2023.01)

CPC G06N 3/08 (2013.01) [G06N 3/045 (2023.01); G06N 5/022 (2013.01); G06Q 10/06313 (2013.01); G06Q 2220/18 (2013.01)]

22 Claims

1. A computer implemented nesting reinforcement machine learning method applied to improve a cost effectiveness of a nested software-encoded reinforcement learning program, the method comprising:

(a.) a nesting reinforcement machine learning software-encoded program performing a plurality of cycles of execution (“executions”) of the nested software-encoded reinforcement learning program;

(b.) the nesting reinforcement machine learning software-encoded program evaluating variations of at least one financial cost value monitored during the executions of the nested software-encoded reinforcement learning program; and

(c.) the nesting reinforcement machine learning software-encoded program adjusting at least one resource parameter value of the nested software-encoded reinforcement learning program at least partly in consideration of the variations of the at least one financial cost value observed during the executions of the nested software-encoded reinforcement learning program.