US 11,775,817 B2
	Reinforcement learning-based techniques for training a natural media agent
Jonathan Brandt, Santa Cruz, CA (US); Chen Fang, Sunnyvale, CA (US); Byungmoon Kim, Sunnyvale, CA (US); and Biao Jia, College Park, MD (US)
Assigned to Adobe Inc., San Jose, CA (US)
Filed by Adobe Inc., San Jose, CA (US)
Filed on Aug. 23, 2019, as Appl. No. 16/549,072.
Prior Publication US 2021/0056408 A1, Feb. 25, 2021
Int. Cl. G06N 3/08 (2023.01); G09G 5/37 (2006.01); G06N 3/04 (2023.01)

CPC G06N 3/08 (2013.01) [G06N 3/04 (2013.01); G09G 5/37 (2013.01)]

20 Claims

1. One or more non-transitory computer readable media for training a natural media agent to implicitly learn a rendering policy in a multi-dimensional continuous action space from a set of training references, the one or more non-transitory computer readable media comprising instructions that, when executed by at least one processor of a reinforcement learning-based system, iteratively cause the system to:

direct a media rendering engine to perform at least one primitive graphic action on a canvas in a synthetic rendering environment, wherein the natural media agent is configured to apply the rendering policy to select the at least one primitive graphic action at each iteration based on a working observation of a current state of the system;

observe a visual state of the canvas and a position of a media rendering instrument within the synthetic rendering environment occurring as a result of performing the at least one primitive graphic action on the canvas;

apply a loss function to compute a reward based on a goal configuration and the visual state of the canvas occurring as a result of performing the at least one primitive graphic action, wherein the goal configuration comprises a current training reference of the set of training references; and

provide the reward to the natural media agent to learn the rendering policy by refining a policy function.