US 12,106,779 B1
	System and method for semantically grounded video generation
Sasha Strelnikoff, Seattle, WA (US); Nicholas A. Ketz, Topanga, CA (US); and Praveen K Pilly, West Hills, CA (US)
Assigned to HRL LABORATORIES, LLC, Malibu, CA (US)
Filed by HRL Laboratories, LLC, Malibu, CA (US)
Filed on Feb. 15, 2023, as Appl. No. 18/110,270.
Int. Cl. G11B 27/034 (2006.01); G06N 3/08 (2023.01); H04N 19/172 (2014.01); H04N 19/17 (2014.01)

CPC G11B 27/034 (2013.01) [G06N 3/08 (2013.01); H04N 19/172 (2014.11)]

21 Claims

1. A system for semantically grounded video generation, the system comprising:

one or more processors and associated memory, the memory being a non-transitory computer-readable medium having executable instructions encoded thereon, such that upon execution of the instructions, the one or more processors perform operations of:

receiving a raw video frame of a scene from one or more sensors on an autonomous platform;

encoding the raw video frame into a low-dimensional representation of the scene;

decoding the low-dimensional representation into a raw observation space;

decoding the low-dimensional representation into a corresponding semantic segmentation map for the scene;

feeding the low-dimensional representation into a controller model for the autonomous platform;

extracting semantic concepts in the low-dimensional representation that are related to an action selection by the controller model;

feeding the extracted semantic concepts into a world model to predict state and action dynamics of the autonomous platform;

feeding the raw observation space into discriminator networks that operate on frames and videos to determine between real and synthetically generated content;

modifying a generative capability of one or more encoders and decoders such that the discriminator networks are unable to distinguish between real and synthetically generated content; and

recursively generating semantically grounded videos using a conjunction of the world model and controller model.