US 12,190,228 B2
	Generating and executing context-specific neural network models based on target runtime parameters
Sek Meng Chai, Princeton, NJ (US); and Jagadeesh Kandasamy, Cupertino, CA (US)
Assigned to Latent AI, Inc., Menlo Park, CA (US)
Filed by Latent AI, Inc., Menlo Park, CA (US)
Filed on Apr. 22, 2021, as Appl. No. 17/237,569.
Application 17/237,569 is a continuation in part of application No. 17/016,908, filed on Sep. 10, 2020, granted, now 11,816,568.
Claims priority of provisional application 63/018,236, filed on Apr. 30, 2020.
Claims priority of provisional application 62/900,311, filed on Sep. 13, 2019.
Prior Publication US 2021/0241108 A1, Aug. 5, 2021
Int. Cl. G06N 3/063 (2023.01); G06N 3/045 (2023.01); G06N 3/08 (2023.01); G06V 10/70 (2022.01); G06V 10/764 (2022.01)

CPC G06N 3/063 (2013.01) [G06N 3/045 (2023.01); G06N 3/08 (2013.01); G06V 10/764 (2022.01); G06V 10/768 (2022.01)]

29 Claims

1. A method for generating and executing a deep neural network (DNN) based on target runtime parameters, comprising:

receiving a trained original model and a set of target runtime parameters for the DNN, wherein the target runtime parameters are associated with one or more of the following for the DNN: desired operating conditions, desired resource utilization, and desired accuracy of results;

generating a context-specific model based on the original model and the set of target runtime parameters;

generating an operational plan for executing both the original model and the context-specific model to meet requirements of the target runtime parameters; and

controlling execution of the original model and the context-specific model based on the operational plan; and

deploying and executing the context-specific model at a location in a hierarchy of computing nodes, wherein the location is determined based on the target runtime parameters.