US 11,657,124 B2
	Integrating binary inference engines and model data for efficiency of inference tasks
Peter Zatloukal, Seattle, WA (US); Matthew Weaver, Bellevue, WA (US); Alexander Kirchhoff, Seattle, WA (US); Dmitry Belenko, Redmond, WA (US); Ali Farhadi, Seattle, WA (US); Mohammad Rastegari, Bothell, WA (US); Andrew Luke Chronister, Seattle, WA (US); Keith Patrick Wyss, Seattle, WA (US); and Chenfan Sun, Redmond, WA (US)
Assigned to Apple Inc., Cupertino, CA (US)
Filed by Apple Inc., Cupertino, CA (US)
Filed on Dec. 10, 2018, as Appl. No. 16/215,540.
Prior Publication US 2020/0184037 A1, Jun. 11, 2020
Int. Cl. G06F 21/10 (2013.01); G06N 3/08 (2006.01); G06N 3/10 (2006.01); H04L 9/30 (2006.01); H04L 9/08 (2006.01); G06F 21/12 (2013.01)

CPC G06F 21/105 (2013.01) [G06F 21/12 (2013.01); G06N 3/08 (2013.01); G06N 3/10 (2013.01); H04L 9/0891 (2013.01); H04L 9/30 (2013.01); G06F 2221/0755 (2013.01)]

20 Claims

1. A method comprising, by a computing system:

receiving, from a client device associated with a user, a first user request;

accessing an instructional file in response to the first user request, wherein the instructional file comprises a binary inference engine and encrypted model data corresponding to the binary inference engine, the encrypted model data comprising at least one of: a weight used in the binary inference engine or a declared syntax used in the binary inference engine;

decrypting, by a decryption key, the encrypted model data corresponding to the binary inference engine;

executing the binary inference engine based on the first user request and using the decrypted model data; and

sending, to the client device responsive to the first user request, one or more execution results responsive to the execution of the binary inference engine.