| CPC G06F 9/4856 (2013.01) [G06F 9/544 (2013.01); G06F 12/0835 (2013.01)] | 20 Claims |

|
1. An apparatus comprising:
one or more neural processors configured to perform neural network model tasks;
a command processor configured to distribute neural network model tasks to the one or more neural processors; and
a shared memory shared by the one or more neural processors,
wherein the command processor is configured to cause:
directly accessing a memory in a host system to read an object database for a neural network model and store the object database in the shared memory, wherein the object database includes one or more objects indicated by indices;
determining whether a command descriptor describing a current command is in a first format or in a second format, wherein the first format includes a source memory address pointing to a memory area in the shared memory having a binary code to be accessed according to direct memory access (DMA) scheme, and the second format includes one or more object indices, a respective one of the one or more object indices indicating an object in the object database;
in response to a determination that the command descriptor describing the current command is in the second format, converting a format of the command descriptor from the second format to the first format;
generating one or more task descriptors describing neural network model tasks based on the command descriptor in the first format; and
distributing the one or more task descriptors to the one or more neural processors,
wherein, if a respective one of the one or more neural processors receives a task descriptor, the respective one neural processor directly accesses the shared memory based on the received task descriptor to load a binary code and executes the loaded binary code.
|