| CPC G06V 10/82 (2022.01) [G06V 10/28 (2022.01); G06V 2201/07 (2022.01)] | 19 Claims |

|
1. A deep neural network-based real-time inference apparatus including a cloud server configured to infer an acquired image along with an edge device in a split manner, the apparatus comprising:
a memory configured to store information of a second artificial intelligence model identical to a first artificial intelligence model of the edge device; and
a processor executing one or more instructions stored in the memory, wherein the instructions, when executed by the processor, cause the processor to receive a quantized feature of an output of a first layer corresponding to a predetermined split point among a plurality of layers included in the first artificial intelligence model, and determine a processing result for the image based on the second artificial intelligence model by inputting the quantized feature to a second layer of the second artificial intelligence model corresponding a layer immediately after the first layer.
|