US 12,423,966 B2
	Deep neural network-based real-time inference method, and cloud device and edge device performing deep neural network-based real-time inference method
Joo Chan Lee, Suwon-si (KR); and Jong Hwan Ko, Suwon-si (KR)
Assigned to Research & Business Foundation SUNGKYUNKWAN UNIVERSITY, Suwon-si (KR)
Filed by Research & Business Foundation SUNGKYUNKWAN UNIVERSITY, Suwon-si (KR)
Filed on May 31, 2023, as Appl. No. 18/203,695.
Claims priority of application No. 10-2022-0066704 (KR), filed on May 31, 2022.
Prior Publication US 2023/0386192 A1, Nov. 30, 2023
Int. Cl. G06V 10/82 (2022.01); G06V 10/28 (2022.01)

CPC G06V 10/82 (2022.01) [G06V 10/28 (2022.01); G06V 2201/07 (2022.01)]

19 Claims

1. A deep neural network-based real-time inference apparatus including a cloud server configured to infer an acquired image along with an edge device in a split manner, the apparatus comprising:

a memory configured to store information of a second artificial intelligence model identical to a first artificial intelligence model of the edge device; and

a processor executing one or more instructions stored in the memory, wherein the instructions, when executed by the processor, cause the processor to receive a quantized feature of an output of a first layer corresponding to a predetermined split point among a plurality of layers included in the first artificial intelligence model, and determine a processing result for the image based on the second artificial intelligence model by inputting the quantized feature to a second layer of the second artificial intelligence model corresponding a layer immediately after the first layer.