US 12,218,801 B2
	Adaptive deep learning inference apparatus and method in mobile edge computing
Ryang Soo Kim, Gwangju (KR); Geun Yong Kim, Gwangju (KR); Sung Chang Kim, Gwangju (KR); Hark Yoo, Gwangju (KR); Jae In Kim, Gwangju (KR); Chor Won Kim, Gwangju (KR); Hee Do Kim, Gwangju (KR); and Byung Hee Son, Gwangju (KR)
Assigned to Electronics and Telecommunications Research Institute, Daejeon (KR)
Filed by ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE, Daejeon (KR)
Filed on Nov. 4, 2021, as Appl. No. 17/519,352.
Claims priority of application No. 10-2020-0147642 (KR), filed on Nov. 6, 2020; and application No. 10-2021-0055977 (KR), filed on Apr. 29, 2021.
Prior Publication US 2022/0150129 A1, May 12, 2022
Int. Cl. H04L 41/14 (2022.01); G06N 3/045 (2023.01); G06V 10/82 (2022.01); G06V 20/56 (2022.01); H04L 41/16 (2022.01); H04L 43/0864 (2022.01)

CPC H04L 41/145 (2013.01) [G06N 3/045 (2023.01); G06V 10/82 (2022.01); G06V 20/56 (2022.01); H04L 41/16 (2013.01); H04L 43/0864 (2013.01)]

17 Claims

1. An adaptive deep learning inference apparatus configured to operate in a mobile edge computing environment including terminal devices and a wireless access network, the adaptive deep learning inference apparatus comprising an edge computing server, the edge computing server being configured to implement a software program to:

in response to a terminal device of the terminal devices sensing data and requesting a deep learning inference service, adjust service latency required to provide a deep learning inference result according to a change in latency of the wireless access network, in order to provide deep learning inference data of deterministic latency with the service latency being fixed,

wherein the software program includes

a data receiving unit configured to receive the sensed data, the sensed data being transmitted by the terminal device over the wireless access network in order to request the deep learning inference service;

a network latency measurement unit configured to measure or predict data latency required for data transmission between the terminal device and the edge computing server and calculate network latency;

an adaptive deep learning inference unit configured to determine a deep learning model inference computation scheme capable of satisfying a required deterministic latency of the deep learning inference service in consideration of round-trip network latency calculated by the network latency measurement unit, and to perform a deep learning inference computation; and

a data processing result transmission unit configured to transmit a result value of the deep learning inference computation of the adaptive deep learning inference unit to the terminal device using the wireless access network.