Construction of an Efficient Divided/Distributed Neural Network Model Using Edge Computing

Ryuta SHINGAI  Yuria HIRAGA  Hisakazu FUKUOKA  Takamasa MITANI  Takashi NAKADA  Yasuhiko NAKASHIMA  

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E103-D   No.10   pp.2072-2082
Publication Date: 2020/10/01
Publicized: 2020/07/02
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2019EDP7326
Type of Manuscript: PAPER
Category: Fundamentals of Information Systems
Keyword: 
convolutional neural network,  edge computing,  distributed neural network,  video compression,  

Full Text: PDF>>
Buy this Article




Summary: 
Modern deep learning has significantly improved performance and has been used in a wide variety of applications. Since the amount of computation required for the inference process of the neural network is large, it is processed not by the data acquisition location like a surveillance camera but by the server with abundant computing power installed in the data center. Edge computing is getting considerable attention to solve this problem. However, edge computing can provide limited computation resources. Therefore, we assumed a divided/distributed neural network model using both the edge device and the server. By processing part of the convolution layer on edge, the amount of communication becomes smaller than that of the sensor data. In this paper, we have evaluated AlexNet and the other eight models on the distributed environment and estimated FPS values with Wi-Fi, 3G, and 5G communication. To reduce communication costs, we also introduced the compression process before communication. This compression may degrade the object recognition accuracy. As necessary conditions, we set FPS to 30 or faster and object recognition accuracy to 69.7% or higher. This value is determined based on that of an approximation model that binarizes the activation of Neural Network. We constructed performance and energy models to find the optimal configuration that consumes minimum energy while satisfying the necessary conditions. Through the comprehensive evaluation, we found that the optimal configurations of all nine models. For small models, such as AlexNet, processing entire models in the edge was the best. On the other hand, for huge models, such as VGG16, processing entire models in the server was the best. For medium-size models, the distributed models were good candidates. We confirmed that our model found the most energy efficient configuration while satisfying FPS and accuracy requirements, and the distributed models successfully reduced the energy consumption up to 48.6%, and 6.6% on average. We also found that HEVC compression is important before transferring the input data or the feature data between the distributed inference processes.