Design and Implementation of Deep Neural Network for Edge Computing

Junyang ZHANG  Yang GUO  Xiao HU  Rongzhen LI  

IEICE TRANSACTIONS on Information and Systems   Vol.E101-D   No.8   pp.1982-1996
Publication Date: 2018/08/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2018EDP7044
Type of Manuscript: PAPER
Category: Fundamentals of Information Systems
edge computing,  vector processor,  convolutional neural network,  multi-core optimization,  

Full Text: PDF(3.1MB)
>>Buy this Article

In recent years, deep learning based image recognition, speech recognition, text translation and other related applications have brought great convenience to people's lives. With the advent of the era of internet of everything, how to run a computationally intensive deep learning algorithm on a limited resources edge device is a major challenge. For an edge oriented computing vector processor, combined with a specific neural network model, a new data layout method for putting the input feature maps in DDR, rearrangement of the convolutional kernel parameters in the nuclear memory bank is proposed. Aiming at the difficulty of parallelism of two-dimensional matrix convolution, a method of parallelizing the matrix convolution calculation in the third dimension is proposed, by setting the vector register with zero as the initial value of the max pooling to fuse the rectified linear unit (ReLU) activation function and pooling operations to reduce the repeated access to intermediate data. On the basis of single core implementation, a multi-core implementation scheme of Inception structure is proposed. Finally, based on the proposed vectorization method, we realize five kinds of neural network models, namely, AlexNet, VGG16, VGG19, GoogLeNet, ResNet18, and performance statistics and analysis based on CPU, gtx1080TI and FT2000 are presented. Experimental results show that the vector processor has better computing advantages than CPU and GPU, and can calculate large-scale neural network model in real time.