An FPGA Implementation for a Flexible-Length-Arithmetic Processor Employing the FDFM Processor Core Approach

Tatsuya KAWAMOTO  Xin ZHOU  Jacir L. BORDIM  Yasuaki ITO  Koji NAKANO  

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E99-D   No.12   pp.2901-2910
Publication Date: 2016/12/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2016PAP0029
Type of Manuscript: Special Section PAPER (Special Section on Parallel and Distributed Computing and Networking)
Category: Architecture
Keyword: 
multiple-length-numbers,  multiple-length-arithmetic,  FPGA,  RSA,  montgomery modular multiplication,  

Full Text: PDF(667.9KB)>>
Buy this Article




Summary: 
Algorithms requiring fast manipulation of multiple-length numbers are usually implemented in hardware. However, hardware implementation, using HDL (Hardware Description Language) for instance, is a laborious task and the quality of the solution relies heavily on the designer expertise. The main contribution of this work is to present a flexible-length-arithmetic processor based on FDFM (Few DSP slices and Few Memory blocks) approach that supports arithmetic operations on multiple-length numbers using FPGAs (Field Programmable Gate Array). The proposed processor has been implement on the Xilinx Virtex-6 FPGA. Arithmetic instructions of the proposed processor architecture include addition, subtraction, and multiplication of integer numbers exceeding 64-bits. To reduce the burden of implementing algorithm directly on the FPGA, applications requiring multiple-length arithmetic operations are written in a C-like language and translated into a machine program. The machine program is then transferred and executed on the proposed architecture. A 2048-bit RSA encryption/decryption implementation has been used to assess the goodness of the proposed approach. Experimental results shows that the computing time, using the proposed architecture, of a 2048-bit RSA encryption takes only 2.2 times longer than a direct FPGA implementation. Furthermore, by employing multiple FDFM cores for the same task, the computing time reduces considerably.