Unified Dual-Radix Architecture for Scalable Montgomery Multiplications in GF(P) and GF(2n)

Kazuyuki TANIMURA  Ryuta NARA  Shunitsu KOHARA  Youhua SHI  Nozomu TOGAWA  Masao YANAGISAWA  Tatsuo OHTSUKI  

Publication
IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences   Vol.E92-A   No.9   pp.2304-2317
Publication Date: 2009/09/01
Online ISSN: 1745-1337
DOI: 10.1587/transfun.E92.A.2304
Print ISSN: 0916-8508
Type of Manuscript: PAPER
Category: VLSI Design Technology and CAD
Keyword: 
elliptic curve cryptography,  dual-radix,  modular multiplication,  Montgomery multiplication,  scalability,  unified,  

Full Text: PDF(1.6MB)>>
Buy this Article




Summary: 
Modular multiplication is the most dominant arithmetic operation in elliptic curve cryptography (ECC), that is a type of public-key cryptography. Montgomery multiplier is commonly used to compute the modular multiplications and requires scalability because the bit length of operands varies depending on its security level. In addition, ECC is performed in GF(P) or GF(2n), and unified architecture for multipliers in GF(P) and GF(2n) is required. However, in previous works, changing frequency is necessary to deal with delay-time difference between GF(P) and GF(2n) multipliers because the critical path of the GF(P) multiplier is longer. This paper proposes unified dual-radix architecture for scalable Montgomery multiplications in GF(P) and GF(2n). This proposed architecture unifies four parallel radix-216 multipliers in GF(P) and a radix-264 multiplier in GF(2n) into a single unit. Applying lower radix to GF(P) multiplier shortens its critical path and makes it possible to compute the operands in the two fields using the same multiplier at the same frequency so that clock dividers to deal with the delay-time difference are not required. Moreover, parallel architecture in GF(P) reduces the clock cycles increased by dual-radix approach. Consequently, the proposed architecture achieves to compute a GF(P) 256-bit Montgomery multiplication in 0.28 µs. The implementation result shows that the area of the proposal is almost the same as that of previous works: 39 kgates.