
For FullText PDF, please login, if you are a member of IEICE,
or go to Pay Per View on menu list, if you are a nonmember of IEICE.

Unified DualRadix Architecture for Scalable Montgomery Multiplications in GF(P) and GF(2^{n})
Kazuyuki TANIMURA Ryuta NARA Shunitsu KOHARA Youhua SHI Nozomu TOGAWA Masao YANAGISAWA Tatsuo OHTSUKI
Publication
IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences
Vol.E92A
No.9
pp.23042317 Publication Date: 2009/09/01
Online ISSN: 17451337
DOI: 10.1587/transfun.E92.A.2304
Print ISSN: 09168508 Type of Manuscript: PAPER Category: VLSI Design Technology and CAD Keyword: elliptic curve cryptography, dualradix, modular multiplication, Montgomery multiplication, scalability, unified,
Full Text: PDF(1.6MB)>>
Summary:
Modular multiplication is the most dominant arithmetic operation in elliptic curve cryptography (ECC), that is a type of publickey cryptography. Montgomery multiplier is commonly used to compute the modular multiplications and requires scalability because the bit length of operands varies depending on its security level. In addition, ECC is performed in GF(P) or GF(2^{n}), and unified architecture for multipliers in GF(P) and GF(2^{n}) is required. However, in previous works, changing frequency is necessary to deal with delaytime difference between GF(P) and GF(2^{n}) multipliers because the critical path of the GF(P) multiplier is longer. This paper proposes unified dualradix architecture for scalable Montgomery multiplications in GF(P) and GF(2^{n}). This proposed architecture unifies four parallel radix2^{16} multipliers in GF(P) and a radix2^{64} multiplier in GF(2^{n}) into a single unit. Applying lower radix to GF(P) multiplier shortens its critical path and makes it possible to compute the operands in the two fields using the same multiplier at the same frequency so that clock dividers to deal with the delaytime difference are not required. Moreover, parallel architecture in GF(P) reduces the clock cycles increased by dualradix approach. Consequently, the proposed architecture achieves to compute a GF(P) 256bit Montgomery multiplication in 0.28 µs. The implementation result shows that the area of the proposal is almost the same as that of previous works: 39 kgates.

