Large-Scale Integrated Circuit Design Based on a Nb Nine-Layer Structure for Reconfigurable Data-Path Processors

Akira FUJIMAKI  Masamitsu TANAKA  Ryo KASAGI  Katsumi TAKAGI  Masakazu OKADA  Yuhi HAYAKAWA  Kensuke TAKATA  Hiroyuki AKAIKE  Nobuyuki YOSHIKAWA  Shuichi NAGASAWA  Kazuyoshi TAKAGI  Naofumi TAKAGI  

Publication
IEICE TRANSACTIONS on Electronics   Vol.E97-C   No.3   pp.157-165
Publication Date: 2014/03/01
Online ISSN: 1745-1353
DOI: 10.1587/transele.E97.C.157
Print ISSN: 0916-8516
Type of Manuscript: INVITED PAPER (Special Section on Leading-Edge Technology of Superconductor Large-Scale Integrated Circuits)
Category: 
Keyword: 
advanced process,  cell-based design technique,  high-end computing,  large-scale integration,  rapid single-flux-quantum circuits,  

Full Text: FreePDF(2.3MB)


Summary: 
We describe a large-scale integrated circuit (LSI) design of rapid single-flux-quantum (RSFQ) circuits and demonstrate several reconfigurable data-path (RDP) processor prototypes based on the ISTEC Advanced Process (ADP2). The ADP2 LSIs are made up of nine Nb layers and Nb/AlOx/Nb Josephson junctions with a critical current density of 10kA/cm2, allowing higher operating frequencies and integration. To realize truly large-scale RSFQ circuits, careful design is necessary, with several compromises in the device structure, logic gates, and interconnects, balancing the competing demands of integration density, design flexibility, and fabrication yield. We summarize numerical and experimental results related to the development of a cell-based design in the ADP2, which features a unit cell size reduced to 30-µm square and up to four strip line tracks in the unit cell underneath the logic gates. The ADP LSIs can achieve ∼10 times the device density and double the operating frequency with the same power consumption per junction as conventional LSIs fabricated using the Nb four-layer process. We report the design and test results of RDP processor prototypes using the ADP2 cell library. The RDP processors are composed of many arrays of floating-point units (FPUs) and switch networks, and serve as accelerators in a high-performance computing system. The prototypes are composed of two-dimensional arrays of several arithmetic logic units instead of FPUs. The experimental results include a successful demonstration of full operation and reconfiguration in a 2×2 RDP prototype made up of 11.5k junctions at 45GHz after precise timing design. Partial operation of a 4×4 RDP prototype made up of 28.5k-junctions is also demonstrated, indicating the scalability of our timing design.