Scalability Analysis of Deeply Pipelined Tsunami Simulation with Multiple FPGAs

Antoniette MONDIGO  Tomohiro UENO  Kentaro SANO  Hiroyuki TAKIZAWA  

IEICE TRANSACTIONS on Information and Systems   Vol.E102-D   No.5   pp.1029-1036
Publication Date: 2019/05/01
Publicized: 2019/02/05
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2018RCP0007
Type of Manuscript: Special Section PAPER (Special Section on Reconfigurable Systems)
Category: Applications
tsunami simulation,  stream computing,  scalability,  multiple FPGAs,  high-performance computing,  

Full Text: FreePDF

Since the hardware resource of a single FPGA is limited, one idea to scale the performance of FPGA-based HPC applications is to expand the design space with multiple FPGAs. This paper presents a scalable architecture of a deeply pipelined stream computing platform, where available parallelism and inter-FPGA link characteristics are investigated to achieve a scaled performance. For a practical exploration of this vast design space, a performance model is presented and verified with the evaluation of a tsunami simulation application implemented on Intel Arria 10 FPGAs. Finally, scalability analysis is performed, where speedup is achieved when increasing the computing pipeline over multiple FPGAs while maintaining the problem size of computation. Performance is scaled with multiple FPGAs; however, performance degradation occurs with insufficient available bandwidth and large pipeline overhead brought by inadequate data stream size. Tsunami simulation results show that the highest scaled performance for 8 cascaded Arria 10 FPGAs is achieved with a single pipeline of 5 stream processing elements (SPEs), which obtained a scaled performance of 2.5 TFlops and a parallel efficiency of 98%, indicating the strong scalability of the multi-FPGA stream computing platform.