Fast Computation with Efficient Object Data Distribution for Large-Scale Hologram Generation on a Multi-GPU Cluster

Takanobu BABA  Shinpei WATANABE  Boaz JESSIE JACKIN  Kanemitsu OOTSU  Takeshi OHKAWA  Takashi YOKOTA  Yoshio HAYASAKI  Toyohiko YATAGAI  

IEICE TRANSACTIONS on Information and Systems   Vol.E102-D   No.7   pp.1310-1320
Publication Date: 2019/07/01
Publicized: 2019/03/29
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2018EDP7346
Type of Manuscript: PAPER
Category: Human-computer Interaction
computer generated holography,  large-scale CGH,  GPU cluster,  

Full Text: FreePDF

The 3D holographic display has long been expected as a future human interface as it does not require users to wear special devices. However, its heavy computation requirement prevents the realization of such displays. A recent study says that objects and holograms with several giga-pixels should be processed in real time for the realization of high resolution and wide view angle. To this problem, first, we have adapted a conventional FFT algorithm to a GPU cluster environment in order to avoid heavy inter-node communications. Then, we have applied several single-node and multi-node optimization and parallelization techniques. The single-node optimizations include a change of the way of object decomposition, reduction of data transfer between the CPU and GPU, kernel integration, stream processing, and utilization of multiple GPUs within a node. The multi-node optimizations include distribution methods of object data from host node to the other nodes. Experimental results show that intra-node optimizations attain 11.52 times speed-up from the original single node code. Further, multi-node optimizations using 8 nodes, 2 GPUs per node, attain an execution time of 4.28 sec for generating a 1.6 giga-pixel hologram from a 3.2 giga-pixel object. It means a 237.92 times speed-up of the sequential processing by CPU and 41.78 times speed-up of multi-threaded execution on multicore-CPU, using a conventional FFT-based algorithm.