Efficient Parallel Join Processing Exploiting SIMD in Multi-Thread Environments

Gilseok HONG  Seonghyeon KANG  Chang soo KIM  Jun-Ki MIN  

IEICE TRANSACTIONS on Information and Systems   Vol.E101-D   No.3   pp.659-667
Publication Date: 2018/03/01
Online ISSN: 1745-1361
Type of Manuscript: PAPER
Category: Data Engineering, Web Information Systems
sort-merge join,  SIMD,  kernel density estimator,  multi-thread,  

Full Text: PDF(1.1MB)
>>Buy this Article

In this paper, we study parallel join processing to improve the performance of the merge phase of sort-merge join by integrating all parallelism provided by mainstream CPUs. Modern CPUs support SIMD instruction sets with wider SIMD registers which allows to process multiple data items per each instruction. Thus, we devise an efficient parallel join algorithm, called Parallel Merge Join with SIMD instructions (PMJS). In our proposed algorithm, we utilize data parallelism by exploiting SIMD instructions. And we also accelerate the performance by avoiding the usage of conditional branch instructions. Furthermore, to take advantage of the multiple cores, our proposed algorithm is threaded in multi-thread environments. In our multi-thread algorithm, to distribute workload evenly to each thread, we devise an efficient workload balancing algorithm based on the kernel density estimator which allows to estimate the workload of each thread accurately.