Efficient Parallel Join Processing Exploiting SIMD in Multi-Thread Environments

Gilseok HONG  Seonghyeon KANG  Chang soo KIM  Jun-Ki MIN  

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E101-D   No.3   pp.659-667
Publication Date: 2018/03/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2017EDP7300
Type of Manuscript: PAPER
Category: Data Engineering, Web Information Systems
Keyword: 
sort-merge join,  SIMD,  kernel density estimator,  multi-thread,  

Full Text: PDF(1.1MB)
>>Buy this Article


Summary: 
In this paper, we study parallel join processing to improve the performance of the merge phase of sort-merge join by integrating all parallelism provided by mainstream CPUs. Modern CPUs support SIMD instruction sets with wider SIMD registers which allows to process multiple data items per each instruction. Thus, we devise an efficient parallel join algorithm, called Parallel Merge Join with SIMD instructions (PMJS). In our proposed algorithm, we utilize data parallelism by exploiting SIMD instructions. And we also accelerate the performance by avoiding the usage of conditional branch instructions. Furthermore, to take advantage of the multiple cores, our proposed algorithm is threaded in multi-thread environments. In our multi-thread algorithm, to distribute workload evenly to each thread, we devise an efficient workload balancing algorithm based on the kernel density estimator which allows to estimate the workload of each thread accurately.