MapReduce Job Scheduling Based on Remaining Job Sizes

Tatsuma MATSUKI  Tetsuya TAKINE  

Publication
IEICE TRANSACTIONS on Communications   Vol.E98-B   No.1   pp.180-189
Publication Date: 2015/01/01
Online ISSN: 1745-1345
Type of Manuscript: PAPER
Category: Network System
Keyword: 
MapReduce,  Hadoop,  job scheduling,  

Full Text: PDF(1.6MB)
>>Buy this Article


Summary: 
The MapReduce job scheduler implemented in Hadoop is a mechanism to decide which job is allowed to use idle resources in Hadoop. In terms of the mean job response time, the performance of the job scheduler strongly depends on the job arrival pattern, which includes job size (i.e., the amount of required resources) and their arrival order. Because existing schedulers do not utilize information about job sizes, however, those schedulers suffer severe performance degradation with some arrival patterns. In this paper, we propose a scheduler that estimates and utilizes remaining job sizes, in order to achieve good performance regardless of job arrival patterns. Through simulation experiments, we confirm that for various arrival patterns, the proposed scheduler achieves better performance than the existing schedulers.