For Full-Text PDF, please login, if you are a member of IEICE,|
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
Eager Memory Management for In-Memory Data Analytics
Hakbeom JANG Jonghyun BAE Tae Jun HAM Jae W. LEE
IEICE TRANSACTIONS on Information and Systems
Publication Date: 2019/03/01
Online ISSN: 1745-1361
Type of Manuscript: LETTER
Category: Computer System
in-memory computing, spark, garbage collection, data spill,
Full Text: PDF>>
| Errata[Uploaded on April 1,2019]
This paper introduces e-spill, an eager spill mechanism, which dynamically finds the optimal spill-threshold by monitoring the GC time at runtime and thereby prevent expensive GC overhead. Our e-spill adopts a slow-start model to gradually increase the spill-threshold until it reaches the optimal point without substantial GCs. We prototype e-spill as an extension to Spark and evaluate it using six workloads on three different parallel platforms. Our evaluations show that e-spill improves performance by up to 3.80× and saves the cost of cluster operation on Amazon EC2 cloud by up to 51% over the baseline system following Spark Tuning Guidelines.