Towards Trusted Result Verification in Mass Data Processing Service

Yan DING  Huaimin WANG  Peichang SHI  Hongyi FU  Xinhai XU  

Publication
IEICE TRANSACTIONS on Communications   Vol.E97-B   No.1   pp.19-28
Publication Date: 2014/01/01
Online ISSN: 1745-1345
Print ISSN: 0916-8516
Type of Manuscript: Special Section PAPER (Special Section on Management for Flexible ICT Systems and Services)
Category: 
Keyword: 
result verification,  mass data processing,  MapReduce,  trusted sampling,  Merkle tree,  

Full Text: PDF(2.1MB)
>>Buy this Article


Summary: 
Computation integrity is difficult to verify when mass data processing is outsourced. Current integrity protection mechanisms and policies verify results generated by participating nodes within a computing environment of service providers (SP), which cannot prevent the subjective cheating of SPs. This paper provides an analysis and modeling of computation integrity for mass data processing services. A third-party sampling-result verification method, named TS-TRV, is proposed to prevent lazy cheating by SPs. TS-TRV is a general solution of verification on the intermediate results of common MapReduce jobs, and it utilizes the powerful computing capability of SPs to support verification computing, thus lessening the computing and transmission burdens of the verifier. Theoretical analysis indicates that TS-TRV is effective on detecting the incorrect results with no false positivity and almost no false negativity, while ensuring the authenticity of sampling. Intensive experiments show that the cheating detection rate of TS-TRV achieves over 99% with only a few samples needed, the computation overhead is mainly on the SP, while the network transmission overhead of TS-TRV is only O(log N).