Privacy-Preserving Statistical Analysis Method by Splitting Server Roles for Distributed Real-World Data

Jun ISHII  Hiroyuki MAEOMICHI  Akihiro TSUTSUI  Ikuo YODA  

IEICE TRANSACTIONS on Communications   Vol.E97-B   No.9   pp.1779-1789
Publication Date: 2014/09/01
Online ISSN: 1745-1345
DOI: 10.1587/transcom.E97.B.1779
Type of Manuscript: Special Section PAPER (Special Section on Ambient Intelligence and Sensor Networks)
privacy-preserving,  multiple pseudonyms,  query auditing,  WebSocket,  in-memory database,  

Full Text: PDF(1.7MB)>>
Buy this Article

This paper propose a novel method for obtaining statistical results such as averages, variances, and correlations without leaking any raw data values from data-holders by using multiple pseudonyms. At present, to obtain statistical results using a large amount of data, we need to collect all data in the same storage device. However, gathering real-world data that were generated by different people is not easy because they often contain private information. The authors split the roles of servers into publishing pseudonyms and collecting answers. Splitting these roles, different entities can more easily join as pseudonym servers than in previous secure multi-party computation methods and there is less chance of collusion between servers. Thus, our method enables data holders to protect themselves against malicious attacks from data users. We also estimated a typical problem that occurred with our method and added a pseudonym availability confirmation protocol to prevent the problem. We report our evaluation of the effectiveness of our method through implementation and experimentation and discuss how we incorporated the WebSocket protocol and MySQL Memoty Storage Engine to remove the bottleneck and improve the implementation style. Finally, we explain how our method can obtain averages, variances, and correlation from 5000 data holders within 50 seconds.