Study on Record Linkage of Anonymizied Data

Hiroaki KIKUCHI  Takayasu YAMAGUCHI  Koki HAMADA  Yuji YAMAOKA  Hidenobu OGURI  Jun SAKUMA  

Publication
IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences   Vol.E101-A   No.1   pp.19-28
Publication Date: 2018/01/01
Online ISSN: 1745-1337
Type of Manuscript: INVITED PAPER (Special Section on Cryptography and Information Security)
Category: 
Keyword: 
data privacy,  anonymization,  re-identification risk,  big data,  

Full Text: PDF(2.1MB)
>>Buy this Article


Summary: 
Data anonymization is required before a big-data business can run effectively without compromising the privacy of personal information it uses. It is not trivial to choose the best algorithm to anonymize some given data securely for a given purpose. In accurately assessing the risk of data being compromised, there needs to be a balance between utility and security. Therefore, using common pseudo microdata, we propose a competition for the best anonymization and re-identification algorithm. The paper reported the result of the competition and the analysis on the effective of anonymization technique. The competition result reveals that there is a tradeoff between utility and security, and 20.9% records were re-identified in average.