Model Inversion Attacks for Online Prediction Systems: Without Knowledge of Non-Sensitive Attributes

Seira HIDANO  Takao MURAKAMI  Shuichi KATSUMATA  Shinsaku KIYOMOTO  Goichiro HANAOKA  

IEICE TRANSACTIONS on Information and Systems   Vol.E101-D   No.11   pp.2665-2676
Publication Date: 2018/11/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2017ICP0013
Type of Manuscript: Special Section PAPER (Special Section on Information and Communication System Security)
Category: Forensics and Risk Analysis
black box,  model inversion,  data poisoning,  online ML systems,  

Full Text: PDF(613.4KB)
>>Buy this Article

The number of IT services that use machine learning (ML) algorithms are continuously and rapidly growing, while many of them are used in practice to make some type of predictions from personal data. Not surprisingly, due to this sudden boom in ML, the way personal data are handled in ML systems are starting to raise serious privacy concerns that were previously unconsidered. Recently, Fredrikson et al. [USENIX 2014] [CCS 2015] proposed a novel attack against ML systems called the model inversion attack that aims to infer sensitive attribute values of a target user. In their work, for the model inversion attack to be successful, the adversary is required to obtain two types of information concerning the target user prior to the attack: the output value (i.e., prediction) of the ML system and all of the non-sensitive values used to learn the output. Therefore, although the attack does raise new privacy concerns, since the adversary is required to know all of the non-sensitive values in advance, it is not completely clear how much risk is incurred by the attack. In particular, even though the users may regard these values as non-sensitive, it may be difficult for the adversary to obtain all of the non-sensitive attribute values prior to the attack, hence making the attack invalid. The goal of this paper is to quantify the risk of model inversion attacks in the case when non-sensitive attributes of a target user are not available to the adversary. To this end, we first propose a general model inversion (GMI) framework, which models the amount of auxiliary information available to the adversary. Our framework captures the model inversion attack of Fredrikson et al. as a special case, while also capturing model inversion attacks that infer sensitive attributes without the knowledge of non-sensitive attributes. For the latter attack, we provide a general methodology on how we can infer sensitive attributes of a target user without knowledge of non-sensitive attributes. At a high level, we use the data poisoning paradigm in a conceptually novel way and inject malicious data into the ML system in order to modify the internal ML model being used into a target ML model; a special type of ML model which allows one to perform model inversion attacks without the knowledge of non-sensitive attributes. Finally, following our general methodology, we cast ML systems that internally use linear regression models into our GMI framework and propose a concrete algorithm for model inversion attacks that does not require knowledge of the non-sensitive attributes. We show the effectiveness of our model inversion attack through experimental evaluation using two real data sets.