For Full-Text PDF, please login, if you are a member of IEICE,|
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
Formal Verification of a Decision-Tree Ensemble Model and Detection of Its Violation Ranges
Naoto SATO Hironobu KURUMA Yuichiroh NAKAGAWA Hideto OGAWA
IEICE TRANSACTIONS on Information and Systems
Publication Date: 2020/02/01
Online ISSN: 1745-1361
Type of Manuscript: PAPER
Category: Dependable Computing
machine learning, formal verification, decision-tree ensemble model,
Full Text: PDF(1.4MB)>>
As one type of machine-learning model, a “decision-tree ensemble model” (DTEM) is represented by a set of decision trees. A DTEM is mainly known to be valid for structured data; however, like other machine-learning models, it is difficult to train so that it returns the correct output value (called “prediction value”) for any input value (called “attribute value”). Accordingly, when a DTEM is used in regard to a system that requires reliability, it is important to comprehensively detect attribute values that lead to malfunctions of a system (failures) during development and take appropriate countermeasures. One conceivable solution is to install an input filter that controls the input to the DTEM and to use separate software to process attribute values that may lead to failures. To develop the input filter, it is necessary to specify the filtering condition for the attribute value that leads to the malfunction of the system. In consideration of that necessity, we propose a method for formally verifying a DTEM and, according to the result of the verification, if an attribute value leading to a failure is found, extracting the range in which such an attribute value exists. The proposed method can comprehensively extract the range in which the attribute value leading to the failure exists; therefore, by creating an input filter based on that range, it is possible to prevent the failure. To demonstrate the feasibility of the proposed method, we performed a case study using a dataset of house prices. Through the case study, we also evaluated its scalability and it is shown that the number and depth of decision trees are important factors that determines the applicability of the proposed method.