Kernel Methods for Chemical Compounds: From Classification to Design

Tatsuya AKUTSU  Hiroshi NAGAMOCHI  

IEICE TRANSACTIONS on Information and Systems   Vol.E94-D   No.10   pp.1846-1853
Publication Date: 2011/10/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.E94.D.1846
Print ISSN: 0916-8532
Type of Manuscript: INVITED PAPER (Special Section on Information-Based Induction Sciences and Machine Learning)
chemoinformatics,  kernel method,  pre-image,  dynamic programming,  enumeration,  graph detachment,  

Full Text: FreePDF(457.5KB)

In this paper, we briefly review kernel methods for analysis of chemical compounds with focusing on the authors' works. We begin with a brief review of existing kernel functions that are used for classification of chemical compounds and prediction of their activities. Then, we focus on the pre-image problem for chemical compounds, which is to infer a chemical structure that is mapped to a given feature vector, and has a potential application to design of novel chemical compounds. In particular, we consider the pre-image problem for feature vectors consisting of frequencies of labeled paths of length at most K. We present several time complexity results that include: NP-hardness result for a general case, polynomial time algorithm for tree structured compounds with fixed K, and polynomial time algorithm for K=1 based on graph detachment. Then we review practical algorithms for the pre-image problem, which are based on enumeration of chemical structures satisfying given constraints. We also briefly review related results which include efficient enumeration of stereoisomers of tree-like chemical compounds and efficient enumeration of outerplanar graphs.