For Full-Text PDF, please login, if you are a member of IEICE,|
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
Spectral Methods for Thesaurus Construction
Nobuyuki SHIMIZU Masashi SUGIYAMA Hiroshi NAKAGAWA
IEICE TRANSACTIONS on Information and Systems
Publication Date: 2010/06/01
Online ISSN: 1745-1361
Print ISSN: 0916-8532
Type of Manuscript: Special Section PAPER (Special Section on Info-Plosion)
Category: Natural Language Processing
synonym acquisition, synonym extraction, thesaurus, spectral clustering, graph laplacian,
Full Text: PDF>>
Traditionally, popular synonym acquisition methods are based on the distributional hypothesis, and a metric such as Jaccard coefficients is used to evaluate the similarity between the contexts of words to obtain synonyms for a query. On the other hand, when one tries to compile and clean a thesaurus, one often already has a modest number of synonym relations at hand. Could something be done with a half-built thesaurus alone? We propose the use of spectral methods and discuss their relation to other network-based algorithms in natural language processing (NLP), such as PageRank and Bootstrapping. Since compiling a thesaurus is very laborious, we believe that adding the proposed method to the toolkit of thesaurus constructors would significantly ease the pain in accomplishing this task.