For Full-Text PDF, please login, if you are a member of IEICE,|
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
Experimental Study on a Two Phase Method for Biomedical Named Entity Recognition
Seonho KIM Juntae YOON
IEICE TRANSACTIONS on Information and Systems
Publication Date: 2007/07/01
Online ISSN: 1745-1361
Print ISSN: 0916-8532
Type of Manuscript: PAPER
Category: Natural Language Processing
information extraction, named entity recognition, two-phase model, ME, CRF, SVM, FST,
Full Text: PDF>>
In this paper, we describe a two-phase method for biomedical named entity recognition consisting of term boundary detection and biomedical category labeling. The term boundary detection can be defined as a task to assign label sequences to a given sentence, and biomedical category labeling can be viewed as a local classification problem which does not need knowledge of the labels of other named entities in a sentence. The advantage of dividing the recognition process into two phases is that we can measure the effectiveness of models at each phase and select separately the appropriate model for each subtask. In order to obtain a better performance in biomedical named entity recognition, we conducted comparative experiments using several learning methods at each phase. Moreover, results by these machine learning based models are refined by rule-based postprocessing. We tested our methods on the JNLPBA 2004 shared task and the GENIA corpus.