Recognition of Collocation Frames from Sentences

Xiaoxia LIU  Degen HUANG  Zhangzhi YIN  Fuji REN  

IEICE TRANSACTIONS on Information and Systems   Vol.E102-D   No.3   pp.620-627
Publication Date: 2019/03/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2018EDP7255
Type of Manuscript: PAPER
Category: Natural Language Processing
collocation patterns,  collocation frames,  recursive nature of collocations,  collocation rules,  

Full Text: PDF(3.9MB)
>>Buy this Article

Collocation is a ubiquitous phenomenon in languages and accurate collocation recognition and extraction is of great significance to many natural language processing tasks. Collocations can be differentiated from simple bigram collocations to collocation frames (referring to distant multi-gram collocations). So far little focus is put on collocation frames. Oriented to translation and parsing, this study aims to recognize and extract the longest possible collocation frames from given sentences. We first extract bigram collocations with distributional semantics based method by introducing collocation patterns and integrating some state-of-the-art association measures. Based on bigram collocations extracted by the proposed method, we get the longest collocation frames according to recursive nature and linguistic rules of collocations. Compared with the baseline systems, the proposed method performs significantly better in bigram collocation extraction both in precision and recall. And in extracting collocation frames, the proposed method performs even better with the precision similar to its bigram collocation extraction results.