A Machine Learning Method for Automatic Copyright Notice Identification of Source Files

Shi QIU  German M. DANIEL  Katsuro INOUE  

IEICE TRANSACTIONS on Information and Systems   Vol.E103-D   No.12   pp.2709-2712
Publication Date: 2020/12/01
Publicized: 2020/09/18
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2020EDL8089
Type of Manuscript: LETTER
Category: Software Engineering
software maintenance,  open source software,  software copyright,  

Full Text: PDF>>
Buy this Article

For Free and Open Source Software (FOSS), identifying the copyright notices is important. However, both the collaborative manner of FOSS project development and the large number of source files increase its difficulty. In this paper, we aim at automatically identifying the copyright notices in source files based on machine learning techniques. The evaluation experiment shows that our method outperforms FOSSology, the only existing method based on regular expression.