An Exploratory Study of Copyright Inconsistency in the Linux Kernel

Daniel M. GERMAN
Katsuro INOUE

IEICE TRANSACTIONS on Information and Systems   Vol.E104-D    No.2    pp.254-263
Publication Date: 2021/02/01
Publicized: 2020/11/17
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2020EDP7107
Type of Manuscript: PAPER
Category: Software Engineering
software maintenance,  mining software repositories,  open source software,  software copyright,  

Full Text: PDF>>
Buy this Article

Software copyright claims an exclusive right for the software copyright owner to determine whether and under what conditions others can modify, reuse, or redistribute this software. For Free and Open Source Software (FOSS), it is very important to identify the copyright owner who can control those activities with license compliance. Copyright notice is a few sentences mostly placed in the header part of a source file as a comment or in a license document in a FOSS project, and it is an important clue to establish the ownership of a FOSS project. Repositories of FOSS projects contain rich and varied information on the development including the source code contributors who are also an important clue to establish the ownership. In this paper, as a first step of understanding copyright owner, we will explore the situation of the software copyright in the Linux kernel, a typical example of FOSS, by analyzing and comparing two kinds of datasets, copyright notices in source files and source code contributors in the software repositories. The discrepancy between two kinds of analysis results is defined as copyright inconsistency. The analysis result has indicated that copyright inconsistencies are prevalent in the Linux kernel. We have also found that code reuse, affiliation change, refactoring, support function, and others' contributions potentially have impacts on the occurrence of the copyright inconsistencies in the Linux kernel. This study exposes the difficulty in managing software copyright in FOSS, highlighting the usefulness of future work to address software copyright problems.