Hidden Singer: Distinguishing Imitation Singers Based on Training with Only the Original Song

Hosung PARK  Seungsoo NAM  Eun Man CHOI  Daeseon CHOI  

IEICE TRANSACTIONS on Information and Systems   Vol.E101-D   No.12   pp.3092-3101
Publication Date: 2018/12/01
Publicized: 2018/08/24
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2018EDP7140
Type of Manuscript: PAPER
Category: Artificial Intelligence, Data Mining
singer authentication,  autoencoder,  neural network,  artificial intelligence,  

Full Text: PDF(1MB)>>
Buy this Article

Hidden Singer is a television program in Korea. In the show, the original singer and four imitating singers sing a song in hiding behind a screen. The audience and TV viewers attempt to guess who the original singer is by listening to the singing voices. Usually, there are few correct answers from the audience, because the imitators are well trained and highly skilled. We propose a computerized system for distinguishing the original singer from the imitating singers. During the training phase, the system learns only the original singer's song because it is the one the audience has heard before. During the testing phase, the songs of five candidates are provided to the system and the system then determines the original singer. The system uses a 1-class authentication method, in which only a subject model is made. The subject model is used for measuring similarities between the candidate songs. In this problem, unlike other existing studies that require artist identification, we cannot utilize multi-class classifiers and supervised learning because songs of the imitators and the labels are not provided during the training phase. Therefore, we evaluate the performances of several 1-class learning algorithms to choose which one is more efficient in distinguishing an original singer from among highly skilled imitators. The experiment results show that the proposed system using the autoencoder performs better (63.33%) than other 1-class learning algorithms: Gaussian mixture model (GMM) (50%) and one class support vector machines (OCSVM) (26.67%). We also conduct a human contest to compare the performance of the proposed system with human perception. The accuracy of the proposed system is found to be better (63.33%) than the average accuracy of human perception (33.48%).