Encoding Detection and Bit Rate Classification of AMR-Coded Speech Based on Deep Neural Network

Seong-Hyeon SHIN  Woo-Jin JANG  Ho-Won YUN  Hochong PARK  

IEICE TRANSACTIONS on Information and Systems   Vol.E101-D   No.1   pp.269-272
Publication Date: 2018/01/01
Publicized: 2017/10/20
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2017EDL8155
Type of Manuscript: LETTER
Category: Speech and Hearing
bit rate,  speech codec,  AMR,  deep neural network,  feature vector,  

Full Text: PDF>>
Buy this Article

A method for encoding detection and bit rate classification of AMR-coded speech is proposed. For each texture frame, 184 features consisting of the short-term and long-term temporal statistics of speech parameters are extracted, which can effectively measure the amount of distortion due to AMR. The deep neural network then classifies the bit rate of speech after analyzing the extracted features. It is confirmed that the proposed features provide better performance than the conventional spectral features designed for bit rate classification of coded audio.