Animal Sound Classification Using A Convolutional Neural Network


Sasmaz E., Tek F. B.

3rd International Conference on Computer Science and Engineering (UBMK), Sarajevo, Bosna-Hersek, 20 - 23 Eylül 2018, ss.625-629 identifier identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Doi Numarası: 10.1109/ubmk.2018.8566449
  • Basıldığı Şehir: Sarajevo
  • Basıldığı Ülke: Bosna-Hersek
  • Sayfa Sayıları: ss.625-629
  • Anahtar Kelimeler: Animal sound classification, Mel Frequency Cepstral Coefficient (MFCC), Convolution Neural Network (CNN), Confusion Matrix (CF)
  • İstanbul Teknik Üniversitesi Adresli: Hayır

Özet

In this paper, we investigate the problem of animal sound classification using deep learning and propose a system based on convolutional neural network architecture. As the input to the network, sound files were preprocessed to extract Mel Frequency Cepstral Coefficients (MFCC) using LibROSA library. To train and test the system we have collected 875 animal sound samples from an online sound source site for 10 different animal types. We report classification confusion matrices and the results obtained by different gradient descent optimizers. The best accuracy of 75% was obtained by Nesterov-accelerated Adaptive Moment Estimation (Nadam).