Fractional-Order Calculus-Based Data Augmentation Methods for Environmental Sound Classification with Deep Learning


Yazgaç B. G. , Kırcı M.

Fractal and Fractional, vol.6, no.10, 2022 (SCI-Expanded) identifier

  • Publication Type: Article / Article
  • Volume: 6 Issue: 10
  • Publication Date: 2022
  • Doi Number: 10.3390/fractalfract6100555
  • Journal Name: Fractal and Fractional
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Social Sciences Citation Index (SSCI), Scopus, INSPEC, Directory of Open Access Journals
  • Keywords: data augmentation, deep learning, environmental sound classification, fractional order calculus
  • Istanbul Technical University Affiliated: Yes

Abstract

© 2022 by the authors.In this paper, we propose two fractional-order calculus-based data augmentation methods for audio signals. The first approach is based on fractional differentiation of the Mel scale. By using a randomly selected fractional derivation order, we are warping the Mel scale, therefore, we aim to augment Mel-scale-based time-frequency representations of audio data. The second approach is based on previous fractional-order image edge enhancement methods. Since multiple deep learning approaches treat Mel spectrogram representations like images, a fractional-order differential-based mask is employed. The mask parameters are produced with respect to randomly selected fractional-order derivative parameters. The proposed data augmentation methods are applied to the UrbanSound8k environmental sound dataset. For the classification of the dataset and testing the methods, an arbitrary convolutional neural network is implemented. Our results show that fractional-order calculus-based methods can be employed as data augmentation methods. Increasing the dataset size to six times the original size, the classification accuracy result increased by around 8.5%. Additional tests on more complex networks also produced better accuracy results compared to a non-augmented dataset. To our knowledge, this paper is the first example of employing fractional-order calculus as an audio data augmentation tool.