Improving Low Resource Turkish Speech Recognition with Data Augmentation and TTS


Ramazan G., YALÇIN H.

2019 16th International Multi-Conference on Systems, Signals Devices (SSD), Istanbul, Turkey, 21 - 24 Mart 2019 identifier identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Doi Numarası: 10.1109/ssd.2019.8893184
  • Basıldığı Şehir: Istanbul, Turkey
  • Anahtar Kelimeler: speech recognition, data augmentation, speech synthesis, low resource languages
  • İstanbul Teknik Üniversitesi Adresli: Evet

Özet

One of the major problems faced by speech recognition researchers is the lack of data. In this paper, our objective is to compare alternative solutions to lack of data. Some experiments are conducted with very limited training data to see the effects of data augmentation and speech synthesis on speech recognition. Speed and volume perturbations are applied in this study. Besides data augmentation, synthetic speech is generated by using two different speech synthesis methods. In first speech synthesis approach, Google Translate Text to Speech (gTTS) is used as speech synthesizer. In second speech synthesis approach, an end-to-end Turkish TTS system is trained by us. Finally, we examined the effects of all these alternative methods on speech recognition for low resource languages.