Ego noise cancellation of a robot using missing feature masks


Ince G., NAKADAI K., RODEMANN T., TSUJINO H., IMURA J.

APPLIED INTELLIGENCE, cilt.34, sa.3, ss.360-371, 2011 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 34 Sayı: 3
  • Basım Tarihi: 2011
  • Doi Numarası: 10.1007/s10489-011-0285-0
  • Dergi Adı: APPLIED INTELLIGENCE
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Sayfa Sayıları: ss.360-371
  • İstanbul Teknik Üniversitesi Adresli: Hayır

Özet

We describe an architecture that gives a robot the capability to recognize speech by cancelling ego noise, even while the robot is moving. The system consists of three blocks: (1) a multi-channel noise reduction block, comprising consequent stages of microphone-array-based sound localization, geometric source separation and post-filtering; (2) a single-channel noise reduction block utilizing template subtraction; and (3) an automatic speech recognition block. In this work, we specifically investigate a missing feature theory-based automatic speech recognition (MFT-ASR) approach in block (3). This approach makes use of spectro-temporal elements derived from (1) and (2) to measure the reliability of the acoustic features, and generates masks to filter unreliable acoustic features. We then evaluated this system on a robot using word correct rates. Furthermore, we present a detailed analysis of recognition accuracy to determine optimal parameters. Implementation of the proposed MFT-ASR approach resulted in significantly higher recognition performance than single or multi-channel noise reduction methods.