Whole Body Motion Noise Cancellation of a Robot for Improved Automatic Speech Recognition

Ince G., NAKADAI K., RODEMANN T., Tsujino H., IMURA J.

ADVANCED ROBOTICS, vol.25, pp.1405-1426, 2011 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 25
  • Publication Date: 2011
  • Doi Number: 10.1163/016918611x579448
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Page Numbers: pp.1405-1426
  • Istanbul Technical University Affiliated: No


The motors of a robot produce ego-motion noise that degrades the quality of recorded sounds. This paper describes an architecture that enhances the capability of a robot to perform automatic speech recognition (ASR) even as the entire body of the robot moves. The architecture consists of three blocks: (i) a multi-channel noise reduction block, consisting of microphone-array-based sound localization, geometric source separation and post-filtering, (ii) a single-channel template subtraction block and (iii) an ASR block. As the first step of our analysis strategy, we divided the whole-body motion noise problem into three subdomains of arm, leg and head motion noise, according to their intensity levels and spatial location. Subsequently, by following a synthesis-by-analysis approach, we determined the best method for suppressing each type of ego-motion noise. Finally, we proposed to utilize a control module in our ASR framework; this module was designed to make decisions based on instantaneously detected motions, allowing it to switch to the most appropriate method for the current type of noise. This proposed system resulted in improvements of up to 50 points in word correct rates compared with results obtained by single microphone recognition of arm, leg and head motions. (C) Koninklijke Brill NV, Leiden, 2011