Dance movement is intrinsically connected to the rhythm of music and is a fundamental form of nonverbal communication present in daily human interactions. In order to enable robots to interact with humans in natural real-world environments through dance, these robots must be able to listen to music while robustly tracking the beat of continuous musical stimuli and simultaneously responding to human speech. In this paper, we propose the integration of a real-time beat tracking system with state recovery with different preprocessing solutions used in robot audition for its application to interactive dancing robots. The proposed system is assessed under different real-world acoustic conditions of increasing complexity, which consider multiple audio sources of different kinds, multiple noise sources of different natures, continuous musical and speech stimuli, and the effects of beat-synchronous ego-motion noise and of jittering in ego noise (EN). The overall results suggest improved beat tracking accuracy with lower reaction times to music transitions, while still enhancing automatic speech recognition (ASR) run in parallel in the most challenging conditions. These results corroborate the application of the proposed system for interactive dancing robots.