A Supervised Learning Classifier for Replayed Voice Attack Detection

Abdulrahman N., Al Shareeda S. Y. A., Ali D.

2022 International Conference on Innovation and Intelligence for Informatics, Computing, and Technologies, 3ICT 2022, Virtual, Online, Bahrain, 20 - 21 November 2022, pp.167-174 identifier

  • Publication Type: Conference Paper / Full Text
  • Doi Number: 10.1109/3ict56508.2022.9990744
  • City: Virtual, Online
  • Country: Bahrain
  • Page Numbers: pp.167-174
  • Keywords: biometric security, GNB classifier, replayed voice detector, shallow machine learning, voice features
  • Istanbul Technical University Affiliated: Yes


Google's voice assistant has the voice match feature, which can only recognize its user's voice. However, it cannot distinguish between an authentic human voice or an audio-replayed replica of the same person's voice. This work develops a Gaussian shallow learning Naive Bayes (GNB) voice-replay detector to add such a missing layer of verification. In the front-end feature extraction stage, the model extracts the Mel frequency Cepstrum Coefficients (MFCC) and Constant Q Cepstrum Coefficients (CQCC) from the input voice signal. The gathered attributes are given to the developed GNB classifier to classify the input speech as either genuine from a live source or replayed from a previously recorded source. The GNB classifier is trained using extensive datasets of labeled speech feature samples from both classes. The Equal Error Rate (%EER) statistic measures the classifier's performance. The trained GNB classifier is exposed to extensive development and evaluation datasets to optimize performance in various reduction, normalization, and filtration situations and settings. The top %EER values for the GNB classifier are 14.3553% for the development set and 19.8722% for the evaluation set. A real-time experiment is conducted with the developed learning model to support the obtained performance results.