Voice Command Recognition for Drone Control by Deep Neural Networks on Embedded System

Creative Commons License

Yapicioglu C., Dokur Z., Ölmez T.

8th International Conference on Electrical and Electronics Engineering (ICEEE), Antalya, Turkey, 9 - 11 April 2021, pp.65-72 identifier identifier

  • Publication Type: Conference Paper / Full Text
  • Doi Number: 10.1109/iceee52452.2021.9415964
  • City: Antalya
  • Country: Turkey
  • Page Numbers: pp.65-72
  • Keywords: speech recognition, spectrogram, convolutional neural networks, deep learning, speech processing, image processing, embedded systems
  • Istanbul Technical University Affiliated: Yes


Speech recognition and its applications for controlling a system has been an important and attractive issue over the last few decades. Controlling electronic devices by speech commands allows us to manage systems quickly and easily since users would not need any additional information or remote controller. Being able to communicate a system by using speech commands also brings with the requirements of fast and accurate response. So, at the present, speech recognition algorithms are extensively performing on high performance computers. However, the improvements of system on a chip (SoC) board and deep neural network based algorithms, make it possible to execute such kind of programs on them. The proposed study defines a model for controlling a drone system by using Turkish speech directional commands in real time which is based on deep learning approaches using spectrogram images. At first, speech commands are detected in real time with the help of signal energy and zero crossing rate and these are converted to log spectrogram images. A CNN (three convolutional layers and a fully connected layer) structure is created and trained by feeding those images. Then, the trained model is moved to embedded board to achieve real time, low-cost performance. Speech commands are provided by the user instantaneously and transferred to the model as an input for decision. Then, algorithm decides which directional command is given by the user and desired operation is performed on the drone system. It is observed that, by using the proposed model, accuracies of 95.72% for offline dataset and 92,88% for real time classification are obtained.