Volume 12, Number 2

Tuning Dari Speech Classification Employing Deep Neural Networks

  Authors

Mursal Dawodi and Jawid Ahmad Baktash, Avignon University, France

  Abstract

Recently, many researchers have focused on building and improving speech recognition systems to facilitate and enhance human-computer interaction. Today, Automatic Speech Recognition (ASR) system has become an important and common tool from games to translation systems, robots, and so on. However, there is still a need for research on speech recognition systems for low-resource languages. This article deals with the recognition of a separate word for Dari language, using Mel-frequency cepstral coefficients (MFCCs) feature extraction method and three different deep neural networks including Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), Multilayer Perceptron (MLP). We evaluate our models on our built-in isolated Dari words corpus that consists of 1000 utterances for 20 short Dari terms. This study obtained the impressive result of 98.365% average accuracy.

  Keywords

Dari, deep neural network, speech recognition, recurrent neural network, multilayer perceptron, convolutional neural network