Volume 13, Number 1

ASERS-CNN: Arabic Speech Emotion Recognition System based on CNN Model


Mohammed Tajalsir1, Susana Mu˜noz Hern´andez2 and Fatima Abdalbagi Mohammed1, 1Sudan University of Science and Technology, Sudan, 2Technical University of Madrid (UPM), Computer Science School (FI), Spain


When two people are on the phone, although they cannot observe the other person's facial expression and physiological state, it is possible to estimate the speaker's emotional state by voice roughly. In medical care, if the emotional state of a patient, especially a patient with an expression disorder, can be known, different care measures can be made according to the patient's mood to increase the amount of care. The system that capable for recognize the emotional states of human being from his speech is known as Speech emotion recognition system (SER). Deep learning is one of most technique that has been widely used in emotion recognition studies, in this paper we implement CNN model for Arabic speech emotion recognition. We propose ASERS-CNN model for Arabic Speech Emotion Recognition based on CNN model. We evaluated our model using Arabic speech dataset named Basic Arabic Expressive Speech corpus (BAES-DB). In addition of that we compare the accuracy between our previous ASERS-LSTM and new ASERS-CNN model proposed in this paper and we comes out that our new proposed mode is outperformed ASERS-LSTM model where it get 98.18% accuracy.


BAES-DB, ASERS-LSTM, Deep learning, Speech emotion recognition.