End-to-End Speech Emotion Recognition Using Deep Learning

Ajai Jose Jacob; Aswin A. Jacob; Ashly Mathew

Authors

Ajai Jose Jacob Department of Computer Applications, Saintgits College of Applied Sciences, Kottayam, India
Aswin A. Jacob Department of Computer Applications, Saintgits College of Applied Sciences, Kottayam, India
Ashly Mathew Department of Computer Applications, Saintgits College of Applied Sciences, Kottayam, India

Keywords:

Deep neural networks, DNN architecture, Voice Activity Detector (VAD), Deep learning

Abstract

The German Corpus database was used for testing the effectiveness of this technique. The database is used to test speech emotion recognition through deep neural networks. The technique uses convolutional pooling and fully connected layers. The database contains the audio recordings of ten actors of which five are male and other five are females, they contain seven emotional state of every actor of which only three emotional states are used. The audio recordings obtained from the database are divided into segments that are of 20 milliseconds, these segments may contain some that are empty which is identified by Voice Activity Detector(VAD) and removed. The remaining segments are divided into Training, Validation and Testing. Deep neural network is enhanced using stochastic gradient descent. After completing the experiment, the result obtained showed around 96% accuracy.

Downloads

Download data is not yet available.

End-to-End Speech Emotion Recognition Using Deep Learning

Authors

Keywords:

Abstract

Downloads

Downloads

Published

Issue

Section

License

How to Cite

Most read articles by the same author(s)

For Authors

Submit Paper Online

Submit Paper by email

Contact Us

Indexing/Abstracting

WhatsApp