Empirical-Based Fusion Deep Convolutional Neural Network for Multimodal Emotion Recognition
Main Article Content
Abstract
Emotion recognition plays an effective and efficient role in identifying a person’s feelings. The performances of using either one feature provide no accurate recognition, in case the format is vague. This research develops a new model, a deep convolutional neural network with empirical approach-based fusion (EBF-DCNN) for emotion recognition. The proposed EBF-DCNN model extracts the audio, visual, and text formats to enhance the emotion recognition process. In this approach, three DCNN models are trained using either format, which consequently reduces the time dependencies and recognition is much faster than the other methods. The model adopts an empirical approach-based fusion method to fuse three data formats, which is highly feasible to avoid over-fitting problems. Here, the DCNN model outperformed with better results and also minimized the computational complexity. Moreover, the model is quite flexible and scalable to recognize the emotions of humans. The performance of the EBF-DCNN model can be evaluated by four metrics such as accuracy, precision, recall and F1 score, and achieved 94.33%, 93.80%, 94.08, and 93.94% for emotion recognition compared to other state-of-the-art methods.