Multi-modal Emotion Recognition using Speech and Facial Expressions using novel PSO-AOA-MAUT and Deep Convolution Neural Network
Main Article Content
Abstract
A vast growth in automation and robotics leads to effective human-machine interaction considering human emotions. Deep learning (DL) based Multimodal emotion recognition (MER) has shown higher reliability, accuracy, and security compared with unimodal emotion recognition systems (UER). However, the effectiveness of the MES is limited due to the more significant feature vector that increases the intricacy and total trainable parameters of the DL framework. This work provides an MER system using speech and facial expressions using a Deep Convolutional Neural Network (DCNN). It uses a novel hybrid combination of Particle Swarm Optimization, Archimedes Optimization Algorithm, and Multi-Attribute Utility Theory algorithm (PSO-AOA-MAUT) for the prominent feature selection. The AOA algorithm helps to attain better convergence and balanced optimization using the exploration and exploitation of particles. The multi-criteria decision-based MAUT algorithm is utilized to compute the weights of the fitness function of the PSO-AOA-based feature selection scheme. The results of the suggested MER are evaluated on the BAUM dataset. The suggested MER provides improved accuracy of 98.33%, recall rate of 0.98, precision of 0.97 and F1-score of 0.97 compared with traditional methods.