A Study of Federated Learning based Speaker Verification System
Main Article Content
Abstract
In recent time, speaker verification has gained significant seriousness as a crucial component of biometric authentication systems. Deep learning (DL) techniques have revolutionized speaker verification by enabling systems to automatically learn discriminative features from raw audio signals. However, the effectiveness of DL models heavily relies on the availability of large-scale datasets, which raises privacy concerns associated with centralized data collection. To get rid of these challenges, federated learning (FL) has emerged as a promising approach, allowing collaborative model training across distributed data sources while preserving data privacy. This paper provides a comprehensive review of recent advancements in speaker verification through the integration of deep federated learning (DFL). There are different deep learning techniques namely convolutional neural networks (CNNs), deep neural networks (DNNs) recurrent neural networks (RNNs) and deep belief networks (DBNs) as well as federated averaging algorithms to enhance speaker verification performance. The CNN based federated learning model exhibits the best overall performance with its EER of 2.42% and MinDCF of 0.048 comparing to the performance of others models DNN, RNN and DBN with its EER of 3.45%, 3.64% and 4.18% and MinDCF of 0.0567,0.0670 and 0.0725 respectively.