Balancing Privacy and Performance: A Review of Differential Privacy in Deep and Federated Learning
Main Article Content
Abstract
The growing dependence on machine learning and deep learning models that are trained on sensitive user data has brought about important privacy issues that require strong solutions to prevent the disclosure of personal data. Differential Privacy (DP) has become a crucial technique for anonymizing individuals to prevent adversaries from identifying them from datasets. However, in distributed learning scenarios such as federated learning, the need for private and secure computation is even more critical. However, while DP approaches are rigorous from the privacy standpoint, their realization in practical systems is not always straightforward, especially for large and complex models such as deep neural networks. Such issues often result in a trade-off between the accuracy of the model and the computational cost, which limits the usability of DP in practical settings. This review aims to survey the usage of DP in deep learning and federated learning frameworks for the purpose of protecting personal data by adding noise to date models. The study reviews the core of DP, the probability distributions (Gaussian and Laplace) and how they are used in practice to bridge the gap between the theoretical and the real world. The comparison also brings out the major trends in privacy, accuracy, and robustness and reveals some major shortcomings in the ability to maintain model performance while achieving a high level of privacy. Furthermore, it discusses the problem of extending the DP concept to work with large datasets and how to control the level of noise such that it does not compromise the predictive capability of the model. This paper gives a detailed survey of the use of DP in improving the privacy of deep learning and the areas that need to be addressed for better implementation in challenging learning environments.