Ravi Kumar, Rushil Shah, Shaurya Jain, 2024. "Privacy-Preserving Machine Learning: Balancing Innovation And Data Security" ESP International Journal of Advancements in Science & Technology (ESP-IJAST) Volume 2, Issue 3: 82-94.
PPML is a novel and rising interdisciplinary field that deals with the application of artificial intelligence to learn models while preserving data privacy. The pervasive and consequential use of data in diverse fields calls for providing security for information, particularly when used for ML purposes. This paper reviews the state of the art in PPML and discusses several techniques, including Differential Privacy, Homomorphic Encryption, Secure Multiparty Computation, and Federated Learning. We discuss the experiences of the proposed methods in keeping up with high innovation while assuming high privacy regulations, computational costs, algorithmic compromises, and responsibilities. In this paper, we first present a survey of the current literature to assess the performance of the proposed methodologies and to design a novel framework for privacy-preserving machine learning. Our approach combines state-of-the-art privacy-enhancing techniques with a modular ML pipeline that is fit for a wide range of applications. Experimental outcomes illustrate the accompanying typical means and compromises in privacy protection. In conclusion, the paper outlines the directions of further research, focusing on the importance of interdisciplinary science in driving efficient progress in PPML.
[1] Dwork, C., & Roth, A. (2014). The algorithmic foundations of differential privacy. Foundations and Trends® in Theoretical Computer Science, 9(3–4), 211-407.
[2] Gkoulalas-Divanis, A., Vatsalan, D., Karapiperis, D., & Kantarcioglu, M. (2021). Modern privacy-preserving record linkage techniques: An overview. IEEE Transactions on Information Forensics and Security, 16, 4966-4987.
[3] Gonçalves, C., Bessa, R. J., & Pinson, P. (2021). A critical overview of privacy-preserving approaches for collaborative forecasting. International Journal of Forecasting, 37(1), 322-342.
[4] Dwork, C., McSherry, F., Nissim, K., & Smith, A. (2006). Calibrating noise to sensitivity in private data analysis. In Theory of Cryptography: Third Theory of Cryptography Conference, TCC 2006, New York, NY, USA, March 4-7, 2006. Proceedings 3 (pp. 265-284). Springer Berlin Heidelberg.
[5] Abadi, M., Chu, A., Goodfellow, I., McMahan, H. B., Mironov, I., Talwar, K., & Zhang, L. (2016, October). Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC conference on computer and communications security (pp. 308-318).
[6] Gentry, C. (2009, May). Fully homomorphic encryption using ideal lattices. In Proceedings of the forty-first annual ACM symposium on Theory of computing (pp. 169-178).
[7] Chillotti, I., Gama, N., Georgieva, M., & Izabachène, M. (2020). TFHE: fast, fully homomorphic encryption over the torus. Journal of Cryptology, 33(1), 34-91.
[8] Yao, A. C. (1982, November). Protocols for secure computations. In 23rd annual symposium on foundations of computer science (sfcs 1982) (pp. 160-164). IEEE.
[9] Mohassel, P., & Zhang, Y. (2017, May). Secureml: A system for scalable privacy-preserving machine learning. In 2017 IEEE Symposium on Security and Privacy (SP) (pp. 19-38). IEEE.
[10] McMahan, B., Moore, E., Ramage, D., Hampson, S., & y Arcas, B. A. (2017, April). Communication-efficient learning of deep networks from decentralised data. In Artificial intelligence and statistics (pp. 1273-1282). PMLR.
[11] Kairouz, P., McMahan, H. B., Avent, B., Bellet, A., Bennis, M., Bhagoji, A. N., ... & Zhao, S. (2021). Advances and open problems in federated learning. Foundations and trends® in machine learning, 14(1–2), 1-210.
[12] Protection, F. D. (2018). General Data Protection Regulation (GDPR). Intersoft Consulting, Accessed in October, 24(1).
[13] Act, A. (1996). Health insurance portability and accountability act of 1996: public law, 104, 191.
[14] Moore, C., O'Neill, M., O'Sullivan, E., Doröz, Y., & Sunar, B. (2014, June). Practical homomorphic encryption: A survey. In 2014 IEEE International Symposium on Circuits and Systems (ISCAS) (pp. 2792-2795). IEEE.
[15] Zhao, C., Zhao, S., Zhao, M., Chen, Z., Gao, C. Z., Li, H., & Tan, Y. A. (2019). Secure multiparty computation: theory, practice and applications. Information Sciences, 476, 357-372.
[16] Li, L., Fan, Y., Tse, M., & Lin, K. Y. (2020). A review of applications in federated learning. Computers & Industrial Engineering, 149, 106854.
[17] Mercier, D., Lucieri, A., Munir, M., Dengel, A., & Ahmed, S. (2022). PPML-TSA: A modular privacy-preserving time series classification framework. Software Impacts, 12, 100286.
[18] Xu, R., Baracaldo, N., & Joshi, J. (2021). Privacy-preserving machine learning: Methods, challenges and directions. arXiv preprint arXiv:2108.04417.
[19] Al-Rubaie, M., & Chang, J. M. (2019). Privacy-preserving machine learning: Threats and solutions. IEEE Security & Privacy, 17(2), 49-58.
[20] Hesamifard, E., Takabi, H., Ghasemi, M., & Wright, R. N. (2018). Privacy-preserving machine learning as a service. Proceedings on Privacy Enhancing Technologies.
Privacy-Preserving Machine Learning, Differential Privacy, Homomorphic Encryption, Federated Learning, Secure Multiparty Computation, Artificial Intelligence.