| 研究生: |
陳彥廷 Chen, Yan-Ting |
|---|---|
| 論文名稱: |
基於雙向知識蒸餾增強個人化聯盟式學習效能之方法 Enhancing Personalized Federated Learning Using Bidirectional Knowledge Distillation |
| 指導教授: |
劉任修
Liu, Ren-Shiou |
| 學位類別: |
碩士 Master |
| 系所名稱: |
管理學院 - 資訊管理研究所 Institute of Information Management |
| 論文出版年: | 2023 |
| 畢業學年度: | 111 |
| 語文別: | 中文 |
| 論文頁數: | 67 |
| 中文關鍵詞: | 聯盟式學習 、知識蒸餾 、雙向蒸餾 |
| 外文關鍵詞: | Knowledge Distillation, Bidirectional Distillation, Federated Learning |
| 相關次數: | 點閱:111 下載:8 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著人工智慧的快速發展,各行業對人工智慧的表現要求也逐漸提高,機器學習模型需要做出更準確的預測和推薦,模型訓練所需要的數據量和數據多樣性勢必得大幅提升;礙於大多數領域所擁有的數據量有限,企業間又因為競爭關係而無法輕易交換數據。近年來人民也逐漸意識到數據隱私的重要性,使得數據的收集難度大幅增加。在這樣的挑戰下,聯盟式學習成為解決這些難題的一大技術。
聯盟式學習的分散式訓練架構,讓參與訓練的設備可以在不上傳私人數據的情況下進行訓練,每個設備各自利用自己的私人數據來共同訓練一個共享模型,同時解決數據收集和隱私安全的問題。近年來聯盟式學習也隨著快速發展而逐漸發現其問題所在,除了設備間的能力不相等造成模型無法很好泛化,每個設備的數據分布差異也造成模型表現效能低落。實現更個人化的聯盟式學習也成為未來研究的一大方向。
本研究將能克服模型差異的雙向蒸餾加入到聯盟式學習的設置中,因應訓練設備之間的模型多樣性。先將相似特性的模型進行分群。每個群集會有一個Prototype Model (原型模型)。在Prototype Model上訓練出一個規模更大的Meta Model (全域模型),透過雙向蒸餾的方式提升彼此模型間的效能,也進而提升每個設備端模型效能的表現。經過實驗證明,透過全域模型蒸餾而得到的新Prototype Model,其準確率皆比原模型來的高。
In recent years, the demand for artificial intelligence (AI) has been steadily increasing across various industries. Consequently, there is a significant need for larger datasets and greater diversity to train AI models effectively. However, the reality is that most fields have limited access to data, and businesses face challenges in exchanging information due to competitive relationships. Moreover, the growing awareness of data privacy among people has significantly complicated data collection efforts. As a result, the inability to acquire sufficient data has posed a challenge in training high-performing models. In response to these difficulties, Federated learning has emerged as a prominent solution to address these issues.
In the framework of Federated learning, every device contributes to the collaborative training of a shared model using its own data. This approach eliminates the necessity of uploading private data, thereby addressing concerns related to data collection and privacy security. However, as federated learning has rapidly progressed in recent years, its associated challenges have also come to light. One such challenge is the unequal capabilities among devices, which hinder the easy generalization of models. Additionally, the performance of the models is influenced by variations in data distribution across different devices. As a result, achieving more personalized Federated learning has emerged as a key research direction for the future.
This study incorporates the bidirectional distillation technique, which addresses
model differences, into the framework of Federated learning. Firstly, models with similar characteristics are grouped together. Each group has a Prototype Model,
and the parameters transmitted by client devices are aggregated. Subsequently,
each Prototype Model is trained using unlabeled data. A larger Meta Model is then
trained based on these Prototype Models. By employing bidirectional distillation,
the performance of the Prototype Models is enhanced, leading to improved performance of individual models on each device.
Aono, Y., Hayashi, T., Wang, L., Moriai, S., et al. (2017). Privacy-preserving Deep
Learning via Additively Homomorphic Encryption. IEEE Transactions on Information Forensics and Security, 13(5):1333–1345.
Arivazhagan, M. G., Aggarwal, V., Singh, A. K., and Choudhary, S. (2019). Federated
Learning with Personalization Layers. arXiv preprint arXiv:1912.00818.
Bistritz, I., Mann, A., and Bambos, N. (2020). Distributed Distillation for On-device
Learning. Advances in Neural Information Processing Systems, 33:22593–22604.
Bonawitz, K., Ivanov, V., Kreuter, B., Marcedone, A., McMahan, H. B., Patel, S., Ra-
mage, D., Segal, A., and Seth, K. (2017). Practical Secure Aggregation for Privacy-
preserving Machine Learning. In proceedings of the 2017 ACM SIGSAC Conference
on Computer and Communications Security, pages 1175–1191.
Bui, D., Malik, K., Goetz, J., Liu, H., Moon, S., Kumar, A., and Shin, K. G. (2019).
Federated User Representation Learning. arXiv preprint arXiv:1909.12535.
Chai, Z., Ali, A., Zawad, S., Truex, S., Anwar, A., Baracaldo, N., Zhou, Y., Ludwig,
H., Yan, F., and Cheng, Y. (2020). Tifl: A Tier-based Federated Learning System. In
Proceedings of the 29th International Symposium on High-Performance Parallel and Distributed Computing, pages 125–136.
Chen, J., Pan, X., Monga, R., Bengio, S., and Jozefowicz, R. (2016). Revisiting Distributed Synchronous SGD. arXiv preprint arXiv:1604.00981.
Deng, Y., Kamani, M. M., and Mahdavi, M. (2020). Adaptive Personalized Federated
Learning. arXiv preprint arXiv:2003.13461.
Du, W. and Atallah, M. J. (2001). Privacy-preserving Cooperative Statistical Analysis.
In Seventeenth Annual Computer Security Applications Conference, pages 102–110. IEEE.
Du, W., Han, Y. S., and Chen, S. (2004). Privacy-preserving Multivariate Statistical
Analysis: Linear Regression and Classification. In Proceedings of the 2004 SIAM
international conference on data mining, pages 222–233. SIAM.
Dvornik, N., Schmid, C., and Mairal, J. (2019). Diversity with Cooperation: Ensemble
Methods for Few-shot Classification. In Proceedings of the IEEE/CVF international
conference on computer vision, pages 3723–3731.
Finn, C., Abbeel, P., and Levine, S. (2017). Model-agnostic Meta-learning for Fast
Adaptation of Deep Networks. In International conference on machine learning,
pages 1126–1135. PMLR.
Foret, P., Kleiner, A., Mobahi, H., and Neyshabur, B. (2020). Sharpness-aware Minimization for Efficiently Improving Generalization. arXiv preprint arXiv:2010.01412.
Ghosh, A., Chung, J., Yin, D., and Ramchandran, K. (2020). An Efficient Framework
for Clustered Federated Learning. Advances in Neural Information Processing Systems, 33:19586–19597.
Gou, J., Yu, B., Maybank, S. J., and Tao, D. (2021). Knowledge Distillation: A Survey.
International Journal of Computer Vision, 129(6):1789–1819.
Hanzely, F. and Richt´arik, P. (2020). Federated Learning of A Mixture of Global and
Local Models. arXiv preprint arXiv:2002.05516.
Hardy, S., Henecka, W., Ivey-Law, H., Nock, R., Patrini, G., Smith, G., and Thorne, B.
(2017). Private Federated Learning on Vertically Partitioned Data via Entity Resolution and Additively Homomorphic Encryption. arXiv preprint arXiv:1711.10677.
He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep Residual Learning for Image
Recognition. In Proceedings of the IEEE conference on computer vision and pattern
recognition, pages 770–778.
Hinton, G., Vinyals, O., and Dean, J. (2015). Distilling the knowledge in a neural
network. arXiv preprint arXiv:1503.02531.
Huang, L., Shea, A. L., Qian, H., Masurkar, A., Deng, H., and Liu, D. (2019). Patient
Clustering Improves Efficiency of Federated Machine Learning to Predict Mortality
and Hospital Stay Time using Distributed Electronic Medical Records. Journal of
biomedical informatics, 99:103291.
Huang, Y., Chu, L., Zhou, Z., Wang, L., Liu, J., Pei, J., and Zhang, Y. (2021). Personalized Cross-Silo Federated Learning on Non-IID Data. In AAAI, pages 7865–7873.
Iqbal, Z. and Chan, H. (2021). Concepts, Key Challenges and Open Problems of Federated Learning. International Journal of Engineering, 34(7):1667–1683.
Jeong, E., Oh, S., Kim, H., Park, J., Bennis, M., and Kim, S.-L. (2018).
Communication-efficient On-device Machine Learning: Federated Distillation and
Augmentation under Non-iid Private Data. arXiv preprint arXiv:1811.11479.
Jiang, D., Shan, C., and Zhang, Z. (2020). Federated Learning Algorithm Based on
Knowledge Distillation. In 2020 International Conference on Artificial Intelligence
and Computer Engineering (ICAICE), pages 163–167. IEEE.
Kairouz, P., McMahan, H. B., Avent, B., Bellet, A., Bennis, M., Bhagoji, A. N.,
Bonawitz, K., Charles, Z., Cormode, G., Cummings, R., et al. (2021). Advances
and Open Problems in Federated Learning. Foundations and Trends® in Machine
Learning, 14(1–2):1–210.
Kang, S., Hwang, J., Kweon, W., and Yu, H. (2020). DE-RRD: A knowledge distillation
framework for recommender system. In Proceedings of the 29th ACM International
Conference on Information and Knowledge Management, pages 605–614.
Karimireddy, S. P., Kale, S., Mohri, M., Reddi, S., Stich, S., and Suresh, A. T. (2020).
Scaffold: Stochastic Controlled Averaging for Federated Learning. In International
Conference on Machine Learning, pages 5132–5143. PMLR.
Koneˇcn`y, J., McMahan, H. B., Ramage, D., and Richt´arik, P. (2016). Federated Optimization: Distributed Machine Learning for On-device Intelligence. arXiv preprint arXiv:1610.02527.
Krizhevsky, A., Hinton, G., et al. (2009). Learning Multiple Layers of Features from
Tiny Images.
Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2017). ImageNet Classification with
Deep Convolutional Neural Networks. Communications of the ACM, 60(6):84–90.
Kulkarni, V., Kulkarni, M., and Pant, A. (2020). Survey of Personalization Techniques
for Federated Learning. In 2020 Fourth World Conference on Smart Trends in Systems, Security and Sustainability (WorldS4), pages 794–797. IEEE.
Kweon, W., Kang, S., and Yu, H. (2021). Bidirectional distillation for Top-K recommender system. In Proceedings of the Web Conference 2021, pages 3861-3871.
Lee, J., Choi, M., Lee, J., and Shim, H. (2019). Collaborative distillation for Top-N
recommendation. In 2019 IEEE International Conference on Data Mining (ICDM),
pages 369–378.
Li, D. and Wang, J. (2019). Fedmd: Heterogenous Federated Learning via Model
Distillation. arXiv preprint arXiv:1910.03581.
Li, T., Sahu, A. K., Zaheer, M., Sanjabi, M., Talwalkar, A., and Smith, V. (2020). Federated Optimization in Heterogeneous Networks. Proceedings of Machine Learning and Systems, 2:429–450.
Lin, T., Kong, L., Stich, S. U., and Jaggi, M. (2020). Ensemble Distillation for Robust
Model Fusion in Federated Learning. Advances in Neural Information Processing
Systems, 33:2351–2363.
Liu, J., Huang, X., Song, G., Li, H., and Liu, Y. (2022). Uninet: Unified Architecture
Search with Convolution, transformer, and mlp. In Computer Vision–ECCV 2022:
17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part
XXI, pages 33–49. Springer.
Lyu, L., Yu, H., and Yang, Q. (2020). Threats to Federated Learning: A Survey. arXiv
preprint arXiv:2003.02133.
Ma, N., Zhang, X., Zheng, H.-T., and Sun, J. (2018). Shufflenet V2: Practical Guidelines for Efficient CNN Architecture Design. In Proceedings of the European conference on computer vision (ECCV), pages 116–131.
McMahan, B., Moore, E., Ramage, D., Hampson, S., and y Arcas, B. A. (2017).
Communication-efficient Learning of Deep Networks from Decentralized Data. In
Artificial intelligence and statistics, pages 1273–1282. PMLR.
Nock, R., Hardy, S., Henecka, W., Ivey-Law, H., Patrini, G., Smith, G., and Thorne, B.
(2018). Entity Resolution and Federated Learning Get a Federated Resolution. arXiv
preprint arXiv:1803.04035.
Pan, S. J. and Yang, Q. (2009). A Survey on Transfer Learning. IEEE Transactions on
knowledge and data engineering, 22(10):1345–1359.
Park, S. and Kwak, N. (2019). Feed: Feature-level Ensemble for Knowledge Distillation. arXiv preprint arXiv:1909.10754.
Pishchik, E. (2023). Trainable Activations for Image Classification.
Sattler, F., M¨uller, K.-R., and Samek, W. (2020). Clustered Federated Learning: Model-agnostic Distributed Multitask Optimization under Privacy Constraints. IEEE transactions on neural networks and learning systems, 32(8):3710–3722.
Seo, H., Park, J., Oh, S., Bennis, M., and Kim, S.-L. (2020). Federated Knowledge Distillation. arXiv preprint arXiv:2011.02367.
Shokri, R. and Shmatikov, V. (2015). Privacy-preserving Deep Learning. In Proceedings of the 22nd ACM SIGSAC conference on computer and communications security, pages 1310–1321.
Smith, V., Chiang, C.-K., Sanjabi, M., and Talwalkar, A. S. (2017). Federated Multitask Learning. Advances in neural information processing systems, 30.
Tan, A. Z., Yu, H., Cui, L., and Yang, Q. (2022). Towards Personalized Federated Learning. IEEE Transactions on Neural Networks and Learning Systems.
Tang, J. and Wang, K. (2018). Ranking distillation: Learning compact ranking models
with high performance for recommender system. In Proceedings of the 24th ACM
SIGKDD International Conference on Knowledge Discovery and Data Mining, page
2289–2298.
Vaidya, J. and Clifton, C. (2002). Privacy Preserving Association Rule Mining in Vertically Partitioned Data. In Proceedings of the eighth ACM SIGKDD international
conference on Knowledge discovery and data mining, pages 639–644.
Wan, L., Ng, W. K., Han, S., and Lee, V. C. (2007). Privacy-preservation for Gradient
Descent Methods. In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 775–783.
Wang, H., Kaplan, Z., Niu, D., and Li, B. (2020). Optimizing Federated Learning on
Non-iid Data with Reinforcement Learning. In IEEE INFOCOM 2020-IEEE Conference on Computer Communications, pages 1698–1707. IEEE.
Wu, Q., He, K., and Chen, X. (2020). Personalized Federated Learning for Intelligent IoT Applications: A Cloud-edge Based Framework. IEEE Open Journal of the Computer Society, 1:35–44.
Yang, Q., Liu, Y., Chen, T., and Tong, Y. (2019). Federated Machine Learning: Concept
and Applications. ACM Transactions on Intelligent Systems and Technology (TIST),
10(2):1–19.
Zhang, Y., Xiang, T., Hospedales, T. M., and Lu, H. (2018). Deep Mutual Learning.
In Proceedings of the IEEE conference on computer vision and pattern recognition,
pages 4320–4328.
Zhao, Y., Li, M., Lai, L., Suda, N., Civin, D., and Chandra, V. (2018). Federated
Learning with Non-iid Data. arXiv preprint arXiv:1806.00582.
Zhu, Z., Hong, J., and Zhou, J. (2021). Data-free Knowledge Distillation for Heterogeneous Federated Learning. In International Conference on Machine Learning, pages 12878–12889. PMLR.