| 研究生: |
薛翔宇 Xue, Xiang-Yu |
|---|---|
| 論文名稱: |
自適應協同訓練之半去中心個人化聯邦學習於非獨立同分布資料 SD-pFLAC: Adaptive Collaborative Training for Semi-Decentralized Personalized Federated Learning on Non-IID Data |
| 指導教授: |
曾繁勛
Tseng, Fan-Hsun |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
| 論文出版年: | 2025 |
| 畢業學年度: | 113 |
| 語文別: | 英文 |
| 論文頁數: | 79 |
| 中文關鍵詞: | 聚合策略 、資料異質性 、網路拓樸 、個人化 、選擇策略 、半去中心化聯邦學習 、訓練效率 |
| 外文關鍵詞: | aggregation strategy, data heterogeneity, network topology, personalized, selection strategy, semi-decentralized federated learning, training efficiency |
| 相關次數: | 點閱:16 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著人工智慧技術的快速發展,資料隱私問題日益受到關注,聯邦學習作為一種兼顧資料隱私與分散式協同訓練的機器學習架構備受關注,然而,集中式聯邦學習架構依賴中央伺服器進行模型聚合,易受單點故障與通訊壅塞影響,且面對用戶之高度資料異質性挑戰時,易導致模型泛化能力下降並造成大量通訊成本;為此,本論文提出一套具個人化能力的半去中心化聯邦學習架構,以同時提升訓練效率、本地任務表現以及模型泛化能力。半去中心化的架構設計採用階層概念,將用戶劃分為多個群組,各群設有中繼聚合節點進行群內訓練協調,為了強化本地任務精度並處理群內資料異質性,本論文結合模型切分技術,將模型拆分為共享層與私人層,前者參與聚合以傳遞通用特徵,後者保留於本地以學習個別用戶之資料特性,藉此保留能反映本地資料特性的模型特徵表現。在群間訓練階段,為了避免靜態拓樸忽略群與群之間的資料分佈差異,本論文提出一套協同訓練對象選擇策略以建構動態網路拓樸,以及基於模型差異程度的聚合策略,此外,現有的半去中心化方法多數於訓練初期立即啟動群間訓練,尚未考量群內異質性所導致的初期模型訓練方向發散導致降低收斂穩定性,且群間訓練也將帶來額外的時間與通訊開銷,為此,本研究設計自適應協同訓練啟動機制,根據訓練初期模型準確率評估任務難度,動態決定群間訓練的啟動時機,實驗結果顯示,本論文提出的架構可降低不必要之通訊與計算成本並維持模型訓練有效性。
With the rapid advancement of artificial intelligence technologies, data privacy issues have grown significantly. Federated learning (FL) allows collaborative training while keeping their data local and private, which is regarded as a promising privacy-preserving and distributed machine learning paradigm. However, traditional FL architecture heavily relies on a central server for model aggregation that makes FL to be vulnerable to single points of failure and communication bottlenecks. Furthermore, in real-world scenarios with highly heterogeneous client data, centralized FL often suffers from poor model generalization and high communication costs. To address these challenges, this thesis presents an adaptive collaborative training for semi-decentralized federated learning (semi-DFL) framework with personalized capability. The proposed framework aims to improve training efficiency, local task performance, and model generalization simultaneously. The semi-decentralized architecture is based on hierarchical structure. Clients are partitioned into clusters and managed by designated relay nodes that coordinate collaborative training processes. To enhance local task accuracy and mitigate intra-cluster data heterogeneity, the model splitting concept is applied to divide the model into shared and private layers. The shared layers participate in aggregation to learn general representations, while the private layers remain local to capture client-specific preferences, thereby preserving model features that reflect the characteristics of local data. During inter-cluster training, a dynamic selection strategy of collaborative training partners for dynamic network topology construction and an aggregation strategy based on model dissimilarity are introduced to avoid the limitations of static topologies, which often overlook the inter-cluster data distributions. Moreover, existing semi-decentralized methods typically initiate inter-cluster training from the beginning. The instantaneous inter-cluster training leads to unstable convergence since models in early stage may be diverse due to intra-cluster heterogeneity. Inter-cluster training also imposes additional time and communication overhead on cluster heads. To solve this problem, the proposed framework introduces an adaptive collaboration activation mechanism, which evaluates task difficulty based on early-stage model performance and dynamically determines when to activate inter-cluster training. Experimental results show that the proposed framework not only reduces unnecessary communication and computational costs but also preserves training effectiveness.
[1] X. Liu, L. Xie, Y. Wang, J. Zou, J. Xiong, Z. Ying, and A. V. Vasilakos, “Privacy and Security Issues in Deep Learning: A Survey,” IEEE Access, vol. 9, pp. 4566–4593, 2021.
[2] K. D. Martin and J. Zimmermann, “Artificial Intelligence and its Implications for Data Privacy,” Current Opinion in Psychology, vol. 58, p. 101829, Aug. 2024.
[3] H. B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas, “Communication-efficient learning of deep networks from decentralized data,” Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS), vol. 54, pp. 1273–1282, April 2017.
[4] Q. Xia, W. Ye, Z. Tao, J. Wu, and Q. Li, “A Survey of Federated Learning for Edge Computing: Research Problems and Solutions,” High-Confidence Computing, vol. 1, pp. 100008, Mar 2021.
[5] W. Y. B. Lim, N. C. Luong, D. T. Hoang, Y. Jiao, Y.-C. Liang, Q. Yang, D. Niyato, and C. Miao, “Federated Learning in Mobile Edge Networks: A Comprehensive Survey,” IEEE Communications Surveys & Tutorials, vol. 22, no. 3, pp. 2031–2063, 2020.
[6] B. Liu, N. Lv, Y. Guo, and Y. Li, “Recent advances on federated learning: A systematic survey,” Neurocomputing, vol. 597, p. 128019, Sep. 2024.
[7] E. T. Martínez Beltrán, M. Q. Pérez, P. M. S Sánchez, S. L. Bernal, G. Bovet, M. G. Pérez, G. M. Pérez and A. H. Celdrán, “Decentralized Federated Learning: Fundamentals, State of the Art, Frameworks, Trends, and Challenges,” IEEE Communications Surveys and Tutorials, vol. 25, no. 4, pp. 2983–3013, Jan. 2023.
[8] H. Xing, O. Simeone and S. Bi, “Decentralized Federated Learning via SGD over Wireless D2D Networks,” IEEE 21st International Workshop on Signal Processing Advances in Wireless Communications (SPAWC), Atlanta, GA, USA, 2020, pp. 1-5.
[9] Z. Wu, Z. Xu, D. Zeng, J. Li, and J. Liu, “Topology Learning for Heterogeneous Decentralized Federated Learning Over Unreliable D2D Networks,” IEEE Transactions on Vehicular Technology, vol. 73, no. 8, pp. 12201–12206, Mar. 2024.
[10] F. P.-C. Lin, S. Hosseinalipour, S. S. Azam, C. G. Brinton, and N. Michelusi, “Semi-Decentralized Federated Learning With Cooperative D2D Local Model Aggregations,” IEEE Journal on Selected Areas in Communications, vol. 39, no. 12, pp. 3851–3869, Dec. 2021.
[11] L. Yuan, Z. Wang, L. Sun, P. S. Yu, and C. G. Brinton, “Decentralized Federated Learning: A Survey and Perspective,” IEEE Internet of Things Journal, vol. 11, no. 21, pp. 34617–34638, 2024.
[12] M. Ye, X. Fang, B. Du, P. C. Yuen, and D. Tao, “Heterogeneous Federated Learning: State-of-the-art and Research Challenges,” ACM computing surveys, vol. 56, no. 3, pp. 1–44, Oct. 2023.
[13] A. B. de Luca, G. Zhang, X. Chen, and Y. Yu, “Mitigating Data Heterogeneity in Federated Learning with Data Augmentation,” Jun. 2022.
[14] H. Wen, Y. Wu, J. Li, and H. Duan, “Communication-Efficient Federated Data Augmentation on Non-IID Data,” 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Jun. 2022.
[15] L. Fu, G. Gao, and X. Liu, “Client Selection in Federated Learning: Principles, Challenges, and Opportunities,” IEEE Internet of Things Journal, vol. 10, no. 24, pp. 21811–21819, Dec. 2023.
[16] W. Zhang, X. Wang, P. Zhou, W. Wu, and X. Zhang, “Client Selection for Federated Learning With Non-IID Data in Mobile Edge Computing,” IEEE Access, vol. 9, pp. 24462–24474, 2021.
[17] J. Mills, J. Hu, and G. Min, “Multi-Task Federated Learning for Personalised Deep Neural Networks in Edge Computing,” IEEE Transactions on Parallel and Distributed Systems, vol. 33, no. 3, pp. 630–641, Mar. 2022.
[18] Q. Wu, K. He, and X. Chen, “Personalized Federated Learning for Intelligent IoT Applications: A Cloud Edge Based Framework,” IEEE Open J. Comput. Soc., vol. 1, pp. 1–15, 2020.
[19] A. Mora, A. Bujari, and P. Bellavista, “Enhancing generalization in Federated Learning with heterogeneous data: A comparative literature review,” Future Generation Computer Systems, vol. 157, pp. 1–15, Aug. 2024.
[20] L. Liu, J. Zhang, S. H. Song, and K. B. Letaief, “Client Edge Cloud Hierarchical Federated Learning,” submitted to 2020 IEEE International Conference on Communications (ICC), pp. 1–6, virtual, Jun. 2020.
[21] Y. Jiang, S. Wang, V. Valls, B. J. Ko, W.-H. Lee, K. K. Leung, and L. Tassiulas, “Model Pruning Enables Efficient Federated Learning on Edge Devices,” IEEE Transactions on Neural Networks and Learning Systems, pp. 1–13, 2022.
[22] F.-H. Tseng and Y.-H. Huang, “FedBF16-Dynamic: Communication-Efficient Federated Learning with Adaptive Transmission,” IEEE INFOCOM 2022 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), pp. 1–6, May 2024.
[23] M. Duan, D. Liu, X. Chen, R. Liu, Y. Tan, and L. Liang, “Astraea: Self-Balancing Federated Learning for Improving Classification Accuracy of Mobile Deep Learning Applications,” 2019 IEEE 37th International Conference on Computer Design (ICCD), Nov. 2019.
[24] T. Li, A. K. Sahu, M. Zaheer, M. Sanjabi, A. Talwalkar, and V. Smith, “Federated Optimization in Heterogeneous Networks,” Apr. 2020.
[25] S. P. Karimireddy, S. Kale, M. Mohri, S. J. Reddi, S. U. Stich, and A. T. Suresh, “SCAFFOLD: Stochastic Controlled Averaging for Federated Learning,” International Conference on Machine Learning, vol. 119, pp. 5132–5143, Jul. 2020.
[26] M. G. Arivazhagan, V. Aggarwal, A. K. Singh, and S. Choudhary, “Federated learning with personalization layers,” 2019.
[27] L. Collins, H. Hassani, A. Mokhtari, and S. Shakkottai, “Exploiting shared representations for personalized federated learning,” in Proceedings of the 38th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, M. Meila and T. Zhang, Eds., vol. 139. PMLR, 18–24 Jul 2021, pp. 2089–2099.
[28] R. Tamirisa, C. Xie, W. Bao, A. Zhou, R. Arel, and Aviv Shamsian, “FedSelect: Personalized Federated Learning with Customized Selection of Parameters for Fine-Tuning,” 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), vol. 34, pp. 23985–23994, Jun. 2024.
[29] E. Jeong and M. Kountouris, “Personalized decentralized federated learning with knowledge distillation,” in Proc. IEEE Int. Conf. Commun. (ICC), 2023, pp. 1982–1987.
[30] Huang, L. Kong, Q. Li, and B. Zhang, “Decentralized Federated Learning Via Mutual Knowledge Distillation,” 2022 IEEE International Conference on Multimedia and Expo (ICME), pp. 342–347, Jul. 2023.
[31] S. Kalra, J. Wen, J. C. Cresswell, M. Volkovs, and H. R. Tizhoosh, “Decentralized federated learning through proxy model sharing,” Nature Communications, vol. 14, no. 1, p. 2899, May 2023.
[32] J. Chen and Y. Yuan, “Decentralized Personalization for Federated Medical Image Segmentation via Gossip Contrastive Mutual Learning,” in IEEE Transactions on Medical Imaging, vol. 44, no. 7, pp. 2768-2783, July 2025.
[33] Y. Liu, Y. Shi, Q. Li, B. Wu, X. Wang, and L. Shen, “Decentralized Directed Collaboration for Personalized Federated Learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 23168–23178, 2024.
[34] M. Duchesne, K. Zhang, and T. Chamseddine, “Multi-Confederated Learning: Inclusive Non-IID Data handling with Decentralized Federated Learning,” Proceedings of the 39th ACM/SIGAPP Symposium on Applied Computing, pp. 1587–1595, Apr. 2024.
[35] J. Zhang, L. Chen, X. Chen and G. Wei, “A Novel Hierarchically Decentralized Federated Learning Framework in 6G Wireless Networks,” IEEE INFOCOM 2023 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), Hoboken, NJ, USA, 2023, pp. 1-6.
[36] F. P.-C. Lin, S. Hosseinalipour, S. S. Azam, C. G. Brinton, and N. Michelusi, “Semi-Decentralized Federated Learning With Cooperative D2D Local Model Aggregations,” IEEE Journal on Selected Areas in Communications, vol. 39, no. 12, pp. 3851–3869, Dec. 2021.
[37] Y. Sun, J. Shao, Y. Mao, Jessie Hui Wang, and J. Zhang, “Semi-Decentralized Federated Edge Learning for Fast Convergence on Non-IID Data,” 2022 IEEE Wireless Communications and Networking Conference (WCNC), Apr. 2022.
[38] F.-H. Tseng, X.-Y. Xue, “SD-FedETC: Semi-Decentralized Federated Learning with Communication-Efficient Topology Construction on Non-IID Data”. 2025.
[39] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, Nov. 1998.
[40] H. Xiao, K. Rasul, and R. Vollgraf, “Fashion-MNIST: A Novel Image Dataset for Benchmarking Machine Learning Algorithms,”, 2017.
[41] A. Krizhevsky and G. Hinton, “Learning multiple layers of features from tiny images,” University of Toronto, Toronto, Ontario, Tech. Rep. 0, 2009. [Online]. Available: https://www.cs.toronto.edu/kriz/learningfeatures-2009-TR.pdf
校內:2028-07-01公開