簡易檢索 / 詳目顯示

研究生: 高嘉謙
Gao, Jia-Cian
論文名稱: 聯邦深度強化學習應用於無線網路功率控制以提升網路總速率
Federated Deep Reinforcement Learning for Improving Network Sum Rate in Distributed Power Control of Wireless Networks
指導教授: 蘇銓清
Sue, Chuan-Ching
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2024
畢業學年度: 112
語文別: 中文
論文頁數: 53
中文關鍵詞: 聯邦學習深度強化學習集群聯邦學習無線網路功率控制
外文關鍵詞: Federated Learning, Deep Reinforcement Learning, Clustered Federated Learning, Wireless Network Power Control
相關次數: 點閱:69下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著人們對即時服務需求的增加,密集部署小型蜂巢網路成為了滿足這些需求的方法之一。然而,隨著小型蜂巢網路部署密度的增加,小區域內以及小區域間的干擾問題也變得愈加嚴重,這使得功率分配在無線網路中變得至關重要要。傳統方法難以在複雜且動態的環境下做出有效的決策,而強化學習則可以通過與環境互動的方式,反覆嘗試並蒐集經驗,從而學習出合適的功率傳輸策略。
    本文提出了兩種基於聯邦深度強化學習的方法,分別結合了集群聯邦學習以及個人化的技術,根據所在的位置學習出適合的策略。在實驗中,我們與未考慮位置資訊的演算法進行了比較,結果顯示出我們的方法可以實現更高的網路總速率。

    As the demand for real-time services increases, the dense deployment of small cell networks has become one of the methods to meet these needs. However, as the deployment density of small cell networks increases, the intra-cell and inter-cell interference also becomes more severe, making power allocation crucial in wireless networks. Traditional methods struggle to make effective decisions in complex and dynamic environments, whereas reinforcement learning can interact with the environment, repeatedly attempt and gather experience, and thereby learn suitable power transmission strategies.
    This thesis proposes two methods based on federated deep reinforcement learning, incorporating techniques of clustered federated learning and personalization to learn suitable strategies based on location. In experiments, we compared our methods with algorithms that do not consider location information, and the results show that our methods can achieve higher network sum rate.

    摘要 II Summary III 致謝 VII List of Tables X List of Figures XI 1 Introduction 1 2 Background and Related Work 5 2.1 Wireless Network Power Control 5 2.2 Federated Learning 5 2.3 Federated Reinforcement Learning 6 2.4 Clustered Federated Learning 7 2.5 Motivation 7 3 Problem Formulation and System Architecture 8 3.1 Problem Formulation 8 3.2 System Architecture 11 3.2.1 CFPPO Framework 11 3.2.2 PFDQN Framework 12 4 Proposed Method 13 4.1 Proximal Policy Optimization 13 4.2 DRL Agent 14 4.2.1 State Space 14 4.2.2 Action Space 14 4.2.3 Reward Function 14 4.3 Clustering and Personalization 15 4.3.1 Clustering 15 4.3.2 Personalization 16 4.4 Algorithm 17 4.4.1 CFPPO Algorithm 17 4.4.2 PFDQN Algorithm 19 5 Evaluation 21 5.1 Training & Testing Environment 21 5.2 Implementation & Hyperparameter 22 5.3 Result 24 5.3.1 PPO Performance 24 5.3.2 Different Aggregation Period 26 5.3.3 Result for CFPPO 29 5.3.4 Result for PFDQN 30 5.3.5 Result for 7×7 and 9×9 cellular network 31 5.4 Discussion 33 6 Conclusion and Future Work 34 7 Reference 35 8 Appendix 39 8.1 PPO Framework 39 8.2 PFPPO 40 8.3 CFDQN 41

    [1] Mansoor Shafi, Andreas F. Molisch, Peter J. Smith, Thomas Haustein, Peiying Zhu, Prasan De Silva, Fredrik Tufvesson, Anass Benjebbour and Gerhard Wunder. “5G: A tutorial overview of standards, trials, challenges, deployment, and practice”, IEEE journal on selected areas in communications, vol. 35, no. 6, pp. 1201-1221, 2017.
    [2] Nisha Panwar, Shantanu Sharma, and Awadhesh Kumar Singh. “A Survey on 5G: The Next Generation of Mobile Communication”,Phys. Commun., vol 18, pp.64-84, Mar. 2016.
    [3] Ekram Hossain, and Monowar Hasan. “5G Cellular: Key Enabling Technologies and Research Challenges”, IEEE Instrum. Meas. Mag., vol. 18, no. 3, pp. 11–21, Jun. 2015.
    [4] Haijun Zhang, Xiaoli Chu, Weisi Guo, and Siyi Wang. “Coexistence of wi-fi and heterogeneous small cell networks sharing unlicensed spectrum”, IEEE Commun. Mag., vol. 53, no. 3, pp. 158–164, Mar. 2015.
    [5] Qingjiang Shi, Meisam Razaviyan, Zhi-Quan Luo, and Chen He. “An Iteratively Weighted MMSE Approach to Distributed Sum-Utility Maximization for a MIMO Interfering Broadcast Channel”, IEEE Trans. Signal Process., vol. 59, no. 9, pp. 4331–4340, Sep. 2011.
    [6] Ahmad Ali Khan, and Raviraj S. Adve. “Centralized and Distributed Deep Reinforcement Learning Methods for Downlink Sum-Rate Optimization”, IEEE Transactions on Wireless Communications, 2020.
    [7] Fan Meng, Peng Chen, Lenan Wu, and Julian Cheng. “Power allocation in multi-user cellular networks: Deep reinforcement learning approaches”, IEEE Transactions on Wireless Communications, 2020.
    [8] Xinruo Zhang, Mohammad Reza Nakhai, Gan Zheng, Sangarapillai Lambotharan, and Björn Ottersten. “Calibrated learning for online distributed power allocation in smallcell networks”, IEEE Trans. Commun., vol. 67, no. 11, pp. 8124–8136, Nov. 2019.
    [9] Yasar Sinan Nasir, and Dongning Guo. “Deep Actor-Critic Learning for Distributed Power Control in Wireless Mobile Networks”, In Proc. 54th Asilomar Conf. Signals Syst. Comput., pp. 398–402, 2020.
    [10] Peyman Tehrani, Francesco Restuccia, and Marco Levorato. “Federated Deep Reinforcement Learning for the Distributed Control of NextG Wireless Networks”, In Proc. IEEE Int Symp. Dyn. Spectr. Access Netw., pp. 248–253, 2021.
    [11] Felix Sattler, Klaus-Robert Müller, and Wojciech Samek. “Clustered federated learning: Model-agnostic distributed multitask optimization under privacy constraints”, IEEE Trans. Neural Netw. Learn. Syst., vol. 32, no. 8, pp. 3710–3722, Aug. 2021
    [12] Chengshuai Shi, Cong Shen, and Jing Yang. “Federated Multi-armed Bandits with Personalization”, In Proc. 24th Int. Conf. Artif. Intell. Statist. (AISTATS), PMLR, pp. 2917–2925, 2021.
    [13] John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. “Proximal policy optimization algorithms”, arXiv preprint arXiv: 1707.06347, 2017.
    [14] Desik Rengarajan, Nitin Ragothaman, Dileep Kalathil, and Srinivas Shakkottai. “Federated Ensemble-Directed Offline Reinforcement Learning”, In 40th International Conference on Maching Learning (FL-ICML), 2023.
    [15] Vaggelis G. Douros, and George C. Polyzos. “Review of some fundamental approaches for power control in wireless networks”, Computer Communications vol. 34, no. 13, pp. 1580–1592, 2011.
    [16] Tian Li, Anit Kumar Sahu, Ameet Talwalkar, and Virginia Smith. “Federated Learning: Challenges, Methods, and Future Directions”, IEEE Signal Process. Mag., vol. 37, no. 3, pp. 50-60, May 2020.
    [17] Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. “Communication-efficient learning of deep networks from decentralized data”, In Artificial Intelligence and Statistics, pp. 1273–1282, 2017.
    [18] Peter Kairouz, H. Brendan McMahan, Brendan Avent, Aurelien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, Kallista Bonawitz, Zachary Charles, Graham Cormode, Rachel Cummings, and Rafael G.L.D’Oliveira. “Advances and open problems in federated learning”, arXiv: 1912.04977, 2019.
    [19] Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, and Virginia Smith. “Federated optimization in heterogeneous networks”, In Conference on Machine Learning and Systems, 2020.
    [20] Jianyu Wang, Qinghua Liu, Hao Liang, Gauri Joshi,H.and Vincent Poor. “Tackling the Objective Inconsistency Problem in Heterogeneous Federated Optimization”, In Proc. 34th Int. Conf. Neural Inf. Process. Syst., pp. 7611–7623, 2020.
    [21] Sai Qian Zhang, Jieyu Lin, and Qi Zhang. “A Multi-Agent Reinforcement Learning Approach for Efficient Client Selection in Federated Learning”, In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pp. 9091–9099, 2022.
    [22] Jiaju Qi, Qihao Zhou, Lei Lei, and Kan Zheng. “Federated Reinforcement Learning: Techniques, Applications, and Open Challenges”, arXiv, 2021.
    [23] Avishek Ghosh, Jichan Chung, Dong Yin, and Kannan Ramchandran. “An efficient framework for clustered federated learning”, In Proc. NIPS, vol. 33, pp. 19586–19597, 2020.
    [24] Yishay Mansour, Mehryar Mohri, Jae Ro, and Ananda Theertha Suresh. “Three approaches for personalization with applications to federated learning”, arXiv Preprint arXiv:2002.10619, 2020.
    [25] Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. “Playing Atari with Deep Reinforcement Learning”, arXiv preprint arXiv: 1312.5602, 2013.
    [26] Ronald J. Williams. “Simple statistical gradient-following algorithms for connectionist reinforcement learning”, Machine learning, vol. 8, no. 3, pp. 229–256, 1992.
    [27] Taejoon Kim, David J. Love, and Bruno Clerckx. “Does frequent low resolution feedback outperform infrequent high resolution feedback for multiple antenna beamforming systems?”, IEEE Trans. Signal Process., vol. 59, no. 4, pp. 1654–1669, Apr. 2011.
    [28] John Schulman, Sergey Levine, Philipp Moritz, Michael I. Jordan, and Pieter Abbeel. “Trust region policy optimization”, In Proc. 32nd Int. Conf. Int. Conf. Machine Learning, Lille, France, pp. 1889–1897, 2015.
    [29] John Schulman, Philipp Moritz, Sergey Levine, Michael Jordan, and Pieter Abbeel. “High-dimensional continuous control using generalized advantage estimation”, International Conference on Learning Representations (ICLR), 2016.
    [30] Zichen Ma, Yu Lu, Wenye Li, Jinfeng Yi, and Shuguang Cui. “PFedAtt: Attention-based Personalized Federated Learning on Heterogeneous Cleints”, In Asian Conference on Machine Learning, PMLR, pp. 1253-1268, 2021

    無法下載圖示 校內:2029-08-22公開
    校外:2029-08-22公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE