簡易檢索 / 詳目顯示

研究生: 游輝吉
Yu, Hui-Chi
論文名稱: 在蜂巢網路聯合傳輸下基於深度增強式學習之聯合功率分配與用戶連結
Deep Reinforcement Learning Based Joint Power Allocation and User Association for Joint Transmission in Cellular Networks
指導教授: 劉光浩
Liu, Kuang-Hao
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電腦與通信工程研究所
Institute of Computer & Communication Engineering
論文出版年: 2020
畢業學年度: 108
語文別: 英文
論文頁數: 44
中文關鍵詞: 深度強化學習Q學習多點聯合傳輸用戶連結功率分配
外文關鍵詞: Deep reinforcement learning, deep Q-learning, joint transmission, user association, power allocation
相關次數: 點閱:200下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 3GPP在Release11提出了多點協調-聯合傳輸的技術,該技術可以將干擾信號轉換為有用信號,當採用多點協調-聯合傳輸時,同一協作集群中的多個收發節點共同發送用戶的數據,從而提高用戶的數據傳輸率。在本文中,我們研究在多收發節點網絡中使用多點協調-聯合傳輸技術下的下行鏈路功率分配和用戶連結,以最大化提高系統的頻譜效率。有鑑於用戶連結結合功率分配為一個複雜度極高的困難問題,我們提出了深度強化學習的方法,考量兩種設計的方式,一為整合用戶連結與功率分配的深度Q學習,另一為分別解決此二問題的二階段深度Q學習。在所提出的深度Q學習中,每個傳送端只收集鄰近細胞的通道資訊,在有限的資訊傳遞下處理用戶連結與功率分配的問題,透過限制傳送端彼此資訊量的交換,一方面減少通道資訊回傳的開銷也同時減輕深度Q學習的計算複雜度。模擬結果顯示,在不同的網路配置下,所提出的演算法都能有效地找到近似最佳解,並且證明了深度Q學習在每個子問題上都可以提供良好的效能。

    In the multi-TRP (Transmission Reception Point) networks, the user equipments (UE) suffer the interference from interfering TRPs. Coordinated Multi-Point Joint Transmission (CoMP-JT) technique is proposed that could transfer interfering signals into useful signals. With CoMP-JT, data for a UE is jointly transmitted from multiple TRPs in the same cooperating cluster, thus improve the UEs’ data rate. In this thesis, we investigate the downlink power allocation and user association for CoMP-JT to maximize the system spectral efficiency. Since the problem of joint user association (UA) and power allocation (PA) is NP-hard, we develop a deep reinforcement learning approach considering its capability to provide approximate solutions for a large-scale problem. Furthermore, we also establish a two-stage optimal approach, where the joint UA and PA is decomposed into two sub-problems, and they are solved by separate deep Q-learning (DQL) algorithms. Simulation results are provided to evaluate the proposed method can achieve near-optimal solutions compared with the other benchmark algorithms, and also demonstrate that DQL can provide good performance for each sub-problem.

    Chinese Abstract i Abstract ii Acknowledgement iii Table of Contents iv List of Figures vi List of Tables vii List of Symbols viii List of Acronyms ix 1 Introduction 1 1.1 Thesis Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2 Related Work 3 3 System Model and Problem Formulation 6 3.1 Network Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 3.2 Clustering Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.3 Channel Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.4 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 4 Proposed Method 12 4.1 DQL Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 4.2 DQL Design for Joint UA and PA . . . . . . . . . . . . . . . . . . . . . 15 4.3 DQL Design for Separate UA and PA . . . . . . . . . . . . . . . . . . . 20 4.3.1 DQUA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 4.3.2 DQPA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 5 Results and Discussions 24 5.1 Simulation Set Up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 5.2 User Association and Power Allocation . . . . . . . . . . . . . . . . . . 27 5.3 User Association . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 5.4 Power Allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 6 Conclusion 42 References 43

    [1] K. Shen and W. Yu, “Fractional programming for communication systems—part I: Power control and beamforming,” IEEE Transactions on Signal Processing, vol. 66, no. 10, pp. 2616–2630, 2018.
    [2] Q. Shi, M. Razaviyayn, Z.-Q. Luo, and C. He, “An iteratively weighted mmse approach to distributed sum-utility maximization for a mimo interfering broadcast channel,” IEEE Transactions on Signal Processing, vol. 59, no. 9, pp. 4331–4340, 2011.
    [3] S. G. Kiani, G. E. Oien, and D. Gesbert, “Maximizing multicell capacity using distributed power allocation and scheduling,” in Proc. IEEE Wireless Communications and Networking Conference, 2007.
    [4] T. Zhou, Z. Liu, J. Zhao, C. Li, and L. Yang, “Joint user association and power control for load balancing in downlink heterogeneous cellular networks,” IEEE Transactions on Vehicular Technology, vol. 67, no. 3, pp. 2582–2593, 2017.
    [5] H. Sun, X. Chen, Q. Shi, M. Hong, X. Fu, and N. D. Sidiropoulos, “Learning to optimize: Training deep neural networks for interference management,” IEEE Transactions on Signal Processing, vol. 66, no. 20, pp. 5438–5453, 2018.
    [6] W. Lee, M. Kim, and D.-H. Cho, “Deep power control: Transmit power control scheme based on convolutional neural network,” IEEE Communications Letters, vol. 22, no. 6, pp. 1276–1279, 2018.
    [7] L. Sanguinetti, A. Zappone, and M. Debbah, “Deep learning power allocation in massive mimo,” in Proc. 2018 52nd Asilomar conference on signals, systems, and computers, 2018.
    [8] C. J. Watkins and P. Dayan, “Q-learning,” Machine Learning, vol. 8, no. 3-4, pp. 279–292, 1992.
    [9] F. S. Melo, “Convergence of Q-learning: A simple proof,” Institute Of Systems and Robotics, Tech. Rep, pp. 1–4, 2001.
    [10] D. Li, H. Zhang, K. Long, W. Huangfu, J. Dong, and A. Nallanathan, “User association and power allocation based on Q-learning in ultra dense heterogeneous networks,” in Proc. IEEE Global Communications Conference (GLOBECOM), 2019.
    [11] O. Naparstek and K. Cohen, “Deep multi-user reinforcement learning for dynamic spectrum access in multichannel wireless networks,” in Proc. IEEE Global Communications Conference (GLOBECOM), 2017.
    [12] K. I. Ahmed and E. Hossain, “A deep Q-learning method for downlink power allocation in multi-cell networks,” arXiv preprint arXiv:1904.13032, 2019.
    [13] L. Xiao, H. Zhang, Y. Xiao, X. Wan, S. Liu, L.-C. Wang, and H. V. Poor, “Reinforcement learning-based downlink interference control for ultra-dense small cells,” IEEE Transactions on Wireless Communications, vol. 19, no. 1, pp. 423–434, 2019.
    [14] Y. S. Nasir and D. Guo, “Multi-agent deep reinforcement learning for dynamic power allocation in wireless networks,” IEEE Journal on Selected Areas in Communications, vol. 37, no. 10, pp. 2239–2250, 2019.
    [15] F. Meng, P. Chen, and L. Wu, “Power allocation in multi-user cellular networks with deep Q-learning approach,” in Proc. ICC IEEE International Conference on Communications (ICC), 2019.
    [16] S. Xu, P. Liu, R. Wang, and S. S. Panwar, “Realtime scheduling and power allocation using deep neural networks,” in Proc. IEEE Wireless Communications and Networking Conference (WCNC), 2019.
    [17] H. Ding, F. Zhao, J. Tian, D. Li, and H. Zhang, “A deep reinforcement learning for user association and power control in heterogeneous networks,” Ad Hoc Networks 102:102069, 2020.
    [18] “Coordinated multi-point operation for lte physical layer aspects,” 3GPP TR 36.819, 2013.
    [19] Z.-Q. Luo and S. Zhang, “Dynamic spectrum management: Complexity and duality,” IEEE journal of selected topics in signal processing, vol. 2, no. 1, pp. 57–73, 2008.
    [20] V. Mnih et al., “Human-level control through deep reinforcement learning,” Nature, vol. 518, no. 7540, pp. 529–533, 2015.
    [21] H. Van Hasselt, A. Guez, and D. Silver, “Deep reinforcement learning with double Q-learning,” arXiv preprint arXiv:1509.06461, 2015.
    [22] S. Bassoy, M. Jaber, M. A. Imran, and P. Xiao, “Load aware self-organising usercentric dynamic comp clustering for 5G networks,” IEEE Access, vol. 4, pp. 2895– 2906, 2016.
    [23] B. Özbek and D. Le Ruyet, Feedback Strategies for Wireless Communication. Springer, 2014.

    下載圖示 校內:2022-08-31公開
    校外:2022-08-31公開
    QR CODE