簡易檢索 / 詳目顯示

研究生: 陳祥瑋
Chen, Shiang-Wei
論文名稱: 一個深度強化學習的快取替換方法應用在動態使用者模型的邊緣快取
A Cache Replacement Policy via Deep Reinforcement Learning in Edge Caching with Dynamic User Model
指導教授: 蘇銓清
Sue, Chuan-Ching
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2022
畢業學年度: 110
語文別: 英文
論文頁數: 39
中文關鍵詞: 物聯網邊緣快取快取替換方法深度強化式學習動態使用者模型
外文關鍵詞: Internet-of-Thing, Edge Caching, Cache Replacement Policy, Dynamic User Model, Deep Reinforcement Learning
相關次數: 點閱:101下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著物聯網近期的快速發展,有越來越多使用者和智慧裝置會頻繁的訪問資料。而邊緣快取的出現,透過利用離使用者較近的邊緣層,可以盡可能地減少雲端和核心網路的負擔。這篇研究將會在邊緣快取中設計快取替換的方法。我們提出用於快取替換的深度強化學習(Deep Reinforcement Learning)的方法,提升快取命中率。實驗結果顯示,在我們設計的三個動態使用者訪問模型下,我們提出的方法的原始命中率與一些傳統快取替換演算法相比平均提高21.79%,與研究[16]相比提升1.36%。除此之外,我們原始命中率相比於模仿學習平均提升3.2%,可以看出深度強化式學習的優勢。

    With the rapid development of the Internet of Things, more and more users and smart devices are accessing data frequently. The emergence of edge caching has been proposed to reduce the cloud server and backhaul network load. This study will focus on cache replacement in the edge caching. We propose a deep reinforcement learning (DRL) model for cache replacement to increase cache hit rate. The experimental results show that under the three dynamic user request models, the raw hit rate of our proposed method is 21.79% higher on average compared with some traditional cache replacement algorithms, and 1.36% higher than the research [16]. In addition, our raw hit rate is increased by 3.2% on average compared to imitation learning, which shows the advantages of deep reinforcement learning.

    中文摘要 I Abstract II 致謝 III Contents IV List of Tables VI List of Figures VII 1 Introduction 1 2 Background and Related Work 4 2.1 Edge Caching Topic 4 2.2 Cache Replacement 5 2.3 Reinforcement Learning 5 2.4 Motivation 6 3 System Model 8 3.1 System Architecture 8 3.2 User Request Model 9 3.2.1 Shot Noise Model 10 3.2.2 Dynamic Zipf Model 10 3.3 Deep Reinforcement Learning Model 16 3.3.1 DRL Design 17 3.3.2 Algorithm 22 4 Performance Evaluation 24 4.1 User Environment Setting 24 4.2 DQN Parameter 24 4.3 Experimental Setup 25 4.4 Result 26 5 Conclusion 35 Reference 36

    [1] S. Barbarossa, S. Sardellitti, and P. D. Lorenzo, “Communicating while computing: Distributed mobile cloud computing over 5G heterogeneous networks,” IEEE Signal Process. Mag., vol. 31, no. 6, pp. 45–55, Nov. 2014.
    [2] [Online].Available:https://www.statista.com/statistics/1017863/worldwide-iot-connected-devices-data-size/
    [3] D.Liu, B.Chen,C.Yang and A.F.Molisch,“Caching at the wireless edge: design aspects, challenges, and future directions”, IEEE Commun. Mag., vol. 54, no. 9, pp. 22-28, 2016.
    [4] B. Varghese, N. Wang, S. Barbhuiya, P. Kilpatrick and D.S. Nikolopoulos, “Challenges and Opportunities in Edge Computing”, 2016 IEEE International Conference on Smart Cloud (SmartCloud), pp. 20-26, 2016.
    [5] L. A. Belady, "A study of replacement algorithms for a virtual-storage computer", in IBM Systems Journal, vol. 5, no. 2, pp. 78-101, 1966.
    [6] Evan Zheran Liu, Milad Hashemi, Kevin Swersky, Parthasarathy Ranganathan, and Junwhan Ahn, “An Imitation Learning Approach for Cache Replacement”, International Conference on Machine Learning (ICML), pp. 6237-6247, July 2020.
    [7] H. Zhu, Y. Cao, X. Wei, W. Wang, T. Jiang and S. Jin, “Caching Transient Data for Internet of Things: A Deep Reinforcement Learning Approach”, in IEEE Internet of Things Journal, vol. 6, no. 2, pp. 2074-2083, April 2019.
    [8] BERGER, Daniel S., BECKMANN, Nathan, HARCHOL-BALTER, Mor, “Practical bounds on optimal caching with variable object sizes”, Proceedings of the ACM on Measurement and Analysis of Computing Systems, pp 1–38, 2018.
    [9] P. Wu, J. Li, L. Shi, M. Ding, K. Cai and F. Yang, “Dynamic Content Update for Wireless Edge Caching via Deep Reinforcement Learning”, in IEEE Communications Letters, vol. 23, no. 10, pp. 1773-1777, Oct. 2019.
    [10] Zhang, L. Li, H. Chen, and B. Daniel, “A cache replacement algorithm for industrial edge computing application”, Journal of Computer Research and Development, vol. 1, pp. 1533-1543, 2021.
    [11] Cao, Pei, and Sandy Irani, "Cost-Aware WWW Proxy Caching Algorithms", USENIX Symposium on Internet Technologies and Systems (USITS 97), pp. 193-206, 1997.
    [12] A. Nasehzadeh and P. Wang, “A Deep Reinforcement Learning-Based Caching Strategy for Internet of Things”, 2020 IEEE/CIC International Conference on Communications in China (ICCC), pp. 969-974, 2020.
    [13] Christopher J.C.H. Watkins and Peter Dayan, “Q-learning”, Machine learning, vol. 8, pp. 279-292, May 1992.
    [14] V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, M. Riedmiller, “Playing Atari with deep reinforcement learning”, NIPS Deep Learning Workshop, pp. 1-9, Dec. 2013.
    [15] V. Mnih, K. Kavukcuoglu, D. Silver, Andrei A. Rusu, J. Veness, Marc G. Bellemare, A. Graves, M. Riedmiller, Andreas K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg, D. Hassabis, “Human-level control through deep reinforcement learning”, Nature, vol. 518, pp. 529-533, 2015.
    [16] Chen-Yi Chang, “A Deep Reinforcement Learning Approach for Cache Replacement in OM2M-based Heterogeneous Access Controller Management System”, Master's thesis, National Cheng Kung University, Institute of Computer Science and Information Engineering, Jul. 2021.
    [17] D. Antonogiorgakis, A. Britzolakis, P. Chatziadam, A. Dimitriadis, S. Gikas, E. Michalodimitrakis, M. Oikonomakis, N. Siganos, E. Tzagkarakis, Y. Nikoloudakis, S. Panagiotakis, E. Pallis, E.K.Markakis, “A view on edge caching applications”, arXiv preprint arXiv:1907.12359, 2019.
    [18] Capra, M., Peloso, R., Masera, G., Ruo Roch, M., Martina, M, “Edge Computing: A Survey On the Hardware Requirements in the Internet of Things World”, Future Internet, vol. 11, no. 4, pp. 100, 2019.
    [19] Traverso, S., Ahmed, M., Garetto, M., Giaccone, P., Leonardi, E., Niccolini, “Temporal locality in today's content caching: Why it matters and how to model it”, ACM SIGCOMM Computer Communication Review 43.5, pp.5-12, 2013.
    [20] M. Leconte, G. Paschos, L. Gkatzikis, M. Draief, S. Vassilaras and S. Chouvardas, "Placing dynamic content in caches with small population", IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications , pp. 1-9, 2016.
    [21] George Kingsley Zipf, “Human Behavior and the Principle of Least Effort: An Introduction to Human Ecology”, Addison-Wesley Press, 1949.
    [22] Morse, Dale, and Gordon Richardson, “The LIFO/FIFO Decision”, Journal of Accounting Research, vol. 21, no. 1, pp. 106–27, 1983.
    [23] Donghee Lee, Jongmoo Choi, Jong-Hun Kim, Sam H. Noh, Sang Lyul Min, Yookun Cho, and Chong Sang Kim, "On the existence of a spectrum of policies that subsumes the least recently used (LRU) and least frequently used (LFU) policies", Proceedings of the 1999 ACM SIGMETRICS international conference on Measurement and modeling of computer systems , pp. 134-143, 1999.
    [24] Weng Meizhen, Shang Yanlei and Tian Yue, “The design and implementation of LRU-based web cache”, 2013 8th International Conference on Communications and Networking in China (CHINACOM), pp. 400-404, 2013.
    [25] CHERKASOVA, Ludmila, “Improving WWW proxies performance with greedy-dual-size-frequency caching policy”, Palo Alto, CA: Hewlett-Packard Laboratories, pp. 1-14, 1998.
    [26] S. Rahman, M. G. R. Alam and M. M. Rahman, “Deep Learning-based Predictive Caching in the Edge of a Network”, International Conference on Information Networking (ICOIN), pp. 797-801, 2020.
    [27] Arryon D. Tijsma, Madalina M. Drugan, Marco A. Wiering, “Comparing exploration strategies for Q-learning in random stochastic mazes”, IEEE Symposium Series on Computational Intelligence (SSCI), pp. 1-8, Dec. 2016.
    [28] M. E. J. Newman, “Power laws, pareto distributions and Zipfs law”, Contemporary Physics, pp. 323–351, 2005.
    [29] Fukushima, K, “Cognitron: A self-organizing multilayered neural network”, Biological Cybernetics, 20(3), pp. 121-136, 1975.
    [30] G. van Rossum, “Python tutorial”, Technical Report CS-R9526, Centrum voor Wiskunde en Informatica (CWI), Amsterdam, May 1995.
    [31] Lincan Li, Chiew Foong Kwong, Qianyu Liu, Jing Wang, “A Smart Cache Content Update Policy Based on Deep Reinforcement Learning”, Wireless Communications and Mobile Computing, vol. 2020, 11 pages, 2020.
    [32] Asaf Cidon, Assaf Eisenman, Mohammad Alizadeh, and Sachin Katti, “Cliffhanger: Scaling Performance Cliffs in Web Memory Caches”, 13th USENIX Symposium on Networked Systems Design and Implementation, pp. 379-392, March 2016.

    無法下載圖示 校內:2027-09-15公開
    校外:2027-09-15公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE