簡易檢索 / 詳目顯示

研究生: 徐健棠
Syu, Jian-Tang
論文名稱: 基於強化學習之通用計算圖形處理器的智慧功率管理
Intelligent Power Management Based on Reinforcement Learning for General-Purpose Computing on Graphics Processing Units
指導教授: 邱瀝毅
Chiou, Lih-Yih
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2019
畢業學年度: 107
語文別: 中文
論文頁數: 40
中文關鍵詞: 圖形處理器動態電壓頻率調整機器學習強化學習動態功率管理
外文關鍵詞: GPU, Dynamic Voltage and Frequency Scaling, Machine Learning, Reinforcement Learning, Power Management
相關次數: 點閱:64下載:3
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著智慧型手機的普及,智慧型手機逐漸成為人們生活中不可或缺的一部分。智慧型手機仰賴圖型處理器(GPU)處理圖像、影像。更甚者,隨著人工智慧技術的發展,各式應用程式紛紛運用機器學習來解決問題。並且GPU因為有著大量的運算單元,很適合用來執行這種應用程式。
    然而GPU大量的運算單元必然帶來相當高的功率消耗,這對智慧型手機這種能源有限的行動裝置是相當不利的。本論文提出可適用於GPU基於強化學習的智慧功率管理,透過收集GPU在不同應用程式下的硬體運作時的資訊,搭配強化學習預測GPU的執行狀態,藉此調整電壓頻率來達到降低晶片功率消耗的目標。此方法可節省約8.59%的功率消耗並只有非常有限的效能損失。

    As smart phones become more and more popular, it has become a necessary part of people’s life. Smart phones rely on the graphics processing units (GPU) to process images and videos. And as the development of artificial intelligent (AI) technology, the GPU is widely used on those applications working on machine learning for its massive computing units. However, massive computing units of the GPU will cause large power consumption. It is not desirable especially for power-limited mobile devices, such as smart phone.

    We propose an intelligent power management scheme for the GPU. Hardware runtime information is feed into the reinforcement learning engine on the host CPU to predict the future state of GPU. Then the power management unit can adjust voltage and frequency of major components of the GPU according to the predicted state. Power consumption can be reduced by 8.59% on average with a limited performance loss.

    摘 要 I 誌 謝 VII 目錄 VIII 表目錄 IX 圖目錄 X 第一章 緒論 1 1.1 前言 1 1.2 圖形處理器 1 1.3 機器學習 2 1.3.1 類神經網路 3 1.3.2 強化學習 4 1.3.3 Q-Learning 5 1.3.4 Deep Q-Network 7 1.3.5 Double Deep Q-Network 7 1.4 研究動機 8 1.5 研究貢獻 10 1.6 論文架構 10 第二章 相關文獻探討 11 2.1 GPU功率管理 11 2.2 運用強化學習進行功率管理 13 2.3 總結 14 第三章 基於強化學習之通用計算圖形處理器的智慧功率管理 15 3.1 問題描述 15 3.2 GPU虛擬平台 15 3.2.1 GPU Virtual Platform硬體規格 16 3.2.2 GPU Virtual Platform Power Model 17 3.3 基於強化學習之通用計算圖形處理器智慧功率管理 19 3.3.1 方法概論 20 第四章 實驗結果 26 4.1 實驗一: 決定Double Deep Q-Network的參數 27 4.2 實驗二: 單獨執行Benchmark評估效果 28 4.3 實驗三: 測試DDQN適應變化的能力 31 4.4 實驗四: 測驗DDQN可以多快適應變化 33 第五章 結論和未來工作 35 5.1 結論 35 5.2 未來工作 35 參考資料 37

    [1] J. Jheng, “Design of Cycle-accurate SIMT Core and Implementation,” M.S. thesis, CCE, National Cheng Kung Univ., Tainan, Taiwan, 2018.
    [2] DeepMind, “Safety-first AI for autonomous data centre cooling and industrial control”, August 2018. [Online]. Available: https://deepmind.com/blog/safety-first-ai-autonomous-data-centre-cooling-and-industrial-control/. [Accessed June 25, 2019].
    [3] Y. Chen, J. Emer, and V. Sze, “Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks,” in Proc. 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA), Seoul, 2016, pp. 367-379.
    [4] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, A. Rabinovich, "Going deeper with convolutions," in Proc. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, 2015, pp. 1-9.
    [5] R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, 2nd ed. Cambridge, MA:MIT Press, pp. 1-3, 2012.
    [6] R. Bellman, “The theory of dynamic programming,” Bulletin of the American Mathematical Society, vol. 60, no. 6, pp. 503-516, Nov.1954.
    [7] V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K.Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H .King, D. Kumaran, D. Wierstra, S. Legg, and D. Hassabis, “Human-level control through deep reinforcement learning,” Nature, vol. 518, no. 7540, pp. 529-533, Feb.2015.
    [8] H. Hasselt, A. Guez and D. Silver, “Deep reinforcement learning with double Q-Learning”, in Proc. Thirtieth AAAI Conference on Artificial Intelligence (AAAI’16), Phoenix, USA, pp.2094-2100, 2016.
    [9] T. Kuhn, The Structure of Scientific Revolutions, 1st ed. Chicago: University of Chicago Press, pp.144-159, 1962.
    [10] Statista, “Quarterly personal computer (PC) vendor shipments worldwide, from 2009 to 2019, by vendor”, April 2019. [Online]. Available: https://www.statista.com/statistics/263393/global-pc-shipments-since-1st-quarter-2009-by-vendor/. [Accessed May 22, 2019].
    [11] StatInvestor, “Global smartphone shipments forecast 2010-2022”, 2017. [Online]. Available: https://statinvestor.com/data/33978/global-smartphone-shipments/. [Accessed May 22, 2019].
    [12] J. Leng, T. Hetherington, A. ElTantawy, S. Gilani, N. Kim, T. M. Aamodt and V. J. Reddi, “GPUWattch: enabling energy optimizations in GPGPUs,” in Proc. IEEE International Symposium on Computer Architecture, pp. 487-498, 2013.
    [13] W. W. L. Fung, I. Sham, G. Yuan, and T. M. Aamodt, “Dynamic warp formation,” ACM Transactions on Architecture and Code Optimization, vol. 6, no. 2, pp. 1-37, Jun.2009.
    [14] L. Chiou, C. Yang and C. Chang, "A Data-Traffic Aware Dynamic Power Management for General-Purpose Graphics Processing Units," in Proc. 2019 IEEE International Symposium on Circuits and Systems (ISCAS), Sapporo, Japan, 2019, pp. 1-5.
    [15] F. M. M. u. Islam and M. Lin, "Hybrid DVFS Scheduling for Real-Time Systems Based on Reinforcement Learning," IEEE Systems Journal, vol. 11, no. 2, pp. 931-940, June 2017.
    [16] Q. Zhang, M. Lin, L. T. Yang, Z. Chen and P. Li, "Energy-Efficient Scheduling for Real-Time Systems Based on Deep Q-Learning Model," IEEE Transactions on Sustainable Computing, vol. 4, no. 1, pp. 132-141, 1 Jan.-March 2019.
    [17] H. Chen, “An HSAIL ISA Conformed GPU Platform,” M.S. thesis, CCE, National Cheng Kung Univ., Tainan, Taiwan, 2015.
    [18] K. Hsu, C. Chen, “Performance Prediction Model on HSA-Compatible General-Purpose GPU System,” M.S. thesis, CCE, National Cheng Kung Univ., Tainan, Taiwan, 2016.
    [19] W. Hsieh, C. Chen, “Micro-Architecture Optimization of HSA-Compatible GPU,” M.S. thesis, CCE, National Cheng Kung Univ., Tainan, Taiwan, 2016.
    [20] HSA Foundation. “Heterogeneous System Architecture,” [Online]. Available: http://www.hsafoundation.com/. [Accessed June 25, 2019].
    [21] Khronos OpenCL Working Group, “The OpenCL Specification,” Nov. 2012 [Online]. Available: https://www.khronos.org/registry/OpenCL/specs/opencl-1.2.pdf/. [Accessed June 25, 2019].
    [22] Micron Technology, “DDR3_DDR3L_Power_Calc,”[Online]. Available: https://www.micron.com/-/media/client/global/documents/products/power-calculator/ddr3_ddr3l_power_calc.xlsm?la=en/. [Accessed June 25, 2019].
    [23] Y. Hengzhou, G. Yang and M. Zhuo, "A 40nm/65nm process adaptive low jitter phase-locked loop," in Proc. 2014 International Symposium on Integrated Circuits (ISIC), Singapore, 2014, pp. 500-503.
    [24] J. Shi, Y. Hsu, E. Soenen, A. Roth and J. Gaither, "A wide-range DC/DC converter with 2ndorder digital compensation and direct battery connection in 40nm CMOS," in Proc. 2011 IEEE Custom Integrated Circuits Conference (CICC), San Jose, CA, 2011, pp. 1-4.
    [25] “Keras Documentation.” [Online]. Available: https://keras.io/. [Accessed June 25, 2019].
    [26] “TensorFlow.” [Online]. Available: https://www.tensorflow.org/. [Accessed June 25, 2019].
    [27] A. Traber, M. Gautschi, P. D. Schiavone, “RI5CY User Manual,” April 2019 [Online]. Available: https://www.pulp-platform.org/docs/ri5cy_user_manual.pdf/. [Accessed July 31, 2019].
    [28] Y. Su, J. Jheng, D. Chen and C. Chen, "Development of an Open ISA GPGPU for Edge Device Machine Learning Applications," in Proc. 2019 Eleventh International Conference on Ubiquitous and Future Networks (ICUFN), Zagreb, Croatia, 2019, pp. 214-217.

    下載圖示 校內:2024-09-01公開
    校外:2024-09-01公開
    QR CODE