| 研究生: |
張哲彬 Chang, Che-Pin |
|---|---|
| 論文名稱: |
針對邊緣運算裝置之高能量效率通用計算圖形處理器之硬體動態使用率功率管理 An Energy-Efficient General-Purpose Computing on GPU for Edge Computing Devices Using Hardware Runtime Utilization |
| 指導教授: |
邱瀝毅
Chiou, Lih-Yih |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
| 論文出版年: | 2019 |
| 畢業學年度: | 107 |
| 語文別: | 中文 |
| 論文頁數: | 48 |
| 中文關鍵詞: | 邊緣運算裝置 、通用計算圖形處理器 、動態電壓頻率調整 、功率管理 |
| 外文關鍵詞: | Edge Computing Devices, GPGPU, Dynamic Voltage and Frequency Scaling (DVFS), Power Management |
| 相關次數: | 點閱:53 下載:6 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著人工智慧(Artificial Intelligent)、物聯網(Internet of Thing, IoT)與5G通訊等技術的發展,數據量越來越龐大,邊緣運算(Edge Computing)的概念也因此被廣泛地討論。邊緣運算主要的概念是將數據在邊緣端預先做部分處理,如此可以降低整體數據處理的時間同時也能減少資料傳輸至雲端所需的時間。在這樣的背景之下,邊緣運算裝置也成為市場關注的重點之一。在運算能力方面,已經被廣泛運用在各類應用程式之中的通用計算圖形處理器,其大量的運算單元及能夠處理大量平行運算的能力讓它成為選擇運算裝置時的考量之一。而若要設計邊緣運算裝置中的通用運算圖形處理器,受限於邊緣運算裝置有限的能量來源和體積之下,高運算能量不再是首要的設計重點,而是如何能在有限的功耗之下盡量提升運算能力成為重要的考量。
本論文提出一個考量硬體動態使用率之動態管理方法及低功耗設計於通用計算圖形處理器的高能量效率解決辦法。在這個方法中會動態的偵測當前的運算核心使用率來進行不同區塊的動態電壓頻率調整,進而達到高能量效率的目標。根據實驗結果,整體而言平均可以降低22.3%的功率消耗同時付出3.2%的效能代價,降低約20%的能量消耗。
As the rapid development of artificial intelligent (AI) and internet of things (IoT), the amount of data becomes increasingly huge. Under this situation, the concept of edge computing by using edge computing devices has been widely discussed because it can pre-process data and further reduce latency of transferring data to the cloud server. When computing devices are considered, the general-purpose computing on graphics processing units (GPGPUs) has been widely used because of its strong computing capability. However, when designing a GPGPU for edge computing devices, high computing capability is no longer the main consideration due to limited energy source and devices volume. Instead, high energy efficient and power management become one of main issues.
This thesis proposes an energy efficient GPGPU for edge computing devices using hardware runtime utilization-aware power management scheme with low power techniques. The proposed scheme can detect the utilization of SIMT cores and then dynamically adjust the voltage and frequency of different domains to reduce unnecessary power consumption during runtime. According to the experimental results, the scheme can reduce power consumption by 22.3% with only 3.2% performance overhead on average.
[1] VARIANT Market Research, “Edge Computing Market,” [Online]. Avalible: https://www.variantmarketresearch.com/report-categories/information-communication-technology/edge-computing-market. [Accessed: 14-Jun-2019].
[2] Y. Su, J. Jheng, D. Chen and C. Chen, "Development of an Open ISA GPGPU for Edge Device Machine Learning Applications," in Proc. 2019 Eleventh International Conference on Ubiquitous and Future Networks (ICUFN), Zagreb, Croatia, 2019, pp. 214-217.
[3] Synopsys Inc., “Expanding the Synopsys PrimeTime Solution with Power Analysis”, [Online] Available: https://www.eit.lth.se/fileadmin/eit/courses/etin35/PrimeTime/PrimeTime_Slides.pdf [Access: 19-Jun-2019]
[4] D. Foley et al., "A Low-Power Integrated x86–64 and Graphics Processor for Mobile Computing Devices," IEEE Journal of Solid-State Circuits, vol. 47, no. 1, pp. 220-231, Jan. 2012.
[5] S. R. Gutta, D. Foley, A. Naini, R. Wasmuth and D. Cherepacha, "A low-power integrated x86–64 and graphics processor for mobile computing devices," in Proc. 2011 IEEE International Solid-State Circuits Conference, San Francisco, CA, 2011, pp. 270-272.
[6] P. Meinerzhagen et al., "An energy-efficient graphics processor featuring fine-grain DVFS with integrated voltage regulators, execution-unit turbo, and retentive sleep in 14nm tri-gate CMOS," in Proc. 2018 IEEE International Solid - State Circuits Conference - (ISSCC), San Francisco, CA, 2018, pp. 38-40.
[7] L. Chiou, C. Yang and C. Chang, "A Data-Traffic Aware Dynamic Power Management for General-Purpose Graphics Processing Units," in Proc. 2019 IEEE International Symposium on Circuits and Systems (ISCAS), Sapporo, Japan, 2019, pp. 1-5.
[8] C-K. Yang, “A Data-Traffic Aware Dynamic Power Management for General-Purpose Computing on Graphics Processing Units with Dynamic Voltage and Frequency Scaling,” M.S. thesis, EE, National Cheng Kung University, Tainan, Taiwan, 2017.
[9] A. Bakhoda, G. Yuan, Wilson W-L. Fung, H. Wong, Tor M. Aamodt,” Analyzing CUDA Workloads Using a Detailed GPU Simulator,” in Proc. 2009 IEEE International Symposium on Performance Analysis of Systems and Software, Boston, MA, 2009, pp. 163-174.
[10] H-Y. Chen, “An HSAIL ISA Conformed GPU Platform,” M.S. thesis, CCE, National Cheng Kung University, Tainan, Taiwan, 2015.
[11] K-C. Hsu, “Performance Prediction Model on HSA-Compatible General-Purpose GPU System,” M.S. thesis, CCE, National Cheng Kung University, Tainan, Taiwan, 2016.
[12] W-S. Hsieh, “Design of Cycle-accurate SIMT Core and Implementation,” M.S. thesis, CCE, National Cheng Kung University, Tainan, Taiwan, 2016.
[13] J-H. Jheng, “Micro-Architecture Optimization of HSA-Compatible GPU,” M.S. thesis, CCE, National Cheng Kung University, Tainan, Taiwan, 2018.
[14] HSA Foundation, “HSA Programmer's Reference Manual: HSAIL Virtual ISA and Programming Model, Compiler Writer, and Object Format (BRIG),” [Online] Available: http://www.cs.nthu.edu.tw/~ychung/slides/HSA/HSA-PRM-1.02.pdf. [Accessed: 14-Jun-2019].
[15] Micron Technology, Inc., “Power Calcs”, [Online] Available: https://www.micron.com/-/media/client/global/documents/products/power-calculator/ddr3_ddr3l_power_calc.xlsm?la=en. [Access: 24-Jun-2019]
[16] Y. Hengzhou, G. Yang and M. Zhuo, "A 40nm/65nm process adaptive low jitter phase-locked loop," in Proc. 2014 International Symposium on Integrated Circuits (ISIC), Singapore, 2014, pp. 500-503.
[17] J. Shi, Y. Hsu, E. Soenen, A. Roth and J. Gaither, "A wide-range DC/DC converter with 2ndorder digital compensation and direct battery connection in 40nm CMOS," in Proc. 2011 IEEE Custom Integrated Circuits Conference (CICC), San Jose, CA, 2011, pp. 1-4.
[18] J. Lee, V. Sathisha, M. Schulte, K. Compton and N. S. Kim, "Improving Throughput of Power-Constrained GPUs Using Dynamic Voltage/Frequency and Core Scaling," in Proc. 2011 International Conference on Parallel Architectures and Compilation Techniques, Galveston, TX, 2011, pp. 111-120.
[19] J. Guerreiro, A. Ilic, N. Roma and P. Tomas, "GPGPU Power Modeling for Multi-domain Voltage-Frequency Scaling," in Proc. 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA), Vienna, 2018, pp. 789-800.
[20] K. Kasichayanula, D. Terpstra, P. Luszczek, S. Tomov, S. Moore and G. D. Peterson, "Power Aware Computing on GPUs," in Proc. 2012 Symposium on Application Accelerators in High Performance Computing, Chicago IL, 2012, pp. 64-73.
[21] Z. Lu and Y. Yao, "Thread Voting DVFS for Manycore NoCs," in IEEE Transactions on Computers, vol. 67, no. 10, pp. 1506-1524, 1 Oct. 2018.
[22] A. Majumdar, L. Piga, I. Paul, J. L. Greathouse, W. Huang and D. H. Albonesi, "Dynamic GPGPU Power Management Using Adaptive Model Predictive Control," in Proc. 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA), Austin, TX, 2017, pp. 613-624.
[23] A. Pathania, Q. Jiao, A. Prakash and T. Mitra, "Integrated CPU-GPU power management for 3D mobile games," in Proc. 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC), San Francisco, CA, 2014, pp. 1-6.