| 研究生: |
徐鏞 Hsu, Yung |
|---|---|
| 論文名稱: |
符合HSA中介語言並支援三維繪圖與通用運算之繪圖處理器設計平台 An HSAIL Conformed GPU Design Platform for General Purpose Computing and 3D Rendering Applications |
| 指導教授: |
陳中和
Chen, Chung-Ho |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 電腦與通信工程研究所 Institute of Computer & Communication Engineering |
| 論文出版年: | 2016 |
| 畢業學年度: | 104 |
| 語文別: | 英文 |
| 論文頁數: | 87 |
| 中文關鍵詞: | 繪圖處理器 、異質架構系統 、平行運算 、繪圖管線 |
| 外文關鍵詞: | GPU, heterogeneous system architecture, parallel computing, rendering pipeline |
| 相關次數: | 點閱:175 下載:11 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
繪圖處理器具有強大的平行運算能力,因此不僅使用在三維計算機繪圖,也被用於一般任務。本論文提出一系統層級的繪圖處理器設計平台,可同時支援三維繪圖與通用目的運算。此平台之目的在於,幫助處理器架構設計者在早期設計階段進行軟硬體的開發與驗證。此平台具有基於現代繪圖處理器之硬體架構的模擬器。該模擬器包含化可程式化且具有客製指令集架構的單一指令多執行緒處理器、針對繪圖管線所設計的特定模組,以及記憶體系統。此繪圖處理器針對高效能運算以及異質運算而設計,並符合異質架構系統的運行模式與其中介語言。本平台亦提供一特殊的編譯流程與工具鏈,用於編譯OpenGL著色程式與OpenCL內核至HSA中介語言以及客製的二進位指令集。本論文發展了一模擬框架,使設計平台得以運行OpenCL與OpenGL應用程式,該框架實作OpenCL與OpenGL 應用程式介面與其執行期函式庫、模擬器的驅動程式,以及客製的內文與視窗管理函式庫。數個OpenCL與OpenGL基準測試程式已被移植至此平台,開發者可剖析其程式行為並評估效能議題。
Graphics Processing Unit (GPU) has powerful parallel computing ability, so it can not only be used for 3D graphic application, but also for general purpose task. This work proposes a system level GPU design platform supporting 3D rendering and general purpose computing applications. The goal of the platform is to assist the processor architects to explore and verify the hardware as well as the software in the early design stage. The platform has a simulator which models the hardware architecture of the modern GPU, including the programmable Single Instruction Multiple Thread (SIMT) processors with customized instruction set architecture, the dedicated modules for the rendering pipeline, and the memory system. This GPU design is aimed for high performance and heterogeneous computing, and it conforms to the Heterogeneous System Architecture (HSA) execution model and HSA intermediate language (HSAIL). This platform also provides a special compilation flow and a tool chain to compile OpenGL shader programs and OpenCL kernels to HSAIL and our custom binary instruction set. To support executing OpenCL and OpenGL applications on this platform, we also develop a simulation framework, including the implementation of OpenCL and OpenGL APIs and runtime libraries, the driver for the simulator, and a customized context and window management library. Several benchmarks have been ported to this platform. Developers can profile the behavior of programs and evaluate the performance issue for both OpenCL and OpenGL applications
[1] J.W. Sheaffer, K. Skadron, and D.P. Luebke. “Temperature-aware GPU design,” ACM SIGGRAPH Posters, New York, NY, USA, August 2004.
[2] V.M. del Barrio, C. Gonzalez, J. Roca, A. Fernandez and R. Espasa, “ATTILA: a cycle-level execution-driven simulator for modern GPU architectures,” International. Symposium on Performance Analysis of Systems and Software, March 2006, pp. 231-241.
[3] A. Bakhoda, G. Yuan, W. Fung, H. Wong, and T. Aamodt, “Analyzing CUDA workloads using a detailed GPU simulator,” in Proc. of ISPASS, 26-28 April 2009 pp. 163-174.
[4] J.M. Arnau, J.M. Parcerisa and P. Xekalakis, “TEAPOT: a toolset for evaluating performance, power and image quality on mobile graphics systems,” in Proc. of the 27th international ACM conference on International conference on supercomputing, New York, NY, USA, 2013, pp. 37-46.
[5] NVIDIA Corporation. (2009) Whitepaper: NVIDIA’s Next Generation CUDA(TM) Compute Architecture: Fermi. [Online]Available:
http://www.nvidia.com.tw/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf
[6] Khronos Group Inc. OpenGL: The Industry's Foundation for High Performance Graphics. [Online] Available: https://www.opengl.org/
[7] HAS Foundation. (2015) HSA Programmer's Reference Manual: HSAIL Virtual ISA and Programming Model, Compiler Writer, and Object Format (BRIG) [Online] Available: http://www.hsafoundation.com/standards/
[8] The HSA Foundation. Heterogeneous System Architecture. [Online] Available: http://www.hsafoundation.com/
[9] The Mesa 3D Graphics Library. [Online] Available: http://www.mesa3d.org/
[10] Khronos Group Inc. [Online] Available: https://www.khronos.org/
[11] NVIDIA Corporation. (2009) Whitepaper: NVIDIA’s Next Generation CUDA(TM) Compute Architecture: Fermi. [Online]Available:
http://www.nvidia.com.tw/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf
[12] OpenGL Architecture Review Board. [Online] Available:
https://www.opengl.org/archives/about/arb/
[13] Khronos Group Inc. OpenCL: The open standard for parallel programming of heterogeneous systems. [Online] Available: https://www.opencl.org/
[14] NVIDIA Corporation. (September 2015) Parallel Thread Execution ISA. Application Guide (Version 4.3).
[Online] Available: http://docs.nvidia.com/cuda/pdf/ptx_isa_4.3.pdf
[15] J. Leng, T. Hetherington, A. Eltantawy, S. Gilani, N. S. Kim, T. M. Aamodt, and V. J. Reddi, “GPUWattch : Enabling energy optimizations in GPGPUs,” in Proc. of the 40th Annual International Symposium on Computer Architecture (ISCA '13), New York, NY, USA , June 2013, pp. 487-498.
[16] S. Li, J.H. Ahn, R.D. Strong, J.B. Brockman, D. M. Tullsen, and N.P. Jouppi, “McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures,” in Proc. of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture, New York, NY, USA, 2009, pp. 469-480.
[17] S. Thoziyoor, J. Ahn, M. Monchiero, J. Brockman, and N. Jouppi, "A Comprehensive Memory Modeling Tool and Its Application to the Design and Analysis of Future Memory Hierarchies," in 35th International Symposium on Computer Architecture, pp.51-62, 21-25 June 2008.
[18] Gallium 3D. TGSI, Tungsten Graphics Shader Infrastructure. [Online] Available: http://gallium.readthedocs.org/en/latest/tgsi.html
[19] H.Y. Cheng, “An HSAIL conformed GPU platform,” master thesis, National Cheng Kung University, Tainan, Taiwan, 2015.
[20] Intel Corporation. (2015) The Compute Architecture of Intel® Processor Graphics Gen9 [Online] Available:
https://software.intel.com/sites/default/files/managed/c5/9a/The-Compute-Architecture-of-Intel-Processor-Graphics-Gen9-v1d0.pdf
[21] AMD Inc. (2012) White paper: AMD Graphics Cores Next (GCN) Architecture. [Online] Available: https://www.amd.com/Documents/GCN_Architecture_whitepaper.pdf
[22] AMD Inc. CL Offline Compiler: Compile OpenCL kernels to HSAIL. [Online] Available: https://github.com/HSAFoundation/CLOC
[23] NVIDIA Corporation. NV_gpu_program4. [Online] Available:
https://www.opengl.org/registry/specs/NV/gpu_program4.txt
[24] NVIDIA Corporation. Cg Toolkit. [Online] Available:
https://developer.nvidia.com/cg-toolkit
[25] Y.C. Huang, “Dynamic SIMD re-convergence with paired-path comparison,” master thesis, National Cheng Kung University, Tainan, Taiwan, 2015.
[26] J.Y. Liou and C.H Chen, “Re-visit blocking texture cache design for modern GPU,” 11th Int. SoC Design Conference (ISOCC), Jeju, Korea, November 2014, pp. 288-289.
[27] X.Org Foundation. [Online] Available: http://www.x.org/wiki/
[28] GLUT - The OpenGL Utility Toolkit. [Online] Available:
https://www.opengl.org/resources/libraries/glut/
[29] GLFW - An OpenGL library. [Online] Available: http://www.glfw.org/
[30] SFML: Simple and Fast Multimedia Library. [Online] Available: http://www.sfml-dev.org/
[31] J. Leech. (2005) OpenGL(R) Graphics with the X Window System(R) (Version 1.4). [Online] Available: https://www.opengl.org/registry/doc/glx1.4.pdf
[32] AMD Inc. APP SDK - A Complete Development Platform. [Online] Available: http://developer.amd.com/tools-and-sdks/opencl-zone/amd-accelerated-parallel-processing-app-sdk/
[33] K. Zhou, X. Wang, Y. Tong, M. Desbrun, B. Guo and H. Shum, “Texture Montage: Seamlessly Texturing of Arbitrary Surfaces From Multiple Images”, ACM Trans. on Graphics, vol. 24, No. 3, pp. 1148-1155, 2005.
[34] T.G. Roger, M. O’Connor, and T.M. Aamodt. “Cache-Conscious Wavefront Scheduling,” in Proc. of the 45th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-45), Washington, DC, USA, Dec 2012, pp. 72-83.
[35] S. Molnar, M. Cox, D. Ellsworth, and H. Fuchs. 1994, "A Sorting Classification of Parallel Rendering.", in Computer Graphics and Applications, IEEE, vol.14, no.4, pp.23-32, July 1994.
[36] H. Gouraud, "Continuous Shading of Curved Surfaces," in IEEE Transactions on Computers, vol.C-20, no.6, pp.623-629, June 1971.
[37] B.T. Phong. "Illumination for computer generated pictures." Communications of the ACM, vol.18.6, pp. 311-317, June 1975.