研究生: |
賴韋諺 Lai, Wei-Yan |
---|---|
論文名稱: |
使用CUDA及圖形處理器作有限元素法計算分析 Finite Element Analysis with CUDA and Graphics Processor |
指導教授: |
何旭彬
Ho, Shi-Pin |
學位類別: |
碩士 Master |
系所名稱: |
工學院 - 機械工程學系 Department of Mechanical Engineering |
論文出版年: | 2013 |
畢業學年度: | 101 |
語文別: | 中文 |
論文頁數: | 60 |
中文關鍵詞: | 圖形處理器 、有限元素法 、CUDA |
外文關鍵詞: | graphic processor, finite element, CUDA |
相關次數: | 點閱:129 下載:2 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
近年來,圖形處理器在浮點數運算能力上已經超越了中央處理器許多,且已支援雙精度浮點數運算。對於大量且重複性很高的計算使用圖形處理器會比中央處理器有更好的效率。
使用有限元素法求解問題所得到的線性聯立方程組占整體運算的一大部分,而本文使用共軛梯度法配合Jacobi預選矩陣求解聯立方程組。我們將使用圖形處理器運算並分析迭代過程中的向量內積、向量加乘及稀疏矩陣向量相乘運算。此外,將使用圖形處理器對全矩陣相乘及全矩陣向量相乘作運算分析。最後,分別使用圖形處理器及中央處理器求解有限元素問題,並分析比較其結果。
本文使用NVIDIA公司的CUDA整合技術及其出品的圖形處理器,Fermi架構的GeForce GTX 580與Kepler架構的GeForce GTX TITAN。
測試結果在求解有限元素問題時,GeForce GTX 580比中央處理器Intel® Core™ i5-2500之單個核心運算速度快79.09倍,而GeForce GTX TITAN則快93.14倍。
In the capability of floating point operations, the graphic processor is better than the central processor recently. In addition, the graphic processor provides the double precision floating point operations already. Therefore, the efficiency of computations with numerous and repeated, the graphic processor would be better than the central processor.
In finite element computations, it spends most of the computation time solving a set of linear equation. In this paper, the Jacobi conjugate gradient method has been used to solve a set of linear equation. There are the vector product, the vector-vector addition and multiplication, and the sparse matrix-vector multiplication in the iterative process. These computations have been calculated and analyzed by the graphic processor. Furthermore, the full matrix-matrix multiplication and the full matrix-vector multiplication have been calculated and analyzed too. Finally, a finite element problem has been solved by the graphic processor and the central processor respectively.
In this paper, we use CUDA(Compute Unified Device Architecture) technology and the graphic processor manufactured by NVIDIA. The graphic processor GTX 580 of Fermi architecture and GTX TITAN of Kepler architecture have been used. The testing result shows that the efficiency of GTX 580 and GTX TITAN compared to the Intel® Core™ i5-2500 by single core are 79.09 times and 93.14 times respectively.
[1] Bolz, J., Farmer, I., Grinspun, E., Schröder, P., "Sparse matrix solvers on the GPU:Conjugate gradient and multigrid", ACM, Inc, 2003.
[2] Kelmelis, E. J., Humphrey, J. R., Durbano, J. P., Ortiz, F. E., "Accelerated modeling and simulation with a desktop supercomputer", SPIE, Vol. 6227 62270N, 2006.
[3] Galoppo, N., Govindaraju, N. K., Henson, M., Manocha, D., "LU-GPU:Efficient algorithms for solving dense linear systems on graphics hardware", University of North Carolina chapel hill, 2005.
[4] "NVIDIA GeForce 8800 GPU architecture overview:World’s first unified directX 10 GPU delivering unparalleled performance and image quality", NVIDIA Corp., 2006.
[5] 林香君, "多處理器個人電腦上的平行有限元素程式設計", 碩士論文, 國立成功大學機械工程系, 1998.
[6] 許育展, "在奔騰4處理器及個人電腦叢集上的計算最佳化", 碩士論文, 國立成功大學機械工程系, 2002.
[7] 陳武勇, "使用圖形處理器於B-spline有限元素分析", 碩士論文, 國立成功大學機械工程學系, 2007.
[8] 林瑞益, "使用圖形處理器作有限元素計算之效能評估", 碩士論文, 國立成功大學機械工程學, 2010.
[9] NVIDIA Corp., "CUDA C Programming Guild v5.0", NVIDIA Corp., 2012.
.
[10] NVIDIA Corp., "NVIDIA’s Next Generation CUDA Compute Architecture: Fermi", NVIDIA Corp., 2009.
[11] NVIDIA Corp., "NVIDIA 下個世代CUDA技術™ 運算架構KeplerTM GK110", NVIDIA Corp., 2012.