成功大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	賴韋諺 Lai, Wei-Yan
論文名稱：	使用CUDA及圖形處理器作有限元素法計算分析 Finite Element Analysis with CUDA and Graphics Processor
指導教授：	何旭彬 Ho, Shi-Pin
學位類別：	碩士 Master
系所名稱：	工學院 - 機械工程學系 Department of Mechanical Engineering
論文出版年：	2013
畢業學年度：	101
語文別：	中文
論文頁數：	60
中文關鍵詞：	圖形處理器、有限元素法、CUDA
外文關鍵詞：	graphic processor, finite element, CUDA
相關次數：	點閱：129 下載：2
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

近年來，圖形處理器在浮點數運算能力上已經超越了中央處理器許多，且已支援雙精度浮點數運算。對於大量且重複性很高的計算使用圖形處理器會比中央處理器有更好的效率。

使用有限元素法求解問題所得到的線性聯立方程組占整體運算的一大部分，而本文使用共軛梯度法配合Jacobi預選矩陣求解聯立方程組。我們將使用圖形處理器運算並分析迭代過程中的向量內積、向量加乘及稀疏矩陣向量相乘運算。此外，將使用圖形處理器對全矩陣相乘及全矩陣向量相乘作運算分析。最後，分別使用圖形處理器及中央處理器求解有限元素問題，並分析比較其結果。

本文使用NVIDIA公司的CUDA整合技術及其出品的圖形處理器，Fermi架構的GeForce GTX 580與Kepler架構的GeForce GTX TITAN。
測試結果在求解有限元素問題時，GeForce GTX 580比中央處理器Intel® Core™ i5-2500之單個核心運算速度快79.09倍，而GeForce GTX TITAN則快93.14倍。

In the capability of floating point operations, the graphic processor is better than the central processor recently. In addition, the graphic processor provides the double precision floating point operations already. Therefore, the efficiency of computations with numerous and repeated, the graphic processor would be better than the central processor.

In finite element computations, it spends most of the computation time solving a set of linear equation. In this paper, the Jacobi conjugate gradient method has been used to solve a set of linear equation. There are the vector product, the vector-vector addition and multiplication, and the sparse matrix-vector multiplication in the iterative process. These computations have been calculated and analyzed by the graphic processor. Furthermore, the full matrix-matrix multiplication and the full matrix-vector multiplication have been calculated and analyzed too. Finally, a finite element problem has been solved by the graphic processor and the central processor respectively.

In this paper, we use CUDA(Compute Unified Device Architecture) technology and the graphic processor manufactured by NVIDIA. The graphic processor GTX 580 of Fermi architecture and GTX TITAN of Kepler architecture have been used. The testing result shows that the efficiency of GTX 580 and GTX TITAN compared to the Intel® Core™ i5-2500 by single core are 79.09 times and 93.14 times respectively.

摘要	I
Abstract	II
致謝	III
目錄	IV
表目錄	VI
圖目錄	VII
符號說明	IX
第一章 緒論	1
1.1 研究動機	1
1.2 文獻回顧	5
1.3 文章架構	5
第二章 相關理論	7
2.1 預加條件共軛梯度法	7
2.2 資料儲存方式	10
第三章 圖形處理器架構	12
3.1 回顧	12
3.2 CUDA	13
3.3 Fermi架構	14
3.4 Kepler架構	16
3.5 記憶體架構	20
3.6 運作模式	22
3.6.1 執行緒層級	22
3.6.2 記憶體層級	23
3.6.3異構計算	24
3.6.4 計算能力	25
第四章 效能最佳化評估	26
4.1 記憶體最佳化	26
4.1.1 Global記憶體	26
4.1.2 Constant記憶體	28
4.1.3 Texture記憶體	28
4.1.4 Shared記憶體	28
4.2 程式碼最佳化	31
4.2.1 block及warp	31
4.2.2 避免資料在host與device間傳遞	31
4.2.3 控制流指令的使用	31
第五章 研究成果	32
5.1 向量內積	33
5.2 向量加乘	39
5.3 全矩陣相乘	41
5.4 全矩陣向量相乘	43
5.5 稀疏矩陣向量相乘	47
5.6 B-Spline有限元素法求解	51
第六章 結論	57
參考文獻	58
自述	60

                                    

[1] Bolz, J., Farmer, I., Grinspun, E., Schröder, P., "Sparse matrix solvers on the GPU：Conjugate gradient and multigrid", ACM, Inc, 2003.

[2] Kelmelis, E. J., Humphrey, J. R., Durbano, J. P., Ortiz, F. E., "Accelerated modeling and simulation with a desktop supercomputer", SPIE, Vol. 6227 62270N, 2006.

[3] Galoppo, N., Govindaraju, N. K., Henson, M., Manocha, D., "LU-GPU：Efficient algorithms for solving dense linear systems on graphics hardware", University of North Carolina chapel hill, 2005.

[4] "NVIDIA GeForce 8800 GPU architecture overview：World’s first unified directX 10 GPU delivering unparalleled performance and image quality", NVIDIA Corp., 2006.

[5] 林香君, "多處理器個人電腦上的平行有限元素程式設計", 碩士論文, 國立成功大學機械工程系, 1998.

[6] 許育展, "在奔騰4處理器及個人電腦叢集上的計算最佳化", 碩士論文, 國立成功大學機械工程系, 2002.

[7] 陳武勇, "使用圖形處理器於B-spline有限元素分析", 碩士論文, 國立成功大學機械工程學系, 2007.

[8] 林瑞益, "使用圖形處理器作有限元素計算之效能評估", 碩士論文, 國立成功大學機械工程學, 2010.

[9] NVIDIA Corp., "CUDA C Programming Guild v5.0", NVIDIA Corp., 2012.
.
[10] NVIDIA Corp., "NVIDIA’s Next Generation CUDA Compute Architecture: Fermi", NVIDIA Corp., 2009.

[11] NVIDIA Corp., "NVIDIA 下個世代CUDA技術™ 運算架構KeplerTM GK110", NVIDIA Corp., 2012.

校內：2016-07-31公開
校外：2018-07-31公開

簡易檢索 / 詳目顯示

相關論文