簡易檢索 / 詳目顯示

研究生: 張惟森
Chang, Wei-Sen
論文名稱: 使用多重圖形顯示晶片及OpenMP平行化方法於分裂AUSM方法應用在暫態可壓縮流
Development of a Split AUSM Method Applied to Transient Compressible Flow Using Multiple GPUs and OpenMP Parallelization
指導教授: 李汶樺
Matthew Smith
學位類別: 碩士
Master
系所名稱: 工學院 - 機械工程學系
Department of Mechanical Engineering
論文出版年: 2016
畢業學年度: 104
語文別: 英文
論文頁數: 150
中文關鍵詞: 平行計算計算流體力學有限體積法分裂式對流迎風分裂法對流迎風分裂法開放式多處理圖形處理器多重圖形處理器統一計算架構
外文關鍵詞: Parallel Computing, Computational Fluid Dynamics (CFD), Finite Volume Method (FVM), Split Advection Upstream Splitting Method (Split AUSM), Advection Upstream Splitting Method (AUSM), Open Multi-Processing (OpenMP), Graphics Processing Unit (GPU), Multi-GPUs, CUDA
相關次數: 點閱:178下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 傳統的計算流體力學(CFD)在多維度計算上經常需要大量的成本,為了得到更精確的結果,需要增加更多的計算網格。然而,此作法需要更多的計算時間。為了有效降低成本,在計算流體力學模擬使用了平行計算架構,過去許多程式設計師使用許多平行計算技術,如使用多重中央處理器(CPU)核心的開放式多處理(OpenMP)、使用多重統一計算架構(CUDA)核心的圖形處理器(GPU)計算,這些技術皆能有效分攤工作量。而為了在計算流體力學模擬達到更高效能,本研究將開放式多處理技術與圖形處理器架構結合,也就是多重圖形處理器計算的模型。藉由使用開放式多處理產生許多線程,每條線程可自己控制一個圖形處理器,由於這些線程為同時工作,每個圖形處理器能夠同時地分享工作量,因此在本研究提出的混合開放式多處理及圖形處理器技術可以達到比單一圖形處理器快上數倍的速度。

    在本研究中,提出一個新的方法根據有限體積法(FVM)求解如尤拉(Euler)方程式的統御方程式,名為分裂式對流迎風分裂法(Split AUSM)。有別於傳統的對流迎風分裂法(AUSM),分裂式對流迎風分裂法是通量完全向量式分裂的方法,且適用於圖形處理器為主的平行化計算,此外,根據泰勒級數展開可使通量達到二階空間再建構於分裂式對流迎風分裂法而得到更高的解析度,一般而言,二階的全差變遞減格式(TVD)與MUSCL方法如MC通量限制器用來將通量再建構於網格交界面上,由於傳統的MUSCL方法有過量再建構而產生震盪的可能性,本研究反而使用MC通量限制器伴隨限制的再建構方法並證明能提供更穩定的結果。

    為了驗證分裂式對流迎風分裂法能有效率地在多重圖形處理器(Multi-GPUs)上計算,本研究探討一些基準測試如一維震波管問題、一維衝擊聲波交互作用、二維爆炸波問題、二維尤拉四面波交互作用、二維尤拉四震波交互作用及二維震波氣泡交互作用問題等。而加速效果可由不同的圖形處理器如GTX Titan X、GTX Titan Z及Tesla M2070等對比於單核心英特爾i5-4590中央處理器得到。

    Conventional Computational Fluid Dynamics (CFD) simulations usually requires a lot expenses for the multidimensional computation. To obtain more precise results, the increase of the computational mesh grids for CFD simulation is needed, however, this requires much more computation time. For the sake of the cost reduction, parallel computing is used to CFD simulations. Traditionally, several programmers employ parallel computing techniques such as OpenMP – computation with multiple CPU cores, and GPU – computation with multiple CUDA cores all can effectively share the workload and reduce computation time. In order to achieve to higher performance for CFD simulations, in this study, we integrate the OpenMP technique with GPU architecture – the model of multiple GPUs. By using OpenMP to generate several threads, each thread then can control one GPU itself. Owing to the threads work simultaneously, each GPU enable to share the workload concurrently. Hence, the hybrid GPU/OpenMP technique presented in this thesis can be several times faster than only one GPU.

    The new method, namely “Split Advection Upstream Splitting Method” (Split AUSM), based on Finite Volume Method (FVM) for solving the governing equations such as Euler equations is presented in this study. Different from the conventional AUSM method, the Split AUSM scheme is a purely vector split flux suitable for GPU-based parallelization. In addition, the extension of Split AUSM method based on the Taylor series expansion to the second order spatial reconstruction is employed to obtain higher resolution. Commonly, the second order TVD-MUSCL method such as MC limiter is used to reconstruct the fluxes at the cell interface. Owing to the possibility of oscillations generated by over-reconstruction, instead, the MC limiter with limited reconstruction is employed and demonstrated that provides much more stable results.

    To demonstrate the Split AUSM method works efficiently on the Multi-GPUs technique, several benchmarks such as the one-dimensional shock tube and shock acoustic wave problem, the two-dimensional blast wave, Euler four contacts, Euler four shocks and shock bubble problem are discussed. The speedup is evaluated by using 1 GTX Titan X GPU, 2 GTX Titan Z GPUs with total 4 GK110 chips, and 3 Tesla M2070 GPUs compared against the single core of an Intel i5-4590 CPU.

    摘要 i Abstract iii Acknowledgements v Contents vi List of Tables viii List of Figures ix Nomenclature xiv Chapter 1 Introduction 1 1.1 Background and Motivation 1 1.2 Governing Equations 2 1.2.1 Conservation Equations 2 1.2.2 Navier-Stokes and Euler Equations 3 1.3 Finite Volume Methods 5 1.3.1 CFL number 7 1.4 High Resolution Schemes 7 1.4.1 Total Variation Diminishing Scheme 8 1.4.2 Weighted Essentially Non-Oscillatory scheme 10 1.5 Flux Schemes 11 1.5.1 Rusanov Flux Method 12 1.5.2 Harten, Lax and Van Leer (HLL) Method 13 1.5.3 HLLG Method 14 1.5.4 Split HLL (SHLL) Method 15 1.5.5 Advection Upstream Splitting Method (AUSM) 17 1.6 Parallel Computing Theory 19 1.6.1 Parallel Computing Theory 20 1.6.2 Open Multi-Processing (OpenMP) 23 1.7 Graphics Processing Unit (GPU) 24 1.7.1 CUDA Memory Architecture 25 1.7.2 Single Instruction Multiple Threads (SIMT) Architecture 26 1.7.3 CUDA Threads, Blocks and Grids 27 (I) One-dimensional Blocks and One-dimensional Threads 27 (II) One-dimensional Blocks and Two-dimensional Threads 28 (III) Two-dimensional Blocks and Two-dimensional Threads 29 1.7.4 CUDA API 29 1.7.5 Fermi Architecture 31 1.7.6 Kepler Architecture 32 1.7.7 Maxwell Architecture 32 Chapter 2 Methodology 34 2.1 Split Advection Upstream Splitting Method 34 2.2 Second Order Extension of Spatial Reconstruction 37 2.2.1 TVD-MUSCL Schemes 37 2.2.2 Limited Spatial Reconstruction Method 39 2.3 OpenMP Parallelization 39 2.4 GPU Parallelization 41 2.4.1 Memory Management 41 2.4.2 GPU Kernels launched by CPU 43 2.4.3 GPU Kernels launched by GPU 43 2.4.4 Compilation of CUDA in Linux 44 2.4.5 Optimization Options 44 2.5 Multi-GPU Parallelization 44 2.5.1 Data Structure 45 2.5.2 Connection with Multiple GPUs and OpenMP 45 2.5.3 Multi-GPU implementation 47 Chapter 3 Results and discussion 49 3.1 One-dimensional Shock Tube Problem 49 3.2 Shock-acoustic Wave Interaction 51 3.3 Two-dimensional Blast Wave Simulation 52 3.4 Euler Four Contacts Interaction 54 3.5 Euler Four Shocks Interaction 55 3.6 Two-dimensional Shock Bubble Interaction 56 3.7 Analysis of Parallel Performance 58 3.7.1 Computational performance of Split AUSM Scheme 58 3.7.2 Parallel Performance Using Multi-GPU Technique 59 Chapter 4 Conclusion 62 References 64 Tables 67 Figures 80

    1. Amdahl G.M., “Validity of the Single Processor Approach to Achieving Large-Scale Computing Capabilities”, AFIPS Conference Proceedings, Vol. 30, pp. 483-485, 1967.
    2. Čada M., Torrilhon M., “Compact third-order limiter functions for finite volume methods”, J. Comput. Phys., Vol. 228, pp. 4118–4145, 2009.
    3. Cebeci T., Shao J.P., F. Kafyeke and Laurendeau E., “Computational Fluid Dynamics for Engineers”, Horizons Publishing Inc., California, 2005.
    4. Chen Y.-C., Smith M.R., A. Ferguson, “Analysis of the second order accurate uniform equilibrium flux method and its graphics processing unit acceleration”, Computers & Fluids, Vol. 110, pp. 9–18, 2015.
    5. Davis S.F., “Simplified Second-order Godunov-type Methods”, SIAM J. Sci. Statist. Comput., Vol. 9, pp. 445–473, 1988.
    6. Flynn. M. J., “Very High-Speed Computing Systems”, Proc. IEEE, Vol. 54, pp.1901–1909, 1966
    7. Flynn’s taxonomy, https://en.wikipedia.org/wiki/Flynn’s_taxonomy.
    8. Gottlieb S., Mullen J.S., Ruuth S.J., “A fifth order flux implicit WENO method”, J. Sci. Comput., Vol. 27, pp. 271–287, 2006.
    9. Gustafson J.L., “Reevaluating Amdahl’s Law”, Communications of the ACM, Vol. 31, Number 5, pp.532-533, 1988.
    10. Harten A., “High-resolution schemes for hyperbolic conservation laws”, J. Comput. Phys., Vol. 49, pp. 357–393, 1983.
    11. Harten A., Lax P.D., Van Leer B., “On Upstream Differencing and Godunov-Type Schemes for Hyperbolic Conservation Laws”, SIAM Rev., Vol. 25(1), pp. 35-61, 1983.
    12. Harten A. and Osher S., “Uniformly high-order accurate nonoscillatory schemes I”, SIAM J. Numer. Anal., Vol. 24, pp. 279-309, 1987
    13. Harten A., Engquist B., Osher S., and Chakravarthy S.R., “Uniformly high order accurate essentially non-oscillatory schemes III”, J. Comput. Phys., Vol. 71, pp. 231–303, 1987.
    14. Hirsch C., “Numerical Computation of Internal and External Flows”, Vol. 1, John Wiley and Sons, New York, 1998.
    15. Jacobs P.A., “Single Block Navier Stokes Integrator”, NASA ICASE Report 187613, July 1991.
    16. Jameson A., Schimidt W. and Turkel E., “Numerical Solution of the Euler Equations by Finite Volume Methods Using Runge-Kutta Time-Stepping Schemes”, AIAA paper, 81-1259, 1981.
    17. Kuo F.-A., Smith M.R., Hsieh C.-W., Chou C.-Y., Wu J.-S., “GPU acceleration for general conservation equations and its application to several engineering problems”, Computers & Fluids, Vol. 45, pp. 147–154, 2011.
    18. Lax P.D., Wendroff B., “Systems of conservation laws”, Commun. Pure Appl. Math., Vol. 13, pp. 217-237, 1960.
    19. Liou M.-S., Steffen Jr. C.J., “A new flux splitting scheme”, J. Comput. Phys., Vol. 107, pp. 23–39, 1993.
    20. Liu X.-D., Osher S., Chan T., “Weighted essentially non-oscillatory schemes”, J. Comput. Phys., Vol. 115, pp. 200–212, 1994.
    21. Liu J.-Y., “Hybrid OpenMP/AVX Acceleration of a Split Harten”, Lax and Van Leer Method for the Euler Equations, 2014.
    22. Liu J.-Y., Smith M.R., Kuo F.-A., Wu J.-S., “Hybrid OpenMP/AVX acceleration of a Split HLL Finite Volume Method for the Shallow Water and Euler Equations”, Computers & Fluids, Vol. 110, pp. 181–188, 2015.
    23. Ma K.-B., Smith M.R., “The parallel Quiet Direct Simulation (QDS) method applied to unstructured tetrahedral grid computation using the Intel PHI Coprocessor”, Parallel CFD, 2015.
    24. Nvidia, “NVIDIA’s Next Generation CUDATM Compute Architecture: FermiTM”, Whitepaper, NVIDIA Corporation, 2009.
    25. Nvidia, “NVIDIA’s Next Generation CUDATM Compute Architecture: KeplerTM GK110”, Whitepaper, NVIDIA Corporation, 2012.
    26. Nvidia, “NVIDIA GeForce GTX 980”, Whitepaper, NVIDIA Corporation, 2014.
    27. Nvidia, “CUDA Compiler Driver NVCC”, Reference Guide, NVIDIA Corporation, 2014
    28. Osher S. and Solomon F., “Upwind difference schemes for hyperbolic systems of conservation laws”, Math. Comp., Vol. 38, pp. 339–374, 1982.
    29. Roe P.L., “Approximate Riemann solvers, parameter vectors, and difference schemes”, J. Comput. Phys., Vol. 43, pp. 357–372, 1981.
    30. Roe P.L., “Characteristic-based schemes for the Euler equations”, Ann. Rev. Fluid Mech., Vol. 18, pp. 337–365, 1986.
    31. Roe P.L., A Brief Introduction to High-Resolution Schemes, in “Upwind and High Resolution Schemes”, edited by Hussaini, M.Y., van Leer, B. and van Leer, Rosendale, J.: 9-28, 1997.
    32. Rusanov V.V., “The calculation of the interaction of non-stationary shock waves and obstacles”, J. Comput. Math. & Phys. USSR, Vol. 1, pp. 267–279, 1961.
    33. Sedov L.I., “Similarity and Dimensional Methods in Mechanics”, Academic Press, New York, 1959.
    34. Shu C.-W. and Osher S., “Efficient implementation of essentially non-oscillatory shock-capturing schemes”, J. Comput. Phys., Vol. 77, pp. 439–471, 1988.
    35. Smith M.R., “The True Direction Equilibrium Flux Method and its Application”, The University of Queensland, 2008.
    36. Smith M.R., Lin K.-M., Hung C.-T., Chen Y.-S., Wu J.-S., “Development of an improved spatial reconstruction technique for the HLL method and its applications”, J. Comput. Phys., Vol. 230, pp. 477–493, 2011.
    37. Smits A.J., Dussauge J.P., “Turbulent shear layers in supersonic flow”, Springer, 2006.
    38. Sod G.A., “A Survey of Several Finite Difference Methods for Systems of Nonlinear Hyperbolic Conservation Laws”, J. Comput. Phys., Vol. 27, pp. 1–31, 1978
    39. Steger J.L. and Warming R.F., “Flux Vector Splitting of the Inviscid Gasdynamic Equations with Application to Finite-Difference Methods”, J. Comput. Phys., Vol. 40, pp. 263–293, 1981.
    40. Sweby P.K., “High resolution schemes using flux limiters for hyperbolic conservation laws”, SIAM J. Numer. Anal., Vol. 21, pp. 995-1011, 1984.
    41. Toro E.F., “Riemann Solvers and Numerical Methods for Fluid Dynamics”, second, Springer, Berlin, Germany, 1999.
    42. Van Leer B., “Towards the ultimate conservative difference scheme. II. Monotonicity and conservation combined in a second-order scheme”, J. Comput. Phys., Vol. 14, pp. 361–370, 1974.
    43. Van Leer B., “Towards the ultimate conservative difference scheme”. III: Upstream-centered finite-difference schemes for ideal compressible flow, J. Comput. Phys., Vol. 23, pp. 263–275, 1977.
    44. Van Leer B., “Flux-vector splitting for the Euler equations”, Eighth international conference on numerical methods in fluid dynamics, Aachen, Germany, 1982.
    45. Warming R.F., Beam R.M., “Upwind second-order difference schemes and applications in aerodynamic flows”, AIAA J., Vol. 14, pp. 1241-1249, 1976.
    46. Schulz-Rinne C.W., Collins J.P., and Glaz H.M., “Numerical Solution of the Riemann Problem for Two-dimensional Gas Dynamics”, SIAM J. Sci. Compt., Vol. 14, pp. 1394–1414, 1993.

    下載圖示 校內:2018-06-30公開
    校外:2018-06-30公開
    QR CODE