簡易檢索 / 詳目顯示

研究生: 張介明
Chang, Chieh-Ming
論文名稱: 運用非結構性及適應性網格的有限體積法應用在圖形顯示晶片的平行計算
Application of unstructured and adaptive mesh Finite Volume Methods applied to GPU parallel computing
指導教授: 李汶樺
Matthew R. Smith
學位類別: 碩士
Master
系所名稱: 工學院 - 機械工程學系
Department of Mechanical Engineering
論文出版年: 2016
畢業學年度: 104
語文別: 英文
論文頁數: 141
中文關鍵詞: 計算流體力學非結構性網格四面體網格適應性網格近似雷曼解法HLL解法SHLL 解法平行計算圖行處理器CUDA
外文關鍵詞: Computational Fluid Dynamics (CFD), Unstructured grid, Tetrahedral grid, Adaptive Mesh Refinement (AMR), Approximately Riemann Solver, HLL method, Split HLL (SHLL) method, Parallel Computing, Graphics Processing Unit (GPU), CUDA
相關次數: 點閱:152下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 傳統的計算流體力學主要著重在結構性的卡式坐標網格上。隨著現在物件越做越小,人們經常放著更多的物件在相同的體積裡,使得我們計算的空間越來越複雜。因此在這種情況之下,結構性網格很難繼續被採用因為網格無法滿足實際上的幾何形狀。本研究的一部分運用非結構性的四面體網格克服三維複雜幾何形狀的問題。為了發展非結構化的程式,我們首先運用二維非結構性卡式坐標網格,然後延伸到三維非結構性的四面體網格。然而,因為程式中存取的非結構性網格的資訊需要高耗費時間來執行,所以我們運用圖形處理器的平行運算架構來增加我們計算的速度。運用近似黎曼解(Approximated Riemann solver)比較輝達之GTX Titan Black大約比單核心的Intel i5-4590 CPU快上59倍。此外,如果運用紋理記憶體(texture memory)計算速度可以加快到70倍。
    此外,另一個研究主題是運用自適應性網格細化(adaptive mesh refinement)演算法在暫態模擬上。在一般流場裡,有些地方可以因為網格不夠精細導致無法捕捉到實際的流場特性;而有些地方則是被切的太過精細而導致計算效率下降。所以在計算過程裡,我們運用局部無因次化密度梯度參數來判斷哪裡是解析度不夠的地方進而來切割成更小塊的網格,或是解析度過高的地方則是需要與周圍網格合併起來。雖然運用AMR技術,計算的總工作量被大幅降低,但是因為GPU傳遞資料的頻寬和跟新網格所需要的時間限制AMR整體計算效率。此外,AMR技術運用在暫態流場的另一個缺點是可能會改變流場特性並且產生一些非物理特性是在相同最小網格大小之下,結構性網格所觀察不到的。
    除了運用精確黎曼解來求解通量之外,在這篇論文也運用HLL和分裂HLL(SHLL)來計算上述的問題-非結構四面體網格。儘管加速的幅度不如精確黎曼解來的那麼高,總計算時間卻能夠顯著的降低。然而因為簡化通量的計算,所以數值擴散可能會影響到模擬的結果。在論文裡,TVD形式的方法也運用在適應性網格的計算,來增加空間精確度到二階精確度。而MINMOD限制器以避免空間中非物理性質的振盪。

    Conventional transient Computational Fluid Dynamics is focused on the structured Cartesian Grid. Since nowadays our gadgets become smaller and smaller, people usually deposit more objects in the same space, which makes our computational domain become sophistication. In such cases, it’s hard to continue applying Structured Cartesian grid because the mesh can’t fit the region well. One of this study applies unstructured grid to overcome the three-dimensional complicated geometry problem. For the sake of developing our unstructured solver, we employ two dimensional unstructured Cartesian grid and then, extend into three dimensional unstructured tetrahedral grid. However, it’s high time-consuming to execute the unstructured solver because of accessing the memory. Thus, we apply the Graphics Processing Units (GPUs) parallel computing architecture to improve our computational speed. The maximum speed-up is approximately 59x compared with Nvidia GTX-Titan Black and Intel i5-4590 CPU core for Analytical Riemann solver. Moreover, the speed-up is able to increase up to 70x with employing texture memory.
    The other research I have done is applying Adaptive Mesh Refinement (AMR) algorithm on the transient simulation. Inside the flow field, some regions may be under-resolved making us difficult to capture the characteristic of the flow field or over-resolved causing inefficiency computation. During the computing, we use a dimensionless variable - local normalized density gradient (∇ρ/ρ∙∆x) - to determine the regions are under-refined which will be flagged to split into small cells while the regions are over-refined which will be marked to combine with neighboring cells. Although the total flux computing work is decreased a lot by applying AMR technique, the AMR performance is still an order less than the structured grid’s performance because the poor GPU memory bandwidth and time required for mesh adaption confine the overall performance. Another disadvantage of AMR technique for transient flow simulation may feed back into the physics of the flow field and lead to produce non-physical flow features which may not be presented on structured grid in the same minimum grid density.
    In addition to the application of Analytical Riemann Solver, the research also applies HLL (Haerten, Lax and Van Leer) method and SHLL (Split HLL) method on tetrahedral grid. In spite of HLL and SHLL method decreasing speed-up performance, the total computational time reduces dramatically. Due to the simplification of flux calculation, however, the numerical diffusion will appear and influence our final results. The TVD-type scheme also be executed on the research to increase the spatial accuracy into second order and the MINMOD limiter is employed to avoid the non-physical oscillations over the space.

    中文摘要 i Abstract iii Acknowledgements v Contents vi List of Tables ix List of Figures ix Nomenclature xiv Chapter 1 - Introduction 1 1.1 Motivation and Background 1 1.2 Governing Equation 2 1.2.1 Advection-Diffusion equation 3 1.2.2 Navier-Stokes and Euler Equations 4 1.3 Computational Fluid Dynamics 6 1.2.1 Finite Volume Method 7 1.2.2 CFL number 9 1.2.3 Higher Order Method 10 1.2.4 Adaptive Mesh Refinement Method 12 1.4 Riemann problem and Riemann Solver 13 1.4.1 Elementary Wave Solutions of Riemann Problem 13 1.4.2 Analytical Solution for Riemann problem 20 1.4.3 Rusanov Method 21 1.4.4 Haerten, Lax and Van Leer (HLL) Method 22 1.4.5 Split HLL (SHLL) Method 24 1.5 Parallel Computing 26 1.5.1 Parallelization Theory 27 1.5.2 Message Passing Interface (MPI) 31 1.5.3 Open Multi-Processing (OpenMP) 32 1.6 General-Purpose Computing on Graphics Processing Units 33 1.6.1 GPU Memory 33 1.6.2 CUDA threads, Blocks, Grids 35 1.6.3 CUDA API 38 1.6.4 Kepler Architecture 39 1.6.5 Maxwell Architecture 40 Chapter 2 – Methodology 42 2.1 Implementation 42 2.1.1 Unstructured Finite Volume Method (FVM) implementation 42 2.1.2 Flux Solvers 43 2.1.2.1 Approximate Riemann Solver Implementation 43 2.1.2.2 HLL Method Implementation 47 2.1.2.3 SHLL Method Implementation 47 2.1.3 Unstructured tetrahedral grid generation 48 2.1.4 AMR implementation 49 2.1.4.1 Refining Criterion 49 2.1.4.2 Cell Splitting Procedures 50 2.1.3.3 Cell Merging Procedures 51 2.1.4.4 Organization Cell Neighbors 51 2.1.4.5 Flux calculation – 2nd order extension 52 2.1.4.6 Additional Truncation Error derivative 52 2.1.4.7 Relative Difference and Performance 55 2.1.4 Unstructured Visualization 56 2.2 Data Structure 57 2.2.1 Cartesian Grid 57 2.2.2 Unstructured Cartesian Grid 57 2.2.3 Unstructured Tetrahedral Grid 58 2.2.4 Adaptive Mesh Refinement (AMR) Techniques 58 2.3 GPU Parallelization 58 2.3.1 Memory Management on the GPU using CUDA API 59 2.3.1.1 Shared Memory 61 2.3.1.2 Texture Memory 61 2.3.2 CPU-Launched GPU Kernels (Global function) 63 2.3.3 GPU-Launched GPU Kernels (Device function) 64 2.3.4 Compiling and Building Solver 64 Chapter 3 – Results 66 3.1 One-dimensional shock tube problem 66 3.2 Numerical Diffusion Test 67 3.3 Forward Facing Step 69 3.4 Blast Wave 70 3.5 Analysis of Parallel Performance 72 3.6 Influence of AMR on accuracy for transient flows 74 Chapter 4 – Conclusion 77 Reference: 79 Tables 83 Figures 88

    1. Anderson, John David, and J. Wendt. Computational fluid dynamics. Vol. 206. New York: McGraw-Hill, 1995.
    2. Berger, Marsha J., and Joseph Oliger. "Adaptive mesh refinement for hyperbolic partial differential equations." Journal of computational Physics 53.3 (1984): 484-512.
    3. Berger, Marsha J., and Phillip Colella. "Local adaptive mesh refinement for shock hydrodynamics." Journal of computational Physics 82.1 (1989): 64-84.
    4. C.-C. Su, M.R. Smith, J.-S. Wu, C.-W. Hsieh, K.-C. Tseng and F.-A. Kuo, Large Scale Simulations on Multiple Graphics Processing Units (GPUs) for the Direct Simulation Monte Carlo Method, Journal of Computational Physics, 231, pp. 7932-7958, 2012.
    5. Čada, Miroslav, and Manuel Torrilhon. "Compact third-order limiter functions for finite volume methods." Journal of Computational Physics 228.11 (2009): 4118-4145.
    6. Corrigan, Andrew, et al. "Running unstructured grid‐based CFD solvers on modern graphics hardware." International Journal for Numerical Methods in Fluids 66.2 (2011): 221-229.
    7. Cramer, Tim, et al. "Openmp programming on intel r xeon phi tm coprocessors: An early performance comparison." Proc. Many Core Appl. Res. Community (MARC) Symp. 2012.
    8. Davis, S. F. "Simplified second-order Godunov-type methods." SIAM Journal on Scientific and Statistical Computing 9.3 (1988): 445-473.
    9. Dumbser, Michael, et al. "ADER-WENO finite volume schemes with space–time adaptive mesh refinement." Journal of Computational Physics 248 (2013): 257-286.
    10. Flynn, Michael J. "Very high-speed computing systems." Proceedings of the IEEE 54.12 (1966): 1901-1909.
    11. Garcia, Alejandro L., et al. "Adaptive mesh and algorithm refinement using direct simulation Monte Carlo." Journal of computational Physics 154.1 (1999): 134-155.
    12. Godunov, Sergei Konstantinovich. "A difference method for numerical calculation of discontinuous solutions of the equations of hydrodynamics."Matematicheskii Sbornik 89.3 (1959): 271-306.
    13. Harten, Ami. "High resolution schemes for hyperbolic conservation laws."Journal of computational physics 49.3 (1983): 357-393.
    14. Harten, Amiram, Peter D. Lax, and Bram van Leer. "On upstream differencing and Godunov-type schemes for hyperbolic conservation laws." SIAM review25.1 (1983): 35-61.
    15. Hess, John L. Calculation of potential flow about arbitrary three-dimensional lifting bodies. No. MDC-J5679-01. DOUGLAS AIRCRAFT CO LONG BEACH CA, 1972.
    16. Hirsch, Charles. Numerical Computation of Internal and External Flows: The Fundamentals of Computational Fluid Dynamics: The Fundamentals of Computational Fluid Dynamics. Butterworth-Heinemann, 2007.
    17. Hundsdorfer, Willem, and Jan G. Verwer. Numerical solution of time-dependent advection-diffusion-reaction equations. Vol. 33. Springer Science & Business Media, 2013.
    18. Jacobs, Peter A. "Approximate Riemann solver for hypervelocity flows." AIAA journal 30.10 (1992): 2558-2561.
    19. Jameson, Antony, Wolfgang Schmidt, and Eli Turkel. "Numerical solutions of the Euler equations by finite volume methods using Runge-Kutta time-stepping schemes." AIAA paper 1259 (1981): 1981.
    20. Kawaguti, Mitutosi. "Numerical solution of the Navier-Stokes equations for the flow around a circular cylinder at Reynolds number 40." Journal of the Physical Society of Japan 8.6 (1953): 747-757.
    21. Kuo, Fang-An, et al. "GPU acceleration for general conservation equations and its application to several engineering problems." Computers & Fluids 45.1 (2011): 147-154.
    22. Lax, Peter, and Burton Wendroff. "Systems of conservation laws."Communications on Pure and Applied mathematics 13.2 (1960): 217-237.
    23. Liou, Meng-Sing, and Christopher J. Steffen. "A new flux splitting scheme."Journal of Computational physics 107.1 (1993): 23-39.
    24. Liu, Ji-Yueh, et al. "Hybrid OpenMP/AVX acceleration of a Split HLL Finite Volume Method for the Shallow Water and Euler Equations." Computers & Fluids 110 (2015): 181-188.
    25. Macrossan, M.N., ‘A particle only hybrid method for near continuum flows’, In AIP Conference Proceedings: 22nd International Symposium on Rarefied Gas Dynamics, Edited by Bartel and Gallis,585:426-433, 2001.
    26. Mavriplis, D. J. "Unstructured grid techniques." Annual Review of Fluid Mechanics 29.1 (1997): 473-514.
    27. Peery, K. M., and S. T. Imlay. "Blunt-body flow simulations." AIAA paper 2904 (1988): 1988.
    28. Richardson, Lewis Fry. Weather prediction by numerical process. Cambridge University Press, 2007.
    29. Robinet, J-Ch, et al. "Shock wave instability and the carbuncle phenomenon: same intrinsic origin?." Journal of Fluid Mechanics 417 (2000): 237-263.
    30. Roe, P. L. "Characteristic-based schemes for the Euler equations." Annual review of fluid mechanics 18.1 (1986): 337-365.
    31. Roe, Philip L. "Approximate Riemann solvers, parameter vectors, and difference schemes." Journal of computational physics 43.2 (1981): 357-372.
    32. Rusanov, Vladimir Vasil'evich. "The calculation of the interaction of non-stationary shock waves and obstacles." USSR Computational Mathematics and Mathematical Physics 1.2 (1962): 304-320.
    33. Sedov, Leonid Ivanovich. Similarity and dimensional methods in mechanics. CRC press, 1993.
    34. Smith, Matthew R., et al. "Rapid optimization of blast wave mitigation strategies using Quiet Direct Simulation and Genetic Algorithm. "Computer Physics Communications 181.6 (2010): 1025-1036.
    35. Sod G.A., ”A Survey of Several Finite Difference Methods for Systems of Nonlinear Hyperbolic Conservation Laws”, J. Comp. Phys., Vol. 27, pp. 1-31, 1978
    36. Sweby, Peter K. "High resolution schemes using flux limiters for hyperbolic conservation laws." SIAM journal on numerical analysis 21.5 (1984): 995-1011.
    37. Taylor, Geoffrey. "The formation of a blast wave by a very intense explosion. I. Theoretical discussion." Proceedings of the Royal Society of London. Series A, Mathematical and Physical Sciences(1950): 159-174.
    38. Van den Berg, A. C., and J. Weerheijm. "Blast phenomena in urban tunnel systems." Journal of Loss Prevention in the process industries 19.6 (2006): 598-603.
    39. Van Leer, Bram. "Flux-vector splitting for the Euler equations." Eighth international conference on numerical methods in fluid dynamics. Springer Berlin Heidelberg, 1982.
    40. Van Leer, Bram. "Towards the ultimate conservative difference scheme. II. Monotonicity and conservation combined in a second-order scheme." Journal of computational physics 14.4 (1974): 361-370.
    41. Van Leer, Bram. "Towards the ultimate conservative difference scheme III. Upstream-centered finite-difference schemes for ideal compressible flow."Journal of Computational Physics 23.3 (1977): 263-275.
    42. Von Neumann, John. "The point source solution." Bethe [Bet47] (1941).
    43. Woodward, Paul, and Phillip Colella. "The numerical simulation of two-dimensional fluid flow with strong shocks." Journal of computational physics (1984): 115-173.
    44. Wu, Wen, and Pheng Ann Heng. "A hybrid condensed finite element model with GPU acceleration for interactive 3D soft tissue cutting." Computer Animation and Virtual Worlds 15.3‐4 (2004): 219-227.
    45. Wu, Yongyan, and Kwok Fai Cheung. "Explicit solution to the exact Riemann problem and application in nonlinear shallow‐water equations." International Journal for Numerical Methods in Fluids 57.11 (2008): 1649-1668.

    下載圖示 校內:2019-06-30公開
    校外:2020-06-30公開
    QR CODE