簡易檢索 / 詳目顯示

研究生: 莊濬維
Zhuang, Jun-Wei
論文名稱: 利用深度強化學習尋找最佳傳播軌跡
Finding Optimal Trajectories in Front Propagation Problem by Deep Reinforcement Learning
指導教授: 劉育佑
Liu, Yu-Yu
學位類別: 碩士
Master
系所名稱: 理學院 - 數學系應用數學碩博士班
Department of Mathematics
論文出版年: 2026
畢業學年度: 114
語文別: 英文
論文頁數: 41
中文關鍵詞: 最佳控制軌跡紊流燃燒速度深度強化學習
外文關鍵詞: Optimal Control Trajectories, Turbulent Flame Speed, Deep Reinforcement Learning
相關次數: 點閱:22下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本研究以強化學習方法談討流場中最佳傳播軌跡的控制策略,此問題源於介面紊流燃燒模型所描述的火焰傳播速度。本研究採用深度強化學習演算法,透過TD3訓練智能體。在穩定巢狀流中,訓練結果與介面燃燒模型的數值模擬結果相符;在不穩定巢狀流中,能獲得控制策略流場並生成傳播軌跡。先前的工作無法對混沌流使用擾動遞降法,本工作使用強化學習處理了不穩定巢狀流的最佳控制問題。

    In this paper we investigate the optimal policies for control trajectories propagating in flow fields. This problem originates from the flame propagation speeds described by the G-equation model in turbulent combustion. We set up a deep reinforcement learning environment, and the agent is trained using the twin delayed deep deterministic policy gradient (TD3) algorithm. For steady cellular flow, the training results are consistent with the simulation results of G-equation. For unsteady cellular flows, previous works on descent method by stochastic perturbation fail on chaotic flows. In this paper the control policies with respect to the flow fields are obtained, and the trajectories in unsteady cellular flows are generated to obtain the propagation speeds.

    摘要 I Abstract II 誌謝 III Contents IV List of Tables V List of Figures VI 1 Introduction 1 2 Reinforcement Learning 4 2.1 Basic Definitions 4 2.2 Markov Decision Process (MDP) 6 2.3 Actor-Critic 7 2.4 Twin Delayed Deep Deterministic Policy Gradient (TD3) 8 2.5 Rollout with a Base Policy 12 3 Trajectory Optimization Problem 15 3.1 Trajectories in Steady Flows 16 3.2 Problem Setting for DRL 18 3.3 Numerical Results for Steady Cellular Flow 20 3.4 Numerical Results for Unsteady Flows 24 4 Conclusion 30 References 31

    [1] Dimitri P. Bertsekas. A Course in Reinforcement Learning. 2nd ed. Belmont,Massachusetts: Athena Scientific, 2025.
    [2] Lawrence C. Evans. Partial Differential Equations. 2nd ed. Vol. 19. American Mathematical Society, 2010.
    [3] Scott Fujimoto, Herke van Hoof, and David Meger. “Addressing Function Approximation Error in Actor-Critic Methods”. In: Proceedings of Machine Learning Research 80 (2018), pp. 1587–1596.
    [4] Peter Gunnarson et al. “Learning efficient navigation in vortical flow fields”. In:Nature Communications 12 (2021), p. 7143.
    [5] Chou Kao, Yu-Yu Liu, and Jack Xin. “A Semi-Lagrangian Computation of Front Speeds of G-Equation in ABC and Kolmogorov Flows with Estimation via Ballistic Orbits”. In: Multiscale Modeling & Simulation 20 (2022).
    [6] Yu-Yu Liu, Jack Xin, and Yifeng Yu. “A numerical study of turbulent flame speeds of curvature and strain G-equations in cellular flows”. In: Physica D: Nonlinear Phenomena 241.23–24(2012), pp. 2045–2055.
    [7] Yu-Yu Liu and Jack Xin. “Synchronized Front Propagation and Delayed Flame Quenching in Strain G-equation and Time-Periodic Cellular Flows”. In: Minimax Theory and its Applications 8.1 (2023).
    [8] Yu-Yu Liu, Jack Xin, and Yifeng Yu. “Turbulent Flame Speeds of G-equation Models in Unsteady Cellular Flows”. In: Mathematical Modelling of Natural Phenomena 8.3 (2013), pp. 198–205.
    [9] Timothy P. Lillicrap et al. “Continuous control with deep reinforcement learning”.In: arXiv preprint arXiv: 1509.02971 (2015).
    [10] Stanley Osher and Ronald Fedkiw. Level Set Methods and Dynamic Implicit Surfaces. Vol.153. Applied Mathematical Sciences. Springer-Verlag, 2002.
    [11] A. M. Oberman. Level Set Motion by Advection, Growth, and Mean Curvature as a Model for Combustion. Ph.D. Thesis, University of Chicago, 2001.
    [12] Norbert Peters. Turbulent Combustion. Cambridge Monographs on Mechanics. Cambridge University Press, 2000.
    [13] Richard S. Sutton and Andrew G. Barto. Reinforcement Learning: An Introduction. 2nd ed. MIT Press, 2018.
    [14] Jack Xin and Yifeng Yu. “Sharp asymptotic growth laws of turbulent flame speeds in cellular flows by inviscid Hamilton-Jacobi models”. In: Annales de l'Institut Henri Poincar´e C, Analyse non lineaire 30 (2013), pp. 1049–1068.
    [15] Shih-Hsiang Yen. Finding Optimal Trajectories in Front Propagation Problem by Descent through Stochastic Perturbation. Master Thesis, National Cheng Kung University, 2022.

    QR CODE