簡易檢索 / 詳目顯示

研究生: 范嘉軒
Fan, Chia-Hsuan
論文名稱: 虛實整合訓練框架:使用強化學習、Unity3D及ROS2的無人車自動導航
Sim-to-Real Framework: Deep Reinforcement Learning Based Autonomous Car Navigation Using Unity3D and ROS2
指導教授: 蘇文鈺
Su, Wen-Yu
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2023
畢業學年度: 111
語文別: 英文
論文頁數: 38
中文關鍵詞: 虛擬至真實端到端無人車機器人強化式學習框架強化式學習導航
外文關鍵詞: sim-to-real, end-to-end, autonomous car, robot, deep reinforcement learning training framework, deep reinforcement learning, navigation, ROS 2, Unity
相關次數: 點閱:115下載:20
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本研究提出了虛實整合及端到端的自主機器人訓練框架,該框架能將無人車控制的模型由虛擬環境順暢地轉移至真實世界。在給定的環境之下,使得後驅動力系統的無人車導航至特定目標的應用將在此框架下實現。首先在 Unity3D® 中透過深度強化學習進行訓練以獲得預訓練模型,接著將該模型布署在真實世界裡的無人車上做進一步訓練。另外,本研究中的無人車僅使用了一個 2D LiDAR 作為感測器。

    該框架涵蓋了五個部份:用於構建 Unity3D® 的虛擬環境,實現 Unity3D® 到深度強化學習的銜接,在虛擬環境中進行深度強化學習的預訓練,運用 ROS 2 將模型布署在真實世界的無人車上,並通過於真實世界環境的訓練來優化模型。該模型在虛擬環境中進行了 3000 個 episodes 的訓練,並在真實世界中進行了額外的 500 個 episodes 的訓練。實驗採用一致的場景設置,利用預訓練模型和在真實世界中優化後的模型在真實世界裡各進行了 48 回合的導航,涵蓋了不同的位置與距離的目標。優化後的模型抵達目標的成功率為 96%,比預訓練模型提高了 38%。同時,它在抵達目標時的平均時間比預訓練模型少了 1 秒。這些結果表明,在此框架之下預訓練於虛擬環境,再進一步訓練於真實世界的無人車能夠順利導航至目標。

    A sim-to-real end-to-end autonomous robot training framework is presented in this thesis. It facilitates seamless transition for training model from simulation environment to physical environment. The framework is experimented with application to the navigation task of an autonomous car with given powertrain architecture in a given environment. The simulation car model is first trained with deep reinforcement learning (DRL) in Unity3D to obtain a pre-trained model which is then applied to the physical car for further training. The only sensor utilized in the experiment is a 2D LiDAR.

    It encompasses a range of modules for construction of Unity3D simulation environments, the Unity3D-to-DRL bridge, pre-train with DRL, policy deployment on physical car running ROS 2, and performance refinement through real-world navigation exercises. The policy is trained for 3000 episodes in simulation and an additional 500 episodes in the real world. The experiments are conducted with consistent scenario settings, consisting of 48 runs with varying target positions and distances in the real world. The refined policy in the real world exhibits a success rate of 96(46 times), surpassing the pre-trained model in simulation by 38(28 times) in successfully reaching the goal. Additionally, it achieves an average time reduction of 1 seconds to reach the goal. These results show that the physical car can perform smoothly with the proposed framework.

    中文摘要 i Abstract ii Acknowledgements iv Contents v List of Table vii List of Figures viii 1 Introduction 1 2 Related Works 4 2.1 Visual navigation 4 2.2 Deep Deterministic Policy Gradient 5 2.3 Unity3D® 7 2.4 ROS 2 8 3 Method 10 3.1 Configuring the Physical Car’s Hardware Components 11 3.2 Creating Simulation Environment in Unity3D and Bridge to DRL 12 3.3 Training in Simulated Environments 15 3.4 Acquiring States for Real-World Training with ROS 2 20 3.5 Training in the Physical Environment 23 4 Experiments 26 4.1 Navigation to Destination in the Real World 27 4.2 Navigation to Destination with Wall Avoidance in Simulation 33 5 Conclusion and Future Work 35 References 36

    [1] OpenAI. Openai spinning up, 2018.
    [2] Szil´ard Aradi. Survey of deep reinforcement learning for motion planning of autonomous vehicles. IEEE Transactions on Intelligent Transportation Systems,
    23(2):740–759, 2022.
    [3] Bharathan Balaji, Sunil Mallya, Sahika Genc, Saurabh Gupta, Leo Dirac, Vineet Khare, Gourav Roy, Tao Sun, Yunzhe Tao, Brian Townsend, Eddie Calleja, Sunil Muralidhara, and Dhanasekar Karuppasamy. Deepracer: Autonomous racing platform for experimentation with sim2real reinforcement learning. In 2020 IEEE International Conference on Robotics and Automation (ICRA), pages 2746–2754, 2020.
    [4] Haoge Jiang, HanWang, Wei-Yun Yau, and Kong-WahWan. A brief survey: Deep reinforcement learning in mobile robot navigation. In 2020 15th IEEE Conference on Industrial Electronics and Applications (ICIEA), pages 592–597, 2020.
    [5] Aws robomaker, 2019.
    [6] Amazon sagemaker, 2019.
    [7] Steven Macenski, Tully Foote, Brian Gerkey, Chris Lalancette, and William Woodall. Robot operating system 2: Design, architecture, and uses in the wild. Science Robotics, 7(66):eabm6074, 2022.
    [8] Yuke Zhu, Roozbeh Mottaghi, Eric Kolve, Joseph J. Lim, Abhinav Gupta, Li Fei-Fei, and Ali Farhadi. Target-driven visual navigation in indoor scenes using deep reinforcement learning. In 2017 IEEE International Conference on Robotics and Automation (ICRA), pages 3357–3364, 2017.
    [9] Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. Playing atari with deep reinforcement learning, 2013.
    [10] Timothy P. Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. Continuous control with deep reinforcement learning, 2019.
    [11] Unity3d.
    [12] Gazebo.
    [13] Morgan Quigley, Brian Gerkey, Ken Conley, Josh Faust, Tully Foote, Jeremy Leibs, Eric Berger, Rob Wheeler, and Andrew Ng. Ros: an open-source robot operating system. In Proc. of the IEEE Intl. Conf. on Robotics and Automation (ICRA) Workshop on Open Source Robotics, Kobe, Japan, may 2009.
    [14] Brian Gerkey. Why ros 2? 2022.
    [15] Steven Macenski, Francisco Martin, Ruffin White, and Jonatan Gin´es Clavero. The marathon 2: A navigation system. In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2020.
    [16] Steve Macenski and Ivona Jambrecic. Slam toolbox: Slam for the dynamic world. Journal of Open Source Software, 6(61):2783, 2021.
    [17] Dave Hershberger, David Gossow, Josh Faust, and William Woodall. Rviz, a 3d visualization tool for ros.
    [18] Paia company, 2020.
    [19] Unity/urdf-importer.
    [20] Scenebuilder.

    下載圖示 校內:立即公開
    校外:立即公開
    QR CODE