研究生: |
何佳諭 Ho, Chia-Yu |
---|---|
論文名稱: |
基於強化學習之無人機控制在多樣環境 Drone Control in Diverse Environments Based on Reinforcement Learning |
指導教授: |
賴槿峰
Lai, Chin-Feng |
學位類別: |
碩士 Master |
系所名稱: |
工學院 - 工程科學系 Department of Engineering Science |
論文出版年: | 2019 |
畢業學年度: | 107 |
語文別: | 中文 |
論文頁數: | 39 |
中文關鍵詞: | 四軸飛行器 、強化學習 、深度學習 |
外文關鍵詞: | Drone, Reinforcement Learning, Deep Learning |
相關次數: | 點閱:145 下載:2 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
近年來,四軸飛行器及計算機效能的成熟,四軸飛行器可以代替人類完成危險或是需要空中影像的任務,但是訓練操控四軸飛行器的駕駛員需要龐大的人力資源,因此近期有許多自動控制的方法降低操控四軸飛行器的成本。本論文提出基於強化學習控制四軸飛行器自主降落,因為強化學習需要從失敗中學習經驗,微軟開源四軸飛行器的模擬器,在虛擬環境中訓練可以降低四軸飛行器的耗損率及時間成本。強化學習本身有基於價值與基於決策的方法,本論文實現基於價值的Q-學習及基於決策的REINFORCE,並且在多樣的環境中評估兩演算法之優缺點。
In recent years, the effectiveness of drone and computers has improved. Drone can take the place of humans in dangerous missions or missions requiring aerial imagery. But training pilots to operate drone is expensive, so there are a number of ways to reduce the cost of operating drone. This paper proposes to control the autonomous landing of drone based on reinforcement learning. Reinforcement learning requires learning from failure. Microsoft open source drone simulator and training drone in virtual environment can reduce the wear rate and time cost of quadcopter. Reinforcement learning has value - based and policy - based approaches. This paper implements value-based q-learning and policy-based REINFORCE and evaluates the advantages and disadvantages of the two algorithms in diverse environments
[1] “DRONE INDUSTRY ANALYSIS: Market trends & growth forecasts - Business Insider.” [Online]. Available: https://www.businessinsider.com/drone-industry-analysis-market-trends-growth-forecasts-2017-7. [Accessed: 20-May-2019].
[2] L. G.Sol, P. G.Estimado, L.Org, andL.Gaceta, “10884 08,” no. 10884, 2010.
[3] V.Mnih et al., “learning,” 2015.
[4] M.Hausknecht andP.Stone, “Deep Recurrent Q-Learning for Partially Observable MDPs,” 2015.
[5] Z.Wang, T.Schaul, M.Hessel, H.vanHasselt, M.Lanctot, andN.deFreitas, “Dueling Network Architectures for Deep Reinforcement Learning,” no. 9, 2015.
[6] H.VanHasselt, A.Guez, andD.Silver, “Deep Reinforcement Learning with Double Q-Learning,” pp. 2094–2100.
[7] T.Schaul, J.Quan, I.Antonoglou, andD.Silver, “Prioritized Experience Replay,” pp. 1–21, 2015.
[8] M.Hessel et al., “Rainbow: Combining Improvements in Deep Reinforcement Learning,” 2017.
[9] H.VanHasselt, A. C.Group, andC.Wiskunde, “Double Q-learning,” pp. 1–9.
[10] C.Doersch, “Tutorial on Variational Autoencoders,” pp. 1–23, 2016.
[11] M.Paczkowski, “Low-friction composite creping blades improve tissue properties,” Pulp Pap., vol. 70, no. 9, 1996.
[12] D.Silver, G.Lever, D.Technologies, G. U. Y.Lever, andU. C. L.Ac, “Deterministic Policy Gradient (DPG),” Proc. 31st Int. Conf. Mach. Learn., vol. 32, no. 1, pp. 387–395, 2014.
[13] T. P.Lillicrap et al., “CONTINUOUS CONTROL WITH DEEP REINFORCEMENT LEARNING.”
[14] J. N.Tsitsiklis, “Actor-Critic Algorithms.”
[15] V.Mnih et al., “Asynchronous Methods for Deep Reinforcement Learning,” vol. 48, 2016.
[16] D.Silver et al., “Mastering the game of Go with deep neural networks and tree search.,” Nature, vol. 529, no. 7587, pp. 484–9, 2016.
[17] D.Silver, J.Schrittwieser, K.Simonyan, I. A.-Nature, andU.2017, “Mastering the game of Go without human knowledge,” Nature.Com, 2009.
[18] M. G.Bellemare andJ.Veness, “The Arcade Learning Environment : An Evaluation Platform for General Agents,” vol. 47, pp. 253–279, 2013.
[19] V.Mnih et al., “Playing Atari with Deep Reinforcement Learning,” 2013.
[20] S.Narayan, S. B.Cohen, andM.Lapata, “Ranking Sentences for Extractive Summarization with Reinforcement Learning,” 2018.
[21] I.VSerban et al., “A Deep Reinforcement Learning Chatbot arXiv : 1709 . 02349v2 [ cs . CL ] 5 Nov 2017,” pp. 1–40.
[22] Y.Chai andG.Liu, “Utterance censorship of online reinforcement learning chatbot,” Proc. - Int. Conf. Tools with Artif. Intell. ICTAI, vol. 2018-November, pp. 358–362, 2018.
[23] M.Lewis, D.Yarats, Y. N.Dauphin, D.Parikh, andD.Batra, “Deal or No Deal? End-to-End Learning for Negotiation Dialogues,” 2012.
[24] “Machine Translation - Microsoft Translator for Business.” [Online]. Available: https://www.microsoft.com/en-us/translator/business/machine-translation/#howtext. [Accessed: 04-Jun-2019].
[25] S. A.Green, N.Bend, S.Isaacs, S.Sylvain, andR. U. S. A.Data, “( 12 ) United States Patent,” vol. 2, no. 12, 2017.
[26] Y.Zeng, R.Zhang, andT. J.Lim, “Wireless communications with unmanned aerial vehicles: Opportunities and challenges,” IEEE Commun. Mag., vol. 54, no. 5, pp. 36–42, 2016.
[27] J. C.Hodgson, S. M.Baylis, R.Mott, A.Herrod, andR. H.Clarke, “Precision wildlife monitoring using unmanned aerial vehicles,” Sci. Rep., vol. 6, no. March, pp. 1–7, 2016.
[28] I. B.Richman et al., “( 12 ) United States Patent,” vol. 1, no. 12, 2017.
[29] C. Y.Ho, S. Y.Tseng, C. F.Lai, M. S.Wang, andC. J.Chen, “A parameter sharing method for reinforcement learning model between AirSim and UAVs,” Proc. - 2018 1st Int. Cogn. Cities Conf. IC3 2018, pp. 20–23, 2018.
[30] N.Imanberdiyev, C.Fu, E.Kayacan, andI. M.Chen, “Autonomous navigation of UAV by using real-time model-based reinforcement learning,” 2016 14th Int. Conf. Control. Autom. Robot. Vision, ICARCV 2016, vol. 2016, no. November, pp. 1–6, 2017.
[31] G.Kahn, A.Villaflor, V.Pong, P.Abbeel, andS.Levine, “Uncertainty-Aware Reinforcement Learning for Collision Avoidance,” 2017.
[32] T. C.Wu, S. Y.Tseng, C. F.Lai, C. Y.Ho, andY. H.Lai, “Navigating assistance system for quadcopter with deep reinforcement learning,” Proc. - 2018 1st Int. Cogn. Cities Conf. IC3 2018, pp. 16–19, 2018.
[33] Y.Zhao, Z.Zheng, X.Zhang, andY.Liu, “Q learning algorithm based UAV path learning and obstacle avoidence approach,” Chinese Control Conf. CCC, pp. 3397–3402, 2017.
[34] B.Zhang, Z.Mao, W.Liu, andJ.Liu, “Geometric Reinforcement Learning for Path Planning of UAVs,” J. Intell. Robot. Syst. Theory Appl., vol. 77, no. 2, pp. 391–409, 2013.
[35] S.Shah, D.Dey, C.Lovett, andA.Kapoor, “AirSim: High-Fidelity Visual and Physical Simulation for Autonomous Vehicles,” pp. 1–14, 2017.
[36] “What is Unreal Engine 4.” [Online]. Available: https://www.unrealengine.com/en-US/?lang=en-US. [Accessed: 04-Jun-2019].
[37] “Unity.” [Online]. Available: https://unity.com/. [Accessed: 04-Jun-2019].
[38] “TensorFlow.” [Online]. Available: https://www.tensorflow.org/. [Accessed: 04-Jun-2019].