| 研究生: |
黃麒祐 Huang, Chi-Yu |
|---|---|
| 論文名稱: |
基於強化學習與適應性逆向步進控制之帶臂四旋翼控制系統 Reinforcement Learning Approach and Adaptive Backstepping Control for Aerial Manipulator System |
| 指導教授: |
劉彥辰
Liu, Yen-Chen |
| 學位類別: |
碩士 Master |
| 系所名稱: |
工學院 - 機械工程學系 Department of Mechanical Engineering |
| 論文出版年: | 2019 |
| 畢業學年度: | 107 |
| 語文別: | 中文 |
| 論文頁數: | 284 |
| 中文關鍵詞: | 帶臂四旋翼系統 、解耦控制架構 、強化學習 、適應控制 、逆向步進控制 、參數不確定性 、軌跡追蹤 |
| 外文關鍵詞: | aerial manipulator system, decoupled control structure, adaptive control, backstepping control, trajectory tracking, reinforcement learning |
| 相關次數: | 點閱:78 下載:1 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
四旋翼成本低廉且結構簡單,其垂直起降及能懸停的能力令四旋翼具有高靈活度,在結合機械手臂後的帶臂四旋翼系統,其擁有高機動性與高操作性,因此是近年來四旋翼載物系統很熱門的研究主題之一。四旋翼的動力模型具有強烈的非線性性質,在與機械手臂結合後,此非線性性質更加明顯。同時,帶臂四旋翼的動力模型比四旋翼動力模型來得更複雜。對於非線性的系統,非線性控制器的控制表現相較於線性PID控制器來得更好。然而,以完整帶臂四旋翼動力模型所設計之控制器,容易遇到計算效率的問題,並且要穩定的控制四旋翼其控制命令發送頻率會希望在50Hz以上。同時,因為帶臂四旋翼系統在進行物體夾取任務時,其系統參數會中途出現變化,因此所使用之四旋翼控制器必須能夠應付。
本論文使用解耦式控制架構,將適應性逆向步進控制器和DDPG強化學習控制器,運用在四旋翼帶機械手臂系統上。以適應性逆向步進器控制四旋翼,其主要針對整體系統參數的不確定性進行補償,在整體系統因為夾取物體,而出現參數變動時,仍可持續進行參數補償,並完成軌跡追蹤的任務。機械手臂控制的部分,使用DDPG強化學習對機械手臂部分進行控制,藉由在V-REP虛擬模擬環境中建置完整的擬真四旋翼帶機械手臂模型,並設定獎勵函數,讓DDPG能學習在不大幅影響四旋翼平衡的情況下,進行機械手臂末端的控制。同時,本論文也實作一帶輕型二軸機械手臂之四旋翼,並將控制器應用在現實系統上,以實地實驗的方式驗證適應性逆向步進控制和DDPG強化學習控制的可行性以及性能的表現。
In this thesis, we proposed a decoupled control structure for aerial manipulator by utilizing the adaptive control and reinforcement learning approach. Aerial manipulators can be utilized to perform multiple tasks with high performance if a precise model is considered in the design of control algorithms. Most of the previous studies exploit Euler Lagrange method to derive the dynamic model of an aerial manipulator; however, the dynamic model is extremely complicated with high degree of nonlinearity. Therefore, the design of an efficient controller for aerial manipulators to follow a trajectory is a crucial problem in aerial manipulation. In this paper, we propose a novel decoupled method to control the quadrotor and robotic arm individually by adaptive backstepping control and reinforcement-learning approach. The adaptive backstepping controller can stabilizes the quadrotor with parameters uncertainty, and learning approach is exploited for robotic arm to operate while minimizing the influence to the quadrotor. The V-REP simulation and experiment result shows that with reinforcement learning controller the movement of the manipulator can have minor disturbance to quadrotor.
[1] S. Kim, S. Choi, and H. J. Kim. Aerial manipulation using a quadrotor with a two DOF robotic arm. In 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 4990-4995, Nov 2013.
[2] Vincenzo Lippiello and Fabio Ruggiero. Cartesian impedance control of a uav with a robotic arm. IFAC Proceedings Volumes, 45(22):704 - 709, 2012. 10th IFAC Symposium on Robot Control.
[3] H. Yang and D. Lee. Dynamics and control of quadrotor with robotic manipulator. In 2014 IEEE International Conference on Robotics and Automation (ICRA), pages 5544-5549, May 2014.
[4] S. Kim, H. Seo, S. Choi, and H. J. Kim. Vision-guided aerial manipulation using a multirotor with a robotic arm. IEEE/ASME Transactions on Mechatronics, 21(4):1912-1923, Aug 2016.
[5] Hossein Bonyan Khamseh, Farrokh Janabi-Sharifi, and Abdelkader Abdessameud. Aerial manipulation - a literature survey. Robotics and Autonomous Systems, 107:221 - 235, 2018.
[6] Tammaso Bresciani. Modelling, identification and control of a quadrotor helicopter, 2008. Student Paper.
[7] 區宗暐. 四旋翼軌跡追蹤與載物控制之動態參數估測系統. Master's thesis, 國立成功大學, 2018.
[8] Timothy P. Lillicrap, Jonathan J. Hunt, Alexand er Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. Continuous control with deep reinforcement learning. arXiv e-prints, page arXiv:1509.02971, Sep 2015.
[9] David Silver, Guy Lever, Nicolas Heess, Thomas Degris, Daan Wierstra, and Martin Riedmiller. Deterministic policy gradient algorithms. In Eric P. Xing and Tony Jebara, editors, Proceedings of the 31st International Conference on Machine Learning, volume 32 of Proceedings of Machine Learning Research, pages 387-395, Bejing, China, 22-24 Jun 2014. PMLR.
[10] Richard S. Sutton, David McAllester, Satinder Singh, and Yishay Mansour. Policy gradient methods for reinforcement learning with function approximation. In Proceedings of the 12th International Conference on Neural Information Processing Systems, NIPS'99, pages 1057-1063, Cambridge, MA, USA, 1999. MIT Press.
[11] V. K. Tripathi, L. Behera, and N. Verma. Design of sliding mode and backstepping controllers for a quadcopter. In 2015 39th National Systems Conference (NSC), pages 1-6, Dec 2015.
[12] Mohd Ari_anan Mohd Basri, Abdul Rashid Husain, and Kumeresan A. Danapalasingam. Enhanced backstepping controller design with application to autonomous quadrotor unmanned aerial vehicle. Journal of Intelligent & Robotic Systems, 79(2):295-321, Aug 2015.
[13] Vijay Kumar and Nathan Michael. Opportunities and challenges with autonomous micro aerial vehicles. Int. J. Rob. Res., 31(11):1279-1291, September
2012.
[14] Ismail Dikmen, Aydemir Arisoy, and Hakan Temeltas. Attitude control of a quadrotor. pages 722 - 727, 07 2009.
[15] J. Hwangbo, I. Sa, R. Siegwart, and M. Hutter. Control of a quadrotor with reinforcement learning. IEEE Robotics and Automation Letters, 2(4):2096-2103, Oct 2017.
[16] V. Ghadiok, J. Goldin, and W. Ren. Autonomous indoor aerial gripping using a quadrotor. In 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 4645-4651, Sep. 2011.
[17] R. G. Valenti, I. Dryanovski, C. Jaramillo, D. P. Striom, and J. Xiao. Autonomous quadrotor flight using onboard rgb-d visual odometry. In 2014 IEEE International Conference on Robotics and Automation (ICRA), pages 5233-5238, May 2014.
[18] Daniel Mellinger, Nathan Michael, and Vijay Kumar. Trajectory generation and control for precise aggressive maneuvers with quadrotors. The International Journal of Robotics Research, 31(5):664-674, 2012.
[19] C. Diao, B. Xian, Q. Yin, W. Zeng, H. Li, and Y. Yang. A nonlinear adaptive control approach for quadrotor uavs. In 2011 8th Asian Control Conference (ASCC), pages 223-228, May 2011.
[20] M. Huang, B. Xian, C. Diao, K. Yang, and Y. Feng. Adaptive tracking control of underactuated quadrotor unmanned aerial vehicles via backstepping. In Proceedings of the 2010 American Control Conference, pages 2076-2081, June 2010.
[21] B. Xiao and S. Yin. A new disturbance attenuation control scheme for quadrotor unmanned aerial vehicles. IEEE Transactions on Industrial Informatics, 13(6):2922-2932, Dec 2017.
[22] K. Sreenath, N. Michael, and V. Kumar. Trajectory generation and control of a quadrotor with a cable-suspended load - a differentially-flat hybrid system. In 2013 IEEE International Conference on Robotics and Automation, pages 4888-4895, May 2013.
[23] I. Palunko, A. Faust, P. Cruz, L. Tapia, and R. Fierro. A reinforcement learning approach towards autonomous suspended load manipulation using aerial robots. In 2013 IEEE International Conference on Robotics and Automation, pages 4896-4901, May 2013.
[24] D. Mellinger, Q. Lindsey, M. Shomin, and V. Kumar. Design, modeling, estimation and control for aerial grasping and manipulation. In 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 2668-2673, Sep. 2011.
[25] F. Caccavale, G. Giglio, G. Muscio, and F. Pierri. Adaptive control for uavs equipped with a robotic arm. IFAC Proceedings Volumes, 47(3):11049 - 11054, 2014. 19th IFAC World Congress.
[26] A. E. Jimenez-Cano, J. Martin, G. Heredia, A. Ollero, and R. Cano. Control of an aerial robot with multi-link arm for assembly tasks. In 2013 IEEE International Conference on Robotics and Automation, pages 4916-4921, May
2013.
[27] G. Heredia, A. E. Jimenez-Cano, I. Sanchez, D. Llorente, V. Vega, J. Braga, J. A. Acosta, and A. Ollero. Control of a multirotor outdoor aerial manipulator. In 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 3417-3422, Sep. 2014.
[28] Riccardo Polvara, Massimiliano Patacchiola, Sanjay Sharma, Jian Wan, Andrew Manning, Robert Sutton, and Angelo Cangelosi. Autonomous Quadrotor Landing using Deep Reinforcement Learning. arXiv e-prints, page arXiv:1709.03339, Sep 2017.
[29] Huy X. Pham, Hung M. La, David Feil-Seifer, and Luan V. Nguyen. Autonomous UAV Navigation Using Reinforcement Learning. arXiv e-prints, page arXiv:1801.05086, Jan 2018.
[30] Wenjie Lou and Xiao Guo. Adaptive trajectory tracking control using reinforcement learning for quadrotor. International Journal of Advanced Robotic Systems, 13(1):38, 2016.
[31] F. Ruggiero, V. Lippiello, and A. Ollero. Aerial manipulation: A literature review. IEEE Robotics and Automation Letters, 3(3):1957-1964, July 2018.
[32] 楊斌, 何玉慶, 韓建達, 劉光軍, 張廣玉, and 王爭. 作業型飛行機器人研究現狀與展望. 機器人, 37(5):628-640, 10 2015.
[33] M. Elsamanty, A. Khalifa, M. Fanni, A. Ramadan, and A. Abo-Ismail. Methodology for identifying quadrotor parameters, attitude estimation and control. In 2013 IEEE/ASME International Conference on Advanced Intelligent Mechatronics, pages 1343-1348, July 2013.
[34] N. Abas, A. Legowo, and R. Akmeliawati. Parameter identification of an autonomous quadrotor. In 2011 4th International Conference on Mechatronics (ICOM), pages 1-8, May 2011.
[35] J.J.E. Slotine and W. Li. Applied Nonlinear Control. Prentice Hall, 1991.
[36] Michiel J. Van Nieuwstadt and Richard M. Murray. Real-time trajectory generation for differentially flat systems. International Journal of Robust and Nonlinear Control, 8(11):995-1020.
[37] Richard M. Murray, Muruhan Rathinam, and Willem Sluis. Differential flatness of mechanical control systems: A catalog of prototype systems. In Proceedings of the 1995 ASME International Congress and Exposition, 1995.
[38] Richard M. Murray, S. Shankar Sastry, and Li Zexiang. A Mathematical Introduction to Robotic Manipulation. CRC Press, Inc., Boca Raton, FL, USA, 1st edition, 1994.
[39] D. Mellinger and V. Kumar. Minimum snap trajectory generation and control for quadrotors. In 2011 IEEE International Conference on Robotics and Automation, pages 2520-2525, May 2011.
[40] D. H. Salunkhe, S. Sharma, S. A. Topno, C. Darapaneni, A. Kankane, and S. V. Shah. Design, trajectory generation and control of quadrotor research platform. In 2016 International Conference on Robotics and Automation for Humanitarian Applications (RAHA), pages 1-7, Dec 2016.
[41] E. Rohmer, S. P. N. Singh, and M. Freese. V-rep: A versatile and scalable robot simulation framework. In 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 1321-1326, Nov 2013.
[42] Raghu Meka. CS289ML, Notes on convergence of gradient descent. URL: https://raghumeka.github.io/CS289ML/gdnotes.pdf. Last visited on 2019/06/10.
[43] Dario Amodei, Chris Olah, Jacob Steinhardt, Paul F. Christiano, John Schulman, and Dan Mane. Concrete problems in ai safety. CoRR, abs/1606.06565, 2016.
[44] Ros wiki. https://www.ros.org/.
[45] Pixhawk o_cial site. https://pixhawk.org/.
[46] Robotis. http://www.robotis.us/dynamixel/.
[47] Raspberry pi. https://www.raspberrypi.org/.
[48] Optitrack. https://optitrack.com/.
[49] Y. Liu and M. Khong. Adaptive control for nonlinear teleoperators with uncertain kinematics and dynamics. IEEE/ASME Transactions on Mechatronics, 20(5):2550-2562, Oct 2015.