簡易檢索 / 詳目顯示

研究生: 賴劭韋
Lai, Shao-wei
論文名稱: 人形機器人之加強式模糊步態控制法之設計與實現
Design and Implementation of Reinforce Learning Based Fuzzy Gait Controller for Humanoid Robot
指導教授: 李祖聖
Li, Tzuu-hseng
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2009
畢業學年度: 97
語文別: 英文
論文頁數: 75
中文關鍵詞: 零點力矩步態合成步態訓練模糊控制機器人人形加強式學習法
外文關鍵詞: Robot, gait synthesis system, fuzzy logic controller, Reinforce Learning, Humanoid, gait learning control, Zero Moment Position
相關次數: 點閱:84下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本論文係探討以加強式學習法與模糊控制法設計實現小型人形機器
    人的步態訓練與步態合成。論文中所使用之加強式學習法主要針對已知
    的參數化行走步態,藉由此演算法在訓練過程中,自動尋找可能的參數
    以得到更快的步伐。我們使用機器人的行走速度作為學習法的獎懲回
    授,但實驗中我們發現,行走穩定度將會影響我們的學習過程。於是我
    們在獎懲函數中結合了零點力矩的觀念,藉以獲得快速且穩定的行走步
    態。實驗結果我們發現,機器人在約1.3 小時的學習時間內,行走速度從
    30.6(公厘/秒)增加至130.6(公厘/秒),比手動調整參數迅速了許多。
    除此之外,為了使機器人的步伐可以結合更多複雜策略,我們亦將機器
    人的策略結合模糊控制系統,使用機器人的視覺訊號作為輸入,將輸出
    訊號經由差值步態合成,得到我們所要的行走方向與動作。最後完成目
    標追隨的策略,如追蹤球與沿線行走,並將其應用於FIRA 及Robocup
    兩大國際賽事。

    This thesis mainly proposes the implementation of gait learning control and the
    fuzzy based gait synthesis system for a small-sized humanoid robot. We accomplish the
    whole system on a biped robot named aiRobot-3. The machine learning approach we
    applied is policy gradient reinforcement learning (PGRL) which can execute the real-time
    performance and directly adjust the policy without calculating action value function. Given
    a parameterized walking motion designed for our robot, PGRL algorithm automatically
    searches the set of possible parameters and finds the faster possible walking motion. The
    reward function we mainly considered is the velocity of our robot which can be estimated
    from the vision system on itself. However, our experiment illustrates that there are some
    stability problems in the learning process. In order to solve these problems, we also attempt
    to employ the desired Zero Moment Position (ZMP) trajectory as another reward for the
    reward function. The results show that the robot learned its gait from 30.6 mm/s to 130.6
    mm/s in about 1.3 hours. It is faster than manual tuning parameters that we used before.
    Besides, for some advanced performance of our robot, we also apply fuzzy logic controller
    (FLC) in our strategy system. We use the information of its vision system as the input of
    the FLC and integrate the robot’s gait to perform such the tracking tasks. To acquire the
    motion that mapping to the output value, we employ Lagrange polynomial interpolation to
    transform the existing motions to the motion we want. Finally, we implement these fuzzy
    based gait synthesis strategies to the tasks such as chasing a ball and tracing a line for
    FIRA and Robocup competitions.

    Chapter 1. Introduction 1 1.1 Motivation 1 1.2 Thesis Organization 2 Chapter 2. Mechanism and Hardware of the Humanoid Robot 4 2.1 Introduction 4 2.2 Design of Mechanism 6 2.3 The Hardware of aiRobot-3 8 2.3.1 Actuators 10 2.3.2 Central Process Unit 12 2.3.3 Wireless Communication System 15 2.3.4 Digital Compass Module 16 2.3.5 Accelerometer 17 2.3.6 Force Sensor 18 2.4 Summary 19 Chapter 3. Concept and Design of Gait Learning 20 3.1 Introduction 20 3.2 The Overview of System Structure 21 3.3 The Design and Generation of Motion Patterns 23 3.3.1 The Concept of Gait Design 23 3.3.2 The Design Procedures of Motion Patterns 25 3.4 The Realization of Gait Learning Control 30 3.4.1 The Definition of Gait Parameters 30 3.4.2 The Concept of PGRL Algorithm 32 3.4.3 Gait Learning Control with Velocity Reward Only 35 3.4.4 Calculation and Generation of Desired ZMP Trajectory 39 3.4.5 Gait Learning Control with ZMP Reward 43 3.5 Summary 47 Chapter 4. Fuzzy Based Gait Synthesis Strategy 48 4.1 Introduction 48 4.2 The Concept of a Fuzzy Logic Controller (FLC) 50 4.3 Lagrange Polynomial Interpolation for Motion Generation 52 4.4 The Application of FLC in Strategy System 56 4.5 Summary 61 Chapter 5. Experimental Results 62 5.1 Introduction 62 5.2 Experimental Results 63 5.2.1 Gait Learning with PGRL Algorithm 63 5.2.2 Strategies for Tracing Target 66 5.2.3 Strategies for the Robot Soccer Competition 68 Chapter 6. Conclusions and Future Works 70 6.1 Conclusions 70 6.2 Future Works 71 References 72 Biography 75

    [1] I. W. Park, J. Y. Kim, S. W. Park, and J. H. Oh, “Development of humanoid robot platform KHR-2 (KAIST Humanoid Robot-2),” in Proc. IEEE/RAS Int. Conf. on Humanoid Robots, vol. 1, pp. 292-310, Nov. 2004.
    [2] P. Sardain, M. Rostami, and G. Bessonnet, “An anthropomorphic biped robot: dynamic concepts and technological design,” IEEE Transactions on Systems, Man and Cybernetics, Part A, vol. 28, no. 6, pp.823-838, Nov. 1998.
    [3] Y. D. Kim, B. J. Lee, J. H. Ryu, and J. H. Kim, “Landing force control for humanoid robot by time-domain passivity approach,” IEEE Transactions on Robotics, vol. 23, no. 6, pp. 1294-1301, Dec. 2007.
    [4] H. Minakata, H. Seki, and S. Tadakuma, “A study of energy-saving shoes for robot considering lateral plane motion,” IEEE Transactions on Industrial Electronics, vol. 55, no. 3, pp. 1271-1276, March 2008.
    [5] J. Morimoto, G. Endo, J. Nakanishi, and G. Cheng, “A biologically inspired biped locomotion strategy for humanoid robots: Modulation of sinusoidal patterns by a coupled oscillator model,” IEEE Transactions on Robotics, vol. 24, no. 1, pp. 185-191, Feb. 2008.
    [6] F. Asano, M. Yamakita, N. Kamamichi, and Z. W. Luo, “A novel gait generation for biped walking robots based on mechanical energy constraint,” IEEE Transactions on Robotics and Automation, vol. 20, no. 3, pp. 565-573, June 2004.
    [7] Q. Huang, and Y. Nakamura, “Sensory reflex control for humanoid walking,” IEEE Transactions on Robotics, vol. 21, no. 5, pp. 977-984, Oct. 2005.
    [8] K. Harada, S. Kajita, F. Kanehiro, K. Fujiwara, K. Kanedo, K. Yokoi, and H. Hirukawa, “Real-time planning of humanoid robot’s gait for force-controlled manipulation,” IEEE/ASME Transactions on Mechatronics, vol. 12, no. 1, pp. 53-62, Feb. 2007.
    [9] N. Motoi, M. Ikebe, and K. Ohnishi, “Real-time gait planning for pushing motion of humanoid robot,” IEEE Transactions on Industrial Informatics, vol. 3, no. 2, pp. 154-163, May 2007.
    [10] K. Harada, S. Kajita, K. Kaneko, and H. Hirukawa, “Dynamics and balance of a humanoid robot during manipulation tasks,” IEEE Transactions on Robotics, vol. 22, no. 3, pp. 568-575, June 2006.
    [11] S. Kajita, T. Nagasaki, K. Kaneko, and H. Hirukawa, “ZMP-based biped running control,” IEEE Transactions on Robotics, vol. 14, no. 2, pp. 63-72, June 2007.

    [12] E. Ohashi, T. Aiko, T. Tsuji, H. Nishi, and K. Ohnishi, “Collision avoidance method of humanoid robot with arm force,” IEEE Transactions on Industrial Electronics, vol. 54, no. 3, pp. 1632-1641, June 2007.
    [13] C. Fu and K. Chen, “Gait synthesis and sensory control of stair climbing for a humanoid robot,” IEEE Transactions on Industrial Electronics, vol. 55, no. 5, pp. 2111-2120, May 2008.
    [14] N. Kohl and P. Stone, “Policy gradient reinforcement learning for fast quadrupedal locomotion,” in Proc. IEEE Int. Conf. Robot. Autom., vol. 3 , pp. 2619-2624, New Orleans, LA, May 2004.
    [15] http://www.fira.net/
    [16] http://www.robocup.org/
    [17] http://www.robotis.com/
    [18] http://www.altera.com/
    [19] http://www.nodna.com/download/ROBOTIS/ZIG-100(english).pdf
    [20] http://www.playrobot.com/sensor/files/tdcm3.pdf
    [21] http://www.parallax.com/dl/docs/prod/acc/HitachiH48C3AxisAccelerometer.pdf
    [22] http://www.robotsfx.com/robot/AGB65_4FS.html
    [23] S. Ito, Y. Aoyama, and H. Kawasaki, “A static balance control under periodic external force,” SICE 2003 Annual Conf., vol. 2, pp. 1967-1972, Aug. 2003.
    [24] P. Sardain, G. Bessonnet, “Zero moment point-measurements from a human walker wearing robot feet as shoes,” IEEE Trans. Systems, Man and Cybernetics, Part A, vol. 34, no. 5, pp. 638-648, Sept. 2004.
    [25] Q. Huang, K. Kaneko, et al., “Balance control of a biped robot combining off-line pattern with real-time modification,” in Proc. IEEE Int. Conf. Robotics and Automation, pp.3346-3352, April, 2000.
    [26] P. Sardain, G. Bessonnet, “Forces acting on a biped robot. Center of pressure-zero moment point,” IEEE Trans Systems, Man. Cybernetics, Part A, vol. 34, pp. 630–372, Sept. 2004.
    [27] S.-H. Liu, Design and implementation of a gait pattern generator based on genetic algorithms and fuzzy control for small-sized humanoid robot by using SOPC, Master Thesis, National Cheng Kung University, July, 2008.
    [28] Y. Choi, D. Kim, Y. Oh, and B. J. You, “Posture/walking control for humanoid robot based on kinematic resolution of CoM Jacobian with embedded motion,” IEEE Transactions on Robotics, vol. 23, no. 6, pp. 1285-1293, Dec. 2007.
    [29] C. C. Lee, “Fuzzy logic in control system: Fuzzy logic controller-part I,” IEEE Transactions on Systems, Man and Cybernetics, vol. 20, no. 2, pp. 404-418, 1990.

    [30] C. C. Lee, “Fuzzy logic in control system: Fuzzy logic controller-part II,” IEEE Transactions on Systems, Man and Cybernetics, vol. 20, no. 2, pp. 419-435, 1990.
    [31] L. A. Zadeh, “Fuzzy sets,” Information Contr., vol. 8, pp. 338-353, 1965.
    [32] T. Takagi and M. Sugeno, “Fuzzy identification of systems and its application to modeling and control,” IEEE Transactions on Systems, Man and Cybernetics, vol. 15, no. 1, pp. 116-132, 1985.
    [33] J.-P. Berrut and L. N. Trefethen, “Barycentric Lagrange Interpolation,” SIAM Review, vol. 46, pp. 501-517, 2004.
    [34] N. Mitsunaga, C. Smith, T. Kanda, H. Ishiguro, and N. Hagita, “Adapting robot behavior for human-robot interaction,” IEEE Transactions on Robotics, vol. 24, no. 4, pp. 911-916, August 2008.
    [35] S. Kamio and H. Iba, “Adaptation technique for integrating genetic programming and reinforcement learning for real robots,” IEEE Transactions on Evolutionary Computation, vol. 9, no. 3, pp. 318-333, June 2005.

    下載圖示 校內:2012-07-28公開
    校外:2014-07-28公開
    QR CODE