簡易檢索 / 詳目顯示

研究生: 何雅芳
Ho, Ya-Fang
論文名稱: 具上階梯學習能力人形機器人之設計與實現
Design and Implementation of Learning the Stair-Climbing Capability for a Humanoid Robot
指導教授: 李祖聖
Li, Tzuu-Hseng S.
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2011
畢業學年度: 99
語文別: 英文
論文頁數: 60
中文關鍵詞: 上階梯人形機器人增強式學習
外文關鍵詞: stair-climbing, humanoid robot, reinforcement learning, PGRL, FPGL
相關次數: 點閱:81下載:7
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本論文係利用捷思法設計小型人形機器人的上階梯步態,並使用模糊策略梯度學習法實現名為aiRobots-VI-R的小型人形機器人的上階梯學習能力。以捷思法設計的上階梯步態是根據人類上階梯時的行為模式所設計,該步態不需要過於複雜的數學模型及分析,而是將步態分成四個狀態,總計十三個姿態;每個姿態係根據前一個姿態變化,並以此設計將學習所需的參數個數簡化至十二個,而透過參數的簡化可提升學習的速率。模糊策略梯度學習法係結合模糊邏輯與策略梯度增強式學習,透過模糊邏輯加強策略梯度的可靠性並考慮參數相關性以加速學習的速率。學習的目的在於加強機器人在上階梯時的穩定性,並盡可能減少身體在上階梯時的晃動。因此,使用機器人身體上的加速度計量測身體的晃動程度,並以加速度計在X-Y方向上的誤差平方和的倒數作為學習的獎懲回授。經實驗證明,機器人確實能在學習後提升動作的穩定性;在學習前,機器人無法成功爬上階梯,但經過學習後,機器人不但能夠爬上階梯,整體動作的穩定性也大幅提高。

    This thesis describes a heuristic stair climbing pattern and applies the machine learning method, named Fuzzy Policy Gradient Learning (FPGL), for stair climbing pattern optimization on a small-sized humanoid robot named aiRobots-VI-R. The heuristic stair climbing pattern does not need complex mathematical models, yet it contains four states and thirteen poses which are designed according to human behavior for stair climbing. These poses of the stair climbing pattern are designed to depend on each other, and then the number of parameters of the stair climbing pattern is effectively reduced to twelve. Thus, the reduction in number of parameters could hasten the learning procedure. The FPGL method is an integrated machine learning method based on the Policy Gradient Reinforcement Learning (PGRL) method and fuzzy logic concept in order to improve the efficiency and speed of gait learning computation. The objective of stair climbing pattern learning is to improve the motion stability of stair climbing, so the redundant acceleration of the trunk during stair climbing should diminish as much as possible. Therefore, the reciprocal of sum square error of acceleration in the X-Y plane is considered as the reward of the FPGL method. The results of the experiment show that with learning, aiRobots-VI-R can climb stairs successfully. Besides, the training data of the experiments shows that the stability of the motion is increased as well.

    Abstract Ⅰ Acknowledgment Ⅲ Contents Ⅳ List of Figures Ⅵ List of Tables Ⅷ Chapter 1. Introduction 1 1.1 Motivation 1 1.2 Thesis Organization 3 Chapter 2. Fuzzy Policy Gradient Learning and Gait Pattern Design 4 2.1 Introduction 4 2.2 Concept of FPGL 5 2.2.1 Basic Policy Gradient Reinforcement Learning 5 2.2.2 PGRL with Parameter Relevance 9 2.2.3 Fuzzy Policy Gradient Learning 11 2.3 Concept of Gait Learning 13 2.3.1 Design and Parameter of Gait Pattern for aiRobots-V-R 13 2.3.2 Gait Pattern of aiRobots-VI-R 16 2.4 Summary 17 Chapter 3. Learning Strategy for Stair Climbing 18 3.1 Introduction 18 3.2 Concept of Heuristic Stair Climbing Pattern 19 3.3 Design and Parameters of Stair Climbing Pattern 21 3.3.1 Right Leg Climbing 21 3.3.2 Body Barycenter Transferring 23 3.3.3 Left Leg Climbing 24 3.3.4 Standing-up Straight 25 3.3.5 Parameters of Stair Climbing Pattern 27 3.4 Configuration of Stair Climbing Pattern Learning for aiRobots-VI-R 36 3.5 Pattern Learning Page of Human-Machine Interface 41 3.6 Summary 42 Chapter 4. Experimental Results 43 4.1 Introduction 43 4.2 Experimental Settings 44 4.3 Experimental Results 45 Chapter 5. Conclusions and Future Works 54 5.1 Conclusions 54 5.2 Future Works 56 References 57 Biography 60

    References
    [1] http://world.honda.com/ASIMO/
    [2] http://global.kawada.jp/mechatronics/hrp2.html
    [3] http://www.aldebaran-robotics.com/en/Nao.php
    [4] K. Harada, S. Kajita, K. Kaneko, and H. Hirukawa, “Dynamics and balance of a humanoid robot during manipulation tasks,” IEEE Trans. Robotics, vol. 22, no. 3, pp. 568-575, Jun. 2006.
    [5] S. Kajita, T. Nagasaki, K. Kaneko, and H. Hirukawa, “ZMP-based biped running control,” IEEE Trans. Robotics, vol. 14, no. 2, pp. 63-72, Jun. 2007.
    [6] E. Ohashi, T. Aiko, T. Tsuji, H. Nishi, and K. Ohnishi, “Collision avoidance method of humanoid robot with arm force,” IEEE Trans. Industrial Electronics, vol. 54, no. 3, pp. 1632-1641, Jun. 2007.
    [7] C. Fu and K. Chen, “Gait synthesis and sensory control of stair climbing for a humanoid robot,” IEEE Trans. Industrial Electronics, vol. 55, no. 5, pp. 2111-2120, May. 2008.
    [8] K. Erbatur and O. Kurt, “Natural ZMP trajectories for biped robot reference generation,” IEEE Trans. Industrial Electronics, vol. 56, no. 3, pp. 835-845, Mar. 2009.
    [9] P. Sardain and G. Bessonnet, “Zero moment point-measurements from a human walker wearing robot feet as shoes,” IEEE Trans. Systems, Man and Cybernetics, Part A: Systems and Humans, vol. 34, no. 5, pp. 638-648, Sep. 2004.
    [10] Q. Huang, K. Kaneko, K. Yokoi, S. Kajita, T. Kotoku, N. Koyachi, H. Arai, N. Imamura, K. Komoriya, and K. Tanie, “Balance control of a biped robot combining off-line pattern with real-time modification,” in Proc. IEEE Int. Conf. Robotics and Automation, 2000, pp. 3346-3352.
    [11] J. Kennedy and R. Eberhart, “Particle swarm optimization,” in Proc. IEEE Int. Conf. Neural Networks, vol. IV, 1995, pp. 1942–1948.
    [12] J. Kennedy, “The particle swarm: social adaptation of knowledge,” in Proc. IEEE Int. Conf. Evolutionary Computation, 1997, pp. 303–308.
    [13] Z-H. Zhan, J. Zhang, Y. Li, and H.S-H. Chung, “Adaptive particle swarm optimization,” IEEE Trans. Systems, Man, and Cybernetics, vol. 39, no.6, 2009, pp. 1362–1381.
    [14] R. J. Williams, “A class of gradient estimating algorithms for reinforcement learning in neural networks,” in Proc. Int. Joint Conf. Neural Networks, vol. II, 1987, pp. 601–608.
    [15] D. Whitley, S. Dominic, R. Das, and C. W. Anderson, “Genetic reinforcement learning for neurocontrol problems,” Machine Learning, vol. 13, 1993, pp.259–284.
    [16] N. Kohl and P. Stone, “Policy gradient reinforcement learning for fast quadrupedal locomotion,” in Proc. IEEE Int. Conf. Robotics and Automation, 2004, vol. 3, pp. 2619-2624.
    [17] A. Cherubini, F. Giannone, L. Iocchi, and P.F. Palamara, “An extended policy gradient algorithm for robot task learning,” in Proc. IEEE/RSJ Int. Conf. Intelligent Robots and System, 2007, pp. 4121-4126.
    [18] A. Cherubini, F. Giannone, L. Iocchi, M. Lombardo, and G. Oriolo, “Policy gradient learning for a humanoid soccer robot,” Robotics and Autonomous Systems, vol. 57, pp. 808-818, Apr. 2009.
    [19] K.-Y. Chong, Design and Implementation of Fuzzy Policy Gradient Gait Learning Method for Humanoid Robot, Master Thesis, National Cheng Kung University, Jul. 2010.
    [20] P.-C. Huang, Design and Implementation of a Series of Small-sized Humanoid Robots, Master Thesis, National Cheng Kung University, Jul. 2011.
    [21] S.-W. Lai, Design and Implementation of Reinforce Learning Based Fuzzy Gait Controller for Humanoid Robot, Master Thesis, National Cheng Kung University, Jul. 2009.
    [22] S.-H. Liu, Design and Implementation of a Gait Pattern Generator based on Genetic Algorithms and Fuzzy Control for Small-sized Humanoid Robot by using SOPC, Master Thesis, National Cheng Kung University, Jul. 2008.
    [23] RoboCup, http://www.robocup.org/
    [24] Policy gradient methods,
    http://www.scholarpedia.org/article/Policy_gradient_methods
    [25] Heuristic, http://en.wikipedia.org/wiki/Heuristic
    [26] L.A. Zadeh, “Fuzzy sets,” Information and Control, vol. 8, no. 3, 1965, pp. 338-353.
    [27] L.A. Zadeh, “Fuzzy algorithms,” Information and Control, vol. 12, no. 2, 1968, pp. 94–102.
    [28] C. C. Lee, “Fuzzy logic in control system: Fuzzy logic controller-part I,” IEEE Trans. Systems, Man and Cybernetics, vol. 20, no. 2, 1990, pp. 404-418.
    [29] C. C. Lee, “Fuzzy logic in control system: Fuzzy logic controller-part II,” IEEE Trans. Systems, Man and Cybernetics, vol. 20, no. 2, 1990, pp. 419-435.
    [30] T. Takagi and M. Sugeno, “Fuzzy identification of systems and its application to modeling and control,” IEEE Trans. Systems, Man and Cybernetics, vol. 15, no. 1, 1985, pp. 116-132.
    [31] Hitachi H48C,
    http://www.parallax.com/dl/docs/prod/acc/HitachiH48C3AxisAccelerometer.pdf
    [32] ROBOTIS ZIG-100
    http://support.robotis.com/en/product/auxdevice/communication/zigbee_manual.htm

    下載圖示 校內:2016-08-26公開
    校外:2016-08-26公開
    QR CODE