成功大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	賴劭韋 Lai, Shao-wei
論文名稱：	人形機器人之加強式模糊步態控制法之設計與實現 Design and Implementation of Reinforce Learning Based Fuzzy Gait Controller for Humanoid Robot
指導教授：	李祖聖 Li, Tzuu-hseng
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 電機工程學系 Department of Electrical Engineering
論文出版年：	2009
畢業學年度：	97
語文別：	英文
論文頁數：	75
中文關鍵詞：	零點力矩、步態合成、步態訓練、模糊控制、機器人、人形、加強式學習法
外文關鍵詞：	Robot, gait synthesis system, fuzzy logic controller, Reinforce Learning, Humanoid, gait learning control, Zero Moment Position
相關次數：	點閱：84 下載：2
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

本論文係探討以加強式學習法與模糊控制法設計實現小型人形機器
人的步態訓練與步態合成。論文中所使用之加強式學習法主要針對已知
的參數化行走步態，藉由此演算法在訓練過程中，自動尋找可能的參數
以得到更快的步伐。我們使用機器人的行走速度作為學習法的獎懲回
授，但實驗中我們發現，行走穩定度將會影響我們的學習過程。於是我
們在獎懲函數中結合了零點力矩的觀念，藉以獲得快速且穩定的行走步
態。實驗結果我們發現，機器人在約1.3 小時的學習時間內，行走速度從
30.6（公厘／秒）增加至130.6（公厘／秒），比手動調整參數迅速了許多。
除此之外，為了使機器人的步伐可以結合更多複雜策略，我們亦將機器
人的策略結合模糊控制系統，使用機器人的視覺訊號作為輸入，將輸出
訊號經由差值步態合成，得到我們所要的行走方向與動作。最後完成目
標追隨的策略，如追蹤球與沿線行走，並將其應用於FIRA 及Robocup
兩大國際賽事。

This thesis mainly proposes the implementation of gait learning control and the
fuzzy based gait synthesis system for a small-sized humanoid robot. We accomplish the
whole system on a biped robot named aiRobot-3. The machine learning approach we
applied is policy gradient reinforcement learning (PGRL) which can execute the real-time
performance and directly adjust the policy without calculating action value function. Given
a parameterized walking motion designed for our robot, PGRL algorithm automatically
searches the set of possible parameters and finds the faster possible walking motion. The
reward function we mainly considered is the velocity of our robot which can be estimated
from the vision system on itself. However, our experiment illustrates that there are some
stability problems in the learning process. In order to solve these problems, we also attempt
to employ the desired Zero Moment Position (ZMP) trajectory as another reward for the
reward function. The results show that the robot learned its gait from 30.6 mm/s to 130.6
mm/s in about 1.3 hours. It is faster than manual tuning parameters that we used before.
Besides, for some advanced performance of our robot, we also apply fuzzy logic controller
(FLC) in our strategy system. We use the information of its vision system as the input of
the FLC and integrate the robot’s gait to perform such the tracking tasks. To acquire the
motion that mapping to the output value, we employ Lagrange polynomial interpolation to
transform the existing motions to the motion we want. Finally, we implement these fuzzy
based gait synthesis strategies to the tasks such as chasing a ball and tracing a line for
FIRA and Robocup competitions.

Chapter 1. Introduction												1
1 Motivation													1
2 Thesis Organization											2
Chapter 2. Mechanism and Hardware of the Humanoid Robot				4
1 Introduction												4
2 Design of Mechanism										6
3 The Hardware of aiRobot-3									8
3.1 Actuators											10
3.2 Central Process Unit									12
3.3 Wireless Communication System							15
3.4 Digital Compass Module								16
3.5 Accelerometer										17
3.6 Force Sensor											18
4 Summary													19
Chapter 3. Concept and Design of Gait Learning							20
1 Introduction												20
2 The Overview of System Structure								21
3 The Design and Generation of Motion Patterns					23
3.1 The Concept of Gait Design								23
3.2 The Design Procedures of Motion Patterns					25
4 The Realization of Gait Learning Control						30
4.1 The Definition of Gait Parameters							30
4.2 The Concept of PGRL Algorithm							32
4.3 Gait Learning Control with Velocity Reward Only			35
4.4 Calculation and Generation of Desired ZMP Trajectory		39
4.5 Gait Learning Control with ZMP Reward					43
5 Summary													47
Chapter 4. Fuzzy Based Gait Synthesis Strategy       						48
1 Introduction												48
2 The Concept of a Fuzzy Logic Controller (FLC)					50
3 Lagrange Polynomial Interpolation for Motion Generation			52
4 The Application of FLC in Strategy System						56
5 Summary													61
Chapter 5. Experimental Results											62
1 Introduction												62
2 Experimental Results										63
2.1 Gait Learning with PGRL Algorithm						63
2.2 Strategies for Tracing Target								66
2.3 Strategies for the Robot Soccer Competition					68
Chapter 6. Conclusions and Future Works								70
1 Conclusions												70
2 Future Works												71
References															72
Biography															75
                                    

[1] I. W. Park, J. Y. Kim, S. W. Park, and J. H. Oh, “Development of humanoid robot platform KHR-2 (KAIST Humanoid Robot-2),” in Proc. IEEE/RAS Int. Conf. on Humanoid Robots, vol. 1, pp. 292-310, Nov. 2004.
[2] P. Sardain, M. Rostami, and G. Bessonnet, “An anthropomorphic biped robot: dynamic concepts and technological design,” IEEE Transactions on Systems, Man and Cybernetics, Part A, vol. 28, no. 6, pp.823-838, Nov. 1998.
[3] Y. D. Kim, B. J. Lee, J. H. Ryu, and J. H. Kim, “Landing force control for humanoid robot by time-domain passivity approach,” IEEE Transactions on Robotics, vol. 23, no. 6, pp. 1294-1301, Dec. 2007.
[4] H. Minakata, H. Seki, and S. Tadakuma, “A study of energy-saving shoes for robot considering lateral plane motion,” IEEE Transactions on Industrial Electronics, vol. 55, no. 3, pp. 1271-1276, March 2008.
[5] J. Morimoto, G. Endo, J. Nakanishi, and G. Cheng, “A biologically inspired biped locomotion strategy for humanoid robots: Modulation of sinusoidal patterns by a coupled oscillator model,” IEEE Transactions on Robotics, vol. 24, no. 1, pp. 185-191, Feb. 2008.
[6] F. Asano, M. Yamakita, N. Kamamichi, and Z. W. Luo, “A novel gait generation for biped walking robots based on mechanical energy constraint,” IEEE Transactions on Robotics and Automation, vol. 20, no. 3, pp. 565-573, June 2004.
[7] Q. Huang, and Y. Nakamura, “Sensory reflex control for humanoid walking,” IEEE Transactions on Robotics, vol. 21, no. 5, pp. 977-984, Oct. 2005.
[8] K. Harada, S. Kajita, F. Kanehiro, K. Fujiwara, K. Kanedo, K. Yokoi, and H. Hirukawa, “Real-time planning of humanoid robot’s gait for force-controlled manipulation,” IEEE/ASME Transactions on Mechatronics, vol. 12, no. 1, pp. 53-62, Feb. 2007.
[9] N. Motoi, M. Ikebe, and K. Ohnishi, “Real-time gait planning for pushing motion of humanoid robot,” IEEE Transactions on Industrial Informatics, vol. 3, no. 2, pp. 154-163, May 2007.
[10] K. Harada, S. Kajita, K. Kaneko, and H. Hirukawa, “Dynamics and balance of a humanoid robot during manipulation tasks,” IEEE Transactions on Robotics, vol. 22, no. 3, pp. 568-575, June 2006.
[11] S. Kajita, T. Nagasaki, K. Kaneko, and H. Hirukawa, “ZMP-based biped running control,” IEEE Transactions on Robotics, vol. 14, no. 2, pp. 63-72, June 2007.

[12] E. Ohashi, T. Aiko, T. Tsuji, H. Nishi, and K. Ohnishi, “Collision avoidance method of humanoid robot with arm force,” IEEE Transactions on Industrial Electronics, vol. 54, no. 3, pp. 1632-1641, June 2007.
[13] C. Fu and K. Chen, “Gait synthesis and sensory control of stair climbing for a humanoid robot,” IEEE Transactions on Industrial Electronics, vol. 55, no. 5, pp. 2111-2120, May 2008.
[14] N. Kohl and P. Stone, “Policy gradient reinforcement learning for fast quadrupedal locomotion,” in Proc. IEEE Int. Conf. Robot. Autom., vol. 3 , pp. 2619-2624, New Orleans, LA, May 2004.
[15] http://www.fira.net/
[16] http://www.robocup.org/
[17] http://www.robotis.com/
[18] http://www.altera.com/
[19] http://www.nodna.com/download/ROBOTIS/ZIG-100(english).pdf
[20] http://www.playrobot.com/sensor/files/tdcm3.pdf
[21] http://www.parallax.com/dl/docs/prod/acc/HitachiH48C3AxisAccelerometer.pdf
[22] http://www.robotsfx.com/robot/AGB65_4FS.html
[23] S. Ito, Y. Aoyama, and H. Kawasaki, “A static balance control under periodic external force,” SICE 2003 Annual Conf., vol. 2, pp. 1967-1972, Aug. 2003.
[24] P. Sardain, G. Bessonnet, “Zero moment point-measurements from a human walker wearing robot feet as shoes,” IEEE Trans. Systems, Man and Cybernetics, Part A, vol. 34, no. 5, pp. 638-648, Sept. 2004.
[25] Q. Huang, K. Kaneko, et al., “Balance control of a biped robot combining off-line pattern with real-time modification,” in Proc. IEEE Int. Conf. Robotics and Automation, pp.3346-3352, April, 2000.
[26] P. Sardain, G. Bessonnet, “Forces acting on a biped robot. Center of pressure-zero moment point,” IEEE Trans Systems, Man. Cybernetics, Part A, vol. 34, pp. 630–372, Sept. 2004.
[27] S.-H. Liu, Design and implementation of a gait pattern generator based on genetic algorithms and fuzzy control for small-sized humanoid robot by using SOPC, Master Thesis, National Cheng Kung University, July, 2008.
[28] Y. Choi, D. Kim, Y. Oh, and B. J. You, “Posture/walking control for humanoid robot based on kinematic resolution of CoM Jacobian with embedded motion,” IEEE Transactions on Robotics, vol. 23, no. 6, pp. 1285-1293, Dec. 2007.
[29] C. C. Lee, “Fuzzy logic in control system: Fuzzy logic controller-part I,” IEEE Transactions on Systems, Man and Cybernetics, vol. 20, no. 2, pp. 404-418, 1990.

[30] C. C. Lee, “Fuzzy logic in control system: Fuzzy logic controller-part II,” IEEE Transactions on Systems, Man and Cybernetics, vol. 20, no. 2, pp. 419-435, 1990.
[31] L. A. Zadeh, “Fuzzy sets,” Information Contr., vol. 8, pp. 338-353, 1965.
[32] T. Takagi and M. Sugeno, “Fuzzy identification of systems and its application to modeling and control,” IEEE Transactions on Systems, Man and Cybernetics, vol. 15, no. 1, pp. 116-132, 1985.
[33] J.-P. Berrut and L. N. Trefethen, “Barycentric Lagrange Interpolation,” SIAM Review, vol. 46, pp. 501-517, 2004.
[34] N. Mitsunaga, C. Smith, T. Kanda, H. Ishiguro, and N. Hagita, “Adapting robot behavior for human-robot interaction,” IEEE Transactions on Robotics, vol. 24, no. 4, pp. 911-916, August 2008.
[35] S. Kamio and H. Iba, “Adaptation technique for integrating genetic programming and reinforcement learning for real robots,” IEEE Transactions on Evolutionary Computation, vol. 9, no. 3, pp. 318-333, June 2005.

校內：2012-07-28公開
校外：2014-07-28公開

簡易檢索 / 詳目顯示

相關論文