簡易檢索 / 詳目顯示

研究生: 王英豪
Wang, Yin-Hao
論文名稱: 逆向Q學習法、模糊加強式學習法及類演化程序粒子群最佳化法之研究及其控制應用
Study of Backward Q-learning, Fuzzy Reinforcement Learning, and EP-Like Particle Swarm Optimization and Their Control Applications
指導教授: 李祖聖
Li, Tzuu-Hseng S.
學位類別: 博士
Doctor
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2014
畢業學年度: 102
語文別: 英文
論文頁數: 108
中文關鍵詞: 逆向Q學習模糊Q學習模糊Sarsa學習基因演算法影像分割粒子群最佳化加強式學習
外文關鍵詞: backward Q-learning, fuzzy Q-learning, fuzzy Sarsa learning, genetic algorithm, image segmentation, particle swarm optimization, reinforcement learning
相關次數: 點閱:123下載:4
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本論文提出四個新的且不同的學習演算法,分別是逆向Q學習法、模糊Q學習結合逆向Q學習法 (FQLBQ)、混合基因演算法與模糊Sarsa學習法 (HGAFSL)以及類演化程序的粒子群最佳化演算法 (EPSO)。首先,我們集中在如何結合Q學習以及Sarsa演算法,並提出一種新的方法叫做逆向Q學習,並且可以與Sarsa演算法或者其它的加強式學習演算法結合。逆向Q學習演算法是直接地調整Q值,然後間接地影響動作選擇的策略。其次,逆向Q學習被利用來結合模糊Q學習,模糊Q學習是使用來調整與學習模糊控制系統的後件步,逆向Q學習是被利用來加快模糊Q學習的速度。第三,為了克服傳統的基因演算法隨機地挑選交配點,我們提出了HGAFSL來快速地調整模糊規則的後件步,當每個個體計算其適應值時,模糊Sarsa學習方法將會同時地計算每一個基因的Q值,Q值會被當作一種預測的資訊,這些資訊會被用來判斷個體或群體的基因是好或是壞。因此,根據Q值並且利用輪盤選擇法選擇多點交配且多對雙親去執行交配,以取代隨機交配的方法。
    最後,用EPSO來快速學習顏色資訊的設置,並且執行在機械手臂的控制系統。機械手臂必須抓取有顏色的物體,並且放置到正確的位置裡,其顏色辨識是經由RGB或HSV的色彩模型,一般的方法是經由手動調整其閥值,但是卻非常的耗時以至於無法得到較好的閥值來分割彩色影像。所以,我們提出半自動的學習方法來學習顏色的資訊,先利用分水嶺演算法與使用者互動以分割彩色影像並得到目標影像,然後再比較目標影像與原始影像以建立顏色資訊的查表法,再利用提出的EPSO方法來學習HSV的三個閥值,由實驗結果得知,EPSO可以較快速地學習到閥值以分割彩色影像,且能夠跳出局部最小值。

    In this dissertation, we propose four different novel algorithms including backward Q-learning, fuzzy Q-learning based on backward Q-learning (FQLBQ), a hybrid of genetic algorithm and fuzzy Sarsa learning (HGAFSL), and evolutionary programming like particle swarm optimization (EPSO), respectively. First, we focus on how to combine Q-learning with the Sarsa algorithm, and presents a new method, called backward Q-learning, which can be implemented with the Sarsa algorithm or other reinforcement learning (RL) algorithm. The backward Q-learning directly tunes the Q-values, and then the Q-values will indirectly affect the action selection policy. Second, the backward Q-learning is utilized to integrate with the fuzzy Q-learning (FQL). The FQL is applied to tune and learn the consequence part of the fuzzy control system, and the backward Q-learning is employed to enhance learning speed of FQL. Third, we offer HGAFSL to fast tune the consequent part of the fuzzy rules in order to overcome the conventional GA randomly chooses the crossover point. When each individual estimates the fitness, the FSL will simultaneously compute the Q-value of every gene. The Q-value is regarded as the predicted information that can assist the GA in distinguishing the better or worse gene from an individual or population. Hence, the crossover operation will select multiple crossover point and multiple parents by roulette wheel selection (RWS) according to the Q-value instead of random choice.
    Finally, a fast color information setup based on EPSO for the manipulator control system is examined. The first step for a manipulator to grasp and place color objects into the correct location is to correctly identify the RGB or the corresponding HSV (Hue, Saturation, Value) color model. The commonly used method to determine the thresholds of HSV range is manual tuning, but it is time-consuming to find the best boundary to segment the color image. Therefore, we propose a new method to learn color information, which is executed by semi-automatic learning. The watershed algorithm incorporates user interactions to segment the color image and obtain the target image. Then, the comparison between the target image and the original image is utilized to build a lookup table (LUT) of color information, where three HSV thresholds are learned by EPSO methods. The EPSO methods can not only rapidly learn the thresholds to segment a color image but can also jump out the local minimum.

    中文摘要 I Abstract III 誌謝 V Contents VI List of Figures X List of Table XIV Chapter 1 Introduction 1 1.1 Preliminary 1 1.2 Dissertation Contributions 5 1.3 Dissertation Organization 6 Chapter 2 Backward Q-learning: the Combination of Sarsa Algorithm and Q-learning 7 2.1 Introduction 7 2.2 Background and Previous Methods for Reinforcement Learning 9 2.2.1 Q-learning Algorithm 9 2.2.2 Sarsa Algorithm 10 2.2.3 Adaptive Q-learning 11 2.2.4 SA-Q-learning 13 2.2.5 Adaptive Learning Rate and Fuzzy Balancer 14 2.3 Backward Q-learning 16 2.4 Simulations 21 2.4.1 Cliff-Walking 21 2.4.2 Mountain Car 26 2.4.3 Cart-Pole Balancing Control System 29 2.5 Discussion 32 2.6 Summary 33 Chapter 3 Fuzzy Q-learning based on Backward Q-learning for Truck Backer-Upper Control System 34 3.1 Introduction 34 3.2 Fuzzy Q-learning and Fuzzy Sarsa Learning 35 3.3 Fuzzy Q-learning based on Backward Q-learning 36 3.4 Simulation Results 38 3.5 Summary 42 Chapter 4 An Intelligent Crossover Operation based on a Hybrid of Genetic Algorithm and Fuzzy Sarsa Learning 43 4.1 Introduction 43 4.2 GAs and FSL 44 4.2.1 GAs 44 4.2.2 FSL 46 4.3 A Hybrid of GA and FSL Method (HGAFSL) 47 4.4 Simulation Results 53 4.4.1 Boat Problem: 53 4.4.2 Truck Backer-Upper Control 60 4.5 Summary 67 Chapter 5 A Fast Color Information Setup Using EP-Like PSO for Manipulator Grasping Color Objects 68 5.1 Introduction 68 5.2 Color Information 70 5.3 Background and Various PSO Methods 72 5.3.1 Standard PSO 72 5.3.2 Various PSOs 73 5.4 EPSO Method 74 5.5 Image Segmentation using EPSO 77 5.6 Summary 91 Chapter 6 Conclusions 92 6.1 Conclusions 92 6.2 Recommendations for Future Work 94 References 95

    [1] J. Abdi, B. Moshiri, B. Abdulhai, and A. K. Sedigh, “Forecasting of short-term traffic-flow based on improved neurofuzzy models via emotional temporal difference learning algorithm,” Eng. Appl. Artif. Intell., vol. 25, no. 5, pp. 1022-1042, 2012.
    [2] N. Aissani, B. Beldjilali, and D. Trentesaux, “Dynamic scheduling of maintenance tasks in the petroleum industry: a reinforcement approach,” Eng. Appl. Artif. Intell. vol. 22, no. 7, pp. 1089-1103, 2009.
    [3] R. Alcala, M. J. Gacto, and F. Herrera, “A fast and scalable multiobjective genetic fuzzy system for linguistic fuzzy modeling in high-dimensional regression problems,” IEEE Trans. Fuzzy Syst., vol. 19, no. 4, pp. 666-681, 2011.
    [4] T. Bandlow, M. Klupsch, R. Hanek, and T. Schmitt, “Fast image segmentation, object recognition and localization in a RoboCup scenario,” in Proc. 3. RoboCup Workshop, IJCAI, pp. 174-185, 1999.
    [5] A. G. Barto, R. S. Sutton, and C. W. Anderson, “Neuronlike adaptive elements that can solve difficult learning control problems,” IEEE Trans. Syst., Man, Cybern., vol. 13, no 5, pp. 834-846. 1983.
    [6] K. Belarbi, F. Titel, W. Bourebia, and K. Benmahammed, “Design of Mamdani fuzzy logic controllers with rule base minimisation using genetic algorithm,” Eng. Appl. Artif. Intell., vol. 18, no. 7, pp. 875-880. 2005.
    [7] R. A. C. Bianchi, C. H. C. Ribeiro, and A. H. R. Costa, “Accelerating autonomous learning by using heuristic selection of actions,” J. Heuristics, vol. 14, no. 2, pp. 135-168, 2008.
    [8] A. Blake, C. Rother, M. Brown, P. Perez, and P. Torr, “Interactive image segmentation using an adaptive GMMRF model,” ECCV, vol. 1, pp. 428-441, 2004.

    [9] H. Boubertakh, M. Tadjine, and P. Y. Glorennec, “A new mobile robot navigation method using fuzzy logic and a modified Q-learning algorithm,” J. Intell. Fuzzy Syst., vol. 21, no. 1-2, pp. 113-119, 2010.
    [10] J. A. Boyan and A. W. Moore, “Generalization in reinforcement learning: Safely approximating the value function,” in Advances in Neural Information Processing Systems, vol. 7, pp. 369-376, 1995.
    [11] I. Bratko, “Transfer of control skill by machine learning,” Eng. Appl. Artif. Intell., vol. 10, no. 1, pp. 63-71, 1997.
    [12] G. Bradski and A. Kaebler, Learning OpenCV: Computer Vision With the OpenCV Library. O’Reilly Media, USA, 2008.
    [13] B. Carse, T. C. Fogarty, and A. Munro, “Evolving fuzzy rule based controllers using genetic algorithms,” Fuzzy Sets Syst., vol. 80, no. 3, pp. 273-293, 1996.
    [14] Y. J. Cao, “Eigenvalue optimization problem via evolutionary programming,” Electron. Lett., vol. 33, no. 7, pp. 642-644, 1997.
    [15] W. C. Chang, “Precise positioning of binocular eye-to-hand robotic manipulators,” J Intell. Robot. Syst., vol. 49, no. 3, pp. 219-236, Jul. 2007.
    [16] K. Y. Chan, C. K. F. Yiu, T. S. Dillon, S. Nordholm, and S. H. Ling, “Enhancement of speech recognitions for control automation using an intelligent particle swarm optimization,” IEEE Trans. Ind. Informat., vol. 8, no. 4, pp. 869-879, 2012.
    [17] S. M. Chen, Y. C. Chang, and J. S. Pan, “Fuzzy rules interpolation for sparse fuzzy rule-based systems based on interval type-2 Gaussian fuzzy sets and genetic algorithms,” IEEE Trans. Fuzzy Syst., vol. 21, no. 3, pp. 412-425, 2013.
    [18] S. M. Chen and Y. C. Chang, “Weighted fuzzy rule interpolation based on GA-based weight-learning techniques,” IEEE Trans. Fuzzy Syst., vol. 19, no. 4, pp. 729-744, 2011.
    [19] C. K. Chiang, H. Y. Chung, and J. J. Lin, “A self-learning fuzzy logic controller using genetic algorithms with reinforcements,” IEEE Trans. Fuzzy Syst., vol. 5, no. 3, pp. 460-467, 1997.
    [20] J. S. Chiou and M. T. Liu, “Numerical simulation for fuzzy- PID controllers and helping EP reproduction with PSO hybrid algorithm,” Simul. Model. Pract. Theory, vol. 17, no. 10, pp. 1555-1565, 2009.
    [21] B. Chu, D. Hong, J. Park, and J. H. Chung, “Passive dynamic walker controller design employing an RLS-based natural actor-critic learning algorithm,” Eng. Appl. Artif. Intell., vol. 21, no. 7, pp. 1027-1034, 2008.
    [22] M. Clerc, “The swarm and the queen: Toward a deterministic and adaptive particle swarm optimization,” in Proc. IEEE Int. Congr. Evol. Comput., vol. 3, 1999, pp. 1951-1957.
    [23] O. Cordon, F. Herrera, F. Gomide, F. Hoffmann, and L. Magdalena, “Ten years of genetic fuzzy ssytems: Current framework and new trends,” in Proc. Joint IFSAWorld Congress and 20th NAFIPS Int. Conf., Vancouver, BC, Canada, 2001, pp. 1241-1246.
    [24] X. Dai, C. K. Li, and A. B. Rad, “An approach to tune fuzzy controllers based on reinforcement learning for autonomous vehicle control,” IEEE Trans. Intell. Transpo. Syst. vol. 6, no. 3, pp. 285-293, 2005.
    [25] K. Deep and M. Thakur, “A new crossover operator for real coded genetic algorithms,” Appl. Math. Comput., vol. 188, no. 1, pp.895-911, 2007.
    [26] V. Derhami, “Similarity of learned helplessness in human being and fuzzy reinforcement learning algorithms,” J. Intell. Fuzzy Syst., vol. 24, no. 2, pp. 347-354, 2013.
    [27] V. Derhami, V. J. Majd, and M. N. Ahmadabadi, “Exploration and exploitation balance management in fuzzy reinforcement learning,” Fuzzy Sets Syst., vol. 161, no. 4, pp. 578-595, 2010.
    [28] V. Derhami, V. J. Majd, and M. N. Ahmadabadi, “Fuzzy Sarsa learning and the proof of existence of its stationary points,” Asian J. Control, vol. 10, no. 5, pp. 535-549, 2008.
    [29] S. F. Desouky and H. M. Schwartz, “Self-learning fuzzy logic controllers for pursuit-evasion differential games,” Robot. Auton. Syst., vol. 59, no. 1, pp. 22-33, 2011.
    [30] D. Y. Dong, C. L. Chen, H. X. Li, and T. J. Tarn, “Quantum reinforcement learning,” IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 38, no. 5, pp. 1207-1220, 2008.
    [31] R. Eberchart and J. Kennedy, “A new optimizer using particle swarm theory,” in Proc. 6th Int. Symp. Micro Machine and Human Science, Nagoya, Japan, Oct. 1995, pp. 39-43.
    [32] M. I. El-Hawwary, A. L. Elshafei, H. M. Emara, and H. A. A. Fattah, “Adaptive fuzzy control of the inverted pendulum problem,” IEEE Trans. Control Syst. Technol., vol. 5, no. 2, pp. 254-260, 2006.
    [33] G. Flandin, F. Chaumette, and E. Marchand, “Eye-in-hand / eye-to-hand cooperation for visual servoing,” in Proc. IEEE Int. Conf. Robot. Autom., San Francisco, 2000, pp. 2741-2746.
    [34] M. Guo, Y. Liu, and J. Malec, “A new Q-learning algorithm based on the metropolis criterion,” IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 34, no. 5, pp. 2140-2143, 2004.
    [35] P. Y. Glorennec and L. Jouffe, “Fuzzy Q-learning,” in Proc. IEEE Int. Conf. Fuzzy Systems, 1997, pp. 659-662.
    [36] G. Gordon, “Stable function approximation in dynamic programming,” Carnegie Mellon Univ., Pittsburgh, PA, Tech. Rep. CMU-CS-95-103, 1995.
    [37] C. Gönner, M. Rous, and K. Kraiss, “Real-time adaptive colour segmentation for the robocup middle size league,” In RoboCup 2004: Robot Soccer World Cup VIII, D. Nardi, M. Riedmiller, C. Sammut, and J. Santos-Victor, Eds. Berlin, Springer-Verlag, pp. 402-409, 2005.
    [38] R. C. Gonzalez, R. E. Woods, and S. L. Eddins, Digital Image Processing Using MATLAB. Englewood Cliffs, NJ: Prentice-Hall, 2004.
    [39] R. Gurumoorthy and S. R. Sanders, “Controlling nonminimum phase nonlinear system-the inverted pendulum on a cart example,” In: Proc. American Control Conf., pp. 680-685, 1993.
    [40] M. T. Harandi, M. N. Ahmadabadi, and B. N. Araabi, “Optimal local basis: A reinforcement learning approach for face recognition,” Int. J. Comput. Vis., vol. 81, no. 2, pp. 191-204, 2009.
    [41] J. H. Holland, Adaptation in Natural and Artificial Systems. Ann Arbor, MI: The University of Michigan Press, 1975.
    [42] J. S. Hu and Y. J. Chang, “Calibration of an eye-to-hand system using a laser pointer on hand and planar constrains,” in Proc. IEEE Int. Conf. Robot. Autom., Shanghai, China, 2011, pp. 982-987.
    [43] Z. Huang and Q. Shen, “Fuzzy interpolation and extrapolation: A practical approach,” IEEE Trans. Fuzzy Syst., vol. 16, no. 1, pp. 13-28, 2008.
    [44] K. S. Hwang, S. W. Tan, and C. C. Chen, “Cooperative strategy based on adaptive Q-learning for robot soccer systems,” IEEE Trans. Fuzzy Syst., vol. 12, no. 4, pp. 569-576, 2004.
    [45] L. Jouffe, “Fuzzy inference system learning by reinforcement methods,” IEEE Trans. Syst., Man, Cybern. C, Appl. Rev., vol. 28, no.3, pp. 338-355, 1998.
    [46] Y. T. Juang, S. L. Tung, and H. C. Chiu, “Adaptive fuzzy particle swarm optimization for global optimization of multimodal functions,” Inform. Sci., vol. 181, no. 20, pp. 4539-4549, 2011.
    [47] C. F. Juang and C. M. Lu, “Ant colony optimization incorporated with fuzzy Q-learning for reinforcement fuzzy control,” IEEE Trans. Syst., Man, Cybern. A, Syst. Humans, vol. 39, no. 3, pp. 597-608, 2009.
    [48] C. F. Juang and C. H. Hsu, “Reinforcement interval type-2 fuzzy controller design by online rule generation and Q-value-aided ant colony optimization,” IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 39, no. 6, pp. 1528-1542, 2009.
    [49] C. F. Juang, “Combination of online clustering and Q-value based GA for reinforcement fuzzy system design,” IEEE Trans. Fuzzy Syst., vol. 13, no. 3, pp. 289-302, 2005.
    [50] C. F. Juang, J. Y. Lin, and C. T. Lin, “Genetic reinforcement learning through symbiotic evolution for fuzzy controller design,” IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 30, no. 2, pp. 290-302, 2000.
    [51] L. P. Kaelbling, M. L. Littman, and A. W. Moore, “Reinforcement learning: A survey,” J. Artif. Intell. Res., vol. 4, pp. 237-287, 1996.
    [52] S. Kamio and H. Iba, “Adaptation technique for integrating genetic programming and reinforcement learning for real robots,” IEEE Trans. Evolutionary Computation, vol. 9, no. 3, pp. 318-333, 2005.
    [53] M. Kaya, “The effects of two new crossover operators on genetic algorithm performance,” Appl. Soft. Comput., vol. 11. no. 1, pp. 881-890, 2011.
    [54] J. Kennedy and R. Eberhart, “Particle swarm optimization,” in Proc. IEEE Int. Conf. Neural Networks, 1995, pp. 1942-1948.
    [55] R. Kjeldsen and J. Kender, “Finding skin in color images,” in Proc. Second Int. Conf. Automatic Face and Gesture Recognition, pp. 312-317, 1996.

    [56] G. Leng, T. M. McGinnity, and G. Prasad, “Design for self-organizing fuzzy neural networks based on genetic algorithms,” IEEE Trans. Fuzzy Syst., vol. 14, no. 6, pp. 755-766, 2006.
    [57] M. S. Leu and M. F. Yeh, “Grey particle swarm optimization,” Appl. Soft Comput., vol. 12, no. 9, pp. 2985-2996, 2012.
    [58] Y. W. Leung and Y. Wang, “An orthogonal genetic algorithm with quantization for global numerical optimization,” IEEE Trans. Evol. Comput., vol. 5, no. 1, pp. 41-53, 2001.
    [59] T. H. S. Li, Y. T. Su, S. W. Lai, and J. J. Hu, “Walking motion generation, synthesis, and control for biped robot by using PGRL, LPI, and fuzzy logic,” IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 41, no. 3, pp. 736-748, 2011.
    [60] T. H. S. Li and M. Y. Shieh, “Switching-type fuzzy sliding mode control of a cart-pole system,” Mechatronics, vol. 10, no. 1-2, pp. 91-109, 2000.
    [61] H. Li and C. Shen, “Interactive color image segmentation with linear programming,” Mach. Vis. Appl., vol. 21, no. 4, pp. 403-412, 2010.
    [62] J. J. Liang, A. K. Qin, P. N. Suganthan, and S. Baskar, “Comprehensive learning particle swarm optimizer for global optimization of multimodal functions,” IEEE Trans. Evol. Comput., vol. 10, no. 3, pp. 281-295, Jun. 2006.
    [63] C. J. Lin and Y. C. Hsu, “Reinforcement hybrid evolutionary learning for recurrent wavelet-based neurofuzzy systems,” IEEE Trans. Fuzzy Syst., vol. 15, no. 4, pp. 729-745, 2007.
    [64] Y. P. Lin and X. Y. Li, “Reinforcement learning based on local state feature learning and policy adjustment,” Inform. Sci., vol. 154, no. 1-2, pp. 59-70, 2003.
    [65] D. Luo and R. Xiong, “An improved error-correcting output coding framework with kernel-based decoding,” Neurocomputing, vol. 71, no. 16-18, pp. 3131-3139, 2008.
    [66] G. Metta, A. Gasteratos, and G. Sandini, “Learning to track colored objects with log-polar vision,” Mechatronics, vol. 14, no. 9, pp. 989-1006, 2004.
    [67] J. R. Millan, “Rapid, safe, and incremental learning of navigation strategies,” IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 26, no. 3, pp. 408-420, 1996.
    [68] T. Mori and S. Ishii, “Incremental state aggregation for value function estimation in reinforcement learning,” IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 41, no. 5, pp. 1407-1416, 2011.
    [69] J. Ning, L. Zhang, D. Zhang, and C. Wu, “Interactive image segmentation by maximal similarity based region merging,” Pattern Recognit., vol. 43, no. 2, pp. 445-456, 2010.
    [70] A. Noma, A. B. V. Graciano, R. M. C. Jr, L. A. Consularo, and I. Bloch, “Interactive image segmentation by matching attributed relational graphs,” Pattern Recognit., vol. 45, no. 3, pp. 1159-1179, 2012.
    [71] S. K. Oh, W. Pedrycz, S. B. Rho, and T. C. Ahn, “Parameter estimation of fuzzy controller and its application to inverted pendulum,” Eng. Appl. Artif. Intell., vol. 17, no. 1, pp. 37-60, 2004.
    [72] M. Pardowitz, R. Haschke, J. Steil, and H. Ritter, “Gestalt-based action segmentation for robot task learning,” in Proc. 8th IEEE-RAS Int. Conf. Humanoid Robot., Daejeon, Korea, Dec. 2008, pp. 347-352.
    [73] R. P. Prado, S. Garcia-Galán, J. Exposito, and A. J. Yuste, “Knowledge acquisition in fuzzy-rule-based systems with particle-swarm optimization,” IEEE Trans. Fuzzy Syst., vol. 18, no. 6, pp. 1083-1097, 2010.
    [74] M. Rahimiyan and H. R. Mashhadi, “An adaptive Q-Learning algorithm developed for agent-based computational modeling of electricity market,” IEEE Trans. Syst., Man, Cybern. C, Appl. Rev., vol. 40, no. 5, pp. 547-556, 2010.
    [75] A. Ratnaweera, S. K. Halgamuge, and H. C. Watson, “Self-organizing hierarchical particle swarm optimizer with time-varying acceleration coefficients,” IEEE Trans. Evol. Comput., vol. 8, no. 3, pp. 240-255, 2004.
    [76] V. Roberge, M. Tarbouchi, and G. Labonte, “Comparison of parallel genetic algorithm and particle swarm optimization for real-time UAV path planning,” IEEE Trans. Ind. Informat., vol. 9, no. 1, pp. 132-141, 2013.
    [77] C. C. D. Ronco and E. Benini, “A Simplex Crossover based evolutionary algorithm including the genetic diversity as objective,” Appl. Soft. Comput., vol. 13. no. 4, pp. 2104-2123, 2013.
    [78] C. Rother, V. Kolmogorov, and A. Blake, “Grabcut─ interactive foreground extraction using iterated graph cuts,” ACM Trans. Graph., vol. 23, no. 3, pp. 309-314, 2004.
    [79] D. Schiebener, A. Ude, J. Morimotot, T. Asfour, and R. Dillmann, “Segmentation and learning of unknown objects through physical interaction,” in Proc. 11th IEEE-RAS Int. Conf. Humanoid Robot., Bled, Slovenia, Oct 2011, pp. 500-506.
    [80] R. Sharma and M. Gopal, “A Markov game-adaptive fuzzy controller for robot manipulators,” IEEE Trans. on Fuzzy Systems, vol. 16, no.1, pp. 171-186, 2008.
    [81] Y. Shi and R. C. Eberhart, “A modified particle swarm optimizer,” in Proc. IEEE Int. Conf. Evol. Comput., Anchorage, AK, May 1998, pp. 69-73.
    [82] Y. Shi and R. C. Eberhart, “Parameter selection in particle swarm optimization,” in Proc. 7th Int. Conf. Evol. Program., vol. 1447, 1998, pp. 591-600.
    [83] M. Sridharan and P. Stone, “Color learning and illumination invariance on mobile robots: A survey,” Robot. Auton. Syst., vol. 57, no. 6-7, pp. 629-644, 2009.
    [84] R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, Cambridge, MA: MIT Press, 1998.
    [85] R. S. Sutton, “Learning to predict by the methods of temporal differences,” Mach. Learn., vol. 3, no. 1, pp. 9-44, 1988.
    [86] R. S. Sutton, “Generalization in reinforcement learning: Successful examples using sparse coarse coding,” In D. S. Touretzky, M. C. Mozer, and M. E. Hasselmo, editors, Advances in Neural Information Processing Systems, vol. 8, pp. 1038-1045, MIT Press, Cambridge MA, 1996.
    [87] W. R. Tan, C. S. Chan, P. Yogarajah, and J. Condell, “A fusion approach for efficient human skin detection,” IEEE Trans. Ind. Informat., vol. 8, no. 1, pp. 138-147, 2012.
    [88] W. Tao, F. Chang, L. Liu, H. Jin, and T. Wang, “Interactively multiphase image segmentation based on variational formulation and graph cuts,” Pattern Recognit., vol. 43, no. 10, pp. 3208-3218, 2010.
    [89] F. Tao, D. Zhao, Y. Hu, and Z. Zhou, “Resource service composition and its optimal-selection based on particle swarm optimization in manufacturing grid system,” IEEE Trans. Ind. Informat., vol. 4, no. 4, pp. 315-327, 2008.
    [90] J. T. Tsai, T. K. Liu, W. H. Ho, and J. H. Chou, “An improved genetic algorithm for job-shop scheduling problems using Taguchi-based crossover,” Int. J. Adv. Manuf. Technol., vol. 38, no. 9-10, pp. 987-994, 2008.
    [91] J. T. Tsai, T. K. Liu, and J. H. Chou, “Hybrid Taguchi-genetic algorithm for global numerical optimization,” IEEE Trans. Evol. Comput., vol. 8, no. 4, pp. 365-377, 2004.
    [92] N. Q. Uy, N. X. Hoai, M. O’Neill , R. I. McKay , and D. N. Phong, “On the roles of semantic locality of crossover in genetic programming,” Inform. Sci., vol. 235, no. 20, pp. 195-213, 2013.
    [93] J. F. Vigueras and M. Rivera, “Registration and interactive planar segmentation for stereo images of polyhedral scenes,” Pattern Recognit., vol. 43, no. 2, pp. 494-505, 2010.
    [94] L. Vincent and P. Soille, “Watersheds in digital spaces: An efficient algorithm based on immersion simulations,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 13, no. 6, pp. 583-598, 1991.
    [95] Y. H. Wang, T. H. S. Li, and C. J. Lin, “Backward Q-learning: The combination of Sarsa algorithm and Q-learning,” Eng. Appl. Artif. Intell., vol. 26, no. 9, pp. 2184-2193, 2013.
    [96] Y. Wang and C.W. de Silva, “A machine-learning approach to multi-robot coordination,” Eng. Appl. Artif. Intell., vol. 21, no. 3, pp. 470-484, 2008.
    [97] C. H. Wang, C. S. Cheng, and T. T. Lee, “Dynamical optimal training for interval type-2 fuzzy neural network (T2FNN),” IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 34, no. 3, pp. 1462-1477, 2004.
    [98] L. X. Wang and J. M. Mendel, “Generating fuzzy rules by learning from examples,” IEEE Trans. Syst., Man, Cybern., vol. 22, no. 6, pp. 1414-1427, 1992.
    [99] C. J. C. H. Watkins and P. Dayan, “Q-learning,” Mach. Learn., vol. 8, no. 3-4, pp. 279-292, 1992.
    [100] M. A. Wiering and H. van Hasselt, “Ensemble algorithms in reinforcement learning,” IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 38, no. 4, pp. 930-936, 2008.
    [101] M. A. Wiering and H. van Hasselt, “Two novel on-policy reinforcement learning algorithms based on TD(λ)-methods,” in Proc. IEEE Int. Symp. Adaptive Dyn. Program. Reinforcement Learn., pp. 280-287, 2007.
    [102] S. W. Wijesoma, D. F. H. Wolfe, and R. J. Richards, “Eye-to-hand coordination for visual-guided robot control applications,” Int. J. Robot. Res., vol. 12, no. 1, pp. 65-78, 1993.
    [103] K. M. Wurm, C. Stachniss, and G. Grisetti, “Bridging the gap between feature- and grid-based SLAM,” Robot. Auton. Syst., vol. 58, no. 2, pp. 140-148, 2010.
    [104] S. Xiang, C. Pan, F. Nie, and C. Zhang, “Interactive image segmentation with multiple linear reconstructions in windows,” IEEE Trans. Multi., vol. 13, no. 2, pp. 342-352, 2011.
    [105] F. Yao, Z. Y. Dong, K. Meng, Z. Xu, H. H. Iu, and K. P. Wong, “Quantum-inspired particle swarm optimization for power system operations considering wind power uncertainty and carbon tax in Australia,” IEEE Trans. Ind. Informat., vol. 8, no. 4, pp. 880-888, 2012.
    [106] X. Yin, D. Guo, and M. Xie, “Hand image segmentation using color and RCE neural network,” Robot. Auton. Syst., vol. 34, no. 4, pp. 235-250, 2001.
    [107] M. Zhang, X. Gao, and W. Lou, “A new crossover operator in genetic programming for object classification,” IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 37, no. 5, pp. 1332-1343, 2007.
    [108] Y. Zhou and M. J. Er, “An evolutionary approach toward dynamic self-generated fuzzy inference systems,” IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 38, no. 4, pp. 963-969, 2008.
    [109] TIROS 2013, http://www.tiros.org.tw/en/index.asp.

    下載圖示 校內:2019-01-28公開
    校外:2019-01-28公開
    QR CODE