成功大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	蘇紹軒 Su, shao-hsuan
論文名稱：	以DQN演算法解工廠排程問題之可行性 Using Deep Q-learning Network Algorithm to solve the feasibility of workshop scheduling problem
指導教授：	王明習 Wang, Ming-Shi
學位類別：	碩士 Master
系所名稱：	工學院 - 工程科學系碩士在職專班 Department of Engineering Science (on the job class)
論文出版年：	2020
畢業學年度：	108
語文別：	中文
論文頁數：	54
中文關鍵詞：	生產排程、DQN 、人工智慧
外文關鍵詞：	Workshop Scheduling, Deep Q-learning Network, Artificial Intelligence
相關次數：	點閱：288 下載：31
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

現今，全球人工智慧（Artificial Intelligence）發展極為迅速，過去所面臨的障礙已不斷地被克服超越，新技術與應用逐漸發展成熟，開始延伸到不同領域。本研究利用強化學習（Reinforcement learning ,RL）與人工神經網路（Artificial Neural Network，ANN）仿造DQN演算法，來建構一個自動生產排程程式並進行實驗，將實驗結果與人工排程作比較。本研究所設環境架構相較實際環境雖有落差，其中包含忽略交期因素並且在整個生產過程僅考慮單一工序的排程作業，所以程式排程的結果不一定符合產品交期，而且多工序排程問題仍需要進一步的研究。雖然仍有很多問題尚未解決，本研究結果可以確認由Deep Q-learning Network, DQN來執行生產排程，能夠達到如同資深管理人員所規劃之生產排程相同的效能，可見DQN能有效的解決生產排程問題。此研究也可看出在排程方面使用DQN演算法取代人力，而且在未來將智能管理系統導入自動排程規劃，甚至可根據訂單與製程參數來自動規劃及預測，使產線生產效率逹到最佳化，達到真正無人工廠的可能性。

Today, Artificial Intelligence is developing very rapidly. Many obstacles have been resolved. New technologies have gradually developed and applied to different areas. In this research, we applied Deep Q-learning Network (DQN) algorithm which is a Reinforcement learning (RL) to implement a production-scheduling program to solve the workshop scheduling problem. The experimental results compare with that of manual method. The operation environment set in this research is to neglect the delivery factors and only a single process is considered. To compare the experimental results with that of manual method, it is shown that the scheduling results by DQN method can achieve the same efficiency as that of manual method but with less time consumption. It means that DQN can effectively solve the workshop-scheduling problem. The operation environment set in this research may not meet the product delivery date, and the problem of multi-process scheduling still needs further resolve.

第 1 章 緒論	1
1 研究背景	1
2 研究動機與目的	2
3 論文架構說明	3
第 2 章 背景知識與相關研究	4
1 強化學習	4
1.1 馬可夫決策過程	5
1.2 貝爾曼方程式	7
1.3 Q-Learning	9
2 人工神經網路	12
2.1 反向傳播	14
3 Deep Q-learning Network	14
第 3 章 DQN演算法解工廠排程問題規劃	17
1 排程問題擬合：	19
1.1 馬可夫決策過程轉化	20
1.2 建立人工神經網路	23
2 流程設計與說明	24
第 4 章 實驗過程與問題討論	30
1 實驗過程	30
1.1 實驗案例一	31
1.2 實驗案例二	40
2 問題與討論	43
第 5 章 結論與未來研究方向	48
1 結論	48
2 未來研究方向	49
參考文獻	51
                                    

[1] R. S. Sutton and A. G. Barto, Reinforcement learning: An introduction, MIT press, 2018.
[2] W. Qualcomm, We are making on-device AI ubiquitous, 2018, https://www.qualcomm.com/news/onq/2017/08/16/we-are-making-device-ai-ubiquitous
[3] E. Shehab, M. Sharp, L. Supramaniam, and T. A. Spedding, "Enterprise resource planning: An integrative review," Business process management journal, vol. 10, no. 4, 2004, pp.359-386.
[4] Wikipedia, Robotic arm, 2020, https://en.wikipedia.org/w/index.php?title=Robotic_arm&oldid=962645951
[5] Wikipedia, Automated guided vehicle, 2020, https://en.wikipedia.org/w/index.php?title=Automated_guided_vehicle&oldid=971376953
[6] D. Silver, J. Schrittwieser, K. Simonyan, I. Antonoglou, A. Huang, A. Guez, T. Hubert, L. Baker, M. Lai, and A. Bolton, "Mastering the game of go without human knowledge," nature, vol. 550, no. 7676, 2017, pp.354-359.
[7] G. Joshua, Everything You Need to Know to Get Started in Reinforcement Learning, 2017, https://joshgreaves.com/reinforcement-learning/introduction-to-reinforcement-learning/
[8] R. Bellman and S. Dreyfus, "Dynamic programming and the reliability of multicomponent devices," Operations Research, vol. 6, no. 2, 1958, pp.200-206.
[9] C. J. Watkins and P. Dayan, "Q-learning," Machine learning, vol. 8, no. 3-4, 1992, pp.279-292.
[10] J. Garson, Connectionism, 1997, https://plato.stanford.edu/entries/connectionism/
[11] R. Hecht-Nielsen, "Theory of the backpropagation neural network," in Neural networks for perception: Elsevier, 1992, pp. 65-93.
[12] V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller, "Playing atari with deep reinforcement learning," arXiv, vol. abs/1312.5602, 2013. https://arxiv.org/pdf/1312.5602
[13] S. Haykin, Neural networks: a comprehensive foundation, Prentice Hall PTR, 1994.
[14] T. Simonini, An Introduction to Deep Q-Learning: Let’s Play Doom, 2018, https://www.freecodecamp.org/news/an-introduction-to-deep-q-learning-lets-play-doom-54d02d8017d8/
[15] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," in Advances in neural information processing systems, University of Toronto, December 1, 2012, pp. 1097-1105.
[16] T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, "Continuous control with deep reinforcement learning," arXiv, vol. abs/1509.02971, 2015. https://arxiv.org/pdf/1509.02971
[17] V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, and G. Ostrovski, "Human-level control through deep reinforcement learning," Nature, vol. 518, no. 7540, 2015, pp.529-533.
[18] M. Hessel, J. Modayil, H. v. Hasselt, T. Schaul, G. Ostrovski, W. Dabney, D. Horgan, B. Piot, M. G. Azar, and D. Silver, "Rainbow: Combining Improvements in Deep Reinforcement Learning," arXiv, vol. abs/1710.02298, 2018. https://arxiv.org/pdf/1710.02298
[19] V. Firoiu, W. F. Whitney, and J. B. Tenenbaum, "Beating the world's best at Super Smash Bros. with deep reinforcement learning," arXiv, vol. abs/1702.06230, 2017. https://arxiv.org/pdf/1702.06230
[20] N. Brown, T. Sandholm, and S. Machine, "Libratus: The Superhuman AI for No-Limit Poker," in IJCAI, Computer Science Department Carnegie Mellon University, August, 2017, pp. 5226-5228.
[21] G. Lample and D. S. Chaplot, "Playing FPS games with deep reinforcement learning," arXiv, vol. abs/1609.05521, 2016. https://arxiv.org/pdf/1609.05521
[22] J. Błażewicz, W. Domschke, and E. Pesch, "The job shop scheduling problem: Conventional and new solution techniques," European journal of operational research, vol. 93, no. 1, 1996, pp.1-33.
[23] X. G. G. Yu, "INTELLIGENT WORKSHOP SCHEDULING WITH MACHINE LEARNING [J]," COMPUTER INTEGRATED MANUFACTURING SYSTEMS, vol. 2, 1996, pp.1-4.
[24] D. Applegate and W. Cook, "A computational study of the job-shop scheduling problem," ORSA Journal on computing, vol. 3, no. 2, 1991, pp.149-156.
[25] S. A. Cook, "The complexity of theorem-proving procedures," in 3rd ACM Symposium on Theory of Computing, Shaker Heights, OH, USA, May 3-5, 1971, pp. 151-158.
[26] H. Mao, M. Alizadeh, I. Menache, and S. Kandula, "Resource management with deep reinforcement learning," in Proceedings of the 15th ACM Workshop on Hot Topics in Networks, Atlanta, Georgia, USA, November 9-10, 2016, pp. 50-56.
[27] A. Y. Ng and S. J. Russell, "Algorithms for inverse reinforcement learning," in 17th International Conf. on Machine Learning, Stanford, CA, USA, July 2, 2000, pp. 663-670.
[28] T. Schaul, D. Horgan, K. Gregor, and D. Silver, "Universal value function approximators," in International conference on machine learning, Madison, Wisconsin, USA, July 24-27, 2015, pp. 1312-1320.
[29] A. Edwards, C. Isbell, and A. Takanishi, "Perceptual reward functions," arXiv, vol. abs/1608.03824, 2016. https://arxiv.org/pdf/1608.03824
[30] R. W. Beard, G. N. Saridis, and J. T. Wen, "Galerkin approximations of the generalized Hamilton-Jacobi-Bellman equation," in Automatica, Department of Electrical and Computer Engineering, Brigham Young University, Provo, UT 84602 U.S.A., June 23, 1997, vol. 33, no. 12, pp. 2159-2177.
[31] G. Feng, G.-B. Huang, Q. Lin, and R. Gay, "Error minimized extreme learning machine with growth of hidden nodes and incremental learning," IEEE Transactions on Neural Networks, vol. 20, no. 8, 2009, pp.1352-1357.
[32] C. Zhang, O. Vinyals, R. Munos, and S. Bengio, "A study on overfitting in deep reinforcement learning," arXiv, vol. abs/1804.06893, 2018. https://arxiv.org/pdf/1804.06893
[33] N. Justesen, R. R. Torrado, P. Bontrager, A. Khalifa, J. Togelius, and S. Risi, "Illuminating generalization in deep reinforcement learning through procedural level generation," arXiv, vol. abs/1806.10729, 2018. https://arxiv.org/pdf/1806.10729
[34] I. Popov, N. Heess, T. Lillicrap, R. Hafner, G. Barth-Maron, M. Vecerik, T. Lampe, Y. Tassa, T. Erez, and M. Riedmiller, "Data-efficient deep reinforcement learning for dexterous manipulation," arXiv, vol. abs/1704.03073, 2017. https://arxiv.org/pdf/1704.03073
[35] P. Henderson, R. Islam, P. Bachman, J. Pineau, D. Precup, and D. Meger, "Deep reinforcement learning that matters," arXiv, vol. abs/1709.06560, 2017. https://arxiv.org/pdf/1709.06560
[36] K. Arulkumaran, M. P. Deisenroth, M. Brundage, and A. A. Bharath, "Deep reinforcement learning: A brief survey," IEEE Signal Processing Magazine, vol. 34, no. 6, 2017, pp.26-38.
[37] A. Irpan, Deep reinforcement learning doesn’t work yet, 2018, https://www.alexirpan.com/2018/02/14/rl-hard.html

校外：立即公開

簡易檢索 / 詳目顯示

相關論文