| 研究生: |
陳政熙 Cheng, Zheng-Xi |
|---|---|
| 論文名稱: |
使用人工智慧進行機械手臂故障報告文字分析 Analysis of Robotic Fault Reporting Text Using Artificial Intelligence |
| 指導教授: |
鍾俊輝
Chung, Chun-Hui |
| 學位類別: |
碩士 Master |
| 系所名稱: |
工學院 - 機械工程學系 Department of Mechanical Engineering |
| 論文出版年: | 2020 |
| 畢業學年度: | 108 |
| 語文別: | 中文 |
| 論文頁數: | 61 |
| 中文關鍵詞: | 人工智慧 、類神經網路 、自然語言處理 、詞向量 、word2vec 、文章分類 |
| 外文關鍵詞: | Artificial intelligence, neural network, natural language processing, word vector, word2vec, text classification |
| 相關次數: | 點閱:90 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本次研究主要是為了要建立一套系統,可以自動處理、分類各種使用者針對故障情境以及特徵所描述出的故障報告,並給予適合排除故障的方法,目的是為了提升在工作場合中使用者自主排除故障的能力以及效率。此系統主要分為詞向量與文章分類器兩個部分。第一個部分詞向量,是先將語料庫中的文章進行分詞,接著透過分布假說所建立的模型進行淺層的類神經網路運算,得到每個字詞專屬的詞向量,讓電腦可以更進一步的了解字詞之間的差異。第二部分為文章分類器,將故障文章分詞和去停用詞後,再經由先前運算出的詞向量進行崁入其中後,得到一個大型的數字矩陣,將這些數字矩陣與事先透過人工方式標記好的故障類型輸入模型,進行5種類神經網路的訓練,5種類神經網路分別為MLP、CNN、RNN、LSTM以及GRU模型。
本次的實驗一共使用202筆資料作為訓練集訓練模型, 87筆資料作為測試集測試模型,實驗中CNN模型訓練速度最為快速,只花費了5.2秒,而MLP花費40.7秒、RNN花費7.1秒、LSTM花費11.8秒以及GRU則是花費11.6秒。以各模型準確率而言,LSTM模型與GRU模型於測試集的準確率最高,分別為45.8%以及46%的準確率,比較回傳前三個機率較大的故障辨識結果,以正確類別有包含在回傳的三項類別中進行模型性能比較,則LSTM與GRU模型的準確率變提升至83.2%和78.9%,則MLP、CNN和RNN模型的率準確都能分別提升至73.6%、75.2%和67.4% 。若未來能提升訓練模型的資料量,模型的性能勢必可以再進一步提升。
The main purpose of this research is to develop a system to process and automatically classify different kinds of text fault report. The system also provides a troubleshooting method to improve the working efficiency of the factory. The developed system is divided into two parts: word2vec and text classifier. Word2vec was utilized as a preprocessing step, and the vector was obtained through shallow neural network training. Thus, the computer can recognize the difference between words. The second part is the text classifier. Initially, preprocessing of the fault text is performed, then the word vector was embedded into sentence matrix. Five neural networks models were utilized for automatic classification of fault text. These include Multilayer Perceptron (MLP), Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), Long Short-term Memory (LSTM) and Gate Recurrent Unit (GRU) models. The text classification performance of these five different neural networks were compared.
In the study, 202 Chinese texts were used for training and 87 for testing the models. The training speed was 5.2s, 40.7s, 7.1s, 11.8s and 11.6s for CNN, MLP, RNN, LSTM, and GRU respectively. The results show that the CNN model has the fastest training speed. LSTM and the GRU model have the highest accuracy in the testing, which are with 45.8% and 46%, respectively. If the system is modified to return the top three possible failure type. The accuracy of LSTM and GRU models will increase to 83.2% and 78.9%, while the accuracy of MLP, CNN and RNN models increases to 73.6%, 75.2% and 67.4% respectively.
1.Djurdjanovic, D., Lee, J. and Ni, J., 2003, “Watchdog Agent—an infotronics-based prognostics approach for product performance degradation assessment and prediction,” Journal of Advanced Engineering Informatics, 17(3), pp. 109-125.
2.Dong, M. and He, D., 2007 “Hidden semi-Markov model-based methodology for multi-sensor equipment health diagnosis and prognosis,” European Journal of Operational Research, 178(3), pp. 858-878.
3.Mikolov, T., Chen, K., Corrado, G. and Dean, J., 2013, “Efficient Estimation of Word Representations in Vector Space,” Proceedings of the 1st International Conference on Learning Representations, ICLR 2013, Scottsdale, Arizona, United States, May 2-4, 2013.
4.Mikolov, T., Sutskever, I., Chen, K. and Corrado, G. S., 2013, “Distributed Representations of Words and Phrases and their Compositionality,’’ Proceedings of the 27th Annual Conference on Neural Information Processing Systems, NIPS 2013, Lake Tahoe, Nevada, United states December 5-10, 2013, pp.3111-3119.
5.Baroni, M., Dinu, G. and Kruszewski, G., 2014, “Don’t count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors,” Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, ACL 2014, Baltimore, Maryland, United states, June 22-27, 2014, pp. 238-247.
6.Cao, J., Chen, L., Qiu, J., Wang, H., Ying, G. and Zhang, B., 2017, “Semantic Framework-Based Defect Text Mining Technique and Application in Power Grid,” Journal of Power System Technology, 41(2), pp. 637-643.
7.Du, X., Qin, J., Guo, S. and Yan, D., 2018, “Text mining of typical defects in power equipment,” Journal of High Voltage Engineering, 44(4), pp. 1078-1084.
8.Kim, Y., 2014, “Convolutional Neural Networks for Sentence Classification,” 2014, Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, Doha, Qatar, October 25-29, 2014, pp. 1746-1751.
9.Chung, J., Gulcehre, C., Cho, K. H. and Bengio, Y., 2014, “Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling,” Proceedings of the Neural Information Processing Systems 2014 Deep Learning and Representation Learning Workshop, NIPS 2014, Montreal, Dec 12, 2014.
10.Harris, Z. S., 1954, “Distributional structure,” word, 10(2-3) pp. 1469-162.
11.Pennington, J., Socher, R. and Manning, C. D., 2014, “Glove: Global Vectors for Word Representation,” Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, Doha, Qatar, October 25-29, 2014, pp. 1532–1543.
12.Baxter, J., 1995 “Learning Internal Representations,” Proceedings of the 8th Annual Conference on Computational Learning Theory, COLT 1995, Santa Cruz, California, United states, July 5-8, 1995, pp. 311-320.
13.LeCun, Y., Bottou, L., Bengio, Y. and Haffner P., 1998, “Gradient-based learning applied to document recognition,” Journal of IEEE, 86(11), pp. 2278-2323.
14.Jacovi, A., Shalom, O. S. and Goldberg, Y., 2018, “Understanding Convolutional Neural Networks for Text Classification,” Proceedings of the EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, EMNLP 2018, Brussels, Belgium, November 1 ,2018, pp.56-65.
15.Hochreiter, S. andSchmidhuber, J., 1997, “Long short-term memory,” Journal of Neural Computation, 9(8), pp.1735-1780.
16.Rijsbergen, C. J. Van., 1979, “Information Retrieval (2nd ed.),” Oxford Handbook of Computational Linguistics 2nd edition..
17.Rosenblatt, F., 1961, ”Principles of neurodynamics; perceptrons and the theory of brain mechanisms,” Perceptrons and the Theory of Brain Mechanisms. In: Palm G., Aertsen A. (eds) Brain Theory. Springer, Berlin, Heidelberg, pp.245-248.
18.Kingma, D. P. and Ba, J., 2015, “Adam: A Method for Stochastic Optimization,” Proceedings of the 3rd International Conference on Learning Representations, ICLR 2015, San Diego, California, USA, May 7-9, 2015.
19.Zaremba, W., Sutskever I. and Vinyals, O., 2014, “Recurrent neural network regulation,” arXiv preprint arXiv:1409.2329, 2014.
20.Devlin, J., Chang, M. W., Lee, K. and Toutanova, K., 2019, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” Proceedings of the Human Language Technologies - Conference, Minneapolis, NAACL 2019, Minnesota, June 2-7, 2019, pp. 4171-4186.
21.“WordNet,” accessed June 28, 2020 , https://wordnet.princeton.edu.
22.Sun, Junyi, “Jieba,” accessed June 28, 2020, https://github.com/fxsjy/jieba.
23.“Loss Functions,” accessed June 28, 2020, https://ml-cheatsheet.readthedocs.io/en/latest/loss_functions.html
24.Gensim–PyPI, accessed June 28, 2020, https://pypi.org/project/gensim/
25.Keras Documentation, accessed June 28, 2020, https://keras.io/
26.“維基百科資料庫下載,” accessed June 28, 2020, https://zh.wikipedia.org/wiki/Wikipedia:%E6%95%B0%E6%8D%AE%E5%BA%93%E4%B8%8B%E8%BD%BD
27.BYVoid-OpenCC, accessed June 28, 2020, https://github.com/BYVoid/OpenCC