研究生: |
溫淙傑 Wen, Tsung-Chieh |
---|---|
論文名稱: |
基於循環卷積式神經網路的文件分類模型實作 Implementation of Text Classification Model Based on Recurrent Convolutional Neural Network |
指導教授: |
王明習
Wang, Ming-Shi |
學位類別: |
碩士 Master |
系所名稱: |
工學院 - 工程科學系碩士在職專班 Department of Engineering Science (on the job class) |
論文出版年: | 2017 |
畢業學年度: | 105 |
語文別: | 中文 |
論文頁數: | 57 |
中文關鍵詞: | 深度學習 、文本分類 、循環神經網路 、卷積神經網路 |
外文關鍵詞: | Deep Learning, Text Classification, RNN, CNN |
相關次數: | 點閱:94 下載:9 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
在各式各樣社群媒體類型的網路平台紛紛上線營運,還有智慧型手機的普及,這些變化改變了大家使用網路的方式,使用者從單純地看網頁,從網路上搜尋和取得自己所需的資料,開始成為資訊的提供者。眾多的網路用戶開始願意且熱衷於把自己的意見分享出去,因此網路上有著大量的文本資料。十幾年前人們就已經聽過:「這是個資訊爆炸的時代」這句話,而現在由於人人都是訊息的提供者,和十幾年前相比現今網路上的資料量比之前更加龐大。
這些由使用者生成的內容常常含有一些觀點、評價等訊息,而這些訊息往往可以轉換成有價值的資訊,讓個人或公司團體來利用。但網路上的文本資料眾多無法以人力方式去收集和分析,因此如何讓機器能夠幫忙分析這些文本是近年來在資訊擷取領域重要議題之一。
本研究實作一個將循環神經網路結合卷積神經網路的深度學習網路架構,實驗在搭配已預先訓練好的詞向量表的狀況下是否能完成文本分類的目標。其中的挑戰在於須將詞向量當作靜態的查找表不須更新,神經網路自身訓練時可以忽略掉因缺字而形成的語義不連貫等雜訊,進而完成正確分類。實驗結果顯示本研究提出的循環卷積式神經網路架構執行各資料集的文本分類任務時,可取得和其他文獻相符的準確率,因此可證明此架構的可行性。且有以下優點,1. 所有資料集的準確率高於循環神經網路架構,2. 相對於卷積神經網路架構,收斂後的結果較穩定。3. 相對於循環神經網路架構在訓練上使用較少的Epoch便可收斂結果。但本研究架構的缺點為每一 Epoch的花費時間太長,不利於訓練文本長度較長的資料集。
In a variety of community media types of network platform have on-line operation, and the popularity of smart phones, these changes have changed the way people use the web. Over the past years, users only can search and get the information from the website, but now the user can be an information provider. Many Internet users began to be willing and keen to share their views out, so there is a lot of text data on the Internet. Ten years ago people have heard this sentence:「This is an era of information explosion」, and now because everyone can be the provider of messages, as compared to a decade ago, the current amount of information on the Internet is more larger.
These content generated by the user often contains some opinion, evaluation and other information, and these messages can often be converted into valuable information, and this information can be used by individuals or corporate groups. But the text on the Internet is too much, cannot be man-made to collect and analyze. So how to use the machine to help users analyze these texts is one of the important topics in the field of information capture in recent years.
In this research, we implemented a deep learning network architecture, which combined with the architecture of convolutional neural network and the architecture of recurrent neural network. And use to complete the goal of text classification with the pre-trained word vectors. The difficulty is that the word vector should be used as a static lookup table without updating, but the network still can ignore the noise which caused by missing words to complete the task. The experimental results show that the accuracy of this study is consistent with the accuracy of other studies, proved the feasibility of this architecture. And has the following advantages: 1. The accuracy rate of this architecture is higher than that of recurrent neural network, 2. Compared with the convolution neural network, the accuracy results are more stable, 3. Use less epoch to get stable results. But the shortcoming of this research architecture is that training time is too long.
[1.] 李宏毅。〈Why deep〉。《Machine Learning》<http://speech.ee.ntu.edu.tw/~tlkagk/courses_ML16.html>。May 2017。
[2.] Andrew Ng。〈BackPropagation Algorithm〉。《Neural Networks: Learning》<https://www.coursera.org/learn/machine-learning>。April 2017。
[3.] Bengio, Yoshua, et al. "A neural probabilistic language model." Journal of machine learning research 3.Feb (2003): 1137-1155.
[4.] Bradley, Margaret M., and Peter J. Lang. "Affective norms for English words (ANEW): Instruction manual and affective ratings." (1999): 1-45.
[5.] Cireşan, Dan Claudiu, et al. "Deep, big, simple neural nets for handwritten digit recognition." Neural computation 22.12 (2010): 3207-3220.
[6.] Colah.github.io。《Understanding LSTM Networks》<http://colah.github.io/posts/2015-08-Understanding-LSTMs/>。Juy. 2017。
[7.] Collobert, Ronan, et al. "Natural language processing (almost) from scratch." Journal of Machine Learning Research 12.Aug (2011): 2493-2537.
[8.] Cs231n.github.io。《CS231n Convolutional Neural Networks for Visual Recognition》<http://cs231n.github.io/neural-networks-2/>。July 2017。
[9.] Elman, Jeffrey L. "Finding structure in time." Cognitive science 14.2 (1990): 179-211.
[10.] Glorot, Xavier, Antoine Bordes, and Yoshua Bengio. "Domain adaptation for large-scale sentiment classification: A deep learning approach." Proceedings of the 28th international conference on machine learning (ICML-11). 2011.
[11.] Google.com.tw。《Google 搜尋趨勢 - Google 網頁搜尋的熱門度 - 全球, 2004年至今》<https://trends.google.com.tw/trends/explore?date=all&q=%2Fm%2F0h1fn8h>。July 2017。
[12.] Gökçay, Didem, Erdinç İşbilir, and G. Yildirim. "Predicting the sentiment in sentences based on words: An Exploratory Study on ANEW and ANET." Cognitive Infocommunications (CogInfoCom), 2012 IEEE 3rd International Conference on. IEEE, 2012.
[13.] Graves, Alex, Abdel-rahman Mohamed, and Geoffrey Hinton. "Speech recognition with deep recurrent neural networks." Acoustics, speech and signal processing (icassp), 2013 ieee international conference on. IEEE, 2013.
[14.] Hingmire, Swapnil, et al. "Document classification by topic labeling." Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval. ACM, 2013.
[15.] Hinton, Geoffrey E. "Learning distributed representations of concepts." Proceedings of the eighth annual conference of the cognitive science society. Vol. 1. 1986.
[16.] Hinton, Geoffrey E., and Ruslan R. Salakhutdinov. "Reducing the dimensionality of data with neural networks." science 313.5786 (2006): 504-507.
[17.] Hinton, Geoffrey E., Simon Osindero, and Yee-Whye Teh. "A fast learning algorithm for deep belief nets." Neural computation 18.7 (2006): 1527-1554.
[18.] Hochreiter, Sepp, and Jürgen Schmidhuber. "Long short-term memory." Neural computation 9.8 (1997): 1735-1780.
[19.] Jason Rennie。《20 Newsgroups》<http://qwone.com/~jason/20Newsgroups/>。May 2017。
[20.] Cortes, Corinna, and Vladimir Vapnik. "Support-vector networks." Machine learning 20.3 (1995): 273-297.
[21.] Kalchbrenner, Nal, Edward Grefenstette, and Phil Blunsom. "A convolutional neural network for modelling sentences." arXiv preprint arXiv:1404.2188 (2014).
[22.] Kim, Yoon. "Convolutional neural networks for sentence classification." arXiv preprint arXiv:1408.5882 (2014).
[23.] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012.
[24.] Lai, Siwei, et al. "Recurrent Convolutional Neural Networks for Text Classification." AAAI. Vol. 333. 2015.
[25.] LeCun, Yann, et al. "Gradient-based learning applied to document recognition." Proceedings of the IEEE 86.11 (1998): 2278-2324.
[26.] Le Cun, Yann, et al. "Handwritten digit recognition: Applications of neural network chips and automatic learning." IEEE Communications Magazine 27.11 (1989): 41-46.
[27.] Lei, Tao, Regina Barzilay, and Tommi Jaakkola. "Molding cnns for text: non-linear, non-consecutive convolutions." arXiv preprint arXiv:1508.04112 (2015).
[28.] Li, Xin, and Dan Roth. "Learning question classifiers." Proceedings of the 19th international conference on Computational linguistics-Volume 1. Association for Computational Linguistics, 2002.
[29.] McCulloch, Warren S., and Walter Pitts. "A logical calculus of the ideas immanent in nervous activity." The bulletin of mathematical biophysics 5.4 (1943): 115-133.
[30.] Mikolov, Tomas, et al. "Efficient estimation of word representations in vector space." arXiv preprint arXiv:1301.3781 (2013).
[31.] Mikolov, Tomas, et al. "Distributed representations of words and phrases and their compositionality." Advances in neural information processing systems. 2013.
[32.] Mitchell, Tom Michael. The discipline of machine learning. Vol. 3. Carnegie Mellon University, School of Computer Science, Machine Learning Department, 2006.
[33.] Nguyen, Thien Huu, and Ralph Grishman. "Event Detection and Domain Adaptation with Convolutional Neural Networks." ACL (2). 2015.
[34.] Pang, Bo, and Lillian Lee. "Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales." Proceedings of the 43rd annual meeting on association for computational linguistics. Association for Computational Linguistics, 2005.
[35.] Pang, Bo, Lillian Lee, and Shivakumar Vaithyanathan. "Thumbs up?: sentiment classification using machine learning techniques." Proceedings of the ACL-02 conference on Empirical methods in natural language processing-Volume 10. Association for Computational Linguistics, 2002.
[36.] Pak, Alexander, and Patrick Paroubek. "Twitter as a corpus for sentiment analysis and opinion mining." LREc. Vol. 10. No. 2010. 2010.
[37.] Paltoglou, Georgios, et al. "Predicting emotional responses to long informal text." IEEE transactions on affective computing 4.1 (2013): 106-115.
[38.] Rumelhart, David E., Geoffrey E. Hinton, and Ronald J. Williams. "Learning representations by back-propagating errors." Cognitive modeling 5.3 (1988): 1.
[39.] Saif, Hassan, et al. "Evaluation datasets for Twitter sentiment analysis: a survey and a new dataset, the STS-Gold." (2013).
[40.] Saif, Hassan, et al. "Contextual semantics for sentiment analysis of Twitter." Information Processing & Management 52.1 (2016): 5-19.
[41.] Seide, Frank, Gang Li, and Dong Yu. "Conversational speech transcription using context-dependent deep neural networks." Twelfth Annual Conference of the International Speech Communication Association. 2011.
[42.] Skymind。《Deeplearning4j (DL4J)》<https://deeplearning4j.org/cn/index>。May 2017。
[43.] Socher, Richard, et al. "Semi-supervised recursive autoencoders for predicting sentiment distributions." Proceedings of the conference on empirical methods in natural language processing. Association for Computational Linguistics, 2011.
[44.] Socher, Richard, et al. "Semantic compositionality through recursive matrix-vector spaces." Proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning. Association for Computational Linguistics, 2012.
[45.] Socher, Richard, et al. "Parsing with Compositional Vector Grammars." ACL (1). 2013.
[46.] Socher, Richard, et al. "Recursive deep models for semantic compositionality over a sentiment treebank." Proceedings of the conference on empirical methods in natural language processing (EMNLP). Vol. 1631. 2013.
[47.] Taboada, Maite, et al. "Lexicon-based methods for sentiment analysis." Computational linguistics 37.2 (2011): 267-307.
[48.] Tang, Huifeng, Songbo Tan, and Xue-qi Cheng. "Research on sentiment classification of Chinese reviews based on supervised machine learning techniques." Journal of Chinese information processing 21.6 (2007): 88-94.
[49.] Turney, Peter D. "Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews." Proceedings of the 40th annual meeting on association for computational linguistics. Association for Computational Linguistics, 2002.
[50.] Warriner, Amy Beth, Victor Kuperman, and Marc Brysbaert. "Norms of valence, arousal, and dominance for 13,915 English lemmas." Behavior research methods 45.4 (2013): 1191-1207.
[51.] Whitelaw, Casey, Navendu Garg, and Shlomo Argamon. "Using appraisal groups for sentiment analysis." Proceedings of the 14th ACM international conference on Information and knowledge management. ACM, 2005.
[52.] Wikipedia。《Gradient_descent》<https://en.wikipedia.org/wiki/Gradient_descent>。June 2017。
[53.] Wikipedia。《Internet》<https://en.wikipedia.org/wiki/Internet>。June 2017。
[54.] Ye, Qiang, Ziqiong Zhang, and Rob Law. "Sentiment classification of online reviews to travel destinations by supervised machine learning approaches." Expert systems with applications 36.3 (2009): 6527-6535.