| 研究生: |
李兆健 Lee, Chao-Chien |
|---|---|
| 論文名稱: |
卷積神經網路應用於中文字手寫風格辨識 Convolution Neural Network Applied to Chinese Handwriting Style Recognition |
| 指導教授: |
王明習
Wang, Ming-Shi |
| 學位類別: |
碩士 Master |
| 系所名稱: |
工學院 - 工程科學系碩士在職專班 Department of Engineering Science (on the job class) |
| 論文出版年: | 2017 |
| 畢業學年度: | 105 |
| 語文別: | 中文 |
| 論文頁數: | 60 |
| 中文關鍵詞: | 卷積神經網路 、深度學習 、類神經網路 、電腦視覺 |
| 外文關鍵詞: | Convolutional Neural Network, Deep Learning, Artificial Neural Network, Computer Vision |
| 相關次數: | 點閱:171 下載:28 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
近幾年電腦視覺在應用上的成效十分豐碩,有部分原因要歸功於類神經網路演算法的改善,尤其是卷積神經網路(Convolutional Neural Network, CNN)的出現,更是大幅改善辨識的成功率,Yann LeCun提出的LeNet[1]搭配MNIST手寫數字資料庫的辨識,更成為學習類神經網路新手們的經典案例。
本次研究將卷積神經網路應用在中文手寫風格的辨識,中文手寫字的辨識日趨完善,在手機裡也時常在使用,即使寫出很潦草的字,也能有相當高的機率辨識出來。但是否可以將各個中文字更往上抽象化,變成判斷是否出自同一人之手的手寫風格。
本論文使用卷積神經網路來辨識各種手寫風格,在收集不同人的手寫之風格後,判斷哪些風格的字是由同一人或不同人撰寫出來,這對用於數位簽章或刑事、民事上的字跡鑑定有著很大的幫助,除此之外,還提出一種多分類器投票的方式以增加辨識率,在訓練的速度上也加快許多,實驗結果顯示,卷積神經網路確實可以分辨不同人的手寫風格,其辨識率也因提出的多分類器投票方式達到提升,類別種類越多,效果越好。本研究蒐集到7種不同手寫風格的中文字,辨識率有91%。
Convolutional Neural Networks are a special kind of multi-layer neural networks. Like almost every other neural networks they are trained with a version of the back-propagation algorithm. Where they differ is in the architecture. Convolutional Neural Networks(CNN) are designed to recognize visual patterns directly from pixel images with minimal preprocessing. They can recognize patterns with extreme variability (such as handwritten characters), and with robustness to distortions and simple geometric transformations. In this thesis, the CNN are proposed to recognize the handwritten styles of traditional Chinese characters. It means the recognizer should be given who writes the input character. Here we call it the writing style of the character. The spirit of majority decision was applied to determine the recognition results. The recognizer is consisted of a number of classifiers, each classifier is a CNN. The data set is created by 7 users, each user is requested to write a paragraph of 100 traditional Chinese characters 30 times. There are 21000 handwritten traditional Chinese characters in the data set. Each character is segmented and resize to 28*28 pixels image. Then the images become the data set. Of them, 16800 characters are used as the training data set and the other data is used as the testing data set.By assigning the training data set in a specific ordering appropriately, each classifier can be trained to recognize most of the classes of the data set. In testing phase, all the recognizers are used to process the same input testing pattern, the output of all the recognizers are voted to get the final output, then the maximum agreement of the class is assigned. Different combination structures are used to do the recognition. It is shown that the recognition rates are better than only one classifier.
[1] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition," Proceedings of the IEEE, vol. 86, pp. 2278-2324, 1998.
[2] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," in Advances in neural information processing systems, pp. 1097-1105, 2012.
[3] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, "Going deeper with convolutions," arXiv preprint arXiv:1409.4842, 2014.
[4] K. Simonyan, and A. Zisserman, "Very Deep Convolutional Networks For Large-Scale Image Recognition," arXiv preprint arXiv:1409.1556, 2015.
[5] K. He, X. Zhang, S. Ren, and J. Sun, "Deep Residual Learning for Image Recognition," arXiv preprint arXiv:1512.03385, 2015.
[6] G. Huang, Z. Liu, and K. Q. Weinberger, "Densely Connected Convolutional Networks," arXiv preprint arXiv:1608.06993, 2016.
[7] R. Girshick, J. Donahue, T. Darrell, and J. Malik, "Rich feature hierarchies for accurate object detection and semantic segmentation," arXiv preprint arXiv:1311.2524, 2014.
[8] Ross Girshick, "Fast R-CNN," arXiv preprint arXiv:1504.08083, 2015.
[9] S. Ren, K. He, R. Girshick, and J. Sun, "Faster R-CNN:Towards Real-Time Object Detection with Region Proposal Networks," arXiv preprint arXiv:1506.01497, 2016.
[10] K. He, G. Gkioxari, P. Dollar, and R. Girshick, "Mask R-CNN," arXiv preprint arXiv:1703.0687, 2017.
[11] V. Dumoulin, and F. Visin, "A guide to convolution arithmetic for deep learning," arXiv preprint arXiv:1603.07285, 2016.
[12] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, et al., "TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems," 2015.
[13] Watkins, C.J.C.H., "Learning from Delayed Rewards," 1989.
[14] Watkins and Dayan, C.J.C.H., "Q-learning.Machine Learning," 1992.
[15] D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, et al., "Mastering the Game of Go with Deep Neural Networks and Tree Search," 2016.
[16] V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller, "Playing Atari with Deep Reinforcement Learning," arXiv preprint arXiv:1312.5602, 2013.
[17] L. A. Gatys, A. S. Ecker, and M. Bethge, "A Neural Algorithm of Artistic Style," arXiv preprint arXiv:1508.06576, 2015.
[18] TechNews. (2017). 機器人會開飛機了,DARPA Alias在模擬飛行中駕駛波音 737成功著陸. [online] Available at: http://technews.tw/2017/06/09/darpa-alias-the-robotic-co-pilot/
[19] V. Nair, and G. E. Hinton, "Rectified Linear Units Improve Restricted Boltzmann Machines," 2010.
[20] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, "Dropout: A Simple Way to Prevent Neural Networks from Overfitting," 2014.
[21] G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. R. Salakhutdinov, "Improving neural networks by preventing co-adaptation of feature detectors," arXiv preprint arXiv:1207.0580, 2012.
[22] D. P. Kingma, and J. L. Ba, "Adam:A Method for Stochastic Optimization," arXiv preprint arXiv:1412.6980, 2015.
[23] CS231n Convolutional Neural Networks for Visual Recognition. (2017). Commonly used activation functions. [online] Available at: http://cs231n.github.io/neural-networks-1 [Accessed Jun. 2017].
[24] 數位時代. (2017). AI陪你一起聊天購物會讓你買更多嗎?電商拚服務力,跟風導入人工智慧. [online] Available at: https://www.bnext.com.tw/article/43326/ecommerce-implement-ai-chatbot
[25] WIRED. (2017). THE RISE OF THE ARTIFICIALLY INTELLIGENT HEDGE FUND. [online] Available at: https://www.wired.com/2016/01/the-rise-of-the-artificially-intelligent-hedge-fund/#slide-1
[26] 泛科技. (2017). 2017 The AI Summit London 倫敦人工智慧峰會,醫療新境界. [online] Available at: https://panx.asia/archives/57828
[27] THE ASIMOV INSTITUTE. (2017). THE NEURAL NETWORK ZOO. [online] Available at: http://www.asimovinstitute.org/neural-network-zoo/
[28] YouTube. (2017). TensorFlow Frontiers (Google I/O '17). [online] Available at: https://www.youtube.com/watch?v=OzAdKMPgUt4
[29] ARM Community. (2017). When Parallelism Gets Tricky: Accelerating Floyd-Steinberg on the Mali GPU. [online] Available at: https://community.arm.com/graphics/b/blog/posts/when-parallelism-gets-tricky-accelerating-floyd-steinberg-on-the-mali-gpu
[30] J. T. Springenberg, A. Dosovitskiy, T. Brox, and M. Riedmiller, "STRIVING FOR SIMPLICITY: THE ALL CONVOLUTIONAL NET," arXiv preprint arXiv:1412.6806, 2015.
[31] X. Glorot, A. Bordes, and Y. Bengio, "Deep Sparse Rectifier Neural Networks," in Aistats, 2011, p.275.
[32] Wikipedia. (2017). Softmax函數. [online] Available at: https://zh.wikipedia.org/wiki/Softmax%E5%87%BD%E6%95%B0#cite_note-bishop-1
[33] Google. (2017). Google搜尋趨勢. [online] Available at: https://trends.google.com.tw/trends/explore?q=tensorflow
[34] Blog. (2017). Single-Layer Neural Networks and Gradient Descent. [online] Available at: http://sebastianraschka.com/Articles/2015_singlelayer_neurons.html
[35] CS231n Convolutional Neural Networks for Visual Recognition. (2017). Visualizing what ConvNets learn. [online] Available at: http://cs231n.github.io/understanding-cnn/
[36] Z. Cai, Q. Fan, R. S. Feris, and N. Vasconcelos, "A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection," arXiv preprint arXiv:1607.07155, 2016.
[37] GitHub. (2017). Keras. [online] Available at: https://github.com/fchollet/keras
[38] Y. LeCun,B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel, "Backpropagation applied to handwritten zip code recognition," Neural Comput. 1(4): 541-551, 1989.
[39] Blog. (2017). Image Classification. [online] Available at: http://book.paddlepaddle.org/03.image_classification/
[40] M. D. Zeiler, and R. Fergus, "Visualizing and Understanding Convolutional Networks," CoRR, abs/1311.2901v3, 2013.
[41] M. D. Zeiler, G. W. Taylor, and R. Fergus, "Adaptive Deconvolutional Networks for Mid and High Level Feature Learning," In ICCV, 2011
[42] Google Research Blog. (2017). How Google Translate squeezes deep learning onto a phone. [online] Available at: https://research.googleblog.com/2015/07/how-google-translate-squeezes-deep.html
[43] BUSINESS INSIDER. (2017). The world's first artificially intelligent lawyer was just hired at a law firm. [online] Available at: http://www.businessinsider.com/the-worlds-first-artificially-intelligent-lawyer-gets-hired-2016-5
[44] MIT Technology Review. (2017). DeepMind Will Use AI to Streamline Targeted Cancer Treatment. [online] Available at: https://www.technologyreview.com/s/602277/deepmind-will-use-ai-to-streamline-targeted-cancer-treatment/
[45] Amazon. (2017). Amazon EC2 執行個體類型. [online] Available at: https://aws.amazon.com/tw/ec2/instance-types/
[46] F Rosenblatt, "The Perceptron: A Probabilistic Model for Information Storage and Organization in the Brain," Cornell Aeronautical Laboratory, Psychological Review, v65, No. 6, pp. 386–408, 1958.
[47] Wikipedia. (2017). 梯度下降法. [online] Available at: https://zh.wikipedia.org/wiki/%E6%A2%AF%E5%BA%A6%E4%B8%8B%E9%99%8D%E6%B3%95#cite_note-1