簡易檢索 / 詳目顯示

研究生: 廖灝添
Liu, Hou-Tim
論文名稱: 基於卷積神經網路之手寫文件語言辨識
Language Identification of Handwritten Documents Based on Convolutional Neural Networks
指導教授: 胡敏君
Hu, Min-Chun
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2016
畢業學年度: 104
語文別: 英文
論文頁數: 20
中文關鍵詞: 手寫文件卷積神經網路語言辨識局部二值模式
外文關鍵詞: Handwritten Document, Convolutional Neural Networks, Language Identification, Local Binary Pattern
相關次數: 點閱:91下載:3
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在這篇論文中我們提出了一個採用投票機制的卷積神經網路,應用在手
    寫文件的語言辨識上,為了可以取得文件中更細部的資訊,我們會對文件
    的每一行進行切割,再對每一行進行滑動窗口的分割,之後對每個滑動窗
    口去進行卷積神經網路的特徵提取,然後用SOFTMAX 層來進行分類。在
    ICDAR2011 公開的數據集的實驗中,我們的方法取得了很高的辨識率,在
    比較複雜的公開數據集Multilingual Handwritten Dataset 上,也取得一個不錯
    的成果,比起傳統對一整張文件可以提取出更多有用的特徵資訊,效能上也
    有明顯比較高的辨識率。

    In this paper, we propose a language identification method use voting base Convolutional
    Neural Networks approach. We apply line extraction and sliding windows
    before CNN feature extraction. After CNN feature extraction, CNN feature
    classify by softmax layer. In experiment, we get high performance on ICDAR2011
    public dataset, on more complex dataset, Multilingual Handwritten Dataset also has
    a good performance. Compared to full-page feature extraction, our method would
    take more useful information.

    Abstract (Chinese) i Abstract (English) ii Table of Contents iii List of Figures v Chapter 1. Introduction 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Chapter 2. Related Work 4 Chapter 3. Convolutional Neural Networks 6 3.1 Introduction of ANNs and CNNs . . . . . . . . . . . . . . . . . . . . . . 6 3.2 CNN Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.3 Training of CNNs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Chapter 4. Methodology 10 4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 4.2 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Chapter 5. Experimental Results 15 5.1 Dataset: ICDAR2011 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 5.2 Dataset: Multilingual HW Dataset . . . . . . . . . . . . . . . . . . . . . 16 Chapter 6. Conclusions 18 6.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 References 19

    [1] University of maryland, laboratory for language and media processing (lamp)
    , handwritten language and writer id dataset. http://lamp.cfar.umd.edu,
    2016.
    [2] J. Almazán, A. Gordo, A. Fornés, and E. Valveny. Word spotting and recognition
    with embedded attributes. Pattern Analysis and Machine Intelligence,
    IEEE Transactions on, 36(12):2552–2566, 2014.
    [3] N. Arvanitopoulos and S. Süsstrunk. Seam carving for text line extraction
    on color and grayscale historical manuscripts. In Frontiers in Handwriting
    Recognition (ICFHR), 2014 14th International Conference on, pages 726–731.
    IEEE, 2014.
    [4] S. Avidan and A. Shamir. Seam carving for content-aware image resizing. In
    ACM Transactions on graphics (TOG), volume 26, page 10. ACM, 2007.
    [5] M. Bresler. Text/non-text classification of strokes using the composite descriptor.
    In Proceedings of the 17th International Student Conference on Electrical
    Engineering (POSTER2013), Prague, Czech, pages 1–5, 2013.
    [6] L. Chen, S. Wang, W. Fan, J. Sun, and N. Satoshi. Deep learning based language
    and orientation recognition in document analysis. In Document Analysis
    and Recognition (ICDAR), 2015 13th International Conference on, pages
    436–440. IEEE, 2015.
    [7] E. Indermuhle, V. Frinken, and H. Bunke. Mode detection in online handwritten
    documents using blstm neural networks. In Frontiers in Handwriting
    Recognition (ICFHR), 2012 International Conference on, pages 302–307.
    IEEE, 2012.
    [8] M. Liwicki, E. Indermuhle, and H. Bunke. On-line handwritten text line detection
    using dynamic programming. In Ninth International Conference onDocument Analysis and Recognition (ICDAR 2007), volume 1, pages 447–
    451. IEEE, 2007.
    [9] G. Louloudis, N. Stamatopoulos, and B. Gatos. Icdar 2011 writer identification
    contest. In 2011 International Conference on Document Analysis and
    Recognition, pages 1475–1479. IEEE, 2011.
    [10] A. Nicolaou, A. Bagdanov, L. Gomez-Bigorda, and D. Karatzas. Visual script
    and language identification. arXiv preprint arXiv:1601.01885, 2016.
    [11] T. Ojala, M. Pietikäinen, and T. Mäenpää. Multiresolution gray-scale and rotation
    invariant texture classification with local binary patterns. Pattern Analysis
    and Machine Intelligence, IEEE Transactions on, 24(7):971–987, 2002.
    [12] R. Sarkar, S. Moulik, N. Das, S. Basu, M. Nasipuri, and M. Kundu. Suppression
    of non-text components in handwritten document images. In Image
    Information Processing (ICIIP), 2011 International Conference on, pages 1–7.
    IEEE, 2011.
    [13] B. Shi, X. Bai, and C. Yao. Script identification in the wild via discriminative
    convolutional neural network. Pattern Recognition, 52:448–458, 2016.
    [14] A. Ul-Hasan, M. Z. Afzal, F. Shafait, M. Liwicki, and T. M. Breuel. A sequence
    learning approach for multiple script identification. In Document Analysis
    and Recognition (ICDAR), 2015 13th International Conference on, pages
    1046–1050. IEEE, 2015.

    下載圖示 校內:2021-06-30公開
    校外:2021-06-30公開
    QR CODE