簡易檢索 / 詳目顯示

研究生: 曾華璽
Tseng, Hua-Hsi
論文名稱: 以變調與斷詞改善台語語音辨識並以諧音建置台語幽默對話特徵
Improvement of Taiwanese Speech Recognition with Automatic Tone Sandhi and Word Segmentation and Construction of Taiwanese Humorous Conversation Pattern Based on Homophonic Words
指導教授: 楊中平
Young, Chung-Ping
共同指導教授: 盧文祥
Lu, Wen-Hsiang
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2018
畢業學年度: 106
語文別: 英文
論文頁數: 32
中文關鍵詞: Kaldi語音辨識自然語言處理幽默辨識
外文關鍵詞: Kaldi, Speech Recognition, Nature Language Processing, Humor Recognition
相關次數: 點閱:132下載:9
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 由於現今台灣尚未開發出完整的閩南語語音辨識系統,因為閩南語不容易用中文字來斷詞與變調,在聲調部分閩南語的轉調規則也比國語複雜且有地域性區分,因此我們利用Kaldi語音辨識的模組建立並改善閩南語語音辨識系統。首先我們在語音辨識之前的文字標註階段加入閩南語轉調規則與台語斷詞處理,其中臺灣閩南語拼音的部分我們選擇官方的臺灣閩南語羅馬字拼音方案,簡稱台羅拼音。接著我們結合閩南語說唱藝術的幽默手法與自然語言處理技術,從YouTube上的說唱藝術影片,並依據幽默的雙關歧義與諧音策略,將閩南語與國語以諧音對應半自動的方式建立成幽默諧音特徵資料庫。

    In Taiwan, we have not completed developed a reliable Taiwanese speech recognition system, because Taiwanese is difficult to write in words, segment word and do tone sandhi. In tone, the rule of Taiwanese tone sandhi is more complicated than the Chinese tone sandhi, and it’s also different for different locations, so we use the Kaldi Speech Recognition Toolkit to build Taiwanese speech recognition system and do some processes to improve it. First, we perform Taiwanese word segmentation and the rules of Taiwanese tone sandhi before building Taiwanese speech recognition model. We choose the Taiwanese Romanization System which is the officially promoted phonetic notation system by Taiwan's Ministry of Education, often referred to as Tâi-lô. Next, we combine quyi and nature language processing. According to the humor strategy homophonic puns, we build the humorous homophonic pattern database semi-automatically from videos on YouTube.

    Abstract I 摘要 II Contents III List of Tables V List of Figures VI Chapter 1 Introduction 1 1.1 Background 1 1.2 Motivation 1 1.3 Contributions 2 1.4 Organization of this Dissertation 2 Chapter 2 Related Work 3 2.1 Taiwanese Speech Recognition 3 2.2 Humor Recognition 3 Chapter 3 Methodology 5 3.1 Taiwanese speech recognition 6 3.1.1 Kaldi Speech Recognition Toolkit 6 3.1.2 Preprocessing 9 3.1.3 Taiwanese Word Segmentation 10 3.1.4 Tone Sandhi Processing 11 3.1.5 Acoustic Speech Recognition 13 3.2 Humor Pattern Management 15 3.2.1 Manual Labeling 15 3.2.2 Word Segmentation 18 3.2.3 Taiwanese Humor Word Pairing 19 3.2.4 Humor Pattern Extraction 20 Chapter 4 Experiment Results 23 4.1 Taiwanese speech recognition evaluation 23 4.1.1 Preprocessing step evaluation 24 4.1.2 Taiwanese Word Segmentation evaluation 26 4.1.3 Tone Sandhi evaluation 28 Chapter 5 Conclusion and Future work 30 Reference 31

    [1] 葉高華, “臺灣語言使用調查文獻回顧,” 2017, p. 20.
    [2] Bryant, J., & Zillmann, D., “Chapter 2: Using Humor to Promote Learning in the Classroom.,” 於 Journal of Children in Contemporary Society, 1989, pp. 20(1-2),49-78.
    [3] Mcghee, P. E., & Frank, M., “Humor and Children's Development: A Guide to Practical Applications,” 2014.
    [4] Mihalcea, R., & Strapparava, C., “Technologies That Make You Smile: Adding Humor to Text-Based,” 於 IEEE Intelligent Systems, 2006b, pp. 21(5),33-39.
    [5] X. Z. Dong Wang, “THCHS-30 : A Free Chinese Speech Corpus,” 2015.
    [6] D. Povey, “The Kaldi Speech Recognition Toolkit,” 2011.
    [7] 朱晴蕾, “Language Identification on Code-Switching Speech,” 2007.
    [8] 游效儒, “A Telephone-based Mandarin/Taiwanese Bi-lingual Speech Recognition System,” 2002.
    [9] 教育部, “臺灣閩南語羅馬字拼音方案使用手冊,” 2007.
    [10] Diyi Yang, Alon Lavie, Chris Dyer, Eduard Hovy, “Humor Recognition and Humor Anchor Extraction,” 於 Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, 2015, p. 2367–2376.
    [11] Gozde Ozbal & Carlo Strapparava, “A Computational Approach to the Automation of Creative Naming,” 於 Paper presented at the 3rd International Workshop On Computational Humor, Amsterdam, Netherlands, 2012.
    [12] X. W. Y. Y. Shikang Du, “Towards Automatic Generation of Entertaining Dialogues in Chinese Crosstalks,” 2017.

    下載圖示 校內:2019-09-01公開
    校外:2019-09-01公開
    QR CODE