| 研究生: | 劉昭宏 Liu, Chao-Hong | 
|---|---|
| 論文名稱: | 非母語書寫文句與語音辨識輸出錯誤修正之研究 A Study on Error Correction for Non-native Written Sentences and Speech Recognition Outputs | 
| 指導教授: | 吳宗憲 Wu, Chung-Hsien | 
| 學位類別: | 博士 Doctor | 
| 系所名稱: | 電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering | 
| 論文出版年: | 2012 | 
| 畢業學年度: | 100 | 
| 語文別: | 英文 | 
| 論文頁數: | 76 | 
| 中文關鍵詞: | 錯誤修正 、非母語書寫文句 、語音辨識輸出 | 
| 外文關鍵詞: | Error Correction, Non-native Written Sentences, Speech Recognition Outputs | 
| 相關次數: | 點閱:104 下載:1 | 
| 分享至: | 
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 | 
在自然語言與語音的處理中經常會有各種不同的錯誤出現。其中,第二語言學習者構句時所產生的錯誤與自動語音辨識器輸出的錯誤是最常見的兩種錯誤類型。本論文回顧了針對這兩種錯誤類型的偵測與修正技術,並且提出一個整合式的架構,對非母語書寫文句與語音辨識輸出的錯誤來進行修正。
非母語書寫文句錯誤的來源為第二語言學習者受其母語影響,因而使其所撰寫之語句產生變異的現象。語言遷移會導致學習者的句子產生各種類型的錯誤,其中包括了不正確的詞序、錯誤的選詞、多出冗詞或是遺漏必要的字詞等四種類型。語音辨識輸出錯誤則是自動語音辨識器由於各種不同的語音辨識環境的影響,因而無法正確將語音轉換為正確文字的現象。語音辨識輸出的錯誤包括了插入、刪除與取代等三種不同類型。
對於非母語書寫文句的錯誤,本論文提出了基於相對位置資訊建立語言模型的方法來加以處理。針對不正確的詞序、錯誤的選詞、多出冗詞或是遺漏必要的字詞等四種錯誤類型,提出相對應的修正候選字詞查找方法以產生可能的修正文句。最後綜合考慮N-gram語言模型與所提出的相對位置語言模型的分數,以動態程式化演算法得出最佳非母語書寫文句錯誤的修正文句。
對於語音辨識輸出的錯誤,本論文提出了以韻律字詞作為修正單位的方法,於語音辨識輸出與其正確文句的平行語料中蒐集到可能的修正對,提供作為修正文句的候選修正字詞來源。對於其中的取代錯誤,則另外再以音節群組之加權型核心特徵矩陣,進行語音辨識替代錯誤的修正。接著提出三種不同的分數估算方式來對候選修正文句進行評估,分別是字詞取代分數(Substitution Score)以及基於前後文相關資訊加以定義的串聯分數(Concatenation Score)以及適應分數(Fitness Score)。最後綜合考慮這三種分數,再以動態程式化演算法得出最佳語音辨識輸出錯誤的修正文句。
本研究討論了各種不同的修正技術,以對非母語書寫文句與語音辨識輸出的錯誤加以修正。主要採用的途徑包括了語言模型技術、韻律資訊與前後文相關資訊等。實驗結果顯示出本研究所提出模型化修正技術,由候選文句中得出最佳修正之效果較先前所提出的方法為佳。
Sentence correction has been an important emerging issue in computer-assisted language learning and automatic speech recognition post-editing. However, existing approaches such as correction grammars and templates or statistical machine translation are still not robust enough to tackle the common errors in sentences produced by second language learners and speech recognition outputs. In this dissertation, techniques based on language models, prosodic information and contextual information and are proposed to address the error correction problem of these two kinds of erroneous texts in natural language processing.
For non-native sentence correction, we present an approach using the proposed language modeling method based on relative positional information, which is suitable for the errors made by learners of Chinese as a Second Language. Four error types considered for correction in this dissertation are Lexical Choice, Redundancy, Omission, and Word Order. Methods for generating correction candidates for these four error types are proposed for sentence correction. Dynamic programming is then applied to yield the best corrected sentence from generated candidates.
For speech recognition outputs, a prosodic word based correction candidate generation method is proposed. The prosodic words and the corresponding mis-recognized word fragments are obtained from a speech database to construct a mis-recognized word fragment table for the extracted prosodic words. For each word fragment in a recognized word sequence, the potential prosodic words which are likely to be misrecognized as input word fragments are retrieved from the table for prosodic word candidate expansion. The prosodic word-based contextual information, considering substitution score, concatenation score and fitness score, is then employed using dynamic programming to find the best word fragment sequence over the whole sentence as the corrected output.
Specifically for the substitution errors in ASR outputs, the distances between ASR outputs and the potentially correct alternatives are estimated based on a weighted context-dependent syllable cluster-based kernel feature matrix followed by multidimensional scaling (MDS)-based distance rescaling. These distances are then used to construct an alternative syllable lattice and the dynamic programming is used to obtain the most likely correct substitution errors with respect to the original ASR results.
Experimental results show that compared to a state-of-the-art phrase-based statistical machine translation method for non-native sentences and correction-pairs method for ASR outputs, the error correction performance of the proposed approaches improved significantly.
[AOP06] Y. Al-Onaizan and K. Papineni. Distortion models for statistical machine translation. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics, pages 529–536. Association for Computational Linguistics, 2006.
[BDG06] C. Brockett, W.B. Dolan, and M. Gamon. Correcting ESL errors using phrasal SMT techniques. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics, pages 249–256. Association for Computational Linguistics, 2006.
[BFO+04] E.M. Bender, D. Flickinger, S. Oepen, A.Walsh, and T. Baldwin. Arboretum: Using a precision grammar for grammar checking in CALL. In InSTIL/ICALL Symposium 2004, 2004.
[BG05] I. Borg and P.J.F. Groenen. Modern multidimensional scaling: Theory and applications. Springer Verlag, 2005.
[Car02] A. Carnie. Syntax: A generative introduction, volume 4. Wiley-Blackwell, 2002.
[CBOK06] C. Callison-Burch, M. Osborne, and P. Koehn. Re-evaluating the role of bleu in machine translation research. In Proceedings of EACL, volume 2006, pages 249–256, 2006.
[CL00] M. Chodorow and C. Leacock. An unsupervised method for detecting grammatical errors. In Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference, pages 140–147. Morgan Kaufmann Publishers Inc., 2000.
[COGB01] S. Corston-Oliver, M. Gamon, and C. Brockett. A machine learning approach to the automatic evaluation of machine translation. In Proceedings of the 39th Annual Meeting on Association for Computational Linguistics, pages 148–155. Association for Computational Linguistics, 2001.
[CQ01] M. Chu and Y. Qian. Locating boundaries for prosodic constituents in unrestricted mandarin texts. Computational linguistics and Chinese language processing, 6(1):61–82, 2001.
[GGB+09] M. Gamon, J. Gao, C. Brockett, A. Klementiev, W.B. Dolan, D. Belenko, and L. Vanderwende. Using contextual speller techniques and language modeling for ESL error correction. Urbana, 51:61801, 2009.
[IUS+03] E. Izumi, K. Uchimoto, T. Saiga, T. Supnithi, and H. Isahara. Automatic error detection in the japanese learners’ english spoken data. In Proceedings of the 41st Annual Meeting on Association for Computational Linguistics-Volume 2, pages 145–148. Association for Computational Linguistics, 2003.
[JMK+00] D. Jurafsky, J.H. Martin, A. Kehler, K. Vander Linden, and N.Ward. Speech and language processing: An introduction to natural language processing, computational linguistics, and speech recognition, volume 2. Prentice Hall New Jersey, 2000.
[KH95] K. Knight and V. Hatzivassiloglou. Two-level, many-paths generation. In Proceedings of the 33rd annual meeting on Association for Computational Linguistics, pages 252–260. Association for Computational Linguistics, 1995.
[KOM03] P. Koehn, F.J. Och, and D. Marcu. Statistical phrase-based translation. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology-Volume 1, pages 48–54. Association for Computational Linguistics, 2003.
[LC03] C. Leacock and M. Chodorow. Automated grammatical error detection. Automated essay scoring: A cross-disciplinary perspective, pages 195–207, 2003.
[LCC08] R. L’opez-C’ozar and Z. Callejas. ASR post-correction for spoken dialogue systems based on semantic, syntactic, lexical and contextual information. Speech Communication, 50(8):745–766, 2008.
[LGB09] C. Leacock, M. Gamon, and C. Brockett. User input and interactions on microsoft research esl assistant. In Proceedings of the Fourth Workshop on Innovative Use of NLP for Building Educational Applications, pages 73–81. Association for Computational Linguistics, 2009.
[Lin04] C.Y. Lin. Rouge: A package for automatic evaluation of summaries. In Proceedings of the workshop on text summarization branches out (WAS 2004), volume 16, 2004.
[LK98] I. Langkilde and K. Knight. Generation that exploits corpus-based statistical knowledge. In Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics-Volume 1, pages 704–710. Association for Computational Linguistics, 1998.
[LS06] J. Lee and S. Seneff. Automatic grammar correction for secondlanguage learners. In Ninth International Conference on Spoken Language Processing, 2006.
[LWH08] C.H. Liu, C.H. Wu, and M. Harris. Word order correction for language transfer using relative position language modeling. In Chinese Spoken Language Processing, 2008. ISCSLP’08. 6th International Symposium on, pages 1–4. IEEE, 2008.
[LWSW11] C.H. Liu, C.H. Wu, D. Sarwono, and J.F. Wang. Candidate generation for ASR output error correction using a context-dependent syllable cluster-based confusion matrix. In Twelfth Annual Conference of the International Speech Communication Association, 2011.
[LZ04] Y. Lu and M. Zhou. Collocation translation acquisition using monolingual corpora. In Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, page 167. Association for Computational Linguistics, 2004.
[LZL07] J. Lee, M. Zhou, and X. Liu. Detection of non-native sentences using machine-translated training data. In Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers, pages 93–96. Association for Computational Linguistics, 2007.
[NMKI06] R. Nagata, K. Morihiro, A. Kawai, and N. Isu. A feedback-augmented method for detecting errors in the writing of learners of english. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics, pages 241–248. Association for Computational Linguistics, 2006.
[PRWZ02] K. Papineni, S. Roukos, T. Ward, and W.J. Zhu. BLEU: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting on association for computational linguistics, pages 311–318. Association for Computational Linguistics, 2002.
[Rat00] A. Ratnaparkhi. Trainable methods for surface natural language generation. In Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference, pages 194–201. Morgan Kaufmann Publishers Inc., 2000.
[Ska07] G. Skantze. Error handling in spoken dialogue systems. Computer Science and Communication Department of Speech, Music and Hearing, 2007.
[SLC+07] G. Sun, X. Liu, G. Cong, M. Zhou, Z. Xiong, J. Lee, and C.Y. Lin. Detecting erroneous sentences using automatically mined sequential patterns. In Annual Meeting-Association for Computational Linguistics, volume 45, page 81, 2007.
[SMN04] H. Sagawa, T. Mitamura, and E. Nyberg. Correction grammars for error handling in a speech dialog system. In Proceedings of HLT-NAACL 2004: Short Papers, pages 61–64. Association for Computational Linguistics, 2004.
[Sto02] A. Stolcke. SRILM-an extensible language modeling toolkit. In Seventh International Conference on Spoken Language Processing, 2002.
[UIS02] K. Uchimoto, H. Isahara, and S. Sekine. Text generation from keywords. In Proceedings of the 19th international conference on Computational linguistics-Volume 1, pages 1–7. Association for Computational Linguistics, 2002.
[WC01] C.H. Wu and J.H. Chen. Automatic generation of synthesis units and prosodic information for chinese concatenative synthesis. Speech Communication, 35(3-4):219–237, 2001.
[WC04] C.H. Wu and Y.J. Chen. Recovery from false rejection using statistical partial pattern trees for sentence verification. Speech Communication, 43(1):71–88, 2004.
[WCG04] C.H. Wu, Y.H. Chiu, and C.S. Guo. Text generation from Taiwanese sign language using a PST-based language model for augmentative communication. Neural Systems and Rehabilitation Engineering, IEEE Transactions on, 12(4):441–454, 2004.
[WCKC05] H.M. Wang, B. Chen, J.W. Kuo, and S.S. Cheng. MATBN: A mandarin chinese broadcast news corpus. International Journal of Computational Linguistics and Chinese Language Processing, 10(2):219–236, 2005.
[WG92] Y. Wang and R. Garigliano. An intelligent language tutoring system for handling errors caused by transfer. In Intelligent Tutoring Systems, pages 395–404. Springer, 1992.
[WLHY10] C.H. Wu, C.H. Liu, M. Harris, and L.C. Yu. Sentence correction incorporating relative position and parse template language models. Audio, Speech, and Language Processing, IEEE Transactions on, 18(6):1170–1181, 2010.
[WLL09] C.H. Wu, C.H. Lee, and C.H. Liang. Idiolect extraction and generation for personalized speaking style modeling. Audio, Speech, and Language Processing, IEEE Transactions on, 17(1):127–137, 2009.
[YD04] F. Yuan and M.S. Dietrich. Formal instruction, grammatical teachability, and acquisition of chinese as a second/foreign language. JOURNAL-CHINESE LANGUAGE TEACHERS ASSOCIATION., 39(2):1–18, 2004.
[ZZS+02] W. Zhu, W. Zhang, Q. Shi, F. Chen, H. Li, X. Ma, and L. Shen. Corpus building for data-driven TTS systems. In Speech Synthesis, 2002. Proceedings of 2002 IEEE Workshop on, pages 199–202. IEEE, 2002.