簡易檢索 / 詳目顯示

研究生: 湯士民
Tang, Shih-Min
論文名稱: 應用錯誤型態分析於英語發音輔助學習
Error Pattern Analysis for Computer Assisted English Pronunciation Learning
指導教授: 吳宗憲
Wu, Chung-Hsien
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2005
畢業學年度: 93
語文別: 中文
論文頁數: 49
中文關鍵詞: 語音辨識電腦輔助語言學習
外文關鍵詞: Computer assisted language learning, Speech recognition
相關次數: 點閱:54下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  •   語言教學方法主要是由以互動理論 (interactionist theories) 為基礎的溝通式教學法 (communicative language teaching) 所主導。因此,如果要針對學生個別的問題進行糾正,需要甚多的時間,很難採用雙向互動的教學方法。要解決這樣的問題,電腦輔助語言學習系統 (Computer Assisted Language Learning System, CALL) 是個可行的方案。利用語音辨識 (Automatic Speech Recognition, ASR) 技術的電腦輔助發音訓練系統 (Computer Assisted Pronunciation Training, CAPT) 不但可以提供一個沒有壓力的環境,讓學生反覆的練習,同時也能針對學生個別的發音問題,提供回饋與糾正的功能。本論文應用語音辨識、錯誤型態分析、及三維唇型動畫等技術,建立一套適合台灣人之發音輔助教學及矯正系統。
      
      本論文的主要技術包括:(1)利用語音辨識技術,將使用者輸入的語音訊號轉變為音素序列,以進行發音錯誤分析。(2)針對台灣學生可能的發音錯誤類型建立發音網路,偵測發音錯誤的位置及發音錯誤的型態,並針對錯誤的發音,進而提供適當的糾正。(3)依據訓練語句之熵值 (entropy) 與使用者的個人發音錯誤類型動態的挑選測試句。(4)運用影像處理及3D動畫的合成,建立一3D虛擬人形動畫回饋系統,且特別針對發音時之唇型及舌位,給予使用者正確的發音動作。本論文研發之系統,未來將可提供於本國語者英語發音學習與發音矯正等範疇的實務應用。

      Communicative language teaching is the main method of second language training for hearing impaired people. Using this method, trainer has to spend much time to check the students’ problems one by one. Therfore, it is difficult for the trainer to interact with the student when training. To solve this problem, Computer Assisted Language Learning System (CALL) is proposed. By applying the Automatic Speech Recognition (ASR) technology, Computer Assisted Pronunciation Training (CAPT) is able to provide a wonderful practice environment for student. Further more, CAPT system can report the pronunciation problem of student individually and make student learn efficiently and easily. This paper will construct a English Pronunciation Training and Corrective Assistance System for Mandarin Chinese by applying ASR, Error Pattern Analysis , and Computer Animation technology.
     
      The main technologies in this paper including: (1) Using ASR technology, the system transfer the input speech signal into a sequence of phoneme for pronunciation problems detection. (2) Using the technology of pronunciation network which is built according to Taiwanese students’ pronunciation error, the types of pronunciation problems are detected and the appropriate corrective courses are selected. (3) Using the entropy of sentence and personalized pronunciation error, the dynamic testing sentences selection is proposed in this paper. (4) A computer animation system is embedded in this paper and a 3D virtual character is constructed for system response. By capturing the motion of lip and tongue of speakers, the 3D virtual character can represent a virtual reality of speaking animation. In the future, this system can be not only the assistance tools of English pronunciation training for people, but an easy tool of pronunciation correction.

    中文摘要 英文摘要 誌謝 目錄 圖目錄 表目錄 第一章 緒論 1 第一節 研究背景與動機 1 第二節 研究目的 2 第三節 文獻回顧與探討 3 第四節 研究方法簡介 6 第五節 章節概要 7 第二章 系統架構 9 第三章 發音內容驗證 11 第一節 聲學模型 11 第二節 語音內容驗證 14 第四章 發音錯誤類型偵測 16 第一節 台灣人常犯之發音錯誤類型 16 第二節 發音錯誤類型的偵測 18 第五章 最佳訓練資料之選取 20 第一節 訓練語句之計分與挑選 21 第二節 發音錯誤類型機率的估計 23 第六章 音調錯誤偵測 26 第一節 音高參數求取 26 第二節 音調錯誤偵測 27 第七章 使用者視覺回饋 30 第一節 三維唇型動畫 31 第二節 三維舌位動畫 33 第八章 實驗結果與討論 37 第一節 英文語音訊號切割實驗 37 第二節 語音內容驗證實驗 38 第三節 發音錯誤偵測實驗 39 第四節 最佳訓練語句挑選評估實驗 40 第九章 結論與未來研究方向 43 參考文獻 45

    [1].T. M. J. Munro and M. Carbonaro, “Does Popular Speech Recognition Software Work with ESL Speech?”, TESQL Quarterly 34, pp.592-603, 2000
    [2].D. Coniam, “Voice Recognition Software Accuracy with Second Language Speakers of English”, System 27, pp.49-64, 1999
    [3].Hiller, S., E. Rooney, J. Laver, and M. Jack 1993. SPELL: An automated system for computer-aided pronunciation teaching. Speech communication 13 : 463-473
    [4].Hamada, H., S. Miki, and R. Nakatsu. 1993. Automatic Evaluation of English Pronunciation Based on Speech Recognition Techniques. IEICE Trans. Inf. and Sys. E76-D(3):352-359
    [5].Neumeyer, L., H. Franco , M. Weintraub, and P. Price 1996. Pronunciation Scoring of Foreign Language Student Speech. In ICSLP’ 96. Philadelphia, USA, Oct.
    [6].Eskenazi, M.1996. Detection of foreign speakers’ pronunciation errors for second language training result. ICSLP’ 96. Philadelphia, USA, Oct.
    [7].Ronen, O., Neumeyer, L. and Franco, H. (1997). “Automatic detection of mispronunciation for language instruction”, Proceedings Eurospeech 97, Rhodes, Greece, 649-652.
    [8].Franco, H., Neumeyer, L., Ramos, M., and Bratt, H. (1999). “Automatic Detection of Phone-Level Mispronunciation for Language Learning”, Proceedings Eurospeech '99, Budapest, Hungary, 851-854.
    [9].Witt,S.M. and Young, S.J. “Phone-level Pronunciation Scoring and Assessment for Interactive Language Learning”, Speech Communication 30, 95-108. 2000
    [10].Franco, H., Abrash, V., Precoda, K., Bratt, H., Rao, R., and Butzberger, J.“The SRI EduSpeak(TM) System: Recognition and Pronunciation Scoring for Language Learning", Proceedings of InSTILL 2000, Dundee, Scotland, 123-128. , 2000
    [11].Seiichi Nakagawa, Kazumasa Mori, Naoki Nakamura “A Statistical Method of Evaluation Pronunciation Proficiency for English Words Spoken by Japanese”, Eurospeech 2003
    [12].Yasushi Tsubota, Tatsuya Kawahara, Masatake Dantsuji “CALL System for Japanese Students of English Using Pronunciation Error Prediction and Formant Structure Estimation” InSTILL 2002
    [13].Jong-mi Kim, Chao Wang, Mitchell Peabody, Stephanie Seneff “An Interactive English Pronunciation Dictionary for Korean Learners” ICSLP 2004
    [14].Menzel, W., Herron, D., Bonaventura, P., and Morton, R. (2000). “Automatic detection and correction of non-native English pronunciations”, Proceedings of InSTILL 2000, Dundee, Scotland,
    [15].Mak, B., Siu, M.H., Ng, M., Tam, Y.C., Chan, Y.C., Chan, K.W., Leung, K.Y., Ho, S., Chong, F.H., Wong, J., Lo, J. (2003). “PLASER: Pronunciation Learning via Automatic Speech Recognition”, Proceedings of HLT-NAACL 2003, Edmonton, Canada, 23-29
    [16].Steve Young, The HTK Book version 3, Microsoft Corporation, 2000
    [17].Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models C. J. Leggetter and P. C. Woodland, Computer Speech and Language (1995) 9, 171–185
    [18].H. Franco, L. Neumeyer, and Y. Kim, “Automatic Pronunciation Scoring for Language Instruction”, Proc. ICASSP , pp.1471-1474, 1997
    [19].Transcriber: Development and use of a tool for assisting speech corpora production Claude Barras , Edouard Geoffrois , Zhibiao Wu , Mark Liberman speech communication 2001
    [20].Maurice F. Aburdow,John E. Dorbouf. “Parallel computation of discrete Legendre transforms.” IEEE ,pp.3225~3228, 1996.
    [21].Jose Daniel Tamos Wey, Hoao Antonio Zuffo, “InterFace: a Real Time Facial Animation System”, in Computer Graphics, Image Processing, and Vision, 1998.
    [22].A. P. Breen, E. Bowers, W. Welsh, “An Investigation into the Generation of Mouth Shapes for a Talking Head”, in Proceedings of ICSLP 96. 1996.
    [23].Fumio Kawakami, Shigeo Morishima, Hiroshi Yamada, Hiroshi Harashima, “Construction of 3-D Emotion Space Based on Prameterized Faces”, in IEEE International Workshop on Robot and Human Communication, 1994.
    [24].Chang Seok Choi, Hiroshi, Harashima, Tsuyosi Takebe, “Analysis and Synthesis of Facial Expressions in Knowledge-Based Coding of Facial Image Sequences”, in ICASSP-91, 1991 International Conference on , 1991.
    [25].Fumio Kawaki, Motohiro Okura, Hiroshi Yamada, Hiroshi Harashima, Shigeo Morishima, “An Evaluation of 3-D Emotion Space”, in IEEE International Workshop on Robot and Human Communication, 1995.

    下載圖示 校內:立即公開
    校外:2005-08-02公開
    QR CODE