| 研究生: |
湯士民 Tang, Shih-Min |
|---|---|
| 論文名稱: |
應用錯誤型態分析於英語發音輔助學習 Error Pattern Analysis for Computer Assisted English Pronunciation Learning |
| 指導教授: |
吳宗憲
Wu, Chung-Hsien |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
| 論文出版年: | 2005 |
| 畢業學年度: | 93 |
| 語文別: | 中文 |
| 論文頁數: | 49 |
| 中文關鍵詞: | 語音辨識 、電腦輔助語言學習 |
| 外文關鍵詞: | Computer assisted language learning, Speech recognition |
| 相關次數: | 點閱:54 下載:2 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
語言教學方法主要是由以互動理論 (interactionist theories) 為基礎的溝通式教學法 (communicative language teaching) 所主導。因此,如果要針對學生個別的問題進行糾正,需要甚多的時間,很難採用雙向互動的教學方法。要解決這樣的問題,電腦輔助語言學習系統 (Computer Assisted Language Learning System, CALL) 是個可行的方案。利用語音辨識 (Automatic Speech Recognition, ASR) 技術的電腦輔助發音訓練系統 (Computer Assisted Pronunciation Training, CAPT) 不但可以提供一個沒有壓力的環境,讓學生反覆的練習,同時也能針對學生個別的發音問題,提供回饋與糾正的功能。本論文應用語音辨識、錯誤型態分析、及三維唇型動畫等技術,建立一套適合台灣人之發音輔助教學及矯正系統。
本論文的主要技術包括:(1)利用語音辨識技術,將使用者輸入的語音訊號轉變為音素序列,以進行發音錯誤分析。(2)針對台灣學生可能的發音錯誤類型建立發音網路,偵測發音錯誤的位置及發音錯誤的型態,並針對錯誤的發音,進而提供適當的糾正。(3)依據訓練語句之熵值 (entropy) 與使用者的個人發音錯誤類型動態的挑選測試句。(4)運用影像處理及3D動畫的合成,建立一3D虛擬人形動畫回饋系統,且特別針對發音時之唇型及舌位,給予使用者正確的發音動作。本論文研發之系統,未來將可提供於本國語者英語發音學習與發音矯正等範疇的實務應用。
Communicative language teaching is the main method of second language training for hearing impaired people. Using this method, trainer has to spend much time to check the students’ problems one by one. Therfore, it is difficult for the trainer to interact with the student when training. To solve this problem, Computer Assisted Language Learning System (CALL) is proposed. By applying the Automatic Speech Recognition (ASR) technology, Computer Assisted Pronunciation Training (CAPT) is able to provide a wonderful practice environment for student. Further more, CAPT system can report the pronunciation problem of student individually and make student learn efficiently and easily. This paper will construct a English Pronunciation Training and Corrective Assistance System for Mandarin Chinese by applying ASR, Error Pattern Analysis , and Computer Animation technology.
The main technologies in this paper including: (1) Using ASR technology, the system transfer the input speech signal into a sequence of phoneme for pronunciation problems detection. (2) Using the technology of pronunciation network which is built according to Taiwanese students’ pronunciation error, the types of pronunciation problems are detected and the appropriate corrective courses are selected. (3) Using the entropy of sentence and personalized pronunciation error, the dynamic testing sentences selection is proposed in this paper. (4) A computer animation system is embedded in this paper and a 3D virtual character is constructed for system response. By capturing the motion of lip and tongue of speakers, the 3D virtual character can represent a virtual reality of speaking animation. In the future, this system can be not only the assistance tools of English pronunciation training for people, but an easy tool of pronunciation correction.
[1].T. M. J. Munro and M. Carbonaro, “Does Popular Speech Recognition Software Work with ESL Speech?”, TESQL Quarterly 34, pp.592-603, 2000
[2].D. Coniam, “Voice Recognition Software Accuracy with Second Language Speakers of English”, System 27, pp.49-64, 1999
[3].Hiller, S., E. Rooney, J. Laver, and M. Jack 1993. SPELL: An automated system for computer-aided pronunciation teaching. Speech communication 13 : 463-473
[4].Hamada, H., S. Miki, and R. Nakatsu. 1993. Automatic Evaluation of English Pronunciation Based on Speech Recognition Techniques. IEICE Trans. Inf. and Sys. E76-D(3):352-359
[5].Neumeyer, L., H. Franco , M. Weintraub, and P. Price 1996. Pronunciation Scoring of Foreign Language Student Speech. In ICSLP’ 96. Philadelphia, USA, Oct.
[6].Eskenazi, M.1996. Detection of foreign speakers’ pronunciation errors for second language training result. ICSLP’ 96. Philadelphia, USA, Oct.
[7].Ronen, O., Neumeyer, L. and Franco, H. (1997). “Automatic detection of mispronunciation for language instruction”, Proceedings Eurospeech 97, Rhodes, Greece, 649-652.
[8].Franco, H., Neumeyer, L., Ramos, M., and Bratt, H. (1999). “Automatic Detection of Phone-Level Mispronunciation for Language Learning”, Proceedings Eurospeech '99, Budapest, Hungary, 851-854.
[9].Witt,S.M. and Young, S.J. “Phone-level Pronunciation Scoring and Assessment for Interactive Language Learning”, Speech Communication 30, 95-108. 2000
[10].Franco, H., Abrash, V., Precoda, K., Bratt, H., Rao, R., and Butzberger, J.“The SRI EduSpeak(TM) System: Recognition and Pronunciation Scoring for Language Learning", Proceedings of InSTILL 2000, Dundee, Scotland, 123-128. , 2000
[11].Seiichi Nakagawa, Kazumasa Mori, Naoki Nakamura “A Statistical Method of Evaluation Pronunciation Proficiency for English Words Spoken by Japanese”, Eurospeech 2003
[12].Yasushi Tsubota, Tatsuya Kawahara, Masatake Dantsuji “CALL System for Japanese Students of English Using Pronunciation Error Prediction and Formant Structure Estimation” InSTILL 2002
[13].Jong-mi Kim, Chao Wang, Mitchell Peabody, Stephanie Seneff “An Interactive English Pronunciation Dictionary for Korean Learners” ICSLP 2004
[14].Menzel, W., Herron, D., Bonaventura, P., and Morton, R. (2000). “Automatic detection and correction of non-native English pronunciations”, Proceedings of InSTILL 2000, Dundee, Scotland,
[15].Mak, B., Siu, M.H., Ng, M., Tam, Y.C., Chan, Y.C., Chan, K.W., Leung, K.Y., Ho, S., Chong, F.H., Wong, J., Lo, J. (2003). “PLASER: Pronunciation Learning via Automatic Speech Recognition”, Proceedings of HLT-NAACL 2003, Edmonton, Canada, 23-29
[16].Steve Young, The HTK Book version 3, Microsoft Corporation, 2000
[17].Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models C. J. Leggetter and P. C. Woodland, Computer Speech and Language (1995) 9, 171–185
[18].H. Franco, L. Neumeyer, and Y. Kim, “Automatic Pronunciation Scoring for Language Instruction”, Proc. ICASSP , pp.1471-1474, 1997
[19].Transcriber: Development and use of a tool for assisting speech corpora production Claude Barras , Edouard Geoffrois , Zhibiao Wu , Mark Liberman speech communication 2001
[20].Maurice F. Aburdow,John E. Dorbouf. “Parallel computation of discrete Legendre transforms.” IEEE ,pp.3225~3228, 1996.
[21].Jose Daniel Tamos Wey, Hoao Antonio Zuffo, “InterFace: a Real Time Facial Animation System”, in Computer Graphics, Image Processing, and Vision, 1998.
[22].A. P. Breen, E. Bowers, W. Welsh, “An Investigation into the Generation of Mouth Shapes for a Talking Head”, in Proceedings of ICSLP 96. 1996.
[23].Fumio Kawakami, Shigeo Morishima, Hiroshi Yamada, Hiroshi Harashima, “Construction of 3-D Emotion Space Based on Prameterized Faces”, in IEEE International Workshop on Robot and Human Communication, 1994.
[24].Chang Seok Choi, Hiroshi, Harashima, Tsuyosi Takebe, “Analysis and Synthesis of Facial Expressions in Knowledge-Based Coding of Facial Image Sequences”, in ICASSP-91, 1991 International Conference on , 1991.
[25].Fumio Kawaki, Motohiro Okura, Hiroshi Yamada, Hiroshi Harashima, Shigeo Morishima, “An Evaluation of 3-D Emotion Space”, in IEEE International Workshop on Robot and Human Communication, 1995.