| 研究生: |
蔡沛任 Tsai, Pei-Jen |
|---|---|
| 論文名稱: |
應用語音屬性分析於構音障礙者之發音錯誤與修正回饋 Error Correction and Feedback Using Phonetic Attribute Analysis for Articulation Disorders |
| 指導教授: |
吳宗憲
Wu, Chung-Hsien |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
| 論文出版年: | 2007 |
| 畢業學年度: | 95 |
| 語文別: | 中文 |
| 論文頁數: | 55 |
| 中文關鍵詞: | 構音障礙 、語音辨識 、語音屬性 |
| 外文關鍵詞: | Articulation Disorders, Speech Attribute, Speech Recognition |
| 相關次數: | 點閱:67 下載:2 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
構音障礙者由於說話語音不清晰,因此與常人溝通時經常發生阻礙,導致其人格發展、學校教育、工作就業等方面受到影響,進而造成嚴重的社會適應問題。本論文目的為應用語音辨識、自然語言處理技術發展構音輔具系統以協助他們正確且快速的找出個人構音動作不良的原因,並可修正其發音後之語句,以改善其日常生活中的溝通表達。
本論文研發架構包括1.)利用語音屬性,分析個人語音混淆矩陣(phone level confusion matrix),再與453位構音病例練訓出來的通用模組(general model)整合,預測使用者可能的語音音節變化,修正人工轉譯檔(transcript file),進行結合最大事後機率(Maximum a posteriori)與最大相似度線性迴歸(Maximum Likelihood Linear Regression)語者調適法;2.)基於錯誤的配對(error-pair)資料,包括合文法與不合文法的音節,對配對進行屬性編碼,使用Apriori 演算法推論出語者穩定的發音錯誤型態;該推論機制,同時還可以大幅縮減使用者測試語句量,減低語者語料收集不易問題;3.)系統依據語者找出的錯誤型態規則,增加語言模組(language model)的辨識絡(lattice),以及利用解除文法限制策略以後進行語音辨識,取得辨識文字結果再進行修正後輸出。本系統特色在於可因應不用的使用者給於回饋,其中的有效資料包括:個人化的錯誤型態以即正確的語音文字,進而達到個人最佳化的訓練與測試結果;系統之設計精神,也符合現今臨床學上語言治療師之評估與治療脈絡。
本論文實驗對個案評估發現,應用本論文所提之語者調適方法,在語音辨識上的音節辨識率可從55%提升到67%,並且語者的發音特質不會受語者調適而喪失;若應用人工標記的結果進行調適,則會有79%的正確率,此為我們的理論上限值。General model的實驗結果也顯示出,本系統對個案的評估和通用模組有高度相關,故選此為priori knowledge是正確可行的方法;個案的錯誤型態方面,本系統和語療師的評定結果亦有對應比較結果。整體而言,本系統可以依循使用者構音特色,找出有效的錯誤型態並以此對構句的辨識結果修正,達到構音障礙者之需求。
Due to the disable of oral communication, Articulation disorders are difficult in occupation, education and socialization. Articulation disorders are the people with several causes, such as hearing impaired, disability of speech-associated organs. The purpose of this thesis is to develop Alternative Assistant Communication (AAC) technology for articulation disorders by applying Speech and language processing methods.
Differ from normal speech processing, this study provide several specific methods achieving the aims. 1) A semi-supervised adaptation for articulation disorders using speech attribute. Dependent on speech attribute to predict suitable pronunciations of the phones from Articulation Disorders for adapting acoustic models avoids mis-adapted of error pronounced phones. 2) Mining latent error patterns from error pairs encoded by phonetic attributes. 3) Extending syllable lattice of speech recognition using personalized error patterns for decoding output approximate to user’s intent.
In order to evaluate the performance of our approach, our subject is a speech and hearing impaired student in national university of Tainan. The syllable of speech recognition will be improved form 55% to 67% using prediction algorithm, and keep the error characteristic of subject. We still analysis the relation between general error patterns and personalized error pattern, it’s positive relation. Furthermore, we have evaluation result of subject from speech-language pathologists, so we could understand the performance of our system.
[Ahmed M, 2001] Ahmed M, Abdelatty Ali, Jan Van der Spiegel, Paul, Mueller, “Acoustic-phonetic features for the automatic classification of Fricatives”, Acoustical Society of America, vol. 109, no. 5, 2001
[Ahmed M, 2001] Ahmed M, Abdelatty Ali, Jan Van der Spiegel, Paul, Mueller, “Acoustic-phonetic features for the automatic classification of stop consonants”, IEEE Trans, Speech and Audio processing, Vol 9, No 8, November 2001
[Chen, 2006] Yeou-Jiunn Chen, Jing-Wei Huang, Hui-Mei Yang, Yi-Hui Lin, and Jiunn-Liang Wu, “Development of articulatory Assessment and Training System with Speech Recognition and articulatory Strategies Selection.” IEEE ICASSP’07 , Vol 4, 209-212
[Duda, 2001] R. O. Duda, P. E. Hart, D. G. Stork, Pattern Classification 2nd ed., Wiley, New York, 2001
[Jinyu Li, 2005] Jinyu Li, Yu Tsao, “Chin-Hui Lee, A Study on Knowledge Source Integration for Candidate Rescoring in Automatic Speech Recognition.” IEEE Trans, ICASSP’05, vol.1, 837-840
[Lo, 2005] W. K. Lo, and F. K. Soong, “Generalized Posterior Probability for Minimum Error Verification of Recognized Sentences,” IEEE Trans, ICASSP’05, vol.1, 85-89
[MAT Speech Database] MAT Speech Database-TCC300
http://rocling.iis.sinica.edu.tw/ROCLING/MAT/Tcc_300brief.htm
[Siohan, 2001] O. Siohan, C. Chesta, and C. H. Lee, “Joint Maximum a Posteriori Adaptation of Transformation and HMM Parameters,” IEEE Trans. Speech and Audio Processing, vol. 9, no. 4, May 2001
[Soong, 2004] F. K. Soong, W. K. Lo, and S. Nakamura, “Generalized Word Posterior Probability (GWPP) for Measuring Reliability of Recognized Words,” Proc. SWIN2004.
[Wang, 2005] L. Wang, Y. Zhao, M. Chu, F. K. Soong, and Z.Cao, “Phonetic Transcription Verification with Generalized Posterior Probability,” Proc. Interspeech’05, Lisbon, 2005.
[Wessel, 2001] F. Wessel, R. Schluter, K. Macherey, and H.Ney, “Confidence Measures for Large Vocabulary Continuous Speech Recognition,” IEEE Trans. Speech and Audio Processing, vol. 9, no. 3, pp. 288-298, Mar. 2001.
[王小川,2004] 王小川,”語音訊號處理”,全華科技,2004
[內政部,民95] 內政部統計處,九十六年第七週內政統計通報,95年底身心障礙者人數統計。http://www.moi.gov.tw/stat/index.asp
[白小芬,2004] 白小芬,”構音障礙兒童語言矯治系統之研究”,行政院國家科學委員會專題研究計畫, 2004
[李偉銓,2006] 李偉銓,吳宗憲,”應用語音註解與音節轉換影像於照片檢索”,國立成功大學資訊工程學系碩士論文,2006
[余秀敏,民78] 余秀敏,劉繼諡,”國語言音特性平衡句之建立”,電信研究季利,第19卷第1期,民國78年3月
[林寶貴,民83] 林寶貴,”語言障礙與矯治”,五南圖書出版公司,民83
[郭人瑋,2004] 郭人瑋,蔡文鴻,陳伯琳,“非監督式學習於中文電視新聞自動轉寫之初步應用” in Proceedings of ROCLING XVI,Taipei,Taiwan,2004
[張顯達,2000] 張顯達,”三至四歲兒童對國語輔音的聽辨與發音”,Language and Linguistic 1.2:19-38,2000
[曾進興,1995] 曾進興主編,”語言病理學基礎,第一卷”,心理出版社,1995
[曾進興,1996] 曾進興主編,”語言病理學基礎,第二卷”,心理出版社,1996
[曾進興,1999] 曾進興主編,”語言病理學基礎,第三卷”,心理出版社,1999
[曾淑娟,2002] 曾淑娟,劉怡芬,”現代漢語口語對話語料庫標註系統說明”,中央研究院語言研究所籌備處,2002
[謝國平,民87] 謝國平,”語言學概論”,三民出版社,民87
[劉麗蓉,民83] 劉麗蓉,”如何克服溝通障礙:病理.診斷.治療.保健(二版)”,遠流出版社
[賴湘君,1987] 賴湘君,”構音異常”,聽語會刊,第四期70-73頁,1987
[顏月珠,2003] 顏月珠,“統計學”,三民書局,2003
[鍾玉梅,] 鍾玉梅,”聽障兒童之構音治療”,聽語會刊,8期,41-47頁,民81年