| 研究生: |
陳昱儒 Chen, Yu-Ju |
|---|---|
| 論文名稱: |
數位助聽器應用之強化語音辨識度噪音抑制與頻率壓縮演算法 Noise Reduction and Frequency Compression to Enhance Speech Intelligibility for Digital Hearing Aids |
| 指導教授: |
雷曉方
Lei, Sheau-Fang |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
| 論文出版年: | 2016 |
| 畢業學年度: | 104 |
| 語文別: | 中文 |
| 論文頁數: | 104 |
| 中文關鍵詞: | 數位助聽器 、噪音抑制 、頻率壓縮 |
| 外文關鍵詞: | Digital Hearing Aids, Noise Reduction, Frequency Compression |
| 相關次數: | 點閱:106 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本論文提出了兩種應用於數位助聽器的演算法──探討雙麥克風拾音特性與功率譜密度之噪音抑制演算法(A Dual-Microphone Noise Reduction Algorithm Based on Power Level Difference and Polar Pattern)、適用中高度陡降型聽損之非線性頻率壓縮演算法(A Nonlinear Frequency Compression Algorithm to Enhance High Frequency Speech Intelligibility Designed for Moderate and Severe Sloping Hearing Loss)。前者為端射型(Endfire)的雙麥克風系統(Dual-microphone System),由於助聽器有小體積的基本要求,故兩麥克風的間距需極小,進而導致基於相干性函數(Coherence Function)的表現不佳,改以功率階層差(Power Level Difference, PLD)與拾音特性(Polar Pattern)取代,進一步引入理想二元遮罩(Ideal Binary Mask)的概念,避免過度地抑制已被語音遮蔽的噪音,造成語音訊號變形(Distortion)而影響辨識能力,因此,無論在理想的模擬假設下,抑或是真實錄音環境中,本論文所提出之噪音抑制演算法,即使面對各種類型的噪音源干擾,能在信噪比(Signal to Noise Ratio, SNR)、基於相干性的語音辨識度指標(Coherence-based Speech Intelligibility Index, CSII)上有相當幅度的改善,如此確保本演算法的穩定性,此外,本論文所提出的噪音抑制演算法在運算複雜度的比較上亦佔有優勢,故能為聽損者提供一個效能穩定、體積小,且成本考量上更符合經濟效益的選擇;後者的非線性頻率壓縮演算法的研究對象是中高度的陡降型聽損者(Sloping Hearing Loss),若僅是純粹地對富含許多重要信息的高音頻帶進行放大處理(Amplification),並無法帶來等效的助益,還可能導致回授型噪音(Feedback Noise)來影響聽覺感受,所以本論文依照聽損者的聽力圖(Audiogram)、聽力辨識度權重值(Speech Intelligibility Weights),來客製化地調整各頻帶語音的分布,將高頻資訊保留至較低的頻帶,便可達到節省回授音消除(Feedback Cancellation)的成本,以及提升受測者在華語語音聽力辨識測驗(Mandarin Speech Recognition Test)的表現。綜合上述,本論文兼具數值上的改良,以及符合使用者端真正的需求兩大特點,以祈能為更多的聽損者帶來幫助。
This thesis presents a noise reduction algorithm and a frequency compression method designed for digital hearing aids. The proposed noise reduction is designed as dual microphone endfire configuration. We adopt the power level difference and polar pattern instead of coherence function to decrease the inaccuracy in small-sized applications. Also, an adaptive threshold mechanism and the concept of ideal binary mask are added. It leads to speech intelligibility enhancement with objective measurements SNR and CSII in different noise corrupted environment. Moreover, less adders and multipliers are used in computation complexity comparison.
The frequency compression method is designed for a specific group people with sloping hearing loss. We propose a customized nonlinear frequency compression method rather than amplification in previous work. This method will manipulate the distribution of frequency by user’s audiogram and speech intelligibility weights. Preserving the information in both low frequencies and high frequencies, participants have higher scores on mandarin speech recognition test.
Combining noise reduction and frequency compression in a hearing aids system, this thesis could apply a stable performance on speech perception as well as economical choice for hearing impairments.
[1] 吳晉祥, 黃盈翔, and 張智仁, "老年人的預防性健康照護-從指引到臨床實務," 台灣老年醫學雜誌, vol. 2, pp. 145-163, 2007.
[2] 陳美后, "台灣身心障礙的長期照護風險之分析," 2015.
[3] 科林助聽器. (2013). 認識聽力損失. Available: http://www.ear.com.tw/Hearing1.php
[4] A. Medical. (2014, 20160421). The Inner Ear. Available: http://www.aurismedical.com/inner-ear-disorders/the-inner-ear
[5] H. Puder, "Hearing aids: an overview of the state-of-the-art, challenges, and future trends of an interesting audio signal processing application," in Image and Signal Processing and Analysis, 2009. ISPA 2009. Proceedings of 6th International Symposium on, 2009, pp. 1-6.
[6] 行政院環境保護署. (2015). 噪音小百科. Available: http://ncs.epa.gov.tw/BB/B-04-01.htm
[7] R. Plomp, "A signal-to-noise ratio model for the speech-reception threshold of the hearing impaired," Journal of Speech, Language, and Hearing Research, vol. 29, pp. 146-154, 1986.
[8] C. C. Crandell, "Speech recognition in noise by children with minimal degrees of sensorineural hearing loss," Ear and hearing, vol. 14, pp. 210-216, 1993.
[9] H. Fastl and E. Zwicker, Psychoacoustics: facts and models vol. 22: Springer Science & Business Media, 2006.
[10] H. Levitt, "Noise reduction in hearing aids: a review," Journal of rehabilitation research and development, vol. 38, p. 111, 2001.
[11] W. H. Organization, "International classification of impairments, disabilities, and handicaps: a manual of classification relating to the consequences of disease, published in accordance with resolution WHA29. 35 of the Twenty-ninth World Health Assembly, May 1976," 1980.
[12] W. M. centre. (2015, 20160429). Deafness and hearing loss.
[13] L.-M. Tseng, G.-S. Lee, E. Yang, N. Young, and C.-Y. Hsu, "Cochlear dead region and word recognition of Mandarin Chinese in Taiwan," Chinese Journal of Physiology, vol. 56, pp. 129-137, 2013.
[14] B. C. Moore, "Dead regions in the cochlea: Diagnosis, perceptual consequences, and implications for the fitting of hearing aids," Trends in amplification, vol. 5, p. 1, 2001.
[15] D. Stapells and P. Oates, "Estimation of the pure-tone audiogram by the auditory brainstem response: a review," Audiology and Neurotology, vol. 2, pp. 257-280, 1997.
[16] T. Baer, B. C. Moore, and K. Kluk, "Effects of low pass filtering on the intelligibility of speech in noise for people with and without dead regions at high frequencies," The Journal of the Acoustical Society of America, vol. 112, pp. 1133-1144, 2002.
[17] J. M. Alexander, J. G. Kopun, and P. G. Stelmachowicz, "Effects of frequency compression and frequency transposition on fricative and affricate perception in listeners with normal hearing and mild to moderate hearing loss," Ear and hearing, vol. 35, pp. 519-532, 2013.
[18] A. Simpson, "Frequency-lowering devices for managing high-frequency hearing loss: A review," Trends in Amplification, vol. 13, pp. 87-106, 2009.
[19] R. H. Gifford, M. F. Dorman, A. J. Spahr, and S. A. McKarns, "Effect of digital frequency compression (DFC) on speech recognition in candidates for combined electric and acoustic stimulation (EAS)," Journal of Speech, Language, and Hearing Research, vol. 50, pp. 1194-1202, 2007.
[20] J. M. Alexander, "20Q: The highs and lows of frequency lowering amplification," AudiologyOnline. 2013b Article, vol. 11772, 2013.
[21] S. F. Boll, "Suppression of acoustic noise in speech using spectral subtraction," Acoustics, Speech and Signal Processing, IEEE Transactions on, vol. 27, pp. 113-120, 1979.
[22] I. Cohen and B. Berdugo, "Noise estimation by minima controlled recursive averaging for robust speech enhancement," Signal Processing Letters, IEEE, vol. 9, pp. 12-15, 2002.
[23] D. S. Brungart, P. S. Chang, B. D. Simpson, and D. Wang, "Isolating the energetic component of speech-on-speech masking with ideal time-frequency segregation," The Journal of the Acoustical Society of America, vol. 120, pp. 4007-4018, 2006.
[24] D. Wang, "Time-frequency masking for speech separation and its potential for hearing aid design," Trends in amplification, 2008.
[25] N. Yousefian, K. Kokkinakis, and P. C. Loizou, "A coherence-based algorithm for noise reduction in dual-microphone applications," in Proc. Eur. Signal Process. Conf.(EUSIPCO’10), 2010, pp. 1904-1908.
[26] N. Yousefian and P. C. Loizou, "A dual-microphone speech enhancement algorithm based on the coherence function," Audio, Speech, and Language Processing, IEEE Transactions on, vol. 20, pp. 599-609, 2012.
[27] S.-C. Lai, H.-C. Lai, F.-C. Hong, H.-R. Lin, and S.-F. Lei, "A Novel Coherence-Function-Based Noise Suppression Algorithm by Applying Sound-Source Localization and Awareness-Computation Strategy for Dual Microphones," in Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP), 2014 Tenth International Conference on, 2014, pp. 313-316.
[28] N. Yousefian, A. Akbari, and M. Rahmani, "Using power level difference for near field dual-microphone speech enhancement," Applied Acoustics, vol. 70, pp. 1412-1421, 2009.
[29] M. Jeub, C. Nelke, H. Kruger, C. Beaugeant, and P. Vary, "Robust dual-channel noise power spectral density estimation," in Signal Processing Conference, 2011 19th European, 2011, pp. 2304-2308.
[30] M. Jeub, C. Herglotz, C. Nelke, C. Beaugeant, and P. Vary, "Noise reduction for dual-microphone mobile phones exploiting power level differences," in Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference On, 2012, pp. 1693-1696.
[31] T. Esch and P. Vary, "Efficient musical noise suppression for speech enhancement system," in Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on, 2009, pp. 4409-4412.
[32] R. Martin, "Noise power spectral density estimation based on optimal smoothing and minimum statistics," Speech and Audio Processing, IEEE Transactions on, vol. 9, pp. 504-512, 2001.
[33] W. Lee, J.-H. Song, and J.-H. Chang, "Minima-controlled speech presence uncertainty tracking method for speech enhancement," Signal Processing, vol. 91, pp. 155-161, 2011.
[34] R. Drullman, "Speech intelligibility in noise: relative contribution of speech elements above and below the noise level," The Journal of the Acoustical Society of America, vol. 98, pp. 1796-1798, 1995.
[35] D. Wang, "On ideal binary mask as the computational goal of auditory scene analysis," in Speech separation by humans and machines, ed: Springer, 2005, pp. 181-197.
[36] M. Brandstein and D. Ward, Microphone arrays: signal processing techniques and applications: Springer Science & Business Media, 2001.
[37] J. M. Kates and K. H. Arehart, "Coherence and the speech intelligibility index," The Journal of the Acoustical Society of America, vol. 117, pp. 2224-2237, 2005.
[38] J. Ma, Y. Hu, and P. C. Loizou, "Objective measures for predicting speech intelligibility in noisy conditions based on new band-importance functions," The Journal of the Acoustical Society of America, vol. 125, pp. 3387-3405, 2009.
[39] A. ANSI, "S3. 5-1997, Methods for the calculation of the speech intelligibility index," New York: American National Standards Institute, vol. 19, pp. 90-119, 1997.
[40] A. Simpson, A. A. Hersbach, and H. J. McDermott, "Improvements in speech perception with an experimental nonlinear frequency compression hearing device," International journal of audiology, vol. 44, pp. 281-292, 2005.
[41] R. Liang, J. Xi, J. Zhou, C. Zou, and L. Zhao, "An improved method to enhance high-frequency speech intelligibility in noise," Applied Acoustics, vol. 74, pp. 71-78, 2013.
[42] J. M. Kates, "Speech intelligibility enhancement," ed: Google Patents, 1984.
[43] 黃佩芬, 黃桂君, 王小川, and 劉惠美, "以語音聽力檢測系統輔助聽障兒童發音教學實驗," 國立臺灣師範大學特殊教育學系 特殊教育研究學刊, p. 115, 2006.
[44] M. P. Moeller, B. Hoover, C. Putman, K. Arbataitis, G. Bohnenkamp, B. Peterson, et al., "Vocalizations of infants with hearing loss compared with infants with normal hearing: Part I–Phonetic development," Ear and hearing, vol. 28, pp. 605-627, 2007.
[45] U. Kornagel, "Method and device for frequency compression," ed: Google Patents, 2014.
[46] M. Henshall. (2015, 20160502). Microphones: Polar pattern / Directionality. Available: http://www.shure.co.uk/support_download/educational_content/microphones-basics/microphone_polar_patterns
[47] F. G. Stremler, "Introduction to communication systems," Introduction to Communication Systems, 3rd edition by Ferrel G. Stremler, 3rd edition, Addison-Wesley, 770 p., ISBN: 0201184982, vol. 1, 1990.
[48] E. Rothauser, W. Chapman, N. Guttman, K. Nordby, H. Silbiger, G. Urbanek, et al., "IEEE recommended practice for speech quality measurements," IEEE Trans. Audio Electroacoust, vol. 17, pp. 225-246, 1969.
[49] Y. Hu and P. C. Loizou, "Subjective comparison and evaluation of speech enhancement algorithms," Speech communication, vol. 49, pp. 588-601, 2007.
[50] P. C. Loizou, Speech enhancement: theory and practice: CRC press, 2013.
[51] M. Gasior and J. Gonzalez, "Improving FFT frequency measurement resolution by parabolic and Gaussian spectrum interpolation," in Beam Instrumentation Workshop 2004, 2004, pp. 276-285.
[52] A. Epstein, "Speech audiometry," Otolaryngologic Clinics of North America, vol. 11, pp. 667-676, 1978.
[53] 工. 資訊與通訊研究所. (2011). 工研院文字轉語音Web服務. Available: http://tts.itri.org.tw/
[54] W. Rivers and H. Webber, "The action of caffeine on the capacity for muscular work," The Journal of physiology, vol. 36, pp. 33-47, 1907.
[55] 蔣燿宇, "華語雙字詞語音辨識力測驗之設計與評估 (未出版之碩士論文). 國立陽明大學, 臺北市," 2005.
[56] 施婉婷, "非線性頻率壓縮及非線性高頻壓縮對高頻聽損者華語語音聽辨之影響," 國立臺北護理健康大學聽語障礙科學研究所碩士論文, 2014.
[57] 葉旭輝, "華語各頻帶訊息對語音理解之重要度分析," ed: 國立陽明大學醫學工程研究所碩士論文, 2005.
[58] W. F. Rintelmann, Contemporary perspectives in hearing assessment vol. 1: Allyn & Bacon, 1999.
[59] J. R. Dubno, F.-S. Lee, A. J. Klein, L. J. Matthews, and C. F. Lam, "Confidence limits for maximum word-recognition scores," Journal of Speech, Language, and Hearing Research, vol. 38, pp. 490-502, 1995.
[60] L. M. Thibodeau. (2012). SPRINTCHART.
校內:2021-08-01公開