簡易檢索 / 詳目顯示

研究生: 陳昱儒
Chen, Yu-Ju
論文名稱: 數位助聽器應用之強化語音辨識度噪音抑制與頻率壓縮演算法
Noise Reduction and Frequency Compression to Enhance Speech Intelligibility for Digital Hearing Aids
指導教授: 雷曉方
Lei, Sheau-Fang
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2016
畢業學年度: 104
語文別: 中文
論文頁數: 104
中文關鍵詞: 數位助聽器噪音抑制頻率壓縮
外文關鍵詞: Digital Hearing Aids, Noise Reduction, Frequency Compression
相關次數: 點閱:106下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本論文提出了兩種應用於數位助聽器的演算法──探討雙麥克風拾音特性與功率譜密度之噪音抑制演算法(A Dual-Microphone Noise Reduction Algorithm Based on Power Level Difference and Polar Pattern)、適用中高度陡降型聽損之非線性頻率壓縮演算法(A Nonlinear Frequency Compression Algorithm to Enhance High Frequency Speech Intelligibility Designed for Moderate and Severe Sloping Hearing Loss)。前者為端射型(Endfire)的雙麥克風系統(Dual-microphone System),由於助聽器有小體積的基本要求,故兩麥克風的間距需極小,進而導致基於相干性函數(Coherence Function)的表現不佳,改以功率階層差(Power Level Difference, PLD)與拾音特性(Polar Pattern)取代,進一步引入理想二元遮罩(Ideal Binary Mask)的概念,避免過度地抑制已被語音遮蔽的噪音,造成語音訊號變形(Distortion)而影響辨識能力,因此,無論在理想的模擬假設下,抑或是真實錄音環境中,本論文所提出之噪音抑制演算法,即使面對各種類型的噪音源干擾,能在信噪比(Signal to Noise Ratio, SNR)、基於相干性的語音辨識度指標(Coherence-based Speech Intelligibility Index, CSII)上有相當幅度的改善,如此確保本演算法的穩定性,此外,本論文所提出的噪音抑制演算法在運算複雜度的比較上亦佔有優勢,故能為聽損者提供一個效能穩定、體積小,且成本考量上更符合經濟效益的選擇;後者的非線性頻率壓縮演算法的研究對象是中高度的陡降型聽損者(Sloping Hearing Loss),若僅是純粹地對富含許多重要信息的高音頻帶進行放大處理(Amplification),並無法帶來等效的助益,還可能導致回授型噪音(Feedback Noise)來影響聽覺感受,所以本論文依照聽損者的聽力圖(Audiogram)、聽力辨識度權重值(Speech Intelligibility Weights),來客製化地調整各頻帶語音的分布,將高頻資訊保留至較低的頻帶,便可達到節省回授音消除(Feedback Cancellation)的成本,以及提升受測者在華語語音聽力辨識測驗(Mandarin Speech Recognition Test)的表現。綜合上述,本論文兼具數值上的改良,以及符合使用者端真正的需求兩大特點,以祈能為更多的聽損者帶來幫助。

    This thesis presents a noise reduction algorithm and a frequency compression method designed for digital hearing aids. The proposed noise reduction is designed as dual microphone endfire configuration. We adopt the power level difference and polar pattern instead of coherence function to decrease the inaccuracy in small-sized applications. Also, an adaptive threshold mechanism and the concept of ideal binary mask are added. It leads to speech intelligibility enhancement with objective measurements SNR and CSII in different noise corrupted environment. Moreover, less adders and multipliers are used in computation complexity comparison.
    The frequency compression method is designed for a specific group people with sloping hearing loss. We propose a customized nonlinear frequency compression method rather than amplification in previous work. This method will manipulate the distribution of frequency by user’s audiogram and speech intelligibility weights. Preserving the information in both low frequencies and high frequencies, participants have higher scores on mandarin speech recognition test.
    Combining noise reduction and frequency compression in a hearing aids system, this thesis could apply a stable performance on speech perception as well as economical choice for hearing impairments.

    中文摘要 I EXTENDED ABSTRACT III 誌謝 XV 目錄 XVII 表目錄 XX 圖目錄 XXII 第一章 緒論 XX 1.1. 動機與目的 1 1.2. 耳朵聽覺系統介紹 1 1.3. 助聽器架構介紹 2 1.3.1. 噪音抑制演算法 3 1.3.2. 降頻技術 5 1.4. 論文章節組織 9 第二章 相關文獻回顧與分析 11 2.1. 噪音抑制演算法相關文獻 11 2.1.1. 頻譜相減法 11 2.1.2. 最小值控制遞迴平均法 12 2.1.3. 理想二元遮罩理論 13 2.1.4. 以相干性函數為基礎之噪音抑制演算法 15 2.1.5. 以功率階層差為基礎之噪音抑制演算法 20 2.2. 噪音抑制演算法相關客觀指標分析 23 2.2.1. 信噪比SNR 23 2.2.2. 基於相干性的語音辨識度指標CSII 24 2.3. 非線性頻率壓縮演算法相關文獻 26 2.3.1. 高頻語音的重要性 26 2.3.2. 非線性頻率壓縮演算法特性 27 2.3.3. Simpson的非線性壓縮演算法[40] 27 2.3.4. Liang的非線性頻率壓縮演算法[41] 29 第三章 探討雙麥克風拾音特性與功率譜密度之噪音抑制演算法 33 3.1. 噪音抑制演算法架構簡介 33 3.2. 噪音抑制演算法環境假設 34 3.2.1. 麥克風擺設方式 34 3.2.2. 麥克風特性 35 3.3. 功率譜密度定義與特性 36 3.3.1. 功率譜密度定義與推導 36 3.3.2. 錄音實驗證明 37 3.4. 語音功率譜密度估測與適應性閾值機制 39 3.4.1. 語音功率譜密度估測 39 3.4.2. 適應性閾值機制 40 3.5. 本論文所提出之噪音抑制演算法統整 42 第四章 噪音抑制演算法之結果分析與比較 47 4.1. MATLAB模擬效能分析與比較 47 4.1.1. MATLAB模擬環境設置 47 4.1.2. MATLAB模擬之效能比較 50 4.2. 錄音實驗效能分析與比較 54 4.2.1. 實際錄音環境設置與流程 54 4.2.2. 實際錄音之效能比較 56 4.3. 演算法運算複雜度分析與比較 60 第五章 適用中高度陡降型聽損之非線性頻率壓縮演算法 63 5.1. 頻率壓縮演算法架構介紹 63 5.2. 壓縮頻率值計算 64 5.2.1. 傅立葉轉換輸入頻率換算 64 5.2.2. 聽力辨識度的權重與頻帶分佈 64 5.2.3. 使用者聽力圖分析與模式決定 65 5.2.4. 截止頻率計算 66 5.2.5. 頻帶壓縮比與輸出頻率值計算 67 5.2.6. 範例說明 67 5.3. 頻譜內插法 70 5.3.1. 頻譜內插法概念與頻率索引值計算 70 5.3.2. 頻譜內插法運算 71 5.4. 本論文所提出之高頻壓縮演算法統整 74 第六章 頻率壓縮演算法之結果分析與比較 79 6.1. 華語語音聽辨測驗流程 79 6.2. 華語語音聽辨測驗的實驗方法 80 6.2.1. 試驗性階段實驗方法 85 6.2.2. 正式階段實驗方法 86 6.3. 華語語音聽辨測驗結果分析與討論 87 6.3.1. 研究對象資料 88 6.3.2. 試驗性階段實驗結果 90 6.3.3. 正式階段實驗結果 92 第七章 結論與未來展望 99 參考文獻 101

    [1] 吳晉祥, 黃盈翔, and 張智仁, "老年人的預防性健康照護-從指引到臨床實務," 台灣老年醫學雜誌, vol. 2, pp. 145-163, 2007.
    [2] 陳美后, "台灣身心障礙的長期照護風險之分析," 2015.
    [3] 科林助聽器. (2013). 認識聽力損失. Available: http://www.ear.com.tw/Hearing1.php
    [4] A. Medical. (2014, 20160421). The Inner Ear. Available: http://www.aurismedical.com/inner-ear-disorders/the-inner-ear
    [5] H. Puder, "Hearing aids: an overview of the state-of-the-art, challenges, and future trends of an interesting audio signal processing application," in Image and Signal Processing and Analysis, 2009. ISPA 2009. Proceedings of 6th International Symposium on, 2009, pp. 1-6.
    [6] 行政院環境保護署. (2015). 噪音小百科. Available: http://ncs.epa.gov.tw/BB/B-04-01.htm
    [7] R. Plomp, "A signal-to-noise ratio model for the speech-reception threshold of the hearing impaired," Journal of Speech, Language, and Hearing Research, vol. 29, pp. 146-154, 1986.
    [8] C. C. Crandell, "Speech recognition in noise by children with minimal degrees of sensorineural hearing loss," Ear and hearing, vol. 14, pp. 210-216, 1993.
    [9] H. Fastl and E. Zwicker, Psychoacoustics: facts and models vol. 22: Springer Science & Business Media, 2006.
    [10] H. Levitt, "Noise reduction in hearing aids: a review," Journal of rehabilitation research and development, vol. 38, p. 111, 2001.
    [11] W. H. Organization, "International classification of impairments, disabilities, and handicaps: a manual of classification relating to the consequences of disease, published in accordance with resolution WHA29. 35 of the Twenty-ninth World Health Assembly, May 1976," 1980.
    [12] W. M. centre. (2015, 20160429). Deafness and hearing loss.
    [13] L.-M. Tseng, G.-S. Lee, E. Yang, N. Young, and C.-Y. Hsu, "Cochlear dead region and word recognition of Mandarin Chinese in Taiwan," Chinese Journal of Physiology, vol. 56, pp. 129-137, 2013.
    [14] B. C. Moore, "Dead regions in the cochlea: Diagnosis, perceptual consequences, and implications for the fitting of hearing aids," Trends in amplification, vol. 5, p. 1, 2001.
    [15] D. Stapells and P. Oates, "Estimation of the pure-tone audiogram by the auditory brainstem response: a review," Audiology and Neurotology, vol. 2, pp. 257-280, 1997.
    [16] T. Baer, B. C. Moore, and K. Kluk, "Effects of low pass filtering on the intelligibility of speech in noise for people with and without dead regions at high frequencies," The Journal of the Acoustical Society of America, vol. 112, pp. 1133-1144, 2002.
    [17] J. M. Alexander, J. G. Kopun, and P. G. Stelmachowicz, "Effects of frequency compression and frequency transposition on fricative and affricate perception in listeners with normal hearing and mild to moderate hearing loss," Ear and hearing, vol. 35, pp. 519-532, 2013.
    [18] A. Simpson, "Frequency-lowering devices for managing high-frequency hearing loss: A review," Trends in Amplification, vol. 13, pp. 87-106, 2009.
    [19] R. H. Gifford, M. F. Dorman, A. J. Spahr, and S. A. McKarns, "Effect of digital frequency compression (DFC) on speech recognition in candidates for combined electric and acoustic stimulation (EAS)," Journal of Speech, Language, and Hearing Research, vol. 50, pp. 1194-1202, 2007.
    [20] J. M. Alexander, "20Q: The highs and lows of frequency lowering amplification," AudiologyOnline. 2013b Article, vol. 11772, 2013.
    [21] S. F. Boll, "Suppression of acoustic noise in speech using spectral subtraction," Acoustics, Speech and Signal Processing, IEEE Transactions on, vol. 27, pp. 113-120, 1979.
    [22] I. Cohen and B. Berdugo, "Noise estimation by minima controlled recursive averaging for robust speech enhancement," Signal Processing Letters, IEEE, vol. 9, pp. 12-15, 2002.
    [23] D. S. Brungart, P. S. Chang, B. D. Simpson, and D. Wang, "Isolating the energetic component of speech-on-speech masking with ideal time-frequency segregation," The Journal of the Acoustical Society of America, vol. 120, pp. 4007-4018, 2006.
    [24] D. Wang, "Time-frequency masking for speech separation and its potential for hearing aid design," Trends in amplification, 2008.
    [25] N. Yousefian, K. Kokkinakis, and P. C. Loizou, "A coherence-based algorithm for noise reduction in dual-microphone applications," in Proc. Eur. Signal Process. Conf.(EUSIPCO’10), 2010, pp. 1904-1908.
    [26] N. Yousefian and P. C. Loizou, "A dual-microphone speech enhancement algorithm based on the coherence function," Audio, Speech, and Language Processing, IEEE Transactions on, vol. 20, pp. 599-609, 2012.
    [27] S.-C. Lai, H.-C. Lai, F.-C. Hong, H.-R. Lin, and S.-F. Lei, "A Novel Coherence-Function-Based Noise Suppression Algorithm by Applying Sound-Source Localization and Awareness-Computation Strategy for Dual Microphones," in Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP), 2014 Tenth International Conference on, 2014, pp. 313-316.
    [28] N. Yousefian, A. Akbari, and M. Rahmani, "Using power level difference for near field dual-microphone speech enhancement," Applied Acoustics, vol. 70, pp. 1412-1421, 2009.
    [29] M. Jeub, C. Nelke, H. Kruger, C. Beaugeant, and P. Vary, "Robust dual-channel noise power spectral density estimation," in Signal Processing Conference, 2011 19th European, 2011, pp. 2304-2308.
    [30] M. Jeub, C. Herglotz, C. Nelke, C. Beaugeant, and P. Vary, "Noise reduction for dual-microphone mobile phones exploiting power level differences," in Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference On, 2012, pp. 1693-1696.
    [31] T. Esch and P. Vary, "Efficient musical noise suppression for speech enhancement system," in Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on, 2009, pp. 4409-4412.
    [32] R. Martin, "Noise power spectral density estimation based on optimal smoothing and minimum statistics," Speech and Audio Processing, IEEE Transactions on, vol. 9, pp. 504-512, 2001.
    [33] W. Lee, J.-H. Song, and J.-H. Chang, "Minima-controlled speech presence uncertainty tracking method for speech enhancement," Signal Processing, vol. 91, pp. 155-161, 2011.
    [34] R. Drullman, "Speech intelligibility in noise: relative contribution of speech elements above and below the noise level," The Journal of the Acoustical Society of America, vol. 98, pp. 1796-1798, 1995.
    [35] D. Wang, "On ideal binary mask as the computational goal of auditory scene analysis," in Speech separation by humans and machines, ed: Springer, 2005, pp. 181-197.
    [36] M. Brandstein and D. Ward, Microphone arrays: signal processing techniques and applications: Springer Science & Business Media, 2001.
    [37] J. M. Kates and K. H. Arehart, "Coherence and the speech intelligibility index," The Journal of the Acoustical Society of America, vol. 117, pp. 2224-2237, 2005.
    [38] J. Ma, Y. Hu, and P. C. Loizou, "Objective measures for predicting speech intelligibility in noisy conditions based on new band-importance functions," The Journal of the Acoustical Society of America, vol. 125, pp. 3387-3405, 2009.
    [39] A. ANSI, "S3. 5-1997, Methods for the calculation of the speech intelligibility index," New York: American National Standards Institute, vol. 19, pp. 90-119, 1997.
    [40] A. Simpson, A. A. Hersbach, and H. J. McDermott, "Improvements in speech perception with an experimental nonlinear frequency compression hearing device," International journal of audiology, vol. 44, pp. 281-292, 2005.
    [41] R. Liang, J. Xi, J. Zhou, C. Zou, and L. Zhao, "An improved method to enhance high-frequency speech intelligibility in noise," Applied Acoustics, vol. 74, pp. 71-78, 2013.
    [42] J. M. Kates, "Speech intelligibility enhancement," ed: Google Patents, 1984.
    [43] 黃佩芬, 黃桂君, 王小川, and 劉惠美, "以語音聽力檢測系統輔助聽障兒童發音教學實驗," 國立臺灣師範大學特殊教育學系 特殊教育研究學刊, p. 115, 2006.
    [44] M. P. Moeller, B. Hoover, C. Putman, K. Arbataitis, G. Bohnenkamp, B. Peterson, et al., "Vocalizations of infants with hearing loss compared with infants with normal hearing: Part I–Phonetic development," Ear and hearing, vol. 28, pp. 605-627, 2007.
    [45] U. Kornagel, "Method and device for frequency compression," ed: Google Patents, 2014.
    [46] M. Henshall. (2015, 20160502). Microphones: Polar pattern / Directionality. Available: http://www.shure.co.uk/support_download/educational_content/microphones-basics/microphone_polar_patterns
    [47] F. G. Stremler, "Introduction to communication systems," Introduction to Communication Systems, 3rd edition by Ferrel G. Stremler, 3rd edition, Addison-Wesley, 770 p., ISBN: 0201184982, vol. 1, 1990.
    [48] E. Rothauser, W. Chapman, N. Guttman, K. Nordby, H. Silbiger, G. Urbanek, et al., "IEEE recommended practice for speech quality measurements," IEEE Trans. Audio Electroacoust, vol. 17, pp. 225-246, 1969.
    [49] Y. Hu and P. C. Loizou, "Subjective comparison and evaluation of speech enhancement algorithms," Speech communication, vol. 49, pp. 588-601, 2007.
    [50] P. C. Loizou, Speech enhancement: theory and practice: CRC press, 2013.
    [51] M. Gasior and J. Gonzalez, "Improving FFT frequency measurement resolution by parabolic and Gaussian spectrum interpolation," in Beam Instrumentation Workshop 2004, 2004, pp. 276-285.
    [52] A. Epstein, "Speech audiometry," Otolaryngologic Clinics of North America, vol. 11, pp. 667-676, 1978.
    [53] 工. 資訊與通訊研究所. (2011). 工研院文字轉語音Web服務. Available: http://tts.itri.org.tw/
    [54] W. Rivers and H. Webber, "The action of caffeine on the capacity for muscular work," The Journal of physiology, vol. 36, pp. 33-47, 1907.
    [55] 蔣燿宇, "華語雙字詞語音辨識力測驗之設計與評估 (未出版之碩士論文). 國立陽明大學, 臺北市," 2005.
    [56] 施婉婷, "非線性頻率壓縮及非線性高頻壓縮對高頻聽損者華語語音聽辨之影響," 國立臺北護理健康大學聽語障礙科學研究所碩士論文, 2014.
    [57] 葉旭輝, "華語各頻帶訊息對語音理解之重要度分析," ed: 國立陽明大學醫學工程研究所碩士論文, 2005.
    [58] W. F. Rintelmann, Contemporary perspectives in hearing assessment vol. 1: Allyn & Bacon, 1999.
    [59] J. R. Dubno, F.-S. Lee, A. J. Klein, L. J. Matthews, and C. F. Lam, "Confidence limits for maximum word-recognition scores," Journal of Speech, Language, and Hearing Research, vol. 38, pp. 490-502, 1995.
    [60] L. M. Thibodeau. (2012). SPRINTCHART.

    無法下載圖示 校內:2021-08-01公開
    校外:不公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE