| 研究生: |
黃尹鐶 Huang, Yin-Huan |
|---|---|
| 論文名稱: |
應用於遠場麥克風陣列之改良型廣義旁瓣消除器語音增強演算法 Speech Enhancement Algorithm Based on Modified Generalized Sidelobe Canceller For Far-field Microphone Array Application |
| 指導教授: |
邱瀝毅
Chiou, Lih-Yih |
| 共同指導教授: |
雷曉方
Lei, Sheau-Fang |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
| 論文出版年: | 2019 |
| 畢業學年度: | 107 |
| 語文別: | 中文 |
| 論文頁數: | 103 |
| 中文關鍵詞: | 麥克風陣列 、語音增強 、波束形成 、適應性濾波器 、噪音抑制 、語音洩漏 |
| 外文關鍵詞: | Microphone Array, Speech Enhancement, Adaptive Filter, Speech Leakage, Noise Reduction |
| 相關次數: | 點閱:74 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
環境噪音對於人機互動系統或是通訊系統是主要干擾語音辨識的因素之一,本論文提出應用於遠場麥克風陣列之改良型廣義旁瓣消除器語音增強演算法,透過多支麥克風在空間上的擺放特性,計算出語音到達各麥克風的時間差,對各麥克風做時間補償,使語音訊號在各通道上達到時序對齊(Time Aligned),接著將各通道取平均得到延遲相加波束形成(Delay and Sum Beamforming)的輸出,做為初步增強的語音訊號。另外,利用時間補償後各通道的訊號,藉由阻塞矩陣(Blocking Matrix)對噪音進行初步的估測,接著利用適應性濾波器抑制掉阻塞矩陣所洩漏的語音,獲得更精確的估測噪音,最後將此估測噪音通過另一個適應性濾波器,對延遲相加波束形成輸出所殘留的噪音進行進一步的抑制,得到最終增強的語音訊號,此外,本論文改良了適應性濾波器的迭代公式,使收斂後的濾波器不會隨語音出現而偏離最佳係數。本論文透過實際的錄音進行實驗模擬,經過本論文的語音增強演算法處理後,在客觀指標訊號雜訊比(Signal-to-Noise Ratio, SNR)、相干性與語音辨識度(Coherence and Speech Intelligibility Index, CSII),以及語音品質之感知評價(Perceptual Evaluation of Speech Quality, PESQ)的效能分析上,能有相當幅度的提升,代表本演算法架構所輸出的語音訊號,具有相當的辨識度和語音品質,使人機互動裝置較能有效的辨識出使用者所下達的指令。
By placing a number of microphones in the space, the array can collect the spatial samples of the propagation wave. Using the spatial information of the microphone array signals, we expect to extract the desired path speech signal from the noisy signal by applying spatial filter. Conventional generalized sidelobe canceller(CGSC) is composed of delay and sum beamforming, blocking matrix, and multi-channel adaptive. It can get good performance with a few microphones. However, this algorithm have some problems thus as speech leakage. Another problem is that after the adaptive filter of conventional generalized sidelobe canceller is convergent, the adaptive filter coefficients will deviate the optimal coefficients as speech present, cause excess mean square error. In this thesis, a modified gereralized sidelobe canceller speech enhancement algorithm is proposed, it modifies the NLMS updating equation, and let the excess mean square error problem can be suppressed effectively, leading the performance is better than conventional updating equation after convergence.
We record noisy signals of microphone array with four different kinds of noise and compare the results with double affine projection generalized sidelobe canceller(DAP-GSC) and CGSC. The signal-to-noise ratio(SNR) performance is better than DAP-GSC and CGSC, that is, the enhanced speech of proposed algorithm contains less nois. The perceptual evaluation of speech quality(PESQ) and coherence and speech intelligibility index(CSII) performances are important indexes. Our performances are better than other algorithms. It means our enhanced speech have better quality, and person or machine can clearly recognize it. Hence, it is worth applying to communcation systems or human-machine interaction devices.
[1] Zwicker, Eberhard, and Hugo Fastl. Psychoacoustics: Facts and models. Vol. 22. Springer Science & Business Media, 2013.
[2] P. Noll, “MPEG digital audio coding,” Signal Processing Magazine, IEEE, vol. 14, no.5, 1997.Sep, pp. 59-81.
[3] BRANDSTEIN, Michael; WARD, Darren (ed.). Microphone arrays: signal processing techniques and applications. Springer Science & Business Media, 2013.
[4] Meyer, Joerg, and Klaus Uwe Simmer. "Multi-channel speech enhancement in a car environment using Wiener filtering and spectral subtraction." 1997 IEEE international conference on acoustics, speech, and signal processing, Vol. 2. IEEE, 1997. pp. 1167-1170.
[5] Speech Recognition-Overview
Available: https://kknews.cc/zh-tw/tech/z5megg.html
[6] S. Boll, "Suppression of acoustic noise in speech using spectral subtraction," IEEE Transactions on acoustics, speech, and signal processing, vol. 27, no. 2, 1979.Apr , pp. 113-120.
[7] I. Cohen and B. Berdugo, "Noise estimation by minima controlled recursive averaging for robust speech enhancement," IEEE signal processing letters, vol. 9, no. 1, 2002.Jan , pp. 12-15.
[8] Martin, Rainer. "Noise power spectral density estimation based on optimal smoothing and minimum statistics." IEEE Transactions on speech and audio processing, vol. 9, no.5, 2001.Jul , pp. 504-512.
[9] Farhang-Boroujeny, Behrouz. Adaptive filters: theory and applications. John Wiley & Sons, 2013.
[10] Widrow, Bernard, et al. "Adaptive noise cancelling: Principles and applications." Proceedings of the IEEE, vol. 63, no. 12, 1975.Dec, pp. 1692-1716.
[11] Van Veen, Barry D., and Kevin M. Buckley. "Beamforming: A versatile approach to spatial filtering." IEEE assp magazine, vol. 5, no. 2, 1988.Apr , pp. 4-24.
[12] Van Trees, Harry L. Optimum array processing: Part IV of detection, estimation, and modulation theory. John Wiley & Sons, 2004.
[13] Zelinski, Rainer. "A microphone array with adaptive post-filtering for noise reduction in reverberant rooms." ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing. IEEE, 1988.
[14] McCowan, Iain A., and Hervé Bourlard. "Microphone array post-filter based on noise field coherence." IEEE Transactions on Speech and Audio Processing, vol. 11, no. 6, 2003.Nov , pp. 709-716.
[15] Griffiths, Lloyd, and C. W. Jim. "An alternative approach to linearly constrained adaptive beamforming." IEEE Transactions on antennas and propagation, vol. 30, no.1, 1982.Jan , pp. 27-34.
[16] PRIYANKA, S. Siva. A review on adaptive beamforming techniques for speech enhancement. In: 2017 Innovations in Power and Advanced Computing Technologies (i-PACT). IEEE, 2017.April , pp. 1-6.
[17] Claesson, Ingvar, and Sven Nordholm. "A spatial filtering approach to robust adaptive beaming." IEEE Transactions on Antennas and Propagation, vol. 40, no. 9, 1992.Sep , pp. 1093-1096.
[18] Bitzer, Joerg, Klaus Uwe Simmer, and K-D. Kammeyer. "Theoretical noise reduction limits of the generalized sidelobe canceller (GSC) for speech enhancement." 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No. 99CH36258). Vol. 5. IEEE, 1999.
[19] XIE, Julan, et al. Finite data performance analysis of a sidelobe canceller. Multidimensional Systems and Signal Processing, 2017, vol. 28, no. 4, pp. 1737-1756.
[20] Liu, Zehua, et al. "A new GSC beamforming algorithm based on double affine projection." 2014 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting. IEEE, 2014.June , pp. 1-4.
[21] Kates, James M., and Kathryn H. Arehart. "Coherence and the speech intelligibility index." The journal of the acoustical society of America, vol. 117, no. 4, 2005.Apr , pp. 2224-2237.
[22] PAL Acoustics Technology Ltd. - Perceptual Evaluation of Speech Quality (PESQ). Available: http://www.pal-acoustics.com/index.php?a=services&id=143&lang=cn
[23] Ma, Jianfen, Yi Hu, and Philipos C. Loizou. "Objective measures for predicting speech intelligibility in noisy conditions based on new band-importance functions." The Journal of the Acoustical Society of America, vol. 125, no. 5, 2009.May, pp. 3387-3405.
[24] A. ANSI, "S3. 5-1997, Methods for the calculation of the speech intelligibility index," New York: American National Standards Institute, vol. 19, 1997.Sep, pp. 90-119.
[25] Beerends, J., et al. "Measurement of speech intelligibility based on the PESQ approach." Proceedings of the Workshop Measurement of Speech and Audio Quality in Networks (MESAQIN), Prague, Czech Republic. 2004.
[26] CMH8K-Superlux Available:http://www.superlux.com.tw/upload/function.product.info/cce05bc6-0bef-4706-8a87-ea0f72b159a7/resource/CMH8K.pdf
[27] OCTA-CAPTURE Hi-Speed USB Audio Interface.
Available: http://tw.roland.com/products/octa-capture/
[28] E. Rothauser, "IEEE recommended practice for speech quality measurements," IEEE Trans. on Audio and Electroacoustics, vol. 17, 1969, pp. 225-246.
[29] Loizou, Philipos C. Speech enhancement: theory and practice. CRC press, 2007.
[30] Greenberg, Julie E., and Patrick M. Zurek. "Evaluation of an adaptive beamforming method for hearing aids." The Journal of the Acoustical Society of America, vol. 91, no .3, 1992. pp. 1662-1676.
校內:2024-07-09公開