簡易檢索 / 詳目顯示

研究生: 李炯彣
Li, Chiung-Wen
論文名稱: 針對非穩態環境之噪音估算與信號子空間語音增強方法
Signal Subspace Speech Enhancement with Noise Estimation for Non-stationary Environments
指導教授: 雷曉方
Lei, Sheau-Fang
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2006
畢業學年度: 94
語文別: 英文
論文頁數: 76
中文關鍵詞: 語音增強子空間
外文關鍵詞: speech enhancement, subspace
相關次數: 點閱:113下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 通訊品質極容易因噪音的影響而降低,在多變化的環境下可使用語音增強機制來刪除噪音。在超過三十年的研究之下,還無法找到針對此問題的完美解答,因此本篇論文的目的在於發展一語音增強演算法,以提供一較佳的噪音估算方法,並利用信號子空間方法來去除此估算得到的噪音。
    在沒有辦法完全移除真實的噪音下,所有的語音增強系統都會遭遇到信號失真和殘值噪音的問題,已經有一些方法被發展來解決這些問題,其中信號子空間方法就是其中之ㄧ。在此篇論文所設計的系統將以子空間方法為基礎,並設計一強健正確的噪音估算演算法來作噪音的估算,而不只是單純從無語音的時間點取得噪音,最後的實驗結果也可以看出此一提出的演算法優於其他一起測試的方法。

    Communication can be greatly degraded by noise. Speech enhancement seeks to eliminate noise in a variety of environments. After over thirty years of research throughout the world, no perfect solution exists to this problem. The objective of this thesis is to develop a speech enhancement algorithm which offers superior noise estimation and uses signal subspace approach to reduce the estimated noise.
    All speech enhancement systems suffer from signal distortion and residual noise due to imperfect noise removal. Some methods were developed to solve the problems. One such method is signal subspace speech enhancement. The system designed in this thesis takes the subspace method as its basis and develops a robust and accurate noise estimation algorithm that can update the noise estimate throughout the signal, not just in speech absence. Results show the new algorithm is an improvement over the other systems tested.

    ABSTRACT.....................................................................I ACKNOWLEDGMENT.............................................................III LIST OF TABLES.............................................................VII LIST OF FIGURES...........................................................VIII CHAPTER 1 INTRODUCTION.......................................................1 1.1 Background..........................................................1 1.2 The Speech Enhancement Concept......................................2 1.3 Overview of Speech Enhancement Techniques...........................3 1.3.1 Spectral Subtraction (SS)...........................................4 1.3.2 Bayesian Spectral Estimation........................................6 1.3.3 Wiener Filtering....................................................7 1.3.4 Enhancement Based on Speech Modeling................................8 1.3.5 Signal Subspace Approaches (SSA)...................................10 1.3.6 Other Methods......................................................10 1.4 Motivation.........................................................11 1.5 Organization of Thesis.............................................11 CHAPTER 2 SIGNAL SUBSPACE TECHNIQUES FOR SPEECH ENHANCEMENT.................12 2.1 Introduction.......................................................12 2.2 Signal and Noise Models............................................13 2.3 Linear Signal Estimation...........................................16 2.3.1 Least-Squares Estimator............................................16 2.3.2 The Linear Minimum Mean Squared Error Estimator....................17 2.3.3 The Time-Domain Constrained Estimator..............................18 2.3.4 The Spectral-Domain Constrained Estimator..........................20 2.4 Handling Color Noise...............................................21 2.4.1 Prewhitening.......................................................22 2.4.2 Generalized Eigenvalue Decomposition Method........................23 2.4.3 The Rayleigh Quotient Method.......................................23 2.5 Implementation Issues..............................................25 2.5.1 Estimating the Covariance Matrix...................................26 2.5.2 Parameter Analysis.................................................27 CHAPTER 3 NOISE ESTIMATION..................................................30 3.1 Noise Power Estimator Based on Minimum Statistics..................30 3.2 Improved Minima Controlled Recursive Averaging (IMCRA) Estimator...34 3.3 Nonlinear Estimator................................................37 CHAPTER 4 THE PROPOSED ALGORITHM FOR SPEECH ENHANCEMENT.....................40 4.1 Noise Estimation...................................................41 4.1.1 Compute Smooth Speech Power Spectrum...............................41 4.1.2 Find the Local Minimum of Noisy Speech.............................42 4.1.3 Speech-presence Probability........................................43 4.1.4 VAD Decision.......................................................44 4.1.5 Update of Noise Spectrum Estimation................................45 4.2 Signal Subspace Approach for Speech Enhancement....................46 4.2.1 Frame Classification...............................................46 4.2.2 Signal KLT Approach for Speech Dominated Frames....................46 4.2.3 Signal KLT Approach for Noise Dominated Frames.....................48 CHAPTER 5 EXPERIMENTAL RESULTS AND PERFORMANCE EVALUATION...................50 5.1 Implementation details.............................................50 5.2 Objective Measure..................................................51 5.2.1 Modified Bark Spectral Distortion measure (MBSD)...................52 5.2.2 Segmental Signal-to-Noise Ratio (SSNR).............................54 5.3 Performance Evaluation.............................................54 CHAPTER 6 CONCLUSION AND FUTURE WORK........................................71 REFERENCES..................................................................72

    [1] S.F. Boll, “Suppression of acoustic noise in speech using spectral subtraction,” IEEE Trans. Acoustics, Speech, and Signal Processing, Vol. ASSP-27, No.2, pp. 113-120, April 1979.
    [2] R. Martin, “Noise power spectral density estimation based on optimal smoothing and minimum statistics,” IEEE Trans. Speech and Audio Processing, Vol.9, Issue 5, pp. 504-512, July 2001.
    [3] J.S. Lim and A.V. Oppenheim, “Enhancement and bandwidth compression of noisy speech,” Proceedings of the IEEE, Vol.67, No.12 pp. 1586-1604, December 1979.
    [4] M. Berouti, P. Schwartz, and J. Makhoul, “Enhancement of speech corrupted by acoustic noise,” Proc. International Conference on Acoustics, Speech and Signal Processing, pp. 208-211, April 1979.
    [5] Y. Ephraim and D. Malah, “Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator,” IEEE Trans. Acoustics, Speech, and Signal Processing, Vol. ASSP-32, No.6, pp. 1109-1121, December 1984.
    [6] Y. Ephraim and D. Malah, “Speech enhancement using a minimum mean-square error log-spectral amplitude estimator,” IEEE Trans. Acoustics, Speech, and Signal Processing, Vol. ASSP-33, No.2, pp. 443-445, April 1985.
    [7] Y. Ephraim, “A Bayesian estimation approach for speech enhancement using hidden Markov models,” IEEE Trans. Signal Processing, Vol.40, No.4, pp. 725-735, April 1992.
    [8] H. Sameti, H. Sheikhzadeh, Li Deng, and R.L. Brennan, “HMM-based strategies for enhancement of speech signals embedded in nonstationary noise,” IEEE Trans. Speech and Audio Processing, Vol.6, No.5, pp. 445-455, September 1998.
    [9] S. Dubost and O. Capp´e, “Enhancement of speech based on non-parametric estimation of a time varying harmonic representation,” Proc. International Conference on Acoustics, Speech and Signal Processing, pp. 1859-1862, June 2000.
    [10] N. Virag, “Single channel speech enhancement based on masking properties of the human auditory system,” IEEE Trans. Speech and Audio Processing, Vol.7, No.2, pp. 126-137, March 1999.
    [11] R.O. Schmidt, “Multiple emitter location and signal parameter estimation,” IEEE Trans. Antennas and Propagation, Vol.34, Issue 3, pp. 276-280, March 1986.
    [12] H. Krim and M. Viberg, “Two decades of array signal processing research: the parametric approach,” IEEE Signal Processing Magazine, pp. 67-94, July 1996.
    [13] Y. Ephraim and H.L. Van Trees, “A signal subspace approach for speech enhancement,” IEEE Trans. Speech and Audio Processing, Vol.3, pp. 251-266, July 1995.
    [14] Y. Bresler and A. Macovski, “Exact maximum likelihood parameter estimation of superimposed exponential signals in noise,” IEEE Trans. Acoustics, Speech, and Signal Processing, Vol.34, pp. 1081-1089, October 1986.
    [15] J. Jensen and J.H.L. Hansen, “Speech enhancement using a constrained iterative sinusoidal model,” IEEE Trans. Speech and Audio Processing, Vol.9, pp. 731-740, October 2001.
    [16] T.F. Quatieri and R.J. McAulay, “Noise reduction using a soft-decision sine-wave vector quantizer,” Proc. International Conference on Acoustics, Speech and Signal Processing, pp. 821-824, April 1990.
    [17] Y. Hu and C. Loizou, “A subspace approach for enhancing speech corrupted by colored noise,” Proc. International Conference on Acoustics, Speech and Signal Processing, pp. 573-576, 2002.
    [18] S.H. Jensen, P.C. Hansen, S.D. Hansen, and J.A. Sorensen, “Reduction of broad-band noise in speech by truncated QSVD,” IEEE Trans. Speech and Audio Processing, Vol.3, pp. 439-448, November 1995.
    [19] U. Mittal and N. Phamdo, “Signal/noise KLT based approach for enhancing speech degraded by colored noise,” IEEE Trans. Speech and Audio Processing, Vol.8, pp. 159-167, March 2000.
    [20] A. Rezayee and S. Gazor, “An adaptive KLT approach for speech enhancement,” IEEE Trans. Speech and Audio Processing, Vol.9, pp. 87-95, February 2001.
    [21] R. Martin, “Noise power spectral density estimation based on optimal smoothing and minimum statistics,” IEEE Trans. Speech and Audio Processing, Vol.9, pp. 504-512, July 2001.
    [22] I. Cohen, “Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging,” IEEE Trans. Speech and Audio Processing, Vol.11, pp. 466-475, September 2003.
    [23] R.J. McAulay and M.L. Malpass, “Speech enhancement using a soft-decision Noise Suppression Filter,” IEEE Trans. Acoustics, Speech, and Signal Processing, Vol. ASSP-28, No.2, pp. 137-145, April 1980.
    [24] R. Martin, “Spectral subtraction based on minimum statistics,” Proc. 7th Eur. Signal Processing Conf. (EUSIPCO’94), pp. 1182-1185, 1994.
    [25] Doblinger, “Computationally efficient speech enhancement by spectral minima tracking in subbands,” EUROSPEECH-1995, pp. 1513-1516, 1995.
    [26] S. Rangachari and P.C. Loizou, “A noise-estimation algorithm for highly non-stationary environments,” Speech Communication, Vol.48, Issue 2, pp.220-231, February 2006.
    [27] I. Cohen and B. Berdugo, “Noise Estimation by Minima Controlled Recursive averaging for robust speech enhancement,” IEEE Signal Processing Letters, Vol.9, No.1, pp.12-15, January 2002.
    [28] S. Wang, A. Sekey and A. Gersho, “An objective measure for predicting subjective quality of speech coders,” IEEE J. on Select. Areas in Comm., Vol. SAC-10, pp. 819-829, 1992.
    [29] J.G. Beerends and J.A. Stemerdink, “A perceptual speech quality measure based on a psychoacoustic sound representation,” J. Audio Eng. Soc. Vol.42, pp. 115-123, March, 1994.
    [30] W. Yang, M. Benbouchta, and R. Yantorno, “Performance of the modified bark spectral distortion as an objective speech quality measure,” ICASSP, Vol.1, pp. 541-544, 1998.
    [31] J.D. Johnston, “Transform Coding of Audio Signals using Perceptual Noise Criteria,” IEEE Journal on Selected Areas of Communications, Vol.6, No.2, pp. 314-323, February 1988.
    [32] Y. Hu and P.C. Loizou, “A generalized subspace approach for enhancing speech corrupted by colored noise,” IEEE Trans. Speech and Audio Processing, Vol.11, No.4, pp. 334-341, July 2003.
    [33] W. Zhong, S. Zhong and H.-M. Tai, “Signal subspace approach for narrowband noise reduction in speech,” IEE Proceedings Vision, Image and Signal Processing,
    Vol.152, pp. 800-805, December 2005.

    下載圖示 校內:2007-09-01公開
    校外:2008-09-01公開
    QR CODE