研究生: |
陳威廷 Chen, Wei-Ting |
---|---|
論文名稱: |
多層卡爾曼濾波器於雙麥克風語音降噪之應用 Dual-Microphone Applications of Multi-Layer Kalman Filter in Speech Enhancement |
指導教授: |
陳永裕
Chen, Yung-Yu |
學位類別: |
碩士 Master |
系所名稱: |
工學院 - 系統及船舶機電工程學系 Department of Systems and Naval Mechatronic Engineering |
論文出版年: | 2016 |
畢業學年度: | 104 |
語文別: | 英文 |
論文頁數: | 154 |
中文關鍵詞: | 卡爾曼濾波器 、自適應濾波器 、語音降噪 、雙麥克風降噪 |
外文關鍵詞: | Kalman filters, adaptive filter, background noise cancellation, dual-microphone noise reduction |
相關次數: | 點閱:96 下載:3 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
數位語音訊號現已廣泛的應用在遠程通信、視訊會議、人工智能系統等眾多領域,然而在實際的環境中,語音訊號在麥克風的聲電轉換中不可避免地會受到周圍環境各種聲源的干擾,經常導致語音接收端辨讀的困擾,故如何降低周遭環境噪音,保留低失真語音訊號已是現代語音訊號處理的一大課題。
本研究提出了一種基於兩種收音能力不同的麥克風(全向麥克風、指向麥克風)的多層卡爾曼濾波器來估測出含低雜訊、低失真的語音訊號,首先指向麥克風設置於背對主要語音聲源來收集環境背景噪音,並建立成擴展狀態空間表示示模型,於此應用第一層卡爾曼濾波器,估測出較為準確的環境背景噪音,再來以估測的結果將設置於面對主要語音聲源的全向麥克風所收集的所有環境聲源(主要語音聲源+環境背景噪音)白雜訊化,並再次建立含些微雜訊的主要語音聲源擴展狀態空間表示示模型,第二層卡爾曼濾波器即可應用於此估測出較為乾淨的主要語音源,可以藉由類似上述的步驟,隨著層數的增加,不斷的對已估測出的主要語音訊號建模並進一步的估測出更為乾淨的主要語音,以此達到降低環境噪音,保留低失真語音的目的。
研究過程將輸入多組SNR低於零的訊號,並利用improved SNR、cross-correlation、PESQ、spectrogram等分析工具來判斷估測出的主要語音源訊號品質,並考慮實際的可行性,在可即時化處理的條件下,比對出可得最佳主要語音訊號品質的濾波器層數與階數。從研究成果顯示出,本論文提出的方法確實能大幅增強主要語音訊號的品質。
Digital speech signal has been widely used in many fields, such as telecommunications, video conference, and artificial intelligence systems. However, the speech signal inevitably disturbed by the various sound sources of surroundings in the actual communication environments due to the microphone acoustic-electric conversion often lead to recognizable problems for the speech receiving end. So how to reduce the environment noise and maintain low distortion speech signal is a major issue in speech signal processing now, and attract a lot of attention.
This study proposes a method called multilayer Kalman filter design based on two different capability microphones (omnidirectional microphone and unidirectional microphone) in hardware to estimate the speech signal and remove the background noises. First, the unidirectional microphone which is used to collect the ambient background noises is built up in the rear side and then the first-layer Kalman filter is applied on modeling the state-space model of background noises and estimate accurately background noises, simultaneously. Using the estimating results of background noises to whiten the ambient sound sources (main sound source and background noise) which are collected by the omnidirectional microphone set up directly to face the main sound source is to get the roughly main sound source model. Therefore, the second-layer Kalman filter can be applied on the roughly main sound source model to extract more pure main sound source. By following the similar filtering process mentioned above, the proposed multilayer Kalman filter with the increment of layers can achieve the goal of reducing ambient background noises and maintain low distortion of the main source.
For testing the robustness of this proposed method, several signals which the initial SNRs are all lower than 0dB are created, and the qualities of estimated results will be verified by the improved SNR, cross-correlation, PESQ, and spectrogram. The filter’s layers and orders will be determined to get the optimal quality of speech signal under the consideration of the practical instant processing ability. From the revealed research results, the proposed method in this paper can significantly enhance the quality of the speech signal.
[1]. L. Watts, “Advanced noise reduction for mobile telephony,” IEEE Computer, vol. 41, no. 8, pp. 90–92, 2008.
[2]. H. Hu, “A real-time implementation of constrained estimate-maximize algorithm for single-microphone speech enhancement,” IEEE Transactions on Consumer Electronics, vol. 44, no. 2, pp. 370-375, 1998.
[3]. Y. Lee, I. Lee, and O. Kwon, “Single-channel speech separation using phase based methods,” IEEE Transactions on Consumer Electronics, vol. 56, no. 4, pp. 2453-2459, 2011.
[4]. Y. Lee, and O. Kwon, “Application of shape analysis techniques for improved CASA-based speech separation,” IEEE Transactions on Consumer Electronics, vol. 55, no. 1, pp. 146-149, 2009.
[5]. Y. Ephraim and D. Malah, “Speech enhancement using a minimum mean-square error log-spectral amplitude estimator,” IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 33, pp. 443-445, Apr. 1985.
[6]. S. F. Boll, “Suppression of acoustic noise in speech using spectral subtraction,” IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 27, no.2, pp. 113-121, Apr. 1979.
[7]. R. Martin, "Spectral subtraction based on minimum statistics," Proceedings of European Signal Processing Conference, pp.1182-1185, Sep. 1994.
[8]. R. Le Bouquin, G. Faucon, "Using the coherence function for noise reduction," IEE Proceedings-1, vol. 139, no. 3, pp.276-280, June 1992.
[9]. J. Cadzow and O. Solomon," Linear modeling and the coherence function," IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 35, no. 1, Jan 1987.
[10]. A. Guerin, R. Le Bouquin, and G. Faucon, "A two-Sensor noise reduction system: Applications for Hands-free Car Kit," EURASIP Journal on Applied Signal Processing, pp.1125-1134, Nov 2003.
[11]. Xuefeng Zhang and Ying Jia, "A soft decision based noise cross power spectral density estimation for two-microphone speech enhancement systems," IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005.
[12]. Jurgen Freudenberger, Sebastian Stenzel, and Benjamin Venditti, "A noise PSD and cross-PSD estimation for two-microphone speech enhancement systems," IEEE/SP 15th Workshop on Statistical Signal Processing, 2009.
[13]. Mario Di Paola, Francesco P. Pinnola, "Cross-power spectral density and cross-correlation representation by using fractional spectral moments," Meccanica dei materiali e delle Strutture, vol. 3, no. 2, pp. 9-16, 2012.
[14]. Keunsang. Lee, Joseph Cho and Youngcheol Park, "Channel prediction-based noise reduction algorithm for dual-microphone mobile phones", IEEE Transactions on Consumer Electronics, vol. 60, no. 3, Aug 2014.
[15]. Parham Aarabi and Guangji Shi, "Phase-based dual-microphone robust speech enhancement", IEEE Transactions on Systems, Man, and Cybernetics—part B: Cybernetics, vol. 34, no. 4, Aug 2004.
[16]. X. Huang, A. Acero, and H. Hon, "Spoken Language Processing," Prentice Hall PTR, pp. 477-544, 2001.
[17]. J. Bitzer, K. Simmer, and K. Kammeyer, “Multi-microphone noise reduction techniques for hands-free speech recognition—A comparative study,” Robust Methods for Speech Recognition in Adverse Conditions, pp. 171–174, May 1999.
[18]. D. H. Jonhnson and D. E. Dudgeon, "Array Signal Processing: Concepts and Techniques", Englewood Cliffs, NJ: Prentice-Hall, 1993.
[19]. C. Marro, Y. Mahieux, and K. U. Simmer, “Analysis of noise reduction and dereverberation techniques based on microphone arrays with postfiltering,” IEEE Trans. Speech Audio Process, vol. 6, no. 3, pp. 240–259, May 1998.
[20]. S. Oh, V. Viswanathan, and P. Papamichalis, “Hands-free voice communication in an automobile with a microphone array,” IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 1, pp. 281–284, Mar 1992.
[21]. S. Oh and V. Viswanathan, “Microphone array for hands-free voice communication in a car,” Modern Methods of Speech Processing, pp. 351–375, 1995.
[22]. J. M. Kates, Digital Hearing Aids, San Diego, CA: Plural, 2008
[23]. J. D. Gibson, B. Koo, and S. D. Gray, “Filtering of colored noise for speech enhancement and coding,” IEEE Transactions on Signal Processing, vol. 39, no. 8, pp. 1732–1742, 1991.
[24]. Q. Mai, D. He, Y. Hou and Z. Huang, “A fast adaptive Kalman filtering algorithm for speech enhancement”, IEEE International Conference on Automation Science
and Engineering, Trieste, August, 2011.
[25]. Y. Shao, C. H. Chang, “A Kalman filter based on wavelet filter-bank and psychoacoustic modeling for speech enhancement”, IEEE International Symposium on Circuits and Systems, Island of Kos, May, 2006.
[26]. Y. Wang, M. Brookes, “Speech enhancement using a robust Kalman filter post-processor in the modulation domain”, IEEE International Conference on Acoustic, Speech and Signal Processing, Vancouver, BC, May, 2013.
[27]. Y. Wang, M. Brookes, “Speech enhancement using a modulation domain Kalman filter post-processor with a Gaussian Mixture noise model”, IEEE International Conference on Acoustic, Speech and Signal Processing, Florence, Italy, May, 2014.
[28]. Greg Welch and Gary Bishop, “An Introduction to the Kalman Filter”, Department of Computer Science University of North Carolina at Chapel Hill, NC 27599-3175