研究生: |
賴旭謙 Lai, Hsu-Chien |
---|---|
論文名稱: |
應用聲源定位與感知計算來決策以相干性為基準之雙麥克風噪音抑制演算法與其實現 Novel Coherence-Based Noise Reduction Algorithm By Applying Sound-Source-Localization And Awareness-Computation Strategy For Dual Microphone |
指導教授: |
雷曉方
Lei, Sheau-Fang |
共同指導教授: |
賴信志
Lai, Shin-Chi |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
論文出版年: | 2014 |
畢業學年度: | 102 |
語文別: | 中文 |
論文頁數: | 86 |
中文關鍵詞: | 噪音抑制(NR) 、聲源定位(SSL) 、感知計算(AC) 、快速傅立葉轉換(FFT) |
外文關鍵詞: | Noise Reduction(NR), Sound Source Localization(SSL), Awareness-Computation(AC), Fast-Fourier Transform(FFT) |
相關次數: | 點閱:126 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本篇論文提出了利用聲源定位系統整合以相干性為基準之噪音抑制演算法,用以改善現有文獻無法抑制前方噪音的缺點以及在後方噪音的環境下會有估計上的誤差。聲源定位系統是利用雙麥克風間的交互相關性(Cross-Correlation),透過計算交互相關性最大值來求得訊號的時間延遲差(Time Difference of Arrival , TDOA),進而能判斷噪音源方向,同時,我們加入適應性(adaptive)的臨界值估測,用以降低誤判機率。此做法(Proposed)不僅可以處理環境中複雜多變的噪音源,而且以相干性為基準的噪音抑制也具有簡單實現的優點。在整合上,我們提出以感知計算(Awareness-Computation , AC)與演算法共模組化來降低運算複雜度,使得乘法運算複雜度僅提升9.09%。不管在穩定噪音或非穩定噪音的環境下,噪音抑制演算法都不需要任何噪音估計或語音動態偵測器(Voice Activity Detector , VAD),可大量減少計算上造成的複雜度。礙於麥克風距離的物理限制,傳統以相干性為基準之噪音抑制演算法無法適用且實作於助聽器上,而本論文採用之噪音抑制演算法係依雙麥克風之間的SNR推導所得,因此能有效地應用於助聽器上。根據軟體模擬後的實驗結果得知,我們提出的噪音抑制效果能比現有文獻在SNR的評比上高出3dB。綜合上述,我們的演算法具備了運算複雜度低、能有效抑制穩定噪音與非穩定噪音以及不用任何的噪音估計等三種優點並且可符合數位助聽器的硬體限制,因此更適合應用於數位助聽器及搭載於智慧型手機等小型裝置上。
This paper presents a novel coherence-function-based noise suppression algorithm (NSA) with a weighted overlap-add (WOLA) filterbank for dual microphones. It solved the following two issues: One is that traditional method cannot efficiently suppress the noise from the front, and the other is that it may cause estimation errors while the noise is from the back. Consider the complexity in algorithm; the proposed method employs a simple sound source localization (SSL) algorithm, and an awareness computation (AC) strategy to improve these drawbacks instead of using complex voice active detector (VAD). By calculating the cross-correlation of dual microphones, the information of time difference of arrival (TDOA) is obtained. Hence, the direction of noise source can be effectively estimated. To reduce the error rate of finding out the exact noise source, an adaptive threshold value is introduced. From the view of system integration, the AC and module-sharing scheme are also adopted to reduce the computational complexity. The results show that the number of multiplication of the proposed method is only 9.09% increased, and the SNR of the proposed algorithm has at least 3dB growth which is higher than that of other approaches. In FPGA implementation, the proposed SSL design can be operated at 25 MHz which is easily to achieve the real-time requirement of 72.625 kHz. Overall, it is very suitable for integrating with Fourier-transform-based WOLA hearing aid design in the future.
[1] http://www.ear.com.tw/CGMH-WEB/earinfo.htm
[2] Noll, P., “MPEG digital audio coding, ” Signal Processing Magazine, IEEE , vol.14, no.5, pp.59, 81, Sep 1997.
[3] 音視訊處理實驗室, “實習單元-1.1 遮罩效應Masking Effect,” 國立中央大學, 2009.
[4] http://www.hear-it.org/Masking
[5] P.C. Loizou , Speech Enhancement : Theroy and Practice , 1st ed. Boca Raton , FL:CRC,Taylor & Francis , 2007.
[6] S. Boll, “Suppression of acoustic noise in speech using spectral subtraction,” IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 27, no. 2, pp. 113-120, 1979.
[7] W. Soede, A. J. Berkhout, and F. A. Bilsen, “Development of a directional hearing instrument based on array technology,” J. Acoust. Soc. Amer., vol. 94, no. 2, pp. 785–798, 1993.
[8] S.Haykin, “Adaptive Filter Theory,” Englewood Cliffs, NJ: Prentice Hall, 2002, 2nd ed.
[9] L. Griffiths, and C. Jim, “An alternative approach to linearly constrained adaptive beamforming,” IEEE Transactions on Antennas and Propagation, vol. 30, no. 1, pp. 27-34, 1982.
[10] J. V. Berghe, and J. Wouters, “An adaptive noise canceller for hearing aids using two nearby microphones,” The Journal of the Acoustical Society of America, vol. 103, pp. 3621, 1998.
[11] J.-B. Maj, J. Wouters, and M. Moonen, “Noise reduction results of an adaptive filtering technique for dual-microphone behind-the-ear hearing aids,” Ear and Hearing, vol. 25, no. 3, pp. 215-229, 2004.
[12] R. Le Bouquin-Jeannes, and G. Faucon, “Using the coherence function for noise reduction,” in Inst. Electron. Eng. Proc.-I Commun. ,Speech, Vis., Jun 1992,vol. 139, no.3, pp. 276-280.
[13] F.J. Harris, “On the Use of Windows for Harmonic Analysis With the Discrete Fourier Transform.” Proceedings of the IEEE, Vol.66, No. 1, Jan.1978,pp.51-83.
[14] R. Martin, “Noise power spectral density estimation based on optimal smoothing and minimum statistics,” IEEE Transactions on Speech and Audio Processing , vol. 9, no. 5, pp. 504-512, 2001.
[15] L.Rabiner, and R.Schafer , “Digital Processing of Speech Signals” , Englewood Cliffs ,NJ: Prentice Hall , 1978
[16] I. A. McCowan, “Robust Speech Recognition using Microphone Arrays”, PhD Thesis, Queensland University of Technology, Australia, 2001.
[17] J.-B. Maj, L. Royackers, J. Wouters, and M. Moonen, “Comparison of adaptive noise reduction algorithms in dual microphone hearing aids,” Speech communication, vol. 48, no. 8, pp. 957-970, 2006.
[18] N. Yousefian, and P. C. Loizou, “A Dual-microphone speech enhancement algorithm based on the coherence function,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 20, no. 2, pp. 599-609, 2012.
[19] N. Yousefian , and P. C. Loizou , “A Dual-Microphone Algorithm That Can Cope With Competing-Talker Scenarios” IEEE Transactions on Audio, Speech, and Language Processing , vol. 21, no.1, pp.145-155, 2013
[20] R. Plomp, “A signal-to-noise ratio model for the speech-reception threshold of the hearing impaired,” Journal of Speech, Language and Hearing Research, vol. 29, no. 2, pp. 146, 1986.
[21] C. Knapp, and G. C. Carter, “The generalized correlation method for estimation of time delay.” IEEE Transactions on Acoustics,Speech and Signal Processing, , vol. 24, no. 4, pp. 320-327, 1976.
[22] M. Rhudy, B. Bucci, J. Vipperman, J. Allanach, and B. Abraham, “Microphone array analysis methods using cross-correlations.” Proceedings of 2009 ASME International Mechanical Engineering Congress, Lake Buena Vista, FL, pp. 281-288, 2009
[23] B. Schilit , and M. Theimer, “Disseminating active map information to mobile hosts,” IEEE Communications Society, vol. 8, no. 5, pp. 22-32, 1994.
[24] M. Brandstein and D. Ward, “Microphone Arrays: Signal Processing Techniques and Applications,” Berlin, Germany : Springer Verlag , ISBN 978-3-662-04619-7 , 2001.
[25] Alan V. Oppenheim, Alan S. Willsky, “Signals & system”,Prentice-Hall International, second edition , chap 3 , pp. 223-224, ISBN-0-13-651175-9, 1997
[26] 洪春龍, “彈性多重輸入輸出OFDM系統之低功率可變點數快速傅立葉轉換處理器”,台灣,國立中央大學, 2008
[27] Y. Li and W. Chu , “Implementation of Single Precision Floating Point Square Root on FPGAs,” Proc. Of the 5th IEEE Symposium on FPGA based Custom Computing Machines, IEEE Computer Society , Vol.43 , No.1,pp.226-232 , 1997
[28] E. Rothauser, W. Chapman, N. Guttman, K. Nordby, H. Silbiger, G. Urbanek, and M. Weinstock, “IEEE recommended practice for speech quality measurements,” IEEE Transactions on Audio Electroacoust, vol. 17, no. 3, pp. 225-246, 1969.
[29] A. Varga, and H. J. Steeneken, “Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems,” Speech Communication, vol. 12, no. 3, pp. 247-251, 1993.
[30] “Saffire PRO 40” The Focusrite Sound on FireWire & Thunderbolt
[31] “PRA-268A” Electret Condenser SuperCardioid Microphone , Superlux.