| 研究生: |
孫政葦 Sun, Zheng-wei |
|---|---|
| 論文名稱: |
基於AMDF之遠場聲源辨位系統設計與FPGA實現 FPGA Implementation of Far-Field Sound Localization System Based on AMDF |
| 指導教授: |
王駿發
Wang, Jhing-fa |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
| 論文出版年: | 2008 |
| 畢業學年度: | 96 |
| 語文別: | 英文 |
| 論文頁數: | 53 |
| 中文關鍵詞: | 平均差值函式 、聲源辨位 、遠場 、場效可程式邏輯陣列 |
| 外文關鍵詞: | Far-field, FPGA, AMDF, sound localization |
| 相關次數: | 點閱:81 下載:2 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
聲源辨位是具有辨識發聲者位置的系統。此系統能夠應用於很多領域,例如玩具及機器人等。其原理是估測音源傳遞到麥克風之時間差,經過角度轉換,藉以判斷音源方位。聲源辨位包含了聲源偵測(判斷環境中是否有聲源存在)以及聲源辨位方法(估算訊號時間差),整合這兩個聲音處理系統即為聲源辨位系統。
從目前聲源辨位系統文獻來看,語者與麥克風之間為近場辨識,當二者之間距離漸遠,辨識效果就會漸差。通常,文獻之系統辨識距離約在一至二公尺左右,不一定符合實際應用。為此,本研究將辨識距離延伸至五公尺範圍內並且測試了拍手聲,結果仍可維持高辨識率。
所發展的聲源辨位系統必須是低成本、低運算複雜度、高辨識率與少硬體面積。因此,本論文著重在時域訊號處理,基於AMDF估算訊號之時間差以建立「遠場聲源辨位系統」。在一個有限空間,遠場辨識容易有回聲產生,為了防止回聲干擾,本論文擷取部份聲音訊號做處理以改善問題。整體聽聲辨位系統初期先以電腦驗證,最後使用DE2-70平台實現於FPGA並驗證其正確性。
比較其他文獻,此系統已完整實現於單一FPGA晶片,其面積只佔用約20萬邏輯閘;辨識距離也提升至5公尺範圍且實驗結果顯示,誤差角度介於±5°內幾乎都有90%辨識率。
The aim of sound localization system (SLS) is to identify the direction of sound source. SLS is often necessary for the applications of toy and robot hearing, etc. The principle theory of SLS is estimating time difference (TD) of the signals received by microphone pair. And then, the time difference information converts into the directional angle. An SLS consists of the input sound detection (ISD) processing and the source localization estimation (SLE) processing. The ISD detects whether the input signal is the sound desired signal or noise. On the other hand, the SLE finds the direction and angle of source.
To date, a great deal of effort has been devoted to providing better sound localization systems. Most of them focus on near-field identification. However, when the distance between sound source and receiver becomes longer, the accuracy decreases more. In many literatures, the distance ranges of identification for SLS are between 1 to 2 m; these ranges may not fit application requirements. For this reason, the purpose of this study is to extend the distance ranges to 5 m. The development of SLS should be low cost, low design complexity and small hardware area. We focus on time domain signal processing and construct a far-field sound localization system base on the average magnitude difference function (AMDF). In the reverberation environment, echo will occur on far-field identification. We only extract the onset signals in order to alleviate the reverberant problem. The far-field SLS is verified on the PC and further is implemented on the field programmable gate array (FPGA). Compared with other literatures, our far-field SLS is implemented on a single FPGA chip. The hardware area only occupies about 200,000 logic gates and the distance of identification owns 5 m. In our experiments, the performance achieves almost 90% accuracy for clap within ±5° error.
[1] C. Knapp and G. Carter, “The generalized correlation method for estimation of time delay,” IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol.24, pp.320 – 327, Aug. 1976.
[2] M.S. Brandstein and H.F. Silverman, “A robust method for speech signal time-delay estimation in reverberant rooms,” IEEE International Conference on Acoustics,
Speech, and Signal Processing, Vol.1, pp.375-378, April 1997.
[3] P. Aarabi, G. Shi, M.M. Shanechi, and S.A.Rabi, PHAES-BASED SPEECH PROCESSING, 2006.
[4] M. Ross, H. Shaffer, A. Cohen, R. Freudberg, and H. Manley, “Average magnitude difference function pitch extractor,” IEEE Transactions on Acoustics, Speech, and
Signal Processing, Vol.22, pp.353-362, Oct. 1974.
[5] A. Fertner and A. Sjolund, “Comparison of various time delay estimation methods by computer simulation,” IEEE Transactions on Acoustics, Speech, and Signal
Processing, Vol.34, pp.1329-1330, Oct. 1986.
[6] K. Nakadai, H. G. Okuno, and H. Kitano, “Real-time sound source localization and separation for robot audition,” IEEE International Conference on Spoken Language, pp.193-196, 2002.
[7] Y. Matsusaka, S. Fujie, and T. Kobayashi, “Modeling of Conversational Strategy for the Robot Participating in the Group Conversation,” Proc.ISCA-EUROSPEECH2001, pp.2173-2176, 2001.
[8] H. F. Silverman, W. R. Patterson III, and Flanagan, “The Huge Microphone Array (HMA)” Brown University Technical Report, May 1996.
[9] http://www.pleoworld.com/
[10] K. Kanagisawa, A. Ohya, and S. Yuta, “An operator interface for an autonomous mobile robot using whistle sound and a source direction detection system,” IEEE
International Conference on Industrial Electronics, Control, and Instrumentation, Vol.2, pp.1118-1123, Nov. 1995.
[11] D. Nguyen, P. Aarabi, and A. Sheikholeslami, “Real-time sound localization using field-programmable gate arrays,” IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol.2, pp. 6-10, April 2003.
[12] D. Halupka, N.J. Mathai, P. Aarabi, and A.Sheikholeslami, “Robust sound localization in 0.18 um CMOS,” IEEE Transactions on Acoustics, Speech, and
Signal Processing, Vol.53, pp.2243 – 2250, June 2005.
[13] Tu Sha-Li, “Two-dimensional Source Localization: Implementation and Discussions of Time Domain Methodologies,” National-Tsing-Hua-University, Taiwan, M.S
thesis, 2005.
[14] Sun Lai-Hui, “Implementation of DOA for Speech Using OMAP5912 on a Wheeled Robot,” National-Chiao-Tung-University, Taiwan, M.S thesis, 2005.
[15] 王小川, “語音訊號處理,” 全華科技圖書有限公司,台北,民國九十六年
[16] H. Elkamchouchi and M.A.E. Mofeed, “Direction-of-arrival methods (DOA) and time difference of arrival (TDOA) position location technique,” Twenty-Second
National Radio Science Conference, pp.173 – 182, March 2005.
[17] TERASIC-DE2_70 User manual_v101, 2007. http:// www.terasic.com
[18] Wolfson Data Sheet, WM8731: Portable Internet Audio CODEC with Headphone Driver and Programmable Sample Rates, April 2004.
[19] M. D. Ciletti, Advanced Digital Design with the VERILOG HDL, PEARSON Prentice Hall, 2003.
[20] Song Kai-Tai, “Development of a Sound Direction Detection System,”National-Chiao-Tung-University, Taiwan, M.S thesis, 2001.