簡易檢索 / 詳目顯示

研究生: 李聖捷
Lee, Sheng-Chieh
論文名稱: 噪聲環境下語音聲源定位及噪音消除之研究應用於自動語音識別系統
A Study on Speech Source Localization and Noise Reduction for Automatic Speech Recognition System in Noisy Environments
指導教授: 王駿發
Wang, Jhing-Fa
學位類別: 博士
Doctor
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2019
畢業學年度: 107
語文別: 英文
論文頁數: 95
中文關鍵詞: 麥克風陣列語音聲源定位噪音消除深度神經網絡卷積遞歸神經網絡
外文關鍵詞: Microphone array, direction of arrival, noise reduction, deep neural network, convolutional-recurrent neural network
相關次數: 點閱:152下載:5
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本研究整合語音聲源定位方法及語音辨識技術,提出一套具有語音聲源定位之自動語音識別系統,此系統可於低訊號雜訊比(Signal-to-Noise Ratio, SNR)的環境中,針對所接收到的語音訊號進行分析,計算出該語音訊號的角度方位資訊,並顯示語句內容辨識結果。此系統主要包含兩大架構,分別為語音訊號定位處理,以及語音辨識流程,本研究應用所提出來的方法,對此系統進行改良,使其能在具有背景噪音的環境下,準確計算出語音訊號角度方位,並有效提升系統的語音辨識率。
    在語音訊號定位處理上,本研究深入探討平均幅度差函數法(Average Magnitude Difference Function, AMDF)、最小化變異數無失真響應演算法(Minimum Variance Distortionless Response, MVDR)、以及多重訊號分類演算法(Multiple Signal Classification, MUSIC),提出一項前處理方法來強化語音訊號定位的準確率。此方法使用線性相位近似處理(Linear Phase Approximation),利用線性回歸方式估測出語音訊號理想的相位差曲線,並重建語音訊號的共變異矩陣,最後再利用特徵值分解方式(Eigenvalue Decomposition, EVD),分析語音訊號和噪音訊號的特徵值比值,濾除含有噪音訊號之頻帶,藉此提高系統估測語音聲源角度之正確性。
    在語音辨識流程方面,本系統提出噪音訊號偵測法,此方法可隨時計算並記錄目前語音訊號雜訊比,根據當前噪音訊號強度,系統可自行評斷是否需要對語音訊號進行噪音消除處理,藉此預防語音訊號因過度濾波處理而造成語音訊號失真,進而影響語音辨識率。在噪音消除方法上,本系統結合獨立成分分析法(Independent Component Analysis, ICA)以及子空間語音增強演算法(Subspace Speech Enhancement, SSE),將帶有噪音的語音訊號進行噪音去除,並增強語音訊號強度,以提升系統語音辨識率。最後再以英國劍橋大學工程系機器智能實驗室所開發的隱藏式馬可夫模型(Hidden Markov Model, HMM)語音訓練與辨識工具(Hidden Markov Model Toolkit, HTK),對噪音消除後的語音訊號進行語音辨識並顯示其語句內容結果。
    本研究提出的具語音聲源定位之自動語音識別系統,在語音訊號定位及語音辨識上,具有良好之成效。相較於傳統MVDR與MUSIC演算法,藉由本研究提出的線性相位近似前處理以及濾除噪音訊號頻帶方法,可使系統的語音訊號定位準確度分別提升4.98°與7.61°,在語音除噪效果方面,根據不同的背景環境噪音,本系統可將原始含有噪音之語音訊號,使其訊號雜訊比提高10 dB至15 dB,並且有效提升12 %至17 %的語音辨識率。此外本研究也深入探討目前近年來嶄露頭角的人工智慧(Artificial Intelligence, AI)技術例如深度學習(Deep Learning)方法,除了介紹深度神經網絡(Deep Neural Network, DNN),並提出使用卷積遞歸神經網絡(Convolutional Recurrent Neural Network, CRNN)架構應用於語音除噪上,在實驗結果方面,使用CRNN架構之方法,在感知語音質量評估(Perceptual Evaluation of Speech Quality, PESQ)量測上,相較於原始雜訊語音訊號,可提升0.83分,在詞錯誤率(Word Error Rate)方面則可降低至15.83 %,說明使用CRNN模型可以有效抑制噪音效果,並且能夠改善語音品質。

    This study integrates methods of speech source localization and speech recognition, and proposes the automatic speech recognition (ASR) system which can additionally provide the angular information of speeches. According to the analysis of received speech signals, the proposed ASR system can estimate the direction of speech signals and display the speech recognition results in low signal-to-noise ratio (SNR) environments. The proposed ASR system comprises two stages, which are speech source localization and speech recognition procedure. This study aims to improve the performance of the ASR system with proposed methods. The proposed methods outperform the angular estimation and speech recognition rate in noisy environments.
    In the speech source localization processing, this study proposes a preprocessing scheme to reduce the estimated error of the direction of arrival (DOA) estimation according to the investigation of average magnitude difference function (AMDF), minimum variance distortionless response (MVDR), and multiple signal classification (MUSIC). The proposed preprocessing method utilizes linear phase approximation to predict the ideal phase line in the absence of noise, and reconstructs the covariance matrix of the received speech signal. To increase the accuracy of DOA result, another method based on eigenvalue decomposition (EVD) is adopted to detect and filter out the noisy frequency bins of the speech signals.
    This study reveals a threshold-based noise detection method in the speech recognition procedure. The proposed method can automatically calculate and record the current SNR value of the speech signal. The ASR system can determine when to enhance the quality of speech based on the SNR value of the collected speech signal. This method can avoid the situation of over-filtering speech, which can decrease the speech recognition rate. In noise reduction stage, independent component analysis (ICA) and subspace speech enhancement (SSE) are employed to eliminate the noise from the received speech and enhance the magnitude of the speech for recognition process. This study uses hidden Markov model toolkit (HTK), which is developed at the machine intelligence laboratory of the Cambridge University Engineering Department, as a speech recognizer in recognition process. The HTK-based speech recognizer analyzes the enhanced speech signal and demonstrates the speech recognition result.
    The experiments in this study indicate that the proposed ASR system can effectively estimate the direction of speech signal and recognize the content of speech signal in noisy environments. Compared with conventional MVDR and MUSIC algorithms, the mean estimation error using proposed preprocessing scheme can be reduced by about 4.98° from the MVDR method. The DOA results also improve the mean estimation accuracy by around 7.61° relative to the MUSIC method. With respect to the performance of noise reduction and speech recognition rate, the results reveal that the SNR values of the enhanced speech exceed those of the noise-contaminated speech by approximately 10 dB to 15 dB. The speech recognition rates can be improved by around 12% to 17% after the proposed noise detection and reduction methods. The study also investigates the technique of artificial intelligence (AI) such as the deep learning method. Both of the deep neural network (DNN) and the convolutional-recurrent neural network (CRNN) are introduced and proposed in noise reduction process. In experiments, the score of perceptual evaluation of speech quality (PESQ) can be improved to 0.83 scores compared with the noisy speech. The word error rate can be decreased to 15.83 %. The experimental results demonstrate that the CRNN model can validly suppress the influence of noise signal and improve the quality of speech for recognition process in the ASR system.

    摘要 i Abstract iii Content vi Figure List ix Table List xi Chapter 1 Introduction 1 1.1 Motivation 1 1.2 Background and Literature Review 2 1.3 Summarization of Contribution 4 1.4 Organization of Dissertation 5 Chapter 2 Acoustical Model in Noisy Environments and Acoustical Transducers for Proposed System 6 2.1 Acoustical Environments 6 2.1.1 Model of Acoustical Environments 6 2.1.2 Additive Noise 7 2.1.3 Channel Distortion 8 2.2 Acoustical Transducers 10 2.2.1 Frequency Response of Microphones 10 2.2.2 Microphone Sensitivity 11 2.2.3 Sound Pressure Level (SPL) 11 2.3 Categories of Microphones 13 2.3.1 Dynamic Microphones 13 2.3.2 Condenser Microphones 14 2.3.3 Directional Microphones 15 2.4 Framework of the Proposed System 18 Chapter 3 Speech DOA Estimation with Linear Phase Approximation and Frequency Bin Selection Preprocessing Schemes 20 3.1 Linear Microphone Array Signal Processing 20 3.2 DOA Algorithms 21 3.2.1 Generalized Cross Correlation (GCC) 21 3.2.2 Average Magnitude Difference Function (AMDF) 24 3.2.3 Minimum Variance Distortionless Response (MVDR) 25 3.2.4 Multiple Signal Classification (MUSIC) 26 3.3 Framework of the Proposed DOA Estimation 29 3.4 Proposed Preprocessing Schemes for DOA Estimation 31 3.4.1 Uniform Linear Array in Noisy Environments 31 3.4.2 Linear Phase Approximation 32 3.4.3 Frequency Bin Selection 38 3.5 Experimental Results 40 3.5.1 Experimental Setup 40 3.5.2 Various Subspace-Based Thresholds in Frequency Bin Selection 42 3.5.3 Compared Results with Conventional MUSIC and MVDR Methods 44 3.5.4 Experimental Results using Related and Proposed Methods 45 3.5.5 Evaluation of Estimated Error for Speech with Various SNRs 46 Chapter 4 Threshold-Based Noise Detection and Noise Reduction for ASR System 48 4.1 Framework of the ASR System 48 4.2 Proposed Threshold-Based Noise Detection 49 4.3 Combined Noise Reduction Procedure 52 4.3.1 Independent Component Analysis (ICA) 52 4.3.2 Subspace Speech Enhancement (SSE) 55 4.4 Speech Recognition Process 57 4.5 Experimental Results 58 4.5.1 Experimental Setup 58 4.5.2 Comparison Results with Spectral Subtraction and Proposed Method 59 4.5.3 Evaluation Results of Proposed Method 61 4.5.4 Recognition Results using Related and Proposed Method 62 Chapter 5 Noise Reduction Method using DNN and CRNN for ASR System 64 5.1 System Overview 64 5.2 Architecture of Deep Neural Networks (DNN) 65 5.2.1 Forward-Propagation 68 5.2.2 Loss Function 69 5.2.3 Regularization 70 5.2.4 Back-Propagation 71 5.3 Convolutional-Recurrent Neural Networks (CRNN) 74 5.3.1 Convolutional Component and Concatenation 74 5.3.2 Long Short Term Memory (LSTM) Component 75 5.3.3 Bidirectional Recurrent Neural Networks (BRNN) 76 5.3.4 Fully-Connected Component and Optimization 77 5.4 Experimental Results 78 5.4.1 Microphone Array 78 5.4.2 Experimental Database Collection 79 5.4.3 Experimental Setup 80 5.4.4 PESQ Score using Related Methods and Proposed Methods 82 5.4.5 Experimental Results of Word Error Rate 82 Chapter 6 Conclusions 84 Appendix 85 A.1 System Procedure 85 A.2 Far-Field Sound Detection 86 A.3 Sound Localization System Architectures 87 A.3.1 SAR ADC Architectures 87 A.3.2 Digital Computing Core Architectures 88 A.4 Experimental Results 89 A.4.1 Experimental Setup 89 A.4.2 Evaluation Results 89 A.4.3 SOC Implementation Results 90 Bibliography 92 Publication List 95

    [1] V. Krishnaveni, T. Kesavamurthy, and B. Aparna., “Beamforming for direction-of-arrival (DOA) estimation–A survey,” Int. Journal of Computer Applications, vol. 61, no. 11, pp. 4–11, Jan. 2013.
    [2] X. Mestre and M. A. Lagunas, “Finite sample size effect on minimum variance beamformers: optimum diagonal loading factor for large array,” IEEE Trans. Signal Processing, vol. 54, no. 1, pp. 69–82, Jan. 2006.
    [3] J. Li, B. Halder, P. Stoica, and M. Viberg, “Computationally efficient angle estimation for signals with known waveforms,” IEEE Trans. Signal Processing, vol. 43, no. 9, pp. 2154–2163, Sep. 1995.
    [4] A. L. Swindlehurst, “Time delay and spatial signature estimation using known asynchronous signals,” IEEE Trans. Signal Processing, vol. 46, no. 2, pp. 449–461, Feb. 1998.
    [5] K. Cho, H. Okumura, T. Nishiura, and Y. Yamashita, “Localization of multiple sound sources based on inter-channel correlation using a distributed microphone system,” in Proc. 9th Annual Conf. Int. Speech Communication Association, Brisbane, Australia, Sep. 2008, pp. 443–446.
    [6] K. Cho, T. Nishiura, and Y. Tamashita, “Robust speaker localization in a disturbance noise environment using a distributed microphone system,” in Proc. 2010 IEEE 7th Int. Symp. on Chinese Spoken Language, Tainan, Taiwan, Nov. 2010, pp. 209–213.
    [7] N. Wang, P. Agathoklis, and A. Antoniou, “A new DOA estimation technique based on subarray beamforming,” IEEE Trans. Signal Processing, vol. 54, no. 9, pp. 3279–3290, Aug. 2006.
    [8] J.-M. Yang, M.-S. Choi, and H.-G. Kang, “Two-channel DOA estimation using frequency selective music algorithm with a phase compensation in reverberant room,” in Proc. 2008 IEEE 5th Int. Workshop on Sensor Array and Multichannel Signal Processing, Darmstadt, Germany, Jul. 2008, pp.365–368.
    [9] Q. Wang and H. You, “MVDR beam-space pre-processing for wideband sources DOA estimation,” in Proc. 2008 IEEE Int. Symp. on Intelligent Information Technology Application, Shanghai, China, Dec. 2008, pp. 1091–1094.
    [10] Y. Hioka, Y. Koizumi, and N. Hamada, “Improvement of DOA estimation using virtually generated multichannel data from two-channel microphone array,” Journal of Signal Processing, vol. 7, no. 1, pp. 105–109, Feb. 2003.
    [11] S. Tanigawa and N. Hamada, “Direction-of-arrival estimation of speech using virtually generated multichannel data from two-channel microphone array,” Electronics and Communications in Japan, Part 3, vol. 86, no. 2, pp. 33–42, Feb. 2003.
    [12] C. Choi, D. Kong, J. Kim, and S. Bang, “Speech enhancement and recognition using circular microphone array for service robots,” in Proc. IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, Las Vegas, Nevada, Oct. 2003, pp. 3516–3521.
    [13] Y.-W. Jung, J. Lee, D. Kong, J. Kim, and C. Lee, “High-quality speech acquisition and recognition system for home-agent robot,” in Proc. IEEE Int. Conf. on Consumer Electronics, Los Angeles, USA, Jun. 2003, pp. 354–355.
    [14] A. Betkowska, K. Shinoda, and S. Furui, “Speech recognition using FHMMs robust against nonstationary noise,” in Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, Honolulu, USA, Apr. 2007, pp. IV–1029–IV–1032.
    [15] R. Gomez, T. Kawahara, and K. Nakadai, “Robust hands-free automatic speech recognition for human-machine interaction,” in Proc. 10th IEEE Int. Conf. on Humanoid Robots, Nashville, USA, Dec. 2010, pp. 138–143.
    [16] Y. Ohashi, T. Nishikawa, H. Saruwatari, A. Lee, and K. Shikano, “Noise-robust hands-free speech recognition based on spatial subtraction array and known noise superimposition,” in Proc. IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, Edmonton, Canada, Aug. 2005, pp. 2328–2332.
    [17] J. Hong, K. Cho, M. Hahn, S. Kim, and S. Jeong, “Multi-channel noise reduction with beamforming and masking-based wiener filtering for human-robot interface,” in Proc. 5th Int. Conf. on Automation, Robotics and Applications, Wellington, New Zealand, Dec. 2011, pp.260–264.
    [18] S.-C. Lee, B.-W. Chen, and J.-F. Wang, “Noisy environment-aware speech enhancement for speech recognition in human-robot interaction,” in Proc. IEEE Int. Conf. on System, Man, and Cybernetics, Istanbul, Turkey, Oct. 2010, pp. 3938–3941.
    [19] N. Mohammadiha, P. Smaragdis, and A. Leijon, “Supervised and unsupervised speech enhancement using nonnegative matrix factorization,” IEEE Trans. Audio, Speech, and Language Processing, vol. 21, no. 10, pp. 2140–2151, Jun. 2013.
    [20] J. Novoa, J. Wuth, J. P. Escudero, J. Fredes, R. Mahu, and N. B. Yoma, “DNN-HMM based automatic speech recognition for HRI scenarios,” in Proc. ACM/IEEE Int. Conf. on Human-Robot Interaction, Chicago, USA, Mar. 2018, pp. 150–159.
    [21] T. T. Vu, B. Bigot, and E. S. Chng, “Combining non-negative matrix factorization and deep neural networks for speech enhancement and automatic speech recognition,” in Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, Shanghai, China, Mar. 2016, pp. 499–503.
    [22] C. Weng, D. Yu, M. L. Seltzer, and J. Droppo, “Single-channel mixed speech recognition using deep neural networks,” in Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, Florence, Italy, May 2014, pp. 5632–5636.
    [23] C. Knapp and G. Carter, “The generalized correlation method for estimation of time delay,” IEEE Trans. Acoustics, Speech, and Signal Processing, vol. 24, no. 4, pp. 320–327, Aug. 1976.
    [24] M. Ross, H. Shaffer, A. Cohen, R. Freudberg, and H. Manley, “Average magnitude difference function pitch extractor,” IEEE Trans. Acoustics, Speech, and Signal Processing, vol. 22, no. 5, pp. 353–362, Oct. 1974.
    [25] S.-C. Lee, B.-W. Chen, J.-F. Wang, K. Bharanitharan, and C.-Y. Chen, “Far field-aware SOC design for sound source location based on 0.18μm CMOS process,“ in Proc. 2nd Int. Symp. on Aware Computing, Tainan, Taiwan, Nov. 2010, pp. 111–115.
    [26] J. Capon, “High-resolution frequency-wavenumber spectrum analysis,” Proceedings of the IEEE, vol. 57, no. 8, pp. 1408–1418, Aug. 1969.
    [27] R. Schmidt, “Multiple emitter location and signal parameter estimation,” IEEE Trans. Antennas and Propagation, vol. 34, no. 3, pp. 276–280, Mar. 1986.
    [28] Innovati Company [Online]. Available: http://www.innovati.com.tw
    [29] D. T. Pham, “Blind separation of instantaneous mixture of sources via an independent component analysis,” IEEE Trans. Signal Processing, vol. 44, no.11, pp. 2768–2779, Nov. 1996.
    [30] Y. Ephraim and H. L. van Trees, “A signal subspace approach for speech enhancement,” IEEE Trans. Speech and Audio Processing, vol. 3, no.4, pp. 251–266, Jul. 1995.
    [31] S. Young, G. Evermann, M. Gales, T. Hain, D. Kershaw, X. Liu, G. Moore, J. Odell, D. Ollason, D. Povey, V. Valtchev, and P. Woodland, The HTK book version 3.4 edition, Cambridge University Engineering Department, 2006, pp. 23–47.
    [32] A. Hyvarinen, “Fast and robust fixed-point algorithms for independent component analysis,” IEEE Trans. Neural Networks, vol. 10, no.3, pp. 626–634, May 1999.
    [33] A. Varga and H. J. M. Steeneken, “Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems,” Speech Communication, vol. 12, no.3, pp. 247–251, Jul. 1993.
    [34] S. Boll, “Suppression of acoustic noise in speech using spectral subtraction,” IEEE Trans. Acoustics, Speech, and Signal Processing, vol. 27, no. 2, pp. 113–120, Apr. 1979.
    [35] Wikimedia Commons [Online]. Available: https://commons.wikimedia.org/wiki/File:Oktava319vsshuresm58.png
    [36] H. Larochelle, Y. Bengio, J. Louradour, and P. Lamblin, “Exploring strategies for training deep neural networks,” Journal of Machine Learning Research, vol. 10, pp. 1–40, Jan. 2009.
    [37] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Computation, vol. 9, no. 8, pp. 1735–1780, Nov. 1997.
    [38] OLAMI Voice Kit – 語音智慧喇叭DIY開發套件[Online]. Available: http://tw.olami.ai/open/website/price/kit
    [39] J. S. Garofolo, “Getting started with the DARPA TIMIT CD-ROM: An acoustic phonetic continuous speech database,” Nat. Inst. Standards Technol. (NIST), Gaithersburg, MD, prototype as of Dec. 1988.
    [40] Y. Li, H. Cui, and K. Tang, “Speech enhancement algorithm based on spectral subtraction,” Journal of Tsinghua University (Science and Technology), vol. 46, no. 10, pp. 1685–1687, Oct. 2006.
    [41] A. L. Maas, Q. V. Le, T. M. O’Neil, O. Vinyals, P. Nguyen, and A. Y. Ng, “Recurrent neural networks for noise reduction in robust ASR,” in 13th Annual Conf. of the Int. Speech Communication Association, Portland, USA, Sep. 2012, pp. 22–25.
    [42] S. Yeom, J. Choi, Y. Lim, M. Park, and M. Kim, “DSP implementation of sound source localization with gain control,” in Proc. Int. Conf. on Control, Automation and Systems, Seoul, South Korea, Oct. 2007, pp. 224–229.
    [43] H. Kim, J. Choi, M. Kim, and C. Lee, “Reliable detection of sound’s direction for human robot interaction,” in Proc. IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, Sendai, Japan, Oct. 2004, pp. 2411–2416.
    [44] N. Sakamoto, W. Kobayashi, T. Onoye, and I. Shirakawa, “DSP implementation of low computational 3D sound localization algorithm,” in Proc. IEEE Workshop on Signal Processing Systems, Antwerp, Belgium, Sep. 2001, pp. 109–116.
    [45] G. Silcott, J. Wilson, N. Peterson, W. Peisel, and K.L. Kroekar, “SOCs drive new product development,” Computer, vol. 32, no. 6, pp. 61–66, Jun. 1999.
    [46] D. Halupka, N.J. Mathai, P. Aarabi, and A. Sheikholeslami, “Robust sound localization in 0.18 μm CMOS,” IEEE Trans. Signal Processing, vol. 53, no. 6, pp. 2243–2250, May 2005.
    [47] H. De Man, “System-on-chip design: impact on education and research,” IEEE Design & Test of Computers, vol. 16, no. 3, pp. 11–19, Sep. 1999.
    [48] B.P. Ginsburg and A.P. Chandrakasan, “500-MS/s 5-bit ADC in 65-nm CMOS with split capacitor array DAC,” IEEE Journal of Solid-State Circuits, vol. 42, no. 4, pp. 739–747, Mar. 2007.
    [49] Y.-K. Chang, C.-S. Wang, and C.-K. Wang, “A 8-bit 500KS/s low power SAR ADC for bio-medical application,” in Proc. IEEE Asian Solid-State Circuits Conf., Jeju, South Korea, Jan. 2008, pp. 228–231
    [50] J. Hu, N. Dolev, and B. Murmann, “A 9.4-bit, 50-MS/s, 1.44-mW pipelined ADC using dynamic residue amplification,” in Proc. IEEE Symp. on VLSI Circuits, Honolulu, USA, Aug. 2008, pp. 216–217.
    [51] C.-C. Liu, S.-J. Chang, G.-Y. Huang, and Y.-Z. Lin, “A 0.92mW 10-bit 50-MS/s SAR ADC in 0.13μm CMOS process,” in Proc. IEEE Symp. on VLSI Circuits, Kyoto, Japan, Aug. 2009, pp. 236–237.
    [52] S. Jin, D. Kim, H.S. Kim, C.H. Lee, J.S. Choi, and J.W. Jeon, “Real-time sound source localization system based on FPGA,” in Proc. IEEE Int. Conf. on Industrial Informatics, Daejeon, South Korea, Sep. 2008, pp. 673–677.
    [53] H.-H. Chen, S.-C. Chan, Z.-G. Zhang, and K.-L. Ho, “Adaptive beamforming and recursive DOA estimation using frequency-invariant uniform concentric spherical arrays,” IEEE Trans. Circuit and Systems I: Regular Papers, vol. 55, no. 10, pp. 3077–3089, Apr. 2008.
    [54] M. Togami, A. Amano, T. Sumiyoshi, and Y. Obuchi, “DOA estimation method based on sparseness of speech sources for human symbiotic robots,” in Proc. 2009 IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, Taipei, Taiwan, Apr. 2009, pp. 3693–3696.
    [55] T. Pirinen, P. Pertila, and A. Visa, “Toward intelligent sensors reliability for time delay based direction of arrival estimates,” in Proc. 2003 IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, Hong Kong, Hong Kong, Apr. 2003, pp. 197–200.
    [56] Y. Xiao, L. Ma, and K. Khorasani, “A novel wideband DOA estimation technique based on harmonic source model for a uniform linear array,” in Proc. 2004 IEEE Int. Symp. on Circuits and Systems, Vancouver, Canada, May 2004, pp.445–448.
    [57] S. Caylar, “A new neural network DOA estimation technique based on subarray beamforming,” in Proc. 2009 IEEE Int. Conf. on Electromagnetics in Advanced Applications, Torino, Italy, Sep. 2009, pp. 732–734.
    [58] J. Wang, R. Du, and F. Liu, “A new method based on the spatial differencing technique for DOA estimation,” in Proc. 2010 IEEE Int. Conf. on Networking, Sensing, and Control, Chicago, USA, Apr. 2010, pp. 44–48.

    下載圖示 校內:2023-01-16公開
    校外:2023-01-16公開
    QR CODE