| 研究生: |
林瑞傑 Lin, Jui-Chieh |
|---|---|
| 論文名稱: |
使用立體聲上混及低複雜度雙耳合成架構之虛擬環繞音效演算法 Virtual Surround Sound Algorithm by Stereo Upmix and Low Complexity Binaural Synthesis Architecture |
| 指導教授: |
雷曉方
Lei, Sheau-Fang |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
| 論文出版年: | 2018 |
| 畢業學年度: | 106 |
| 語文別: | 中文 |
| 論文頁數: | 77 |
| 中文關鍵詞: | 空間音效 、雙耳聽覺演算法 、雙聲道上混 、頭相關轉移函數 、共聲學極點與零點模型 |
| 外文關鍵詞: | 3D audio, Binaural auralization, stereo upmix, Head-Related Transfer Function(HRTF), Common-Acoustical-Poles and Zeros(CAPZ) model |
| 相關次數: | 點閱:94 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
雙耳聽覺演算法主要是用來增強耳機系統聲場的表現,其演算法主要由三個部分所組成1)空間分析(Spatial Analysis)、2)空間合成(Spatial synthesis)、3)雙耳合成(Binaural synthesis),藉由經過這樣的處理得到增強聲場的雙耳音訊。但是在雙耳合成時,使用量測到的頭相關轉移函數(HRTFs)來重建耳機內聲源位置的資訊,會使得運算量以及儲存量非常的大,這對系統來說是非常大的負擔,故本論文提出一種低複雜度雙耳合成應用於雙耳聽覺演算法。本論文使用HRTFs的特性先將其轉換成最小相位濾波器以及一個雙耳時間差的延遲後,在將其最小相位部分使用共聲學極點與零點進行逼近,而最後使用Shuffler濾波器結構來減少渲染對稱聲源時所需要的濾波器組。其實驗顯示本論文所提出的雙耳合成演算法與原始演算法相比其儲存需求量節省了86.25%,而在運算資源上乘法運算量節省了86.25%、加法運算量則節省了83.88%,最後則使用PEAQ進行客觀音質分析,透過這個參數的評估及分析後可知,本論文所提出的雙耳合成架構之效果,能夠和使用量測得到的HRTFs得到相同的效果。
The Binaural auralization algorithm is used to enhance the performance of the sound field in headphone system. This algorithm is involve three parts: 1)Spatial analysis; 2)Spatial synthesis; 3)Binaural synthesis. By this processing, a binaural audio signal with an enhanced sound field is obtained. However the computational cost and memory usage of measured HRTF are so huge that make synthesize 3D audio arduously in real-time. In this thesis, we proposed a low-complexity and low memory usage binaural synthesis architecture for binaural auralization algorithm to overcome this problem.
Frist, we represent HRTFs with a pure delay followed by minimum-phase system. Then the minimum phase HRTF is used to estimate the common acoustical pole and direction-dependent zeros for modeling IIR filter. Finally, for a symmetrical acoustical system, the Shuffler filter architecture can be used to reduce computation cost.The experiment shows that the binaural synthesis framework presented in this thesis has 86.25% saving in memory usage, and 85% saving in computational cost compared to the original framework. And, using PEAQ methods to analyze the objective sound quality, through the evaluation and analysis of the parameter, we can see that the effect of propoesd in this thesis is the same as measured HRTFs.
[1] T. Lee, Y. Baek, Y.-c. Park, and D. H. Youn, "Stereo upmix-based binaural auralization for mobile devices," in Consumer Electronics (ICCE), 2014 IEEE International Conference on, 2014, pp. 139-140: IEEE.
[2] I. Recommendation, "Multichannel stereophonic sound system with and without accompanying picture," International Telecommunication Union, pp. 775-1, 1992.
[3] M. R. Bai and G.-Y. Shih, "Upmixing and downmixing two-channel stereo audio for consumer electronics," IEEE Transactions on Consumer Electronics, vol. 53, no. 3, 2007.
[4] C. Faller and J. Breebaart, "Binaural reproduction of stereo signals using upmixing and diffuse rendering," in Audio Engineering Society Convention 131, 2011: Audio Engineering Society.
[5] T. Lee, Y. Baek, Y.-c. Park, and D. H. Youn, "Stereo upmix-based binaural auralization for mobile devices," IEEE Transactions on Consumer Electronics, vol. 60, no. 3, pp. 411-419, 2014.
[6] S. Sima, "HRTF Measurements and Filter Design for a Headphone-Based 3D-Audio System," Bachelorarbeit (HAW-Hamburg), 2008.
[7] J. Blauert, Spatial hearing: the psychophysics of human sound localization. MIT press, 1997.
[8] TheOpenUniversity. (2011). Hearing. Available: http://openlearn.open.ac.uk/SD329_1
[9] H. Wallach, E. B. Newman, and M. R. Rosenzweig, "A precedence effect in sound localization," The Journal of the Acoustical Society of America, vol. 21, no. 4, pp. 468-468, 1949.
[10] F. L. Wightman and D. J. Kistler, "Headphone simulation of free‐field listening. I: stimulus synthesis," The Journal of the Acoustical Society of America, vol. 85, no. 2, pp. 858-867, 1989.
[11] C. I. Cheng and G. H. Wakefield, "Introduction to head-related transfer functions (HRTFs): Representations of HRTFs in time, frequency, and space," in Audio Engineering Society Convention 107, 1999: Audio Engineering Society.
[12] W. G. Gardner and K. D. Martin, "HRTF measurements of a KEMAR," The Journal of the Acoustical Society of America, vol. 97, no. 6, pp. 3907-3908, 1995.
[13] V. R. Algazi, R. O. Duda, D. M. Thompson, and C. Avendano, "The cipic hrtf database," in Applications of Signal Processing to Audio and Acoustics, 2001 IEEE Workshop on the, 2001, pp. 99-102: IEEE.
[14] V. Algazi, "Documentation for the UCD HRIR files," Technical Report, CIPIC Interface Laboratory—University of California at Davis1998.
[15] W. G. Gardner, 3-D audio using loudspeakers. Springer Science & Business Media, 1998.
[16] V. R. Algazi and R. O. Duda, "Headphone-based spatial sound," IEEE Signal Processing Magazine, vol. 28, no. 1, pp. 33-42, 2011.
[17] G. S. Kendall, "The decorrelation of audio signals and its impact on spatial imagery," Computer Music Journal, vol. 19, no. 4, pp. 71-87, 1995.
[18] J. Breebaart and E. Schuijers, "Phantom materialization: A novel method to enhance stereo audio reproduction on headphones," IEEE transactions on audio, speech, and language processing, vol. 16, no. 8, pp. 1503-1511, 2008.
[19] Y.-H. Baek, S.-W. Jeon, Y.-c. Park, and S. Lee, "Efficient primary-ambient decomposition algorithm for audio upmix," in Audio Engineering Society Convention 133, 2012: Audio Engineering Society.
[20] V. Pulkki, "Virtual sound source positioning using vector base amplitude panning," Journal of the audio engineering society, vol. 45, no. 6, pp. 456-466, 1997.
[21] C. Avendano and J.-M. Jot, "A frequency-domain approach to multichannel upmix," Journal of the Audio Engineering Society, vol. 52, no. 7/8, pp. 740-749, 2004.
[22] J. Merimaa, M. M. Goodwin, and J.-M. Jot, "Correlation-based ambience extraction from stereo recordings," in Audio Engineering Society Convention 123, 2007: Audio Engineering Society.
[23] M. M. Goodwin, "Geometric signal decompositions for spatial audio enhancement," in Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on, 2008, pp. 409-412: IEEE.
[24] C. Faller, "Multiple-loudspeaker playback of stereo signals," Journal of the Audio Engineering Society, vol. 54, no. 11, pp. 1051-1064, 2006.
[25] J. He, E.-L. Tan, and W.-S. Gan, "Linear estimation based primary-ambient extraction for stereo audio signals," IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), vol. 22, no. 2, pp. 505-517, 2014.
[26] M. Briand, N. Martin, and D. Virette, "Parametric representation of multichannel audio based on principal component analysis," in Audio Engineering Society Convention 120, 2006: Audio Engineering Society.
[27] S.-W. Jeon, Y.-c. Park, S.-P. Lee, and D. H. Youn, "Virtual source panning using multiple-wise vector base in the multispeaker stereo format," in Signal Processing Conference, 2011 19th European, 2011, pp. 1337-1341: IEEE.
[28] V. Pulkki and T. Lokki, "Creating auditory displays with multiple loudspeakers using VBAP: A case study with DIVA project," 1998: Georgia Institute of Technology.
[29] P. Courrieu, "Fast Computation of Moore-Penrose Inverse Matrices," Neural Information Processing-Letters and Reviews, vol. 8, no. 2, 2005.
[30] Y. Haneda, S. Makino, Y. Kaneda, and N. Kitawaki, "Common-acoustical-pole and zero modeling of head-related transfer functions," IEEE Transactions on speech and audio processing, vol. 7, no. 2, pp. 188-196, 1999.
[31] M. A. Blommer and G. H. Wakefield, "Pole-zero approximations for head-related transfer functions using a logarithmic error criterion," IEEE Transactions on Speech and Audio Processing, vol. 5, no. 3, pp. 278-287, 1997.
[32] S. Yao, T. Collins, and P. Jančovič, "A dual-mode architecture for headphones delivering surround sound: Low-order IIR filter models approach," in Consumer Electronics (ISCE), 2011 IEEE 15th International Symposium on, 2011, pp. 62-66: IEEE.
[33] A. Kulkarni, S. Isabelle, and H. Colburn, "Sensitivity of human subjects to head-related transfer-function phase spectra," The Journal of the Acoustical Society of America, vol. 105, no. 5, pp. 2821-2840, 1999.
[34] A. V. Oppenheim, Discrete-time signal processing. Pearson Education India, 1999.
[35] A. Kulkarni, S. Isabelle, and H. Colburn, "On the minimum-phase approximation of head-related transfer functions," in Applications of Signal Processing to Audio and Acoustics, 1995., IEEE ASSP Workshop on, 1995, pp. 84-87: IEEE.
[36] J. Nam, M. A. Kolar, and J. S. Abel, "On the minimum-phase nature of head-related transfer functions," in Audio Engineering Society Convention 125, 2008: Audio Engineering Society.
[37] S. T. Neely and J. B. Allen, "Invertibility of a room impulse response," The Journal of the Acoustical Society of America, vol. 66, no. 1, pp. 165-169, 1979.
[38] L. Ljung and T. Söderström, Theory and practice of recursive identification. MIT press, 1983.
[39] R. O. Duda, "Modeling head related transfer functions," in Signals, Systems and Computers, 1993. 1993 Conference Record of The Twenty-Seventh Asilomar Conference on, 1993, pp. 996-1000: IEEE.
[40] C. P. Brown and R. O. Duda, "An efficient HRTF model for 3-D sound," in Applications of Signal Processing to Audio and Acoustics, 1997. 1997 IEEE ASSP Workshop on, 1997, p. 4 pp.: IEEE.
[41] V. Larcher and J.-M. Jot, "Techniques d’interpolation de filtres audio-numériques, Applicationa la reproduction spatiale des sons sur écouteurs," in Proc. CFA: Congres Français d’Acoustique, 1997: Citeseer.
[42] BS.1387, "Method for Objective Measurements of Perceived Audio Quality," Recommendation ITU-R, 1998.
校內:2023-07-31公開