| 研究生: |
莊士緯 Chuang, Shih-Wei |
|---|---|
| 論文名稱: |
新穎低複雜度且彈性化心理聲學模型應用於有效率音訊等化系統 Novel low complexity and flexible psycho-acoustic model design for efficient audio equalization system |
| 指導教授: |
雷曉方
Lei, Sheau-Fang |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
| 論文出版年: | 2015 |
| 畢業學年度: | 103 |
| 語文別: | 中文 |
| 論文頁數: | 110 |
| 中文關鍵詞: | 心理聲學 、遮蔽效應 、聽力圖 、音訊等化器 |
| 外文關鍵詞: | Psychoacoustic, Audiogram, Audio equalizer |
| 相關次數: | 點閱:123 下載:9 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本論文提出新穎低複雜度且彈性化心理聲學模型與搭配開發出來的以聽力圖為基礎有效率等化系統。所設計等化系統具有以下幾個特色: 1)透過心理聲學模型,人耳存在的遮蔽效應可以讓系統每音框需做等化的點數大幅下降約70%,且考慮應用上的不同,本論文在心理聲學模型設計省略不必要的步驟,以降低計算複雜度,跟原始模型比較可降低約30%~40%的運算且不影響音質,除此以外,本論文提出之模型不需要去全頻的計算心理聲學模型,彈性化的設計在應用上更可以減輕心理聲學模型對於系統帶來的負擔。2)考慮聽力圖為基礎的等化演算法,透過考慮人耳(正常人)存在的聽力損失,將這些損失當作等化的依據,讓等化器的效果更好,3) 因為心理聲學遮蔽效應特性,在計算上捨棄了冗餘的訊號,進而降低每個音框的輸出功率。以整體等化器系統來說,本論文提出符合人耳特性的架構設計,比較數據上擁有最低的運算量且不會遜色的音質,此外,更有最低的輸出功率,跟其他架構相比,更適合實現於可攜帶裝置上。
This paper presents a novel low complexity and flexible psychoacoustic model with proposed audiogram based efficient audio equalization system. Our proposed structure have the following characteristics: 1) With psychoacoustic model, we can reduce the processing points about 70% every frame according to the masking effect in our ears. Besides, based on different purpose, we keep only the necessary steps in psychoacoustic model which reducing 30%-40% operations without affecting the sound quality compared with original model in MPEG standard. Moreover, the flexible design with band-selectable strategy in our model lightening the burden on our system when we don’t need to calculate full-band model; 2) The equalization algorithm based on audiogram considering the hearing loss in normal people, we adjust the gain in frequency domain with the information given by audiogram making the equalization better and adaptive; 3) With the masking effect in psychoacoustic, the redundant signals were discarded, leading to the lower output power every frame. Compared with other audio equalizer with filter bank design, our proposed structure has the lowest computation complexity, lowest output power. Nevertheless, considering the property in human ear, our proposed is more suitable for normal using. Based on the above, the proposed structure would be a new solution for future application in audio processing.
[1] http://www.hearinglink.org/what-is-an-audiogram
[2] T. Painter, and A. Spanias, “Perceptual coding of digital audio,” Proceedings of the IEEE, vol. 88, no. 4, pp. 451-515, 2000.
[3] B. C. J. Moore, “Masking in the human auditory system,” in Audio Engineering Society Conference: Collected Papers on Digital Audio Bit-Rate Reduction, pp. 9–19, 1996.
[4] MPEG. Coding of moving pictures and associated audio for digital storage media at up to 1.5 Mbit/s, part 3: Audio, International Standard IS 11172-3, ISO/IEC JTC1/SC29 WG11, 1992.
[5] E. Zwicker and H. Fastl, Psychoacoustics Facts and Models. Berlin, Germany: Springer-Verlag, 1990.
[6] http://cochlearimplanthelp.com/journey/choosing-a-cochlear-implant/electrodes-and-channels/
[7] H. Fletcher, “Auditory patterns,” Rev. Mod. Phys., vol. 12, no. 1, pp. 47–65, Jan. 1940.
[8] W. Jesteadt, S. Bacon, and J. Lehman, “Forward masking as a function of frequency, masker level, and signal delay,” J. Acoust. Soc. Amer., vol. 71, pp. 950–962, 1982.
[9] R. Hellman, “Asymmetry of masking between noise and tone,” Percep. Psychphys., vol. 11, pp. 241–246, 1972.
[10] B. Scharf, “Critical bands.” in Foundations of Modern Auditory Theory, vol. 1, pp. 157-202, 1970.
[11] J. L. Hall, “Asymmetry of masking revisited: Generalization of masker and probe bandwidth,” J. Acoust. Soc. Amer., vol. 101, pp. 1023–1033, Feb. 1997.
[12] M. Schroeder, B. S. Atal, and J. L. Hall, “Optimizing digital speech coders by exploiting masking properties of the human ear,” J. Acoust. Soc. Amer., pp. 1647–1652, Dec. 1979.
[13] S. Cecchi, L. Palestini, E. Moretti, and F. Piazza, “A New Approach to Digital Audio Equalization,” in Proc. Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, NY, USA, pp. 62–65, October 2007.
[14] Virgulti, Marco, Stefania Cecchi, and Francesco Piazza. "IIR filter approximation of an innovative digital audio equalizer." Image and Signal Processing and Analysis (ISPA), 2013 8th International Symposium on. IEEE, pp. 410-415, Sept. 2013.
[15] http://www.petitcolas.net/fabien/software/mpeg/
[16] A. V. Oppenheim, R. W. Schafer, and J. R. Buck, Discrete-time signal processing. Prentice-hall Englewood Cliffs, 1989.
[17] http://stingsti.blogspot.tw/2013/07/blog-post_24.html
[18] http://audioate.blogspot.tw/2014/10/blog-post.html
[19] S. Huang, T. Tsai, and L. Chen, “A low complexity design of psychoacoustic model for MPEG-2/4 advanced audio coding,” IEEE Tran.on Consumer Electronics, Vol. 50, No. 4, pp. 1209-1217, Nov. 2004.
[20] Huang, Shih-Way, Liang-Gee Chen, and Tsung-Han Tsai. "Memory and computationally efficient psychoacoustic model for MPEG AAC on 16-bit fixed-point processors." IEEE International Symposium on Circuits and Systems, Vol. 4, pp. 3155-3158, May. 2005.
[21] ITU-R Recommendation BS. 1387: “Method for objective measurements of perceived audio quality,” July 2001.
[22] http://www.ebu.ch/fr/technical/publications/tech3000_series/tech3253/index.php?display=FR
[23] ITU-R Recommendation BS. 1284, "Methods for the subjective assessment of sound quality - General requirements", 1997.
[24] P. Duhamel, B. Piron, and J. Etcheto, “On computing the inverse DFT,” Acoustics, Speech and Signal Processing, IEEE Transactions on, vol. 36, no. 2, pp. 285-286, 1988.
[25] http://www.terasic.com.tw/cgi-bin/page/archive.pl?Language=Taiwan&CategoryNo=173&No=542&PartNo=2