| 研究生: |
鄭裕成 Cheng, Yu-Cheng |
|---|---|
| 論文名稱: |
低複雜度心理聲學模型及壓縮感知技術應用於音訊壓縮系統 Audio Compression System based on Low Complexity Psychoacoustic Model Design and Compressive Sensing Techniques |
| 指導教授: |
雷曉方
Lei, Sheau-Fang |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 電腦與通信工程研究所 Institute of Computer & Communication Engineering |
| 論文出版年: | 2016 |
| 畢業學年度: | 105 |
| 語文別: | 中文 |
| 論文頁數: | 83 |
| 中文關鍵詞: | 心理聲學 、遮蔽效應 、音訊 、壓縮感知 、壓縮 |
| 外文關鍵詞: | psychoacoustic, masking effect, audio, compressive sensing, compression |
| 相關次數: | 點閱:139 下載:2 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本論文提出低複雜度心理聲學模型與利用壓縮感知技術建立新型式音訊壓縮系統。新型音訊壓縮感知系統具有以下幾個特色:1)利用人耳存在的物理特性所建立的心理聲學模型,其中的遮蔽效應可以讓系統每個音框點數大幅下降約75%,且針對音訊壓縮的部分,本論文改良了原始心理聲學模型,並提供兩種頻帶壓縮的可選擇性,省略不必要的步驟、簡化所需步驟計算量,降低計算複雜度,跟原始模型比較可節省一半的步驟且不影響音質。2)新式壓縮概念,壓縮感知,和傳統壓縮方法相左的是資料取樣率,相異於傳統取樣方法奈奎斯特(Nyquist)定理,壓縮感知取樣概念是能夠以低於兩倍訊號採樣頻率之下仍能保有原訊號完整性,能夠於訊號重建時利用還原演算法將原訊號還原。3)因耳蝸物理結構造成的遮蔽效應特性,計算上可捨棄人耳所聽不見的冗餘訊號,進而降低每個音框所需計算的資料點數,搭配壓縮感知所需之稀疏度特性,本論文所提出低複雜度心理聲學模型及壓縮感知技術應用於音訊壓縮系統,和傳統壓縮方式比較保有一定的壓縮率及音質表現。
This paper presents a novel audio compression system combining a low complexity and flexible property psychoacoustic model with compressive sensing techniques. There are three characteristics in our proposed system algorithm: 1) We establish the psychoacoustic model due to the physical property of the human ear which can reduce the processing points about 75% in inch frame based on masking effect. Moreover, the aim of our proposal is different from the original psychoacoustic model. We keep the necessary steps of the original model and simplify to reduce about 10% of operations without affecting the quality of sound compared to the model of MPEG original standard. 2) A novel idea of compression, “Compressive Sensing”, where the data sampling rate is different from original “Nyquist” theorem compressive methods, can reconstruct and keep the integrity of the original signal even though the sampling rate of compressive sensing is more or less twice the data sampling ratio. 3) Due to psychoacoustic masking effect, we can decrease the processing points since these points can not be heard by the human ear and this property can match the signal sparsity needed in compressive sensing. The processed sparsity signal abandons the frame points that also decrease the computation complexity in the system. Our proposed low complexity psychoacoustic model design and compressive sensing system is used for audio compression which has a better compression ratio and audio quality recovery.
[1] E. Zwicker and H. Fastl, Psychoacoustics: Facts and models vol. 22: Springer Science & Business Media, 2013.
[2] B. C. Moore, "Masking in the human auditory system," in Audio Engineering Society Conference: Collected Papers on Digital Audio Bit-Rate Reduction, 1996.
[3] R. G. Baraniuk, "Compressive sensing," IEEE signal processing magazine, vol. 24, 2007.
[4] J. E. Hawkins. (Web. 17 May. 2016). human ear. Available: http://global.britannica.com/science/organ-of-Corti
[5] J. E. Hawkins. (Web. 17 May. 2016). The physiology of hearing. Available: http://global.britannica.com/science/ear/Transmission-of-sound-by-bone-conduction
[6] H. Fletcher, "Auditory patterns," Reviews of modern physics, vol. 12, p. 47, 1940.
[7] W. Jesteadt, S. P. Bacon, and J. R. Lehman, "Forward masking as a function of frequency, masker level, and signal delay," The journal of the Acoustical Society of America, vol. 71, pp. 950-962, 1982.
[8] R. P. Hellman, "Asymmetry of masking between noise and tone," Perception & Psychophysics, vol. 11, pp. 241-246, 1972.
[9] M. R. Schroeder, B. S. Atal, and J. Hall, "Optimizing digital speech coders by exploiting masking properties of the human ear," The Journal of the Acoustical Society of America, vol. 66, pp. 1647-1652, 1979.
[10] I. M. Committee, "Coding of moving pictures and associated audio for storage at up to about 1.5 mbit/s, part 3: Audio," ISO/IEC 11172, vol. 3, 1993.
[11] F. Petitcolas. MPEG for Matlab. Available: http://www.petitcolas.net/fabien/software/mpeg/
[12] E. J. Candès, J. Romberg, and T. Tao, "Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information," Information Theory, IEEE Transactions on, vol. 52, pp. 489-509, 2006.
[13] M. S. Solé, J. S. de Diego, J. L. V. Malumbres, and J. V. Melenchón, Proceedings oh the International Congress of Mathematicians: Madrid, August 22-30, 2006: invited lectures, 2006.
[14] R. Baraniuk, M. Davenport, R. DeVore, and M. Wakin, "A simple proof of the restricted isometry property for random matrices," Constructive Approximation, vol. 28, pp. 253-263, 2008.
[15] E. Candes and J. Romberg, "Sparsity and incoherence in compressive sampling," Inverse problems, vol. 23, p. 969, 2007.
[16] E. J. Candes and T. Tao, "Near-optimal signal recovery from random projections: Universal encoding strategies?," Information Theory, IEEE Transactions on, vol. 52, pp. 5406-5425, 2006.
[17] M. A. Davenport, M. F. Duarte, Y. C. Eldar, and G. Kutyniok, "Introduction to compressed sensing," Preprint, vol. 93, p. 2, 2011.
[18] D. L. Donoho and P. B. Stark, "Uncertainty principles and signal recovery," SIAM Journal on Applied Mathematics, vol. 49, pp. 906-931, 1989.
[19] L. Bregman, "The method of successive projection for finding a common point of convex sets(Theorems for determining common point of convex sets by method of successive projection)," Soviet Mathematics, vol. 6, pp. 688-692, 1965.
[20] J. A. Tropp and A. C. Gilbert, "Signal recovery from random measurements via orthogonal matching pursuit," Information Theory, IEEE Transactions on, vol. 53, pp. 4655-4666, 2007.
[21] Y.-S. Chen, H.-Y. Lin, H.-C. Chiu, and H.-P. Ma, "A compressive sensing framework for electromyogram and electroencephalogram," in Medical Measurements and Applications (MeMeA), 2014 IEEE International Symposium on, 2014, pp. 1-6.
[22] A. V. Oppenheim, R. W. Schafer, and J. R. Buck, Discrete-time signal processing vol. 2: Prentice hall Englewood Cliffs, NJ, 1989.
[23] R. B. ITU-R, "1387,“Method for objective measurements of perceived audio quality”," International Telecommunication Union, vol. 2001, 1999.
[24] R. B. ITU-R, "1284, Methods for the subjective assessment of sound quality-General requirements," International Telcommunications Union Radiocommunication Assembly, 1998.
[25] Available: https://www.ebu.ch/fr/technical/publications/tech3000_series/tech3253/index.php?display=FR
[26] S.-W. Huang, T.-H. Tsai, and L.-G. Chen, "A low complexity design of psycho-acoustic model for MPEG-2/4 advanced audio coding," Consumer Electronics, IEEE Transactions on, vol. 50, pp. 1209-1217, 2004.
[27] Speech Compression. Available: http://www.data-compression.com/speech.shtml
校內:2021-08-30公開