簡易檢索 / 詳目顯示

研究生: 鄭裕成
Cheng, Yu-Cheng
論文名稱: 低複雜度心理聲學模型及壓縮感知技術應用於音訊壓縮系統
Audio Compression System based on Low Complexity Psychoacoustic Model Design and Compressive Sensing Techniques
指導教授: 雷曉方
Lei, Sheau-Fang
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電腦與通信工程研究所
Institute of Computer & Communication Engineering
論文出版年: 2016
畢業學年度: 105
語文別: 中文
論文頁數: 83
中文關鍵詞: 心理聲學遮蔽效應音訊壓縮感知壓縮
外文關鍵詞: psychoacoustic, masking effect, audio, compressive sensing, compression
相關次數: 點閱:139下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本論文提出低複雜度心理聲學模型與利用壓縮感知技術建立新型式音訊壓縮系統。新型音訊壓縮感知系統具有以下幾個特色:1)利用人耳存在的物理特性所建立的心理聲學模型,其中的遮蔽效應可以讓系統每個音框點數大幅下降約75%,且針對音訊壓縮的部分,本論文改良了原始心理聲學模型,並提供兩種頻帶壓縮的可選擇性,省略不必要的步驟、簡化所需步驟計算量,降低計算複雜度,跟原始模型比較可節省一半的步驟且不影響音質。2)新式壓縮概念,壓縮感知,和傳統壓縮方法相左的是資料取樣率,相異於傳統取樣方法奈奎斯特(Nyquist)定理,壓縮感知取樣概念是能夠以低於兩倍訊號採樣頻率之下仍能保有原訊號完整性,能夠於訊號重建時利用還原演算法將原訊號還原。3)因耳蝸物理結構造成的遮蔽效應特性,計算上可捨棄人耳所聽不見的冗餘訊號,進而降低每個音框所需計算的資料點數,搭配壓縮感知所需之稀疏度特性,本論文所提出低複雜度心理聲學模型及壓縮感知技術應用於音訊壓縮系統,和傳統壓縮方式比較保有一定的壓縮率及音質表現。

    This paper presents a novel audio compression system combining a low complexity and flexible property psychoacoustic model with compressive sensing techniques. There are three characteristics in our proposed system algorithm: 1) We establish the psychoacoustic model due to the physical property of the human ear which can reduce the processing points about 75% in inch frame based on masking effect. Moreover, the aim of our proposal is different from the original psychoacoustic model. We keep the necessary steps of the original model and simplify to reduce about 10% of operations without affecting the quality of sound compared to the model of MPEG original standard. 2) A novel idea of compression, “Compressive Sensing”, where the data sampling rate is different from original “Nyquist” theorem compressive methods, can reconstruct and keep the integrity of the original signal even though the sampling rate of compressive sensing is more or less twice the data sampling ratio. 3) Due to psychoacoustic masking effect, we can decrease the processing points since these points can not be heard by the human ear and this property can match the signal sparsity needed in compressive sensing. The processed sparsity signal abandons the frame points that also decrease the computation complexity in the system. Our proposed low complexity psychoacoustic model design and compressive sensing system is used for audio compression which has a better compression ratio and audio quality recovery.

    中文摘要 I EXTENDED ABSTRACT II 誌謝 VII 目錄 VIII 表目錄 X 圖目錄 XII 第一章 緒論 1 1.1研究背景 1 1.2動機與想法 2 1.3論文章節組織 3 第二章 相關文獻探討與回顧 4 2.1心理聲學 4 2.1.1耳蝸與臨界頻帶 4 2.1.2人耳絕對可聽門檻 8 2.1.3人耳聲學遮蔽效應 9 2.1.4心理聲學模型 14 2.2壓縮感知 24 2.2.1壓縮感知採樣理論 24 2.2.2訊號稀疏表示域 26 2.2.3測量矩陣 27 2.2.4還原演算法 29 第三章 低複雜度心理聲學模型及壓縮感知技術應用於音訊壓縮系統 31 3.1系統架構與演算法流程 31 3.2低複雜度心理聲學模型改良 34 3.2.1可選擇性頻帶模型 34 3.2.2保留必要模型步驟 36 3.2.3簡化所保留步驟算式 40 3.3訊號稀疏處理與還原 43 3.3.1訊號壓縮前處理 43 3.3.2訊號還原演算法 54 第四章 演算法分析與比較 60 4.1系統壓縮倍率 60 4.2系統輸出音質計算 61 4.2.1客觀數據評估 62 4.2.2主觀音質判定 64 4.2.3訊號還原誤差 67 4.2.4音質分析結論 68 4.3心理聲學模型計算複雜度比較 69 4.4壓縮方式比較 77 4.4.1MPEG壓縮技術比較 77 4.4.2語音壓縮比較 83 4.4.3壓縮方式比較結論 83 第五章 結論與未來展望 84 參考文獻 86

    [1] E. Zwicker and H. Fastl, Psychoacoustics: Facts and models vol. 22: Springer Science & Business Media, 2013.
    [2] B. C. Moore, "Masking in the human auditory system," in Audio Engineering Society Conference: Collected Papers on Digital Audio Bit-Rate Reduction, 1996.
    [3] R. G. Baraniuk, "Compressive sensing," IEEE signal processing magazine, vol. 24, 2007.
    [4] J. E. Hawkins. (Web. 17 May. 2016). human ear. Available: http://global.britannica.com/science/organ-of-Corti
    [5] J. E. Hawkins. (Web. 17 May. 2016). The physiology of hearing. Available: http://global.britannica.com/science/ear/Transmission-of-sound-by-bone-conduction
    [6] H. Fletcher, "Auditory patterns," Reviews of modern physics, vol. 12, p. 47, 1940.
    [7] W. Jesteadt, S. P. Bacon, and J. R. Lehman, "Forward masking as a function of frequency, masker level, and signal delay," The journal of the Acoustical Society of America, vol. 71, pp. 950-962, 1982.
    [8] R. P. Hellman, "Asymmetry of masking between noise and tone," Perception & Psychophysics, vol. 11, pp. 241-246, 1972.
    [9] M. R. Schroeder, B. S. Atal, and J. Hall, "Optimizing digital speech coders by exploiting masking properties of the human ear," The Journal of the Acoustical Society of America, vol. 66, pp. 1647-1652, 1979.
    [10] I. M. Committee, "Coding of moving pictures and associated audio for storage at up to about 1.5 mbit/s, part 3: Audio," ISO/IEC 11172, vol. 3, 1993.
    [11] F. Petitcolas. MPEG for Matlab. Available: http://www.petitcolas.net/fabien/software/mpeg/
    [12] E. J. Candès, J. Romberg, and T. Tao, "Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information," Information Theory, IEEE Transactions on, vol. 52, pp. 489-509, 2006.
    [13] M. S. Solé, J. S. de Diego, J. L. V. Malumbres, and J. V. Melenchón, Proceedings oh the International Congress of Mathematicians: Madrid, August 22-30, 2006: invited lectures, 2006.
    [14] R. Baraniuk, M. Davenport, R. DeVore, and M. Wakin, "A simple proof of the restricted isometry property for random matrices," Constructive Approximation, vol. 28, pp. 253-263, 2008.
    [15] E. Candes and J. Romberg, "Sparsity and incoherence in compressive sampling," Inverse problems, vol. 23, p. 969, 2007.
    [16] E. J. Candes and T. Tao, "Near-optimal signal recovery from random projections: Universal encoding strategies?," Information Theory, IEEE Transactions on, vol. 52, pp. 5406-5425, 2006.
    [17] M. A. Davenport, M. F. Duarte, Y. C. Eldar, and G. Kutyniok, "Introduction to compressed sensing," Preprint, vol. 93, p. 2, 2011.
    [18] D. L. Donoho and P. B. Stark, "Uncertainty principles and signal recovery," SIAM Journal on Applied Mathematics, vol. 49, pp. 906-931, 1989.
    [19] L. Bregman, "The method of successive projection for finding a common point of convex sets(Theorems for determining common point of convex sets by method of successive projection)," Soviet Mathematics, vol. 6, pp. 688-692, 1965.
    [20] J. A. Tropp and A. C. Gilbert, "Signal recovery from random measurements via orthogonal matching pursuit," Information Theory, IEEE Transactions on, vol. 53, pp. 4655-4666, 2007.
    [21] Y.-S. Chen, H.-Y. Lin, H.-C. Chiu, and H.-P. Ma, "A compressive sensing framework for electromyogram and electroencephalogram," in Medical Measurements and Applications (MeMeA), 2014 IEEE International Symposium on, 2014, pp. 1-6.
    [22] A. V. Oppenheim, R. W. Schafer, and J. R. Buck, Discrete-time signal processing vol. 2: Prentice hall Englewood Cliffs, NJ, 1989.
    [23] R. B. ITU-R, "1387,“Method for objective measurements of perceived audio quality”," International Telecommunication Union, vol. 2001, 1999.
    [24] R. B. ITU-R, "1284, Methods for the subjective assessment of sound quality-General requirements," International Telcommunications Union Radiocommunication Assembly, 1998.
    [25] Available: https://www.ebu.ch/fr/technical/publications/tech3000_series/tech3253/index.php?display=FR
    [26] S.-W. Huang, T.-H. Tsai, and L.-G. Chen, "A low complexity design of psycho-acoustic model for MPEG-2/4 advanced audio coding," Consumer Electronics, IEEE Transactions on, vol. 50, pp. 1209-1217, 2004.
    [27] Speech Compression. Available: http://www.data-compression.com/speech.shtml

    無法下載圖示 校內:2021-08-30公開
    校外:不公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE