| 研究生: |
黃彥彰 Huang, Yen-Chung |
|---|---|
| 論文名稱: |
一MPEG-4 CELP位元率可調混合編碼方法 A Hybrid Bit Rate Scalable Coding Method for MPEG-4 CELP |
| 指導教授: |
陳進興
Chen, Jin-Xing |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
| 論文出版年: | 2002 |
| 畢業學年度: | 90 |
| 語文別: | 英文 |
| 論文頁數: | 86 |
| 中文關鍵詞: | 位元率可調 、語音 |
| 外文關鍵詞: | Speech, Scalability, MPEG-4 CELP |
| 相關次數: | 點閱:92 下載:1 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
現今的的語音編碼大都以code–excited linear prediction (CELP)為核心,且朝向以層為單位的編碼方法來增加訊號的品質,其中MPEG-4 CELP Bit Rate Scalability (MCBRS)就是一個例子。
本篇論文提出一新的位元率可調混合編碼方法,這個方法的目的跟MCBRS一樣,都是以層為單位來提升語音品質。此一編碼方法分成兩個模型:當基本層的位元率夠高時,使用差值位置波形基底編碼(DPWBC)來壓縮殘餘訊號;當基本層的位元率較低時,則採用MCBRS部份解碼(PDMCBRS)的方法來壓縮殘餘值。DPWBC不採用LPC係數來合成訊號,而是統計訊號殘餘值,找出適合的波形基底來編碼殘餘值,壓縮資料包括波形的位置、正負號、寬度及振幅;其中訊號先依能量大小分組後重新排列編碼順序,再編碼位置差值,此方式大大節省位元率及提升編碼的準確性。而PDMCBRS則仍採用原本MCBRS的編碼方法,再加入部分解碼的觀念來解碼位元流。以上二種編碼模式的位元率間隔都比MCBRS要小,因此更接近連續精細化(Successive refinement)的功能。
實驗結果顯示,基本層位元率夠高及聲音吵雜時,DPWBC比MCBRS有更高的訊號雜訊比,而PDMCBRS在任何情況下都能保持跟MCBRS一樣的編碼效果。這二種模式搭配使用所組成的混合編碼方法,比起標準MCBRS,可以有更好的語音品質且有較小的位元率間隔。
Many speech coding standards are based upon code-excited linear prediction (CELP), and it is desirable to enhance signal performance by using layered coding methods that are compatible with this base coder.
In this thesis, a hybrid bit rate scalable coding method is proposed for MPEG-4 CELP that addresses similar functionalities of MPEG-4 CELP Bit Rate Scalability, and is also layered structured to enhance signal performance. It offers two modes of coding in different situations: differential position waveform-based coding (DPWBC) for high coding bit rate of the base layer and partial decoding of MPEG-4 CELP Bit rate Scalability (PDMCBRS) for low coding bit rate of the base layer. Instead of using analysis-by-synthesis coding method, the DPWBC method codes the waveform of the residual signal. In the residual signal, continuous positive (or negative) samples are regards as one signal to be coded. Each signal is coded by using its position, sign, width and magnitude. The coding order of signals are arranged according to signal’s group and position to reduce coding bit rate and increase coding accuracy. The PDMCBRS method employs partial decoding of the MPEG-4 CELP Bit Rate Scalability to gradually enhance speech performance. Either modes of the proposed coding method has smaller bit rate step than MCBRS, so the concept of successive refinement is more closely approached.
Experiments show that DPWBC enhances the performance more effectively than MCBRS at high bit rate and noisy background; and PDMCBRS has the same performance as the MPEG-4 CELP Bit rate Scalability. So the proposed coding method employing DPWBC and PDMCBRS performs better than MPEG-4 CELP Bit rate Scalability and has smaller bit rate step of enhancement.
[1] ITU-T, Recommendation G.729, “Coding of Speech at 8 kbps Using Conjugate-Structure Algebraic-Code-Excited Linear Prediction (CS-ACELP),” March 1996.
[2] ITU-T, Recommendation G.723.1, “Dual Rate Speech Coder for Multimedia Communications Transmitting at 5.3 and 6.3 kbps,” March 1996.
[3] ISO/IEC JTC1 SC29/WG11, ISO/IEC FCD 14496-3. Information Technology – Coding of Audiovisual Object – Part 3: Audio, Nov 1998.
[4] H. C. Woo and J. D. Gibson, “Low Delay Tree Coding of Speech at 8 kbps,” IEEE Trans. on Speech and Audio Processing, Vol. 2, No. 3, pp. 361-370, July 1994.
[5] S. Yeldener, “A 4kb/s Toll Quality Harmonic Excitation Linear Predictive Speech Coder,” IEEE ICASSP, Vol. 1, pp. 481-484, 1999.
[6] L. Nishiguchi, K. Iijima and J. Matsumoto, “Harmonic Vector Excitation Coding of Speech at 2.0 kbps,” IEEE Workshop on Speech Coding, pp. 39-40, Sep. 1997.
[7] S. Ahmadi and A.S. Spanias, “New Algorithms for Sinusoidal Speech Coding at Low Bit Rates,” IEEE International Conference on Personal Wireless Communications, pp. 57-61, 1997.
[8] A. McCree, Kwan Truong, E. B. George, T. P. Barnwell and V. Viswanathan, “A 2.4 kb/s MELP Coder Candidate for The New U.S. Federal Standard,” IEEE ICASSP, Vol. 1, pp. 200-203, 1996.
[9] M. R. Nakhai and F. A. Marvasti, “A 4.1 kb/s Hybrid Speech Coder,” IEEE International Symposium on Circuits and Systems, Vol. 3, pp. 110-113, 1999.
[10] J. Stachurski and A. McCree, “A 4kb/s Hybrid MELP/CELP Coder with Alignment Phase Encoding and Zero-Phase Equalization,” IEEE ICASSP, Vol. 3, pp. 1379-1382, 2000.
[11] M. Schroeder and B. Atal, “Code Excited Linear Prediction: High Quality Speech at Low Bit Rates,” IEEE ICASSP, pp. 937-940, 1985.
[12] J. P. Ashley, E. M. Cruz-Zeno, U. Mittal and Weimon Peng, “Wideband Coding of Speech Using a Scalable Pulse Codebook,” IEEE Workshop on Speech Coding, pp. 148-150, 2000.
[13] R. Taori, R. J. Sluijter and A. J. Gerrits, “On Scalability in CELP Coding Systems,” IEEE Workshop on Speech Coding, pp. 67-68, 1997.
[14] Hui Dong and J. D. Gibson, “Universal Successive Refinement of CELP Speech Coders,” IEEE ICASSP, Vol. 2, pp. 713-716, 2001.
[15] Xuedong Huang, Alex Acero and Hsiao-Wuen Hon, Spoken Language Processing, prentice-Hall, 2001.
[16] A. M. Kondoz, Digital Speech, John Wiley & Sons, 1994.
[17] K. R. RAO and J. J. HWANG, Techniques and Standards for Image, Video, and Audio Coding, Prentice Hall PTR, 1996.
[18] 戴顯權, 資料壓縮, 紳藍出版社, 2001.
[19] Touradj Ebrahimi et al., “MPEG-4 Natural Video Coding - An Overview,” http://leonardo.telecomitalialab.com/icjfiles/mpeg-4_si/.
[20] B. S. Atal and L. Hanauer, “Speech Analysis and Synthesis by Linear Prediction of the Speech Wave,” Journal of the Acoustical Society of America, pp. 637-655, 1971.
[21] J. D. Tardelli and E. W. Kreamer, “Vocoder Intelligibility and Quality Test Methods,” IEEE ICASSP, Vol. 2, pp. 1145-1148, 1996.
[22] ITU-T, “Methods for Subjective Determination of Transmission Quality,” Int. Telecommunication Unit, 1996.
[23] T. Tremain, “The Government Standard Linear Predictive Coding Algorithm (LPC-10),” Speech Technology, Vol. 1, pp. 40-49, 1982.
[24] P. Kroon, E. F. Deprettere, and R. J. Sluyter, “Regualr-Pulse Excitation – A Novel Approach to Effective and Efficient MultiPulse Coding of Speech,” IEEE Trans. SP, Vol. ASSP-34, No. 5, pp. 1054-1063, Oct. 1986.
[25] T. Nomura, M. Iwadare, M. Serizawa, and K. Ozawa, “A Bitrate and Bandwidth Scalable CELP Coder,” IEEE ICASSP 98, Vol. 1, pp. 341-344, May 1998.
[26] N. Tanaka, T. Morii, K. Yoshida and K. Honma, “A Multi-Mode Variable Rate Speech Coder for CDMA Cellular Systems,” IEEE Vehicular Technology Conference 96, pp. 198-202, Apr. 1996.