簡易檢索 / 詳目顯示

研究生: 林威成
Lin, Wei-Cheng
論文名稱: 嵌入式系統平台上視訊解碼器之記憶體存取最佳化技術
The Memory Access Optimization Techniques for Video Decoders on Embedded System Platforms
指導教授: 郭致宏
Kuo, Chih-Hung
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2009
畢業學年度: 97
語文別: 英文
論文頁數: 102
中文關鍵詞: 視訊解碼器基因演算法嵌入式系統軟體優化
外文關鍵詞: Genetic Algorithms, Source Code Optimization, Embedded System, Video Decoders
相關次數: 點閱:76下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本篇論文主要研究視訊解碼器移植到嵌入式系平台後,處理器對外部記憶體存取次數達到最少的軟體優化技術。我們以H.264/AVC和VC-1等視訊解碼器和內嵌有ARM處理器的嵌入式系統平台為例,來探討此嵌入式軟體的優化技術。
    將視訊解碼器移植到嵌入式系統平台後,視訊解碼的效能不佳。在視訊解碼器軟體運算過程中,太多的外部記憶體存取動作是主要的問題點。為了解決這個問題,我們提出數種軟體優化技術來優化H.264/AVC及VC-1視訊解碼器的視訊解碼流程。而且,我們也運用了基因演算法的搜尋法則來搜尋出H.264/AVC De-blocking Filter中的近似最佳運算流程順序,此種近似最佳運算流程更能使外部記憶體的存取次數達到最低。我們所提出的軟體優化技術技巧不僅能充分運用處理器內的暫存器做複雜的數值運算,更能減少處理器對外部記憶體的存取次數。
    根據軟體模擬結果,以及移植到嵌入式系統開發板後的結果顯示,我們所提出的記憶體存取最佳化技術能夠明顯的改善嵌入式系統上之視訊解碼器的效能。

    In this thesis, we focus on the software optimization techniques in reducing the memory accesses after video decoders are implemented on embedded system platforms. We take H.264/AVC, VC-1 video decoders and the embedded system platforms with ARM processors for example to illustrate the software optimization techniques. The video decoding performance is not as good as expected after the video decoders are implemented on embedded system platforms. Too much memory access is the major problem. To solve this kind of problem, we propose several software optimization techniques to optimize the decoding flow of H.264/AVC and VC-1 video decoders. Moreover, we use genetic algorithms to search for the near optimal filtering order of H.264/AVC de-blocking filter. This kind of near optimal filtering order can reduce the external memory access in the de-blocking filter. The proposed optimization techniques above not only can fully utilize the registers to do the complicated computations, but also can reduce the memory access in the decoding flow of video decoders. Simulation results and implementation results on embedded system platform show that our proposed software optimization techniques can really improve the decoding performance of video decoders on embedded system platforms.

    CONTENTS 中文摘要 I ABSTRACT II Acknowledgement III CONTENTS IV LIST OF TABLES VI LIST OF FIGURES VII Chapter 1 Introduction 1 1.1 Backgrounds and Motivation 1 1.2 Why Source Code Optimization at High-level languages? 2 1.3 Contributions 3 1.4 Organization of Thesis 4 Chapter 2 Review of H.264/AVC and VC-1 Decoders 5 2.1 H.264/AVC Decoder 5 2.1.1 H.264/AVC Inverse Discrete Cosine Transform 6 2.1.2 H.264/AVC Inter Prediction 7 2.1.3 H.264/AVC Deblocking Filter 11 2.2 VC-1 Decoder 15 2.2.1 VC-1Inverse Transformation 16 2.2.2 VC-1 Interpolation 17 2.2.3 VC-1 Deblocking Filter 21 Chapter 3 Review of Optimization Techniques for Embedded Software 26 3.1 Methodology to Reduce Memory Access for the Embedded System Platforms 26 3.2 Optimization on the Loop-Dominated Data Flow and Expression at the Source-Code-Level 30 Chapter 4 Efficient Optimization Techniques to Reduce Memory Access on Video Decoder 39 4.1 Profiling 39 4.2 Software Optimization on Inverse Transformation 40 4.2.1 Packetization on H.264/AVC IDCT 40 4.2.2 Optimization on VC-1 Inverse Transformation 46 4.3 Software Optimization on Interpolations 51 4.3.1 Optimization on H.264/AVC Interpolation 51 4.3.2 Optimization on VC-1 Interpolation 67 4.4 Software Optimization on De-blocking Filter 60 4.4.1 Optimization on H.264/AVC De-blocking Filter 60 4.4.2 Optimization on VC-1 De-blocking Filter 61 4.5 Optimization on the Decoding Flow of H.264/AVC Decoder 62 Chapter 5 Memory Access Optimization for H.264/AVC De-blocking Filter Using Genetic Algorithms 65 5.1 Introduction to Genetic Algorithms 65 5.1.1 Selection 66 5.1.2 Crossover 67 5.1.3 Mutation 68 5.2 Proposed Genetic Algorithms 69 5.2.1 Chromosome Representation in the Proposed GAs 69 5.2.2 Definition of Fitness Function in the Proposed GAs 72 5.2.3 Definition of Genetic Operators in the Proposed GAs 75 5.2.3.1 Crossover in the Proposed GAs 75 5.2.3.2 Mutation in the Proposed GAs 76 5.3 Simulation Result of the Proposed GAs 78 Chapter 6 Experimental Results 81 6.1 Introduction to Embedded System Platforms 81 6.1.1 XScale PXA255 Embedded System Platforms 81 6.1.2 Cheetah Development Kits 82 6.2 The Simulation Results of Optimized H.264/AVC Decoder 84 6.3 The Simulation Results of Optimized VC-1 Decoder 91 6.4 Comparison 97 Chapter 7 Conclusion and Future Work 98 7.1 Conclusion 98 7.2 Future Work 100 Reference 101

    [1] T. Van Achteren, G. Deconinck, F. Catthoor, and R. Lauwereins “Data Reuse Exploration Techniques for Loop-dominated Applications” in Proc. of the 2002 Design, Automation and Test in Europe Conference and Exhibition
    [2] T. Van Achteren, R. Lauwereins, and F. Catthoor. “Systematic data reuse exploration methodology for irregular access patterns”. In Proc. of IEEE/ACM 13th International Symposium on System Synthesis (ISSS), pages 115-121, Madrid, Spain, Sept 2000.
    [3] S.Wuytack, J. Diguet, F. Catthhoor, and H.De Man “Formalized Methodology for Data Reuse Exploration for Low-Power Hierarchical Memory Mappings.” IEEE Trans. on VLSI Systems, 6(4):529-537, Dec.1998
    [4] T. Wiegand, G.J. Sulivan, G. Bjntegaard, A. Luthra, “Overview of the H.264/AVC video coding standard,” IEEE Trans. Circuits Syst. Video Technol., vol. 13, Issue 7, pp.560–576, July 2003.
    [5] Iain E. G Richardson, H.264 and MPEG-4 video compression: Video Coding for Next-generation Multimedia, Aberdeen, UK, John Wiley, 2003.
    [6] Proposed SMPTE Standard for Television: “VC-1 Compressed Video Bitstream Format and Decoding Process,” 2005-08-23
    [7] M. Kandemir, “A Compiler-Based Approach for Improving Intra-Iteration Data Reuse” in Proc. Design, Automation and Test in Europe Conference and Exhibition (DATE), pp. 984-990, Paris
    [8] Heiko Falk, Cedric Ghez, Miguel Miranda and Rainer Leupers, “High-level control flow transformations for performance improvement of address-dominated multimedia applications,” in Proc. 11th Workshop on Synthesis and Syst. Integration of Mixed Information Technologies (SASIMI 2003), Hiroshima / Japan, April 2003.
    [9] Heiko Falk, “Control flow optimization by loop nest splitting at the source code level,” in Proc. of the Design, Automation and Test in Europe Conference and Exhibition (DATE’03), Munich, Germany, March 2003.
    [10] Heiko Falk, “Control flow driven code hoisting at the source code level,” in Proc. 3rd Workshop on Optimizations for DSP and Embedded Systems (ODES) In conjunction with The International Symposium on Code Generation and Optimization (CGO), San Jose, United States, March 2005.
    [11] Heiko Falk, Jens Wagner, Andre Schaefer, “Use of a bit-true data flow analysis for processor-specific source code optimization,” in Proc. 4th IEEE Workshop on Embedded Systems for Real-Time Multimedia (ESTIMedia), Seoul, Korea, October 2006.
    [12] D. E. Goldberg, Genetic Algorithms in Search, Optimization, and machine learning, Reading, MA: Addison-Wesley, 2001.
    [13] R. L. Haupt and S. E. Haupt, Practical Genetic Algorithms, New York: Wiley, 2004, pp. 51-93.
    [14] C.L Valenzuela, and P.Y. Wang, “VLSI placement and area optimization using a genetic algorithm tobreed normalized postfix expressions” IEEE Trans. Evolutionary Computation, Vol. 6, pp.390-401, Aug 2002
    [15] H.Oh, and S. Ha, “Fractional Rate Dataflow Model for Efficient Code Synthesis” J. VLSI Signal Processing, Vol. 37, pp.41-51, May 2004
    [16] H.Oh, and S. Ha, “Fractional Rate Dataflow Model and Efficient Code Synthesis for Multimedia Applications” ACM SIGPLAN Notice Vol. 37, pp.12-17, June 2002
    [17] S.-H. Wang, W.-H. Peng, Y. He, G.-Y. Lin, C.-Y. Lin, S.-C. Chang, C.N. Wang, and T. Chiang, “A platform based MPEG-4 advanced video coding decoder with block level pipelining,” in Proc. Int. Conf. Multimedia (ICICS-PCM 2003), Singapore, Nov. 2003.
    [18] Moshe Y. and Peleg N., ”Implementations of H.264/AVC baseline decoder on different digital signal processors,” in Proc. 47th Int. Symp. Multimedia Sys. Applica. (ELMAR 2005), Zadar, Croatia, June 2005, pp. 37-40.
    [19] Y.S Tung, S.W Wang, C.W Tsai, Y.T Yang, and J.L Wu, “DSP-Based Multi-Format Video Decoding Engine for Media Adapter Applications”, IEEE Trans. on Consumer Electronics, Volume 51, Issue 1. pp. 273-280, Feb. 2005.
    [20] J. Lee, S.K Moom, W.Y Sung, “H.264 Decoder optimization Exploiting SIMD Instructions” in Asia-Pacific Conf. on Circuits and Systems, Dec. 2004.
    [21] G.A. Jian, and J.I Guo, “Low Complexity Multi-Standard Video Player For Portable Multimedia Applications” in IEEE International Conference on Multimedia and Expo, pp. 7-7, July 2007.

    下載圖示 校內:2011-08-17公開
    校外:2011-08-17公開
    QR CODE