| 研究生: |
郭漢衿 Kuo, Hang-Chin |
|---|---|
| 論文名稱: |
適用於RISC處理器之歷程快取記憶體架構 Trace Reuse Cache for RISC Processor Architecture |
| 指導教授: |
陳中和
Chen, Chung-Ho |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 電腦與通信工程研究所 Institute of Computer & Communication Engineering |
| 論文出版年: | 2012 |
| 畢業學年度: | 100 |
| 語文別: | 中文 |
| 論文頁數: | 38 |
| 中文關鍵詞: | 歷程快取記憶體 、功率效能 、精簡指令集處理器 |
| 外文關鍵詞: | Trace Reuse Cache, power consumption, RISC |
| 相關次數: | 點閱:97 下載:5 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
針對RISC CPU執行時的效能之改進與降低功率消耗,本論文提出利用新的指令快取記憶體機制並配合內嵌於CPU 指令遞送級的支援電路讓指令的使用得到更佳的效果,並對整個系統架構作性能分析。
本架構採用ARM處理器為基礎的RISC CPU,配合低功率指令遞送機制的歷程快取記憶體(Trace Reuse cache)的小幅度修改,讓處理器能夠在不增加過多面積的情況下支援本架構所採用的處理器指令運送機制,達到降低指令快取記憶體的能源耗用率,並盡可能提昇執行效能的目標。整體的架構將透過EDA設計工具來測得效能改善率,並與原始架構作比較並分析優劣,取得較為準確的實驗數據。
根據實驗結果,使用歷程快取記憶體搭配傳統指令快取記憶體的系統相較於一般只使用傳統指令快取記憶體的系統,能夠達到更好的表現效能,並且為降低整個系統的能源耗用率方面做出進一步的貢獻。除了提供設計者能夠有更多選項外,也讓整體系統的功能性有提昇的空間。
In this thesis, we propose a new mechanism named Trace Reuse Cache (TRC) for instructions delivery to reduce power consumption of RISC processor architecture. With additional circuit in CPU fetch stage and TRC, it is possible to choose suitable instructions in the system for reuse, and to get better performance. We also analyze the whole system and evaluate the IPC, hit rate, and power consumption.
Based on an ARM-compatible 5-stage RISC core, we attach the TRC to the cache memory system and modify part of the pipeline architecture for supporting TRC instruction delivery with small area overhead. The purpose is to lower the power consumption of cache itself and benefit the instruction fetching performance.
Experimental result shows that there is less power dissipation in our work than traditional case only with an instruction cache. Moreover, TRC provides the designer with an additional option, and has possibility to enhance system.
[1] Yi-Ying Tsai and Chung-Ho Chen, “Energy-Efficient Trace Reuse Cache for Embedded Processors,” IEEE Transactions on VLSI, Vol. 19, NO.9, September 2011, pp.1681-1694.
[2] C. Yang and A. Orailoglu, “Power-efficient instruction delivery through trace reuse,” Proceedings of the 15th International Conference on Parallel Architectures and Compilation Techniques, 2006, pp.192-201.
[3] A. Hossain, D. J. Pease, J. S. Burns, and N. Parveen, “Trace Cache Performance Parameters,” Proceedings of the 2002 IEEE International Conference on Computer Design, February 2002, pp.348-355.
[4] J. Kin, M. Gupta, and W. H. Magione-Simth, “Filter Cache: An Energy Efficient Memory Structure,” Proceedings of the 30th International Symposium on Microarchitecture, December 1997, pp.184-193.
[5] J. Kin, M. Gupta, and W. H. Magione-Simth, “Filtering memory references to increase energy efficiency,” IEEE Transaction on Computers, January 2000, Vol.49, pp.1-15.
[6] N. Bellas, I. Hajj, C. Polychronopoulos, and G. Stamoulis, “Energy and Performance Improvements in Microprocessor Design using a loop cache,” Proceedings of the International Conference on Computer Design, October 1999, pp.378-383.
[7] HP Labs, “CACTI: An integrated cache and memory access time, cycle time, area, leakage, and dynamic power model,” http://www.hpl.hp.com/research/cacti/ .
[8] S. Thoziyoor, N. Muralimanohar, J. H. Ahn, and N. P. Jouppi, “CACTI 5.1,” Technical Report HPL-2008-20, HP Laboratories Palo Alto, April 2, 2008.
[9] Chung-Ho Chen, Chih-Kai Wei, Tai-Hua Lu and Hsun-Wei Gao, “Software-based Self-Testing with Multiple-Level Abstractions for Soft Processor Cores,” IEEE Transactions on VLSI Systems, May 2007, Vol.15, pp.505-517.
[10] M. R. Guthaus, J. S. Ringenberg, D. Ernst, T.M. Austin, T. Mudge, and R. B. Brown. “MiBench: A free, commercially representative embedded benchmark suite,” Proceedings of the IEEE 4th Annual Workshop on Workload Characterization, December 2001, pp.3-14.
[11] E. Rotenberg, S. Bennett, and J. E. Smith, “A Trace Cache Microarchitecture and Evaluation,” IEEE Transactions on Computers, February 1999, Vol. 48, Issue 2, pp.111-120.
[12] Chun-Hung Lai, Fu-Ching Yang, and Ing-Jer Huang, “A Trace-Capable Instruction Cache for Cost-Efficient Real-Time Program Trace Compression in SoC,” IEEE Transactions on Computers, December 2011, Vol. 60, Issue 12, pp.1665-1677.
[13] Filipa Duarte, and Stephan Wong, “Cache-Based Memory Copy Hardware Accelerator for Multicore Systems,” IEEE Transactions on Computers, November 2010, Vol. 59, Issue 11, pp. 1494-1507.
[14] Stephan Wong , Filipa Duarte, and Stamatis Vassiliadis, “A Hardware Cache memcpy Accelerator,” Field Programmable Technology, 2006. FPT 2006. IEEE International Conference, December 2006, pp. 141-148.
[15] Stamatis Vassiliadis, Filipa Duarte, and Stephan Wong, “A Load-Store Unit for a memcpy Hardware Accelerator,”Field Programmable Logic and Applications, 2007. FPL 2007. International Conference, August 2007, pp. 537-541.
[16] Ji Gu, Hui Guo, and Patrick Li, “ROBTIC An On-Chip Instruction Cache Design for Low Power Embedded Systems,”IEEE Conferences on RTCSA, August 2009, pp.419-424.