簡易檢索 / 詳目顯示

研究生: 羅健倫
Lo, Chien-Lun
論文名稱: 針對混合最後級緩存之記憶體到最後級緩存的自適應放置策略
AMLP: Adaptive Memory-to-LLC Placement Policy for Hybrid Last-Level Cache
指導教授: 林英超
Lin, Ing-Chao
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2019
畢業學年度: 107
語文別: 英文
論文頁數: 40
中文關鍵詞: 自旋轉移力矩隨機存取記憶體混合式快取記憶體多核心架構非對稱式記憶體最後級緩存失效率記憶體到最後級緩存放置策略
外文關鍵詞: Spin-Transfer Torque Magnetoresistive Random Access Memory(STT-MRAM), Hybrid Cache, Multi-Core System, Asymmetric Memory, LLC Miss Rate, Memory-to-LLC Placement Policy
相關次數: 點閱:143下載:8
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在先進奈米製程下,SRAM 面臨密度小、leakage power 大的問題。因此,Emerging Non-Volatile Memories (NVM) 已經被廣泛討論為傳統SRAM-based Last-Level- Cache (LLC) 的替換品。在NVM 中,Spin-Transfer Torque Magnetoresistive Random Access Memory (STT-MRAM) 是最被受期待可以取代SRAM 的記憶體。因為STT-MRAM 的read overhead 與SRAM 相近,而且它可以提供更低的leakage power 與更高的密度。但是STT-MRAM 有一個缺點, 就是它write overhead 很 大。為了讓LLC 可以擁有large capacity 與low write overhead 等特點,因此hybrid SRAM/STT-MRAM 的LLC architecture 就被提出,讓LLC 可以擁有STT-MRAM 的大容量的優點與SRAM 的low write overhead 的優點。Hybrid LLC 要表現出它的優點必須依賴Intelligent Block Placement Policy,將write intensive block 擺在SRAM、將read intensive block 擺在STT-MRAM。目前很多研究已經在探討hybrid LLC 的Migration Policy 和Replacement Policy,但是hybrid LLC 的Memory-to-LLC Placement Policy 卻沒有被詳細的探討。在本篇論文中,我們探討Memory-to-LLC Placement Policy , 它決定每一個從main memory 來的block 應該被擺進hybrid LLC 中的SRAM 或是STT-MRAM。在先前的hybrid LLC work 中, 它們都沒有分析cache 的access behaviors 就直覺性的將block 從main memory 擺進SRAM 或STT-MRAM。這種做法無法發揮hybrid LLC 大容量與low write cost 的優點。此外,我們觀察quad-core processor system 而且4 個CPU 跑4 支不同的benchmark 的access 行為。我們觀察到每一個CPU 的access behavior 會隨著模擬時間有所不同。因此,我們提出Adaptive Memory-to-LLC Placement Policy(AMLP)。AMLP 根據每一個CPU 在不同 的 time phase 的 access behavior 分配 block 到 SRAM 或 STT-MRAM。EPI(energy per instruction) 實驗顯示AMLP 分別在Read-Write Aware Region-Based Hybrid Cache Architecture(RWHCA)、Access-Aware Policies(AAP) 、Adaptive Placement and Migration Policy(APM)、Repetitive Access Aware Placement and Migration(RAP) 中減少分別14.28%, 7.63%, 3.79%, 7.61%。

    In modern nano-scale manufacturing, SRAM faces the dual problems of low density, and high leakage power. In light of this, Emerging Non-Volatile Memories (NVM) has been widely discussed as a potential replacement for the traditional SRAM-based Last-Level-Cache (LLC). In NVM, Spin-Transfer Torque Magnetoresistive Random Access Memory(STT-MRAM) is the most anticipated material to replace SRAM, due to the read overhead of STT-MRAM being similar to SRAM, while providing higher density and lower leakage power. However, STT-MRAM has a disadvantage that its write overhead is quite high. So, in order for LLC to have both large capacity and low write overhead, a hybrid SRAM/STTMRAM LLC architecture is proposed, giving LLC the large capacity of STT-MRAM as well as the low write overhead of SRAM. For hybrid LLC to express its advantages, it must rely on Intelligent Block Placement Policy, which places write intensive blocks in SRAM, and read intensive blocks in STT-MRAM. Currently, much research is being done on a Migration Policy and a Replacement Policy of hybrid LLC, but none have investigated on Memory-to-LLC Placement Policy so far. In this thesis, we explore Memory-to-LLC Placement Policy, which decides whether to place blocks from main memory into a hybrid LLC’s SRAM or STT-MRAM. In previous works on hybrid LLC, the access behaviors of cache were not analyzed, and blocks were intuitively placed from main memory into either SRAM or STTMRAM. This method cannot show the advantages in large capacity and low write cost that hybrid LLC has. Additionally, we observe the access behavior of quad-core processors with 4 CPUs running four different benchmarks. We find that each CPU’s access behavior changes depending on simulation time. Therefore, we propose an adaptive memory-to-LLC placement policy (AMLP). AMLP analyzes whether a block should be placed in SRAM or STTMRAM based on each CPU’s access behavior at different time phases. Energy Per Instruction (EPI) experiments show AMLP reduces Read-Write Aware Region-Based Hybrid Cache Architecture(RWHCA), Access-Aware Policies(AAP), Adaptive Placement and Migration Policy(APM), Repetitive Access Aware Placement and Migration(RAP) by 14.28%, 7.63%, 3.79%, and 7.61% respectively.

    摘要i Abstract iii Table of Contents v List of Tables vii List of Figures viii Chapter 1. Introduction 1 1.1 Background 1 1.2 Main Contributions 4 1.3 Thesis Organization 6 Chapter 2. Preliminaries and Motivation 7 2.1 STT-MRAM Fundamentals 7 2.2 Reuse Distance 9 2.3 Hybrid SRAM/STT-MRAM LLC Policies 10 2.4 Motivation Example 11 2.4.1. Write Block Distribution 12 2.4.2. Short/Medium Reuse Distance Block Distribution 13 2.4.3. Summary 14 Chapter 3. Adaptive Memory-to-LLC Placement Policy 16 3.1 Placement Counter and Placement Prediction Table 17 3.2 Adaptive Memory-to-LLC Placement Policy 18 Chapter 4. EVALUATION SETUP AND RESULTS 21 4.1 Evaluation Setup 21 4.2 Short/medium Reuse Distance Block Distribution 25 4.3 Evaluation Results 26 4.3.1. Reductions of LLC Energy and STT-MRAM Writes 26 4.3.2. Performance on IPC and LLC Miss Rate 28 4.3.3. Energy Per Instruction (EPI) 31 4.4 Sensitivity Studies 32 4.4.1. Threshold Settings 32 4.4.2. 8-Core System 33 4.4.3. STT-MRAM Size 35 4.5 Hardware Overhead 36 Chapter 5. Conclusion 38 Reference 39

    [1] J. Ahn, S. Yoo, and K. Choi. “prediction hybrid cache: An energy-efficient stt-ram cache architecture”. IEEE Transactions on Computers, pages 940–951, Mar. 2016.
    [2] N. Binkert, B. Beckmann, G. Black, S.-K. Reinhardt, A. Saidi, A. Basu, J. Hestness, D.-R. Hower, T. Krishna, S. Sardashti, R. Sen, K. Sewell, M. Shoaib, N. Vaish, M.-D. Hill, and D.-A. Wood. “the gem5 simulator”. SIGARCH Computer Architecture News, pages 1–7, Aug. 2011.
    [3] D. Chen and Z. Yutao. “reuse distance analysis”. In Technical Report UR-CS-TR-741, University of Rochester, pages 1–13, Feb. 2001.
    [4] H. Y. Cheng, J. Zhao, J. Sampson, M. J. Irwin, A. Jaleel, Y. Lu, and Y. Xie. “lap: Loop- block aware inclusion properties for energy-efficient asymmetric last level caches”. In ACM/IEEE Annual International Symposium on Computer Architecture, pages 103– 114, Jun. 2016.
    [5] O. Coi, G. Patrigeon, S. Senni, L. Torres, and P. Benoit. “a novel sram —stt-mram hy- brid cache implementation improving cache performance”. In IEEE/ACM International Symposium on Nanoscale Architectures, pages 39–44, Jul. 2017.
    [6] X. Dong, C. Xu, Y. Xie, and N.-P. Jouppi. “nvsim: A circuit-level performance, energy, and area model for emerging nonvolatile memory”. IEEE Transactions on Computer- Aided Design of Integrated Circuits and Systems, pages 994–1007, Jul. 2012.
    [7] X. Wu et al. “power and performance of read-write aware hybrid caches with non- volatile memories”. In Design, Automation Test in Europe Conference Exhibition, pages 737–742, Apr. 2009.
    [8] J.-L. Henning. “spec cpu2006 benchmark descriptions”. SIGARCH Computer Archi- tecture News, pages 1–17, Sep. 2006.
    [9] M. Imani, S. Patil, and T. Rosing. “low power data-aware stt-ram based hybrid cache architecture”. In International Symposium on Quality Electronic Design, pages 88–94, Mar. 2016.
    [10] Z. Jie, K. Miryeong, P. Chanyoung, J. Myoungsoo, and K. Songkuk. “ROSS: A design of read-oriented stt-mram storage for energy-efficient non-uniform cache architecture”. In Workshop on Interactions of NVM/Flash with Operating Systems and Workloads, Nov. 2016.
    [11] Y. Joo and S. Park. “a hybrid pram and stt-ram cache architecture for extending the lifetime of pram caches”. IEEE Computer Architecture Letters, pages 55–58, Jul. 2013.
    [12] B.-C. Lee, E. Ipek, O. Mutlu, and D. Burger. “architecting phase change memory as a scalable dram alternative”. In Proceedings of the Annual International Symposium on Computer Architecture, pages 2–13, Jun. 2009.
    [13] J. Li and C.-J. Xue. “stt-ram based energy-efficiency hybrid cache for cmps”. In IEEE/IFIP International Conference on VLSI and System-on-Chip, pages 31–36, Oct. 2011.
    [14] I.-C. Lin and J. Chiou. “high-endurance hybrid cache design in cmp architecture with cache partitioning and access-aware policies”. IEEE Transactions on Very Large Scale Integration Systems, pages 2149–2161, Oct. 2015.
    [15] J.-Y. Luo. “rap: Reducing the energy of asymmetric hybrid last-level cache via repeti- tive access aware placement and migration”. In National Cheng-Kung University M.S. thesis, pages 1–51, Nov. 2017.
    [16] Z. Wang, D. A. Jiménez, C. Xu, G. Sun, and Y. Xie. “adaptive placement and migration policy for an stt-ram-based hybrid cache”. In IEEE International Symposium on High Performance Computer Architecture, pages 13–24, Feb. 2014.

    下載圖示
    2024-03-04公開
    QR CODE