| 研究生: |
翁明瀚 Weng, Ming-Han |
|---|---|
| 論文名稱: |
混合式互聯應用於多核心平台之效能評估 Evaluating the Performance of a Hybrid Interconnect in Many-Core Platform |
| 指導教授: |
陳中和
Chen, C. H. |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 電腦與通信工程研究所 Institute of Computer & Communication Engineering |
| 論文出版年: | 2012 |
| 畢業學年度: | 100 |
| 語文別: | 中文 |
| 論文頁數: | 77 |
| 中文關鍵詞: | 晶片網路 、快取一致性 、混合式互聯 、目錄式快取一致性協定 |
| 外文關鍵詞: | Cache Coherency, Directory-based Cache Coherence Protocol, Hybrid Interconnect, Network-on-Chip(NoC) |
| 相關次數: | 點閱:72 下載:1 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
傳統上多核心系統是以匯流排(bus)作為互聯(interconnect),它的傳遞延遲較低。但是它一次只能服務一個核心(core),隨著core的數量越來越多,它便開始成為效能上的瓶頸。因此,有人提出晶片網路(Netwok on Chip, NoC)作為多核心系統的interconnect。由於以NoC為interconnect的多核心系統廣播(broadcast)較為困難,無法以傳統bus上的偵聽式協定(snoop-based Protocol)去維持快取一致性(cache coherency)。因此,在NoC上多採用目錄式協定(directory-based protocol)去維持晶片上的cache coherency。
在本論文中,我們以SystemC語言實作NoC的時間近似模型(approximate-timed model),將其應用於多核心平台,並且在NoC上實作MESI directory-based cache coherence protocol,並且以此平台針對SPLASH-2驗證程式(benchmark)進行分析。此外,我們提出了一種混合式互聯(hybrid interconnect)以減少NoC上的traffic。我們以SPLASH-2對此hybrid interconnect進行效能評估,實驗結果顯示,相較於8核、16核,以及32核的純NoC-based interconnect,traffic分別可以減少53%,45%與39%。
Traditionally, a bus interconnect is used as the interconnect in a multi-core system due to its low transmission delay. However, it can only serve one master core at a time, and thus becomes a bottleneck when the number of cores increases in the system. Therefore, Network-on-Chip (NoC) is proposed as the interconnect of a many-core platform. In a NoC-based many-core system, broadcast is costly, so we can’t apply the snoop-based cache coherence protocol to maintain the cache coherency. For this reason, a directory-based cache coherence protocol is commonly used for on-chip data coherency.
In this thesis, we use SystemC to implement an approximate-timed model of NoC, and realize the MESI directory-based cache coherence protocol. In addition, we develop a hybrid interconnect by clustering a small number of cores using bus to reduce the traffic in the NoC. We evaluate it by executing the SPLASH-2 Benchmarks. According to the experimental results, the hybrid internconet can reduce the traffic in the NoC by 53%, 45%, and 39% respectively with 8, 16, and 32 cores compared to the baseline without clustering.
[1] Partha Pratim Pande, Cristian Grecu, and Michael Jones, “Performance Evaluation and Design Trade-Offs for Network-on-Chip Interconnect Architectures,” IEEE Transactions on Computers, vol. 54, no. 8, pp. 1025-1040, Aug. 2005, doi:10.1109/TC.2005.134.
[2] Tobias Bjerregaard and Shankar Mahadevan, “A Survey of Research and Practices of Network-on-Chip,” ACM Computing Surveys, Vol. 38, March 2006, Article 1.
[3] Erno Salminen et al., “Benchmarking mesh and hierarchical bus networks in system-on-chip context,” Journal of Systems Architecture, pp. 477-488, August 2007.
[4] Mohammad Al-Fares, Alexander Loukissas, and Amin Vahdat” A Scalable, Commodity Data Center Network Architecture,” Proceedings of the ACM SIGCOMM 2008 conference on Data communication, August 17-22, 2008, Seattle, WA, USA.
[5] Cyriel Minkenberg and Mitch Gusat, IBM Research GmbH, Zurich,”Bidirectional Fat Tree Construction and Routing for IEEE 802.1au”.
[6] F. Karim, Anh Nguyen, and Sujit Dey, “An Interconnect Architecture for Networking Systems on Chips,” IEEE Micro, vol. 22, no. 5, pp. 36-45, Sept./Oct. 2002.
[7] John L. Hennessy and David A. Patterson,” Computer Architecture: A Quantitative Approach,”4th edition, 2006.
[8] Dennis Abts et al., “Achieving Predictable Performance through Better Memory Controller Placement in Many-Core CMPs,”ISCA, 2009.
[9] Ali Bakhoda, John Kim, and Tor M. Aamodt,”Throughput-Effective On-Chip Networks for Manycore Accelerators,” IEEE Micro, pp. 421-432, 2010.
[10] Stefanos Kaxiras and Georgios Keramidas,"SARC Coherence:Scaling Directory Cache Coherence in Performance and Power," IEEE MICRO, 2010.
[11] Jacob Leverich et al.,”Comparing Memory Systems for Chip Multiprocessors,” ISCA,2007.
[12] SystemC 2.0 User's Guide.
[13] S.-Y. Lee, “An Instruction Set Simulator with GDB Support and its Full System Simulation Virtual Platform,” 2010 master thesis of National Cheng Kung University, Tainan, Tuaiwan, Jl. 2010.
[14] Evgeny Bolotin et al.,” The Power of Priority: NoC based Distributed Cache Coherency,” International Symposium on Networks-on-Chip,2007.
[15] Seth H. Pugsley, et al., "SWEL: Hardware Cache Coherence Protocols to Map Shared Data onto Shared Caches", PACT'10, September 11.15, 2010, Vienna, Austria.
[16] C.-T. Liu, “CASL Hypervisor and its Full System Virtualization Platform,” 2012 master thesis of National Cheng Kung University, Tainan, Tuaiwan, Jl. 2012.