成功大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	林鼎原 Lin, Ding-Yuan
論文名稱：	基於不同平行化層級之多核處理器架構研究與分析 Study and Analysis of Multi-Processor Architecture for Various Levels Parallelism
指導教授：	周哲民 Jou, Jer-Min
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 電機工程學系 Department of Electrical Engineering
論文出版年：	2015
畢業學年度：	103
語文別：	中文
論文頁數：	104
中文關鍵詞：	亂序執行、暫存器重命名、平行處理
外文關鍵詞：	Out-Of-Order Execution, Register Renaming, Parallelism
相關次數：	點閱：181 下載：8
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

傳統單處理器透過亂序執行、超純量、預測執行等技術來提高執行效能，但當時脈達到一定程度時會衍生出能量消耗以及散熱等問題，受限於記憶體存取延遲以及程式指令固有平行度(Instruction Level Parallelism)，現今處理器採用多執行緒技術(Thread Level Parallelism)，讓多個執行緒並行執行，從執行緒間發覺執行緒並行執行的潛力。在能耗等問題考量下，多處理器系統單晶片成為新一代設計主流趨勢。
本論文主要是針對不同平行層級，研究與分析處理器的架構與執行行為。在了解gem5模擬器的配置以及系統模擬方式後，可透過gem5模擬器快速完整地模擬出目標平台，gem5為一個週期時序準確性模擬的模擬器，可以模擬處理器每個週期管線之動作，使用MiBench做為目標平台的測試程式，利用模擬器先針對三種不同處理器架構進行效能評估並分析整個系統發生效能瓶頸可能的原因，並針對問題點做對應的改進與修正。
接著研究分析控制處理器的硬體架構與執行行為。控制處理器負責在執行時動態地分析及紀錄任務之間的相依性，將可以獨立執行的任務萃取出來，並動態地分配給底層空閒的處理單元processing Unit (PU)平行執行。最後比較指令層級與任務層級處理器架構間的差異性。

Traditional single-core processors use out-of-order execution, superscalar, speculative execution and other techniques to improve performance. Clock speed is not the answer when it comes to energy consumption and heat dissipation. Limited by memory access latency and inherent parallelism of the program instructions, known as instruction level parallelism. Modern processors use multithreading techniques, which allows us to perform concurrent processing, and find the potential parallelism between threads (Thread level Parallelism), even if there’s only one single-core processor. In consideration of energy efficiency, the trend of processor design toward single chip multi-processor.

In this thesis, we study and analysis of multi-processor architecture for various levels parallelism. By using gem5 which is a cycle accurate simulator simulates pipeline stages cycle by cycle, we can configure and simulate the target platform as soon as possible. We have shown that the performance of MiBench applications running with out-of-order processor is much faster than those running with in-order and non-pipelined processors.

In addition, we also study and analysis task level control processor hardware architecture and its behavior. The control processor keeps tacking dependencies between tasks, automatically extracts parallelism among coarse-grain tasks and schedules them for execution on underlying processors. In the end, we have made a comparison between instruction and task level processors architecture.

摘要		III
ABSTRACT	IV
誌謝		VIII
目錄		IX
表目錄		XI
圖目錄		XII
第1章 緒論	1
1	研究背景	1
2	研究動機與目的	2
3 論文架構	3
第2章 背景知識與相關研究	4
1 平行化層級的探討	4
1.1 指令層級平行	4
1.2 執行緒層級平行	7
1.3 任務與資料層級平行	8
2 平行計算架構	9
2.1 單處理器系統	10
2.2 多處理器系統	12
2.3 平行計算機分類	15
3 處理器模擬工具相關研究	16
3.1 VMware	18
3.2 QEMU	19
3.3 SimpleScalar	21
3.4 Gem5 Simulator	22
第3章 Gem5計算機系統模擬架構設計概念	24
1 Gem5系統架構解析	24
1.1 CPU Model	26
1.2 Supported ISAs	28
1.3 Memory Systems	29
1.4 Timebuffers	31
2 Gem5 系統建立	33
2.1 目標系統配置	34
2.2 系統執行流程	36
3 Gem5 指令層級O3CPU架構研究與分析	38
3.1 Instruction Fetch	40
3.2 Instruction Decode	41
3.3 Register Rename	44
3.4 Issue/Writeback/Execute	48
3.5 Instruction Commit	53
3.6 Backwards Communication	54
第4章 Task Level Control Procesor架構研究與分析	57
1 MLCA系統架構設計概念	57
2 Control Processor硬體架構單元概觀	59
3 Control Processor前端處理	60
3.1 Fetch unit	60
3.2 Decode and Rename unit	63
3.3 Dispatch unit	68
4 Control Processor排程調度與執行	70
4.1 Task Queue	70
4.2 Task Scheduler	72
4.2.1 Wakeup unit	73
4.2.2 Ready Pool、Select and Assign unit	77
4.3 Issue unit	78
5 Control Processor後端寫回與提交	80
5.1 Writeback	80
5.2 Retire	82
6 指令與任務層級處理器的比較	84
第5章 實驗環境與數據分析	85
1 實驗環境	85
2 處理器架構模擬流程	86
2.1管線模擬方式	87
3 測試程式	88
4 實驗數據與結果分析	90
第6章 結論與未來展望	101
1 結論	101
2 未來展望	102
參考文獻	103
                                    

[1] Chuck Moore, "DATA PROCESSING IN EXASCALE-CLASS COMPUTER SYSTEMS", The Salishan Conference on High Speed Computing, 2011.
[2] Hammond, L., Hubbert, B., Siu, M., Prabhu, M. K., Chen, M., & Olukolun, K. "The Stanford hydra cmp". Micro, IEEE, 20(2), 71-84, 2000.
[3] Place, One AMD. "AMD SimNow™ Simulator." ,2004.
[4] Bohrer, P., Peterson, J., Elnozahy, M., Rajamony, R., Gheith, A., Rockhold, & Zhang,
L."Mambo: a full system simulator for the PowerPC architecture".ACM SIGMETRICS Performance Evaluation Review, 31(4), 8-12, 2004.
[5] Bellard, "QEMU, a Fast and Portable Dynamic Translator.", In USENIX Annual Technical Conference, FREENIX Track (pp. 41-46), 2005.
[6] Austin, Todd, Eric Larson, and Dan Ernst. "SimpleScalar: An infrastructure for computer system modeling." Computer 35.2: 59-67, 2002.
[7] Binkert, N., Beckmann, B., Black, G., Reinhardt, S. K., Saidi, A., Basu, A., & Wood, D. A. "The gem5 simulator". ACM SIGARCH Computer Architecture News, 39(2), 1-7, 2011.
[8] Binkert, N. L., Dreslinski, R. G., Hsu, L. R., Lim, K. T., Saidi, A. G., & Reinhardt, S. K. "The M5 simulator: Modeling networked systems". IEEE Micro, (4), 52-60, 2006.
[9] Martin, Milo MK, et al. "Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset." ACM SIGARCH Computer Architecture News 33.4, 2005
[10] ISA Support Matrices: http://www.m5sim.org/Status_Matrix.
[11] Tomasulo, Robert M. "An efficient algorithm for exploiting multiple arithmetic units." IBM Journal of research and Development 11.1: 25-33, 1967.

[12] Karim, F., Mellan, A., Nguyen, A., Aydonat, U., & Abdelrahman, T. "A multilevel computing architecture for embedded multimedia applications". Micro, IEEE, 24(3), 56-66, 2004.
[13] Abdelrahman, Tarek, et al. "The MLCA: a solution paradigm for parallel programmable SoCs." IEEE North-East Workshop on Circuits and Systems. 2006.
[14] Capalija, D., & Abdelrahman, T. S. "Microarchitecture of a coarse-grain out-of-order superscalar processor". Parallel and Distributed Systems, IEEE Transactions on, 24(2), 392-405, 2013.
[15] Control processor : http://www.eecg.toronto.edu/~davor/MLCA/
[16] Guthaus, M. R., Ringenberg, J. S., Ernst, D., Austin, T. M., Mudge, T., & Brown, R. B. "MiBench: A free, commercially representative embedded benchmark suite". In Workload Characterization, WWC-4. 2001 IEEE International Workshop on (pp. 3-14). IEEE, 2001.
[17] MiBench Version 1.0: http://wwweb.eecs.umich.edu/mibench/

2020-08-26公開

簡易檢索 / 詳目顯示

相關論文