| 研究生: |
陳威佐 Chen, Wei-Tso |
|---|---|
| 論文名稱: |
利用工作排程減少資源爭搶於混合草稿式記憶體和快取記憶體之多核心處理器 Reduce Resource Contention by Task Scheduling on Hybrid SPM and Cache MPSoC |
| 指導教授: |
蘇文鈺
Su, Wen-Yu |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
| 論文出版年: | 2012 |
| 畢業學年度: | 100 |
| 語文別: | 英文 |
| 論文頁數: | 44 |
| 中文關鍵詞: | 多核心處理器 、資源爭搶 、排程 、草稿式記憶體 |
| 外文關鍵詞: | MPSoC, Resource Contention, Scheduling, Scratchpad Memory |
| 相關次數: | 點閱:94 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
現今的多核心處理器開始採用混合快取及草稿式記憶體的架構。此類的多核心處理器已經廣泛被使用在智慧型手機和平板電腦。同時,為了達到高效能運算和核心的數目也不斷的在增加。隨著核心數目的增加,處理器內的資源爭搶也愈發嚴重,使得工作效能減低,多核心所帶來的好處也大打折扣。在這篇論文中,我們探討多核心系統中資源爭搶的問題。針對混合草稿式和快取式記憶體之多核心處理器,我們提出了減少資源爭搶之作業系統排程。對於使用草稿式記憶體為本地儲存的程式,我們利用離線(offline)分析的方法找到其特性;對於使用快取式記憶體為本地儲存的程式,我們在執行期利用匯流排監控器(bus monitor)來監視他。最後,作業系統排程會根據這兩種類型程式所提供的資訊來進行排程,避免資源爭搶的發生。模擬結果顯示在八核心的多核心處理器上,相對於不考慮資源爭搶的排程,我們可以有效的減少最多23.3% 的工作等待時間,並且減少工作執行時間最多13.5%。
Modern MPSoC adopts scratchpad memory (SPM) in conjunction with cache because SPM is efficient in area and power. So, such MPSoCs is widely used in smart phones and tablet PCs. In order to have more computing power, the number of cores in a single chip increases and inevitably results in resource contention. Once resource contention becomes severe, performance gained by the number of cores may suffer. In this thesis, we propose a contention-reduction scheduling (CRS) for hybrid SPM/cache MPSoC. Tasks which use SPM as the local storage are profiled in advance. For non-profiled tasks, cache is used. In addition, a bus monitor (BM) is designed to provide bus activity information. In runtime, our contention-reduction scheduler (CRSer) uses the profiling information as well as the bus information to relieve resources contention. Simulation results show that CRS performs better than existing scheduling algorithms without considering the contention. CRS successfully reduces waiting time up to 23.3% and the average execution time up to 13.5%.
REFERENCES
1 Zhuravlev, S., Blagodurov, S., and Fedorova, A.: ‘Addressing shared resource contention in multicore processors via scheduling’. Proc. Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems, Pittsburgh, Pennsylvania, USA2010 pp. Pages
2 Fedorova, A., Seltzer, M., Small, C., and Nussbaum, D.: ‘Performance of multithreaded chip multiprocessors and implications for operating system design’. Proc. Proceedings of the annual conference on USENIX Annual Technical Conference, Anaheim, CA2005 pp. Pages
3 Fedorova, A.: ‘Operating system scheduling for chip multithreaded processors’, Harvard University, 2006
4 Pham, D., Asano, S., Bolliger, M., Day, M.N., Hofstee, H.P., Johns, C., Kahle, J., Kameyama, A., Keaty, J., Masubuchi, Y., Riley, M., Shippy, D., Stasiak, D., Suzuoki, M., Wang, M., Warnock, J., Weitzel, S., Wendel, D., Yamazaki, T., and Yazawa, K.: ‘The design and implementation of a first-generation CELL processor’, in Editor (Ed.)^(Eds.): ‘Book The design and implementation of a first-generation CELL processor’ (2005, edn.), pp. 184-592 Vol. 181
5 Kalokerinos, G., Papaefstathiou, V., Nikiforos, G., Kavadias, S., Katevenis, M., Pnevmatikatos, D., and Xiaojun, Y.: ‘FPGA implementation of a configurable cache/scratchpad memory with virtualized user-level RDMA capability’, in Editor (Ed.)^(Eds.): ‘Book FPGA implementation of a configurable cache/scratchpad memory with virtualized user-level RDMA capability’ (2009, edn.), pp. 149-156
6 Banakar, R., Steinke, S., Bo-Sik, L., Balakrishnan, M., and Marwedel, P.: ‘Scratchpad memory: a design alternative for cache on-chip memory in embedded systems’, in Editor (Ed.)^(Eds.): ‘Book Scratchpad memory: a design alternative for cache on-chip memory in embedded systems’ (2002, edn.), pp. 73-78
7 http://lwn.net/Articles/230501/
8 Guthaus, M.R., Ringenberg, J.S., Ernst, D., Austin, T.M., Mudge, T., and Brown, R.B.: ‘MiBench: A free, commercially representative embedded benchmark suite’, in Editor (Ed.)^(Eds.): ‘Book MiBench: A free, commercially representative embedded benchmark suite’ (2001, edn.), pp. 3-14
9 Blagodurov, S., Zhuravlev, S., and Fedorova, A.: ‘Contention-Aware Scheduling on Multicore Systems’, ACM Trans. Comput. Syst., 2010, 28, (4), pp. 1-45
10 David, F.M., Carlyle, J.C., and Campbell, R.H.: ‘Context switch overheads for Linux on ARM platforms’. Proc. Proceedings of the 2007 workshop on Experimental computer science, San Diego, California2007 pp. Pages
11 Panda, P.R., Dutt, N.D., and Nicolau, A.: ‘Efficient Utilization of Scratch-Pad Memory in Embedded Processor Applications’. Proc. Proceedings of the 1997 European conference on Design and Test1997 pp. Pages
12 Nguyen, N., Dominguez, A., and Barua, R.: ‘Memory allocation for embedded systems with a compile-time-unknown scratch-pad size’. Proc. Proceedings of the 2005 international conference on Compilers, architectures and synthesis for embedded systems, San Francisco, California, USA2005 pp. Pages
13 Wehmeyer, L., Helmig, U., and Marwedel, P.: ‘Compiler-optimized usage of partitioned memories’. Proc. Proceedings of the 3rd workshop on Memory performance issues: in conjunction with the 31st international symposium on computer architecture, Munich, Germany2004 pp. Pages
14 Angiolini, F., Benini, L., and Caprara, A.: ‘An efficient profile-based algorithm for scratchpad memory partitioning’, Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on, 2005, 24, (11), pp. 1660-1676
15 Chen, Z.-H., and Su, A.W.Y.: ‘A hardware/software framework for instruction and data scratchpad memory allocation’, ACM Trans. Archit. Code Optim., 2010, 7, (1), pp. 1-27
16 Udayakumaran, S., Dominguez, A., and Barua, R.: ‘Dynamic allocation for scratch-pad memory using compile-time decisions’, ACM Trans. Embed. Comput. Syst., 2006, 5, (2), pp. 472-511
17 Udayakumaran, S., and Barua, R.: ‘An integrated scratch-pad allocator for affine and non-affine code’. Proc. Proceedings of the conference on Design, automation and test in Europe: Proceedings, Munich, Germany2006 pp. Pages
18 Egger, B., Lee, J., and Shin, H.: ‘Scratchpad memory management for portable systems with a memory management unit’. Proc. Proceedings of the 6th ACM & IEEE International conference on Embedded software, Seoul, Korea2006 pp. Pages
19 Park, S., Park, H.-w., and Ha, S.: ‘A novel technique to use scratch-pad memory for stack management’. Proc. Proceedings of the conference on Design, automation and test in Europe, Nice, France2007 pp. Pages
20 Dominguez, A., Udayakumaran, S., and Barua, R.: ‘Heap data allocation to scratch-pad memory in embedded systems’, J. Embedded Comput., 2005, 1, (4), pp. 521-540
21 Egger, B., Kim, S., Jang, C., Lee, J., Min, S.L., and Shin, H.: ‘Scratchpad Memory Management Techniques for Code in Embedded Systems without an MMU’, IEEE Trans. Comput., 2010, 59, (8), pp. 1047-1062
22 Hallnor, E.G., and Reinhardt, S.K.: ‘A fully associative software-managed cache design’, SIGARCH Comput. Archit. News, 2000, 28, (2), pp. 107-116
23 Marsan, M.A., Balbo, G., Conte, G., and Gregoretti, F.: ‘Modeling Bus Contention and Memory Interference in a Multiprocessor System’, Computers, IEEE Transactions on, 1983, C-32, (1), pp. 60-72
24 Rettberg, R., and Thomas, R.: ‘Contention is no obstacle to shared-memory multiprocessing’, Commun. ACM, 1986, 29, (12), pp. 1202-1212
25 Dhruba, C., Fei, G., Seongbeom, K., and Yan, S.: ‘Predicting inter-thread cache contention on a chip multi-processor architecture’, in Editor (Ed.)^(Eds.): ‘Book Predicting inter-thread cache contention on a chip multi-processor architecture’ (2005, edn.), pp. 340-351
26 al., W.E.E.e.: ‘Semaphore memory to reduce common bus contention to global memory with localized semaphores in a multiprocessor system’, 17 Sept 1991 1991
27 Yingxin, W., Yan, C., Pin, T., Haining, F., Yu, C., and Yuanchun, S.: ‘Reducing Shared Cache Contention by Scheduling Order Adjustment on Commodity Multi-cores’, in Editor (Ed.)^(Eds.): ‘Book Reducing Shared Cache Contention by Scheduling Order Adjustment on Commodity Multi-cores’ (2011, edn.), pp. 984-992
28 Verma, M., Wehmeyer, L., and Marwedel, P.: ‘Cache-Aware Scratchpad Allocation Algorithm’. Proc. Proceedings of the conference on Design, automation and test in Europe - Volume 22004 pp. Pages
29 Teng-Feng, Y., Chung-Hsiang, L., and Chia-Lin, Y.: ‘Cache-aware task scheduling on multi-core architecture’, in Editor (Ed.)^(Eds.): ‘Book Cache-aware task scheduling on multi-core architecture’ (2010, edn.), pp. 139-142
30 Chattopadhyay, S., and Roychoudhury, A.: ‘Static bus schedule aware scratchpad allocation in multiprocessors’, SIGPLAN Not., 46, (5), pp. 11-20
31 Sherwood, T., Perelman, E., and Calder, B.: ‘Basic Block Distribution Analysis to Find Periodic Behavior and Simulation Points in Applications’. Proc. Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques2001 pp. Pages
32 Shelepov, D.a.F., A.: ‘Scheduling on Heterogeneous Multicore Processors Using Architectural Signatures’, in Editor (Ed.)^(Eds.): ‘Book Scheduling on Heterogeneous Multicore Processors Using Architectural Signatures’ (2008, edn.), pp.
33 Austin, T., Larson, E., and Ernst, D.: ‘SimpleScalar: an infrastructure for computer system modeling’, Computer, 2002, 35, (2), pp. 59-67
34 Manjikian, N.: ‘Multiprocessor enhancements of the SimpleScalar tool set’, SIGARCH Comput. Archit. News, 2001, 29, (1), pp. 8-15
校內:2015-09-05公開