成功大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	洪洋 Hung, Yang
論文名稱：	估算MapReduce模型在GPU叢集下的程式執行時間 Estimation of Job Execution Time in MapReduce on GPU clusters
指導教授：	鄭憲宗 Cheng, Sheng-Tzong
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering
論文出版年：	2014
畢業學年度：	102
語文別：	英文
論文頁數：	47
中文關鍵詞：	GPU 、MapReduce 、隨機派翠網路、預估執行時間
外文關鍵詞：	GPU, MapReduce, Stochastic Petri Net, Estimation execution time
相關次數：	點閱：81 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

近期GPU的發展，一張GPU上擁有上百個核心，遠超過四核心、八核心的CPU，開啟高速計算的新方向，GPU加速運算提供前所未見的應用效能，能夠有效的處理平行運算，再加上同時雲端運算的蓬勃成長，使得GPU叢集的超高效能更快速的完成圖形處理或一般的應用。但是對開發者而言這是個新的挑戰，當開發出在GPU叢集上新的應用程式，面對MapReduce[1]上資源的管理以及GPU上的參數，常常讓開發者分心於對效能上的調校[2]，耗費精力在鑽研MapReduce和GPU運作的執行細節，分析程式的瓶頸。
本研究探討GPU加速運算在MapReduce多節點中各個階段的執行細節，利用隨機派翠網路來分析GPU加速運算在雲端計算上的影響，提出名為SPN-GC的模型。定義每個執行階段程式執行時間和延遲時間的計算公式，在短暫的時間內分析出應用程式在面對不同資料量時將需耗費的時間，讓開發者更快的找出時間瓶頸。
實驗結果將顯示GPU加速運算在雲端計算的預估性執行時間與不同資料量下的程式實際執行數據做比較，在不同輸入資料的平均誤差在百分之十以內，可作為開始者的在分析應用程式時做為數據的參考。

With the development of GPU, there are hundreds of cores on one GPU which is much more than a four-core or eight-core CPU. It starts new direction of high speed computing. GPU computing’s performance is much better than before. Cloud computing also developed rapidly, MapReduce [1] on GPU clusters can execute graphic processing or general purpose application even much faster. For developers, it is a new challenge. When they are writing new applications on GPU clusters, they have to manage the resources and parameters on GPU device. It will cost a lot of time on tuning performance [2] studying the details of MapReduce and GPU computing or finding the bottleneck of programs.
In this thesis, the execution details of MapReduce on GPU clusters is discussed precisely. We used Stochastic Petri Net [4] to analyze the influence of GPU computing and develop SPN-GC model. The model defines formulas of every stage’s execution time and respond the estimation execution time under different input data size immediately. It will help the developer to find out the time bottleneck.
The result of experiment shows out the comparison between the estimation execution time and actual test under different input data size. The error range is found out to be within 10%, and it can be a reference when developer is tuning the program.

摘  要	i
Abstract	ii
ACKNOWLEDGEMENT	iii
TABLE OF CONTENTS	iv
LIST OF TABLES	v
LIST OF FIGURES	vi
Chapter 1.	Introduction and Motivation	1
Chapter 2.	Background and Related Work	4
2.1.	MapReduce	4
2.1.1.	General Architecture of MapReduce	4
2.1.2.	Example of Map and Reduce	6
2.2.	Petri Nets	8
2.3.	GPU	10
2.3.1.	GPU architecture	10
2.3.2.	CUDA Programming Model	10
2.4.	Related Work	12
Chapter 3.	Analytical Model	14
3.1.	SPN-GC: Modeling MapReduce on GPU clusters	14
3.1.1.	Nine phases in SPN-GC	15
3.1.2.	Actions of tokens in SPN-GC	17
3.2.	Analysis with SPN and Problem in data processing	21
3.2.1.	Analysis with SPN	21
3.2.2.	Problem of Map function	22
Chapter 4.	Analysis of SPN-GC	25
4.1.	Nine phases of mean delay time formulas	25
4.2.	System environment	28
Chapter 5.	Performance Evaluation	31
5.1.	Validation of the SPN-GC	31
5.2.	Simulation Settings	32
5.2.1.	Experimental environment	32
5.2.2.	GPU computing benchmarks	34
5.3.	Results and Evaluation	37
Chapter 6.	Conclusions and Future Work	45
References	46

                                    

[1] J. Dean and S. Ghemawat, “MapReduce: Simplified Data Processing on Large Clusters,” OSDI ’04, pages 137–150, 2004.
[2] G. Wang, A. R. Butt, P. Pandey, K. Gupta, “A Simulation Approach to Evaluating design decisions in MapReduce setups,” IEEE Symposium, Modeling, Analysis & Simulation of Computer and Telecommunication Systems, pages1-11, 2009.
[3] T. White, “Hadoop: The Definitive Guide,” Chapter 6. How MapReduce Works, O’REILLY Media, 2009.
[4] Molloy, M.K, “Performance Analysis Using Stochastic Petri Nets,” IEEE Transactions, Computers, Volume: C-31, Issue: 9 on 1982.
[5] N. J. Dingle, W. J. Knottenbelt, T. Suto, “PIPE2: A Tool for the Performance Evaluation of Generalized Stochastic Petri Nets,” ACM SIGMETRICS, Performance Evaluation Review, Volume 36(4), pages 34–39, 2009.
[6] T. Murata, “Petri Nets: Properties, Analysis and Applications,” IEEE Proceedings, Volume. 77, No. 4, pages 541-580, 1989.
[7] Lindholm E., NVIDIA, Nickolls J., Oberman S., Montrym J., “NVIDIA Tesla: A Unified Graphics and Computing Architecture,” IEEE conference, Micro, Volume.28, Issue.2, pages 39-55, 2008.
[8] Wenbin Fang, Bingsheng He, Qiong Luo, Govindaraju, N.K., “Mars: Accelerating MapReduce with Graphics Processors,” IEEE Transactions, Parallel and Distributed Systems, Volume.22, Issue.4, pages 608-620, 2011.
[9] Elteir, M., Heshan Lin, Wu-chun Feng, Scogland, T., “StreamMR: An Optimized MapReduce Framework for AMD GPUs,” IEEE conference, Parallel and Distributed Systems, Pages 364-371, 2011.
[10] Guo Yiru, Liu Weiguo, Voss Gerrit, Mueller-Wittig, Wolfgang, “GCMR: A GPU Cluster-based MapReduce Framework for Large-scale Data Processing,” IEEE conference, High Performance Computing and Communications & Embedded and Ubiquitous Computing, pages 580-586, 2013.
[11] Mengjun Xie, Kyoung-Don Kang, Basaran C., “Moim: A Multi-GPU MapReduce Framework,” IEEE conference, Computational Science and Engineering, pages 1279-1286, 2013.
[12] Stuart J.A., Owens J.D., “Multi-GPU MapReduce on GPU Clusters,” IEEE Transactions, Parallel & Distributed Processing Symposium, pages1068–1079, 2011.
[13] Gao Heng, Tang Jie, Wu Gangshan, “A MapReduce Computing Framework Based on GPU Cluster,” IEEE Conference, High Performance Computing and Communications & Embedded and Ubiquitous Computing, pages 1902-1907, 2013.
[14] Junjie Lai, and André Seznec., “Break Down GPU Execution Time with an Analytical Method,” ACM, RAPIDO, pages 33-39, 2012.
[15] Wong, H., Papadopoulou, Sadooghi-Alvandi, Moshovos, “Demystifying GPU Microarchitecture through Microbenchmarking,” IEEE Symposium, Performance Analysis of Systems & Software, pages 235-246, 2010.
[16] Song Jun Park, “An Analysis of GPU Parallel Computing,” IEEE Conference, DoD High Performance Computing Modernization Program Users Group, pages 365-369, 2009.
[17] C. Ranger, R. Raghuraman, A. Penmetsa, G. Bradski, C. Kozyrakis, “Evaluating mapreduce for multi-core and multiprocessor systems,” International Symposium, High-Performance Computer Architecture, pages 13-24, 2007.
[18] Hsi-Chuan Wang, “Using Petri Net to Estimate Job Execution Time in MapReduce Model,” Institute of Computer Science and Information Engineering, NCKU, 2013.

校內：2019-08-18公開
校外：不公開電子論文尚未授權公開，紙本請查館藏目錄

簡易檢索 / 詳目顯示

相關論文