簡易檢索 / 詳目顯示

研究生: 蘇進發
Su, Chin-Fa
論文名稱: 應用於異質性雲端MapReduce叢集之可適性粒度策略
ATGMR: An Adaptive Task Granularity Scheme for GPU-CPU MapReduce Clusters
指導教授: 鄭憲宗
Cheng, Sheng-Tzong
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2014
畢業學年度: 102
語文別: 英文
論文頁數: 56
中文關鍵詞: 雲端運算MapReduceHadoopGPUCUDA
外文關鍵詞: Cloud computing, MapReduce, Hadoop, GPU, CUDA
相關次數: 點閱:161下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著Big Data時代來臨以及雲端計算的普及與GPU平行運算的技術成熟,將傳統用於CPU的雲端運算架構MapReduce整合GPU的大量平行運算技術將可以大幅度的增加系統的效能,而這樣的作法已經是目前處理大規模資料的趨勢。Hadoop為目前最為廣泛使用的且實現MapReduce架構之開源平台,雖然Hadoop設計時並無考慮GPU資源,但是透過JNI的介面便可以利用GPU的資源,因此為了探討整合CPU與GPU資源的分散式異質性運算平台,本研究使用Hadoop做為我們的測試平台。
    由於CPU與GPU對於各式各樣的應用程式都有不同的執行效能,沒有一個處理器可以全面性的優於另一個,因此就造成分派工作的難度。本研究探討在MapReduce分散式運算的架構下,CPU與GPU工作粒度的動態調整策略,讓所有的處理器的工作負載量可以平衡並增加資源的使用率。
    本研究參考了SPN-MR [19]的MapReduce模型做為模擬的系統架構,根據其定義的七個執行階段分別測量與模擬MapReduce程式的執行效能,實驗結果證明模擬數據與實際執行結果最多6.5 %,以及平均1.9 %的誤差。此外實驗的結果更驗證了本研究所提出的方法可以達到傳統做法的數倍效能,並且有效的減少工作負載的失衡。

    Big data processing is one of the most crucial topics nowadays. One of the most popular solution to big data is MapReduce which takes advantage of distributed computing and is highly scalable. Several packages have been implemented and widely used to allow agile analysis on big data. Hadoop is one of the most famous packages that implement MapReduce framework to ease the complexity of developing distributed computing applications.
    However, the poor performance of a single node can lead to the bottleneck of the whole system. Therefore, to improve the performance of MapReduce tasks on a single node, GPUs have become a promising solution. Owing to the lack of GPU utilization in Hadoop default package, JNI is adopted to leverage GPU in the MapReduce runtime.
    Nevertheless, since the performance of GPUs and CPUs varies with applications, balancing workload is a crucial technique to maximize system performance. In this work, we proposed ATGMR which adjust task granularity for processors so that each of them is able to consume a task at the same time and minimize idle time of all processors. Results show that ATGMR makes jobs finish at most 10 times faster than native strategies in matrix multiplication and 3 times faster in k-means.
    Besides, SPN-MR [19] is introduced in our simulation in order to simulate MapReduce jobs with variable block size properly. Experiment result shows that our simulation is consistent with real-world MapReduce jobs.

    摘 要 I Abstract II ACKNOWLEDGEMENT IV TABLE OF CONTENTS V LIST OF TABLES VII LIST OF FIGURES VIII Chapter 1. Introduction and Motivation 1 1.1. Introduction 1 1.2. Motivation 4 Chapter 2. Background and Related Work 7 2.1. MapReduce 7 2.1.1. Framework 7 2.1.2. Execution Flow 8 2.2. Hadoop 9 2.3. GPGPU 11 2.4. SPN-MR 12 2.5. Related Work 14 Chapter 3. System Design 17 3.1. Problem Description and Scenario 17 3.2. Problem Formulation 20 3.2.1. Objective Function 20 3.2.2. Parameter Definition 21 3.2.3. Mapping to SPN-MR 23 3.3. ATGMR 26 3.3.1. Overview 26 3.3.2. Oscillation Avoidance 30 3.3.3. Empirical setting 31 Chapter 4. Performance Evaluation 32 4.1. Experiment environment and simulation settings 32 4.2. Evaluation matrices 34 4.2.1. Benchmark 34 4.2.2. Evaluation matrices 36 4.3. Results and evaluation 37 4.3.1. Simulation accuracy 37 4.3.2. Performance evaluation 39 4.3.3. Load imbalance 49 4.3.4. Effect of empirical setting 50 Chapter 5. Conclusions and Future Work 52 References 53 Appendix 55

    [1] Miao Xin; Hao Li, “An Implementation of GPU Accelerated MapReduce: Using Hadoop with OpenCL for Data- and Compute-Intensive Jobs”, In Proceedings of the 2012 International Joint Conference on Service Sciences (IJCSS) ,May 24-26, 2012
    [2] Wenbin Fang, Bingsheng He, Qiong Luo, Naga K and Govindaraju, N.K. “Mars: Accelerating MapReduce with Graphics Processors,” IEEE Trans. Parallel and Distributed Systems,vol. 22, no. 4, pp. 608 – 620, April 2011.
    [3] Trevor G. Reid, Jian Wei Gan “Hadoop EKG: Using Heartbeats to Propagate Resource Utilization Data”, Duke University
    [4] Koichi Shirahata, Hitoshi Sato, and Satoshi Matsuoka. “Hybrid Map Task Scheduling for GPU-Based Heterogeneous Clusters,” In Proceedings of CloudCom, pp. 733-740, 2010.
    [5] Linchuan Chen, Xin Huo and Agrawal, G., “Accelerating MapReduce on a coupled CPU-GPU architecture,” High Performance Computing, Networking, Storage and Analysis (SC), pp. 1-11, Nov. 2012
    [6] Yu Shyang Tan, Bu-Sung Lee ; Bingsheng He ; Campbell, R.H. “A Map-Reduce Based Framework for Heterogeneous Processing Element Cluster Environments,” In proceeding of Cluster, Cloud and Grid Computing (CCGrid), 2012 12th IEEE/ACM International Symposium, pp. 57-64, 13-16 May 2012
    [7] “Hadoop”, http://hadoop.apache.org/
    [8] “Nvidia CUDA”, http://www.nvidia.com/object/numerical-packages.html
    [9] “OpenCL”, https://www.khronos.org/opencl/
    [10] Grossman, M., Breternitz, M. and Sarkar, V., “HadoopCL: MapReduce on Distributed Heterogeneous Platforms through Seamless Integration of Hadoop and OpenCL,” In Proceedings of Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2013 IEEE 27th International, pp. 1918 – 1927, May 2013
    [11] Abbasi, A, Khunjush, F. and Azimi, R., “A Preliminary Study of Incorporating GPUs in the Hadoop Framework,” In Proceedings of Computer Architecture and Digital Systems (CADS), 2012 16th CSI International Symposium, pp. 178 - 185, May 2012.
    [12] Reza Mokhtari, Amin Abbasi, Farshad Khunjush, and Reza Azimi., “Soren: Adaptive MapReduce for Programmable GPUs,” In Proceedings of HiPEAC’11 the 6th International Conference on High Performance and Embedded Architectures and Compilers, 2011.
    [13] Reza Farivar, Abhishek Verma, Ellick M. Chan, and Roy H. Campbell. “MITHRA: Multiple data independent tasks on a heterogeneous resource architecture,” In Proceedings of 2009 IEEE International Conference on Cluster Computing and Workshops, pp. 1–10, 2009.
    [14] C. Ranger, R. Raghuraman, A. Penmetsa, G. Bradski, and C. Kozyrakis, “Evaluating mapreduce for multi-core and multiprocessor systems,” In HPCA ’07: Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture, pages 13–24, Washington, DC, USA, 2007. IEEE Computer Society.
    [15] C.-K. Lu, S. Hong, and H. Kim, “Qilin: Exploiting parallelism on heterogeneous multiprocessors with adaptive mapping,” In Proceedings of MICRO ’09, pp. 45–55, 2009.
    [16] Yanlong Zhai, Mbarushimana, E., Wei Li, Jing Zhang, Ying Guo, “Lit: A high performance massive data computing framework based on CPU/GPU cluster,” In Proceedings of Cluster Computing (CLUSTER), 2013 IEEE International Conference, pp. 1-8, Sept. 2013.
    [17] Kai Ma, Xue Li, Wei Chen, Chi Zhang, Xiaorui Wang, “GreenGPU: A Holistic Approach to Energy Efficiency in GPU-CPU Heterogeneous Architectures,” In Proceedings of Parallel Processing (ICPP), 2012 41st International Conference, pp. 48-57, 2012
    [18] “Google MapReduce”, http://research.google.com/archive/mapreduce.html
    [19] Hsi-Chuan Wang, “Using Petri Net to Estimate Job Execution Time in MapReduce Model,” Institute of Computer Science and Information Engineering, NCKU, 2013

    無法下載圖示 校內:2019-08-06公開
    校外:不公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE