| 研究生: | 黃宗智 Huang, Tsung-Chih | 
|---|---|
| 論文名稱: | 圖形處理器環境基於OpenCL之MapReduce框架之設計與實作 The Design and Implementation of the MapReduce Framework based on OpenCL in GPU Environment | 
| 指導教授: | 陳 敬 Chen, Jing | 
| 學位類別: | 碩士 Master | 
| 系所名稱: | 電機資訊學院 - 電腦與通信工程研究所 Institute of Computer & Communication Engineering | 
| 論文出版年: | 2013 | 
| 畢業學年度: | 101 | 
| 語文別: | 中文 | 
| 論文頁數: | 94 | 
| 中文關鍵詞: | GPGPU 、OpenCL 、MapReduce | 
| 外文關鍵詞: | GPGPU, OpenCL, MapReduce | 
| 相關次數: | 點閱:75 下載:2 | 
| 分享至: | 
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 | 
隨著科技的發展與演進,不論是中央處理器(Central Processing Unit, CPU)或是圖形處理器(Graphics Processing Unit, GPU)的處理能力都日益增強。特別是圖形處理器,由於其擁有優異的平行處理之能力,GPGPU(General Purpose Computation on the GPU)的概念也被提出。為此本論文設計並實作一個適用於OpenCL(Open Computing Language)在圖形處理器環境的MapReduce軟體框架,針對想要使用OpenCL開發運行在圖形處理器上的平行化應用程式之使用者,提供一個可以簡化開發過程、容易實現平行計算複雜細節,並大幅度地降低使用者負擔的MapReduce軟體框架。
本論文設計之框架包含許多應用程序介面,其系統架構可以分為兩個部分,第一部分為運行在中央處理器上的框架應用程序介面,包含:初始化、取得輸入資料、建立程式、裝置資源查詢、設定執行緒與切割資料、設定與準備內核、加入輸入資料、配置GPU記憶體、取回結果與釋放資源等。第二部分為運行在圖形處理器上的框架應用程序介面,包含: Map、Map count、Reduce、Reduce count、Group、計算記憶體大小等。本論文之實作中部分模組使用OpenCL函式庫所提供的應用程序介面,加入應用程式資料的運算處理、計算記憶體空間、應用程式待處理資料之轉換與準備等等。使用者可以專心設計處理的部分,框架會自動呼叫OpenCL應用程序介面並傳入適當的參數值,協調CPU與GPU的處理過程。
本論文主要貢獻在於使用OpenCL實作MapReduce軟體框架,使所有使用本論文框架所開發的程式可以跨平台,易於移植;亦提供許多應用程序介面供使用者在開發的過程中使用,這些應用程序介面的運作充分展現OpenCL的運作以及處理流程。
With the advances and evolution of technology, General Purpose Computation on the GPU(GPGPU) was put forward due to the excellent performance of GPU in parallel computing. This thesis presents the design and implemention of a MapReduce software framework which is based on Open Computing Language(OpenCL) in GPU environment. For those users who develop parallel application software using OpenCL, this framework provides an alternative which can simplify the process of development and can implement the complicate details of parallel computing easily. Therefore, the burden of developers will be considerably relieved.
The design of this framework is composed of many application programming interfaces which can be divided into two parts in the system architecture. The first part is application programming interface framework working on CPU, such as initialization, data transfer, creating program, query device information, thread configuration, preparing kernel, adding input record, GPU memory allocation, copying output to host and releasing resource. The second part is application programming interface framework working on GPU, such as Map, Map count, Reduce, Reduce count, group, GPU memory sum. The implementation is realized using OpenCL application programming interfaces which are provided by OpenCL library modules, including application data computing, memory calculation, application pending data preparation, etc. Users thus can concentrate on the part of the design process, the framework will automatically invoke the OpenCL functions and pass the appropriate parameter values, and coordinating CPU and GPU processing.
The main contribution of this thesis is using OpenCL to implement MapReduce software framework. Users can use this framework to develop cross-platform programs making the porting process much easier. Furthermore, this framework provides many application programming interfaces used in the development and those application programming interfaces fully demonstrate how OpenCL works and its flow of processing.
[1] GPGPU,
http://www.nvidia.com.tw/object/GPU_Computing_tw.html, accessed on
2012/5/15.
[2] AMD GPGPU Chronicles,
http://blogs.amd.com/play/2008/11/05/the-gpgpu-chronicles/, accessed on
2012/5/15.
	[3] DirectX,
 http://en.wikipedia.org/wiki/DirectX, accessed on 2012/5/15.
	[4] NVIDIA GeForce 8800 GPU Architecture Overview,
http://www.nvidia.cn/object/IO_37183.html, accessed on 2012/5/15.
	[5] Shading language,
http://en.wikipedia.org/wiki/Shading_language, accessed on 2012/5/15.
	[6] AMD Brook+,
http://developer.amd.com/gpu_assets/AMD-Brookplus.pdf, accessed on
2012/5/16.
	[7] CUDA, http://www.nvidia.com/object/cuda_home_new.html, accessed on
2012/5/16.
	[8] OpenCL, http://en.wikipedia.org/wiki/OpenCL, accessed on 2012/5/16.
	[9] DirectCompute,
 http://en.wikipedia.org/wiki/DirectCompute, accessed on 2012/5/16.
[10] C++ Accelerated Massive Parallelism, 
	http://techreport.com/discussions.x/21134, accessed on 2012/5/16.
[11] OpenACC, http://en.wikipedia.org/wiki/OpenACC, accessed on 2012/5/16.
[12] GPU通用計算調研報告,
	http://www.slideshare.net/onemonkey/gpuby, accessed on 2012/5/17.
[13] MapReduce, http://en.wikipedia.org/wiki/MapReduce, accessed on
2012/5/17.
[14] NVIDIA發表GeForce 6800系列為NVIDIA史上效能與功能最大的躍進,
http://www.nvidia.com.tw/object/IO_12911.html, accessed on 2012/5/27.
[15] Google spotlights data center inner workings,
http://news.cnet.com/8301-10784_3-9955184-7.html, accessed on
2012/5/28.
[16] Jeffrey Dean, Sanjay Ghemawat, “MapReduce: Simplified Data Processing 	on Large Clusters”, 6th Symposium on Operating Systems Design and
Implementation, 2004.
	[17] MapReduce,
		http://www.slideshare.net/waue/hadoop-map-reduce-3019713, accessed on
2012/5/29.
	[18] Erlang (programming language),
		http://en.wikipedia.org/wiki/Erlang_(programming_language), accessed on
2012/5/29.
	[19] Disco: a Map reduce framework in Python and Erlang,
http://ebiquity.umbc.edu/blogger/2008/12/21/disco-a-map-reduce-framewok-in-python-and-erlang/, accessed on 2012/5/29.
	[20] CouchDB, http://couchdb.apache.org/, accessed on 2012/5/29.
	[21] Khronos Group, http://en.wikipedia.org/wiki/Khronos_Group, accessed on
2012/5/29.
	[22] OpenCL 1.1 Specification,
http://www.khronos.org/registry/cl/specs/opencl-1.1.pdf, accessed on
2012/5/29.
[23] Tim Mattson, Ian Buck, Michael Houston, Ben Gaster,
“OpenCL : A Standard Platform for programming Heterogeneous parallel computers”, 2009.
	[24] Scarpino Matthew, “OpenCL in Action: How to Accelerate Graphics and 			Computation”, 2011.
[25] Benedict R. Gaster, Lee Howes, David R. Kaeli, Perhaad Mistry, Dana
Schaa, “Heterogeneous Computing with OpenCL”, 2011.
[26] Marwa Elteir, Heshan Lin, Wu-chun Feng, Tom Scogland, “StreamMR: An 
Optimized MapReduce Framework for AMD GPUs”, IEEE 17th 
International Conference on Parallel and Distributed Systems, 2011.
	[27] Bingsheng He, Wenbin Fang, Naga K. Govindaraju, Qiong Luo, Tuyong 
Wang, “Mars: a Map Reduce Framework on Graphics Processors”,
In 17th International Conference on Parallel Architectures and Compilation Techniques, pages 260–269, ACM, 2008.
	[28] Wenbin Fang, Bingsheng He, Qiong Luo, Naga K. Govindaraju, “Mars: 
Accelerating MapReduce with Graphics Processors”, IEEE Transactions on
Parallel and Distributed Systems, Volume 22, Issue 4, pages 608 – 620,
2011.
	[29] Chuntao Hong, Dehao Chen, Wenguang Chen, Weimin Zheng, Haibo Lin, 
“MapCG: Writing Parallel Program Portable Between CPU and GPU”, In 
19th International Conference on Parallel Architectures and Compilation 
Techniques, pages 217–226, ACM, 2010.
[30] OpenMP, http://en.wikipedia.org/wiki/OpenMP, accessed on 2012/6/1.
[31] CUDA與OpenCL區別, 
http://blog.csdn.net/babyfacer/article/details/6863572, accessed on
2012/6/2.
	[32] Apache Hadoop, http://en.wikipedia.org/wiki/Apache_Hadoop, accessed on
2012/6/7.
	[34] AMD Developer Central,
http://developer.amd.com/tools-and-sdks/, accessed on 2013/6/15.
	[35] C99 Specification, 
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf, accessed on 
2012/6/9.
[36] AMD APP SDK Samples & Demos,	   http://amddevcentral.com/tools/hc/AMDAPPSDK/samples/Pages/default.aspx, accessed on 2013/02/06.
[37] Heterogeneous Computing with OpenCL,
	http://www.heterogeneouscompute.org/?page_id=7, accessed on
2013/02/06.
[38] Taking MapReduce to Monte Carlo,
  http://nathanwiegand.com/wp/tag/mapreduce/, accessed on 2012/11/26.
[39] C.Ranger, R.Raghuraman, A.Penmetsa, G.Bradski, C.Kozyrakis, 
 “Evaluating Mapreduce for Multi-Core and Multi-processor Systems,”
     Proc. IEEE 13th Int’l Symp. High Performance Computer Architecture
(HPCA), 2007.