成功大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	呂學儒 Lyu, Syue-Ru
論文名稱：	一個用於MapReduce雲計算之簡易叢集規模調整策略 A Simple Cluster-Scaling Policy for MapReduce Clouds
指導教授：	謝錫堃 Shieh, Ce-Kuen
共同指導:	黃祖基 Huang, Tzu-Chi
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 電腦與通信工程研究所 Institute of Computer & Communication Engineering
論文出版年：	2012
畢業學年度：	100
語文別：	英文
論文頁數：	36
中文關鍵詞：	MapReduce 、策略、叢集規模調整
外文關鍵詞：	MapReduce, policy, cluster-scaling
相關次數：	點閱：131 下載：2
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

由於雲端計算的興起，帶起了非常多的服務之發展。Google提出了MapReduce這個用以處理大量資料之架構。而在YAHOO發表了他們的開源碼MapReduce之實作-Hadoop之後，很多公司、企業紛紛開始採用這種系統，並建立屬於自己的叢集去處理他們龐大的資料。
一個叢集內的運算資源常常是不會全部被使用的。因此，很多調整叢集規模的研究也被提出。這些研究提出了如何降低叢集的規模來達到省電的效果，還有研究如何加入更多運算節點以獲得更好的效能。但是，這些研究並無法同時兼顧省電與效能。
因此，為了同時兼顧效能與省電這兩種優點，我們提出了一個簡單的策略。透過分析MapReduce之特性，並且利用這這些特性發展出我們的叢集規模調整策略。此策略可以有效地找出一個叢集能夠將多少節點移除而不會影響到工作的運作時間。我們在多種情況下測試我們的策略能夠順利運作，並且兼顧了效能以及省電的目的。

Due to the rise of cloud computing, many cloud services have been developing. Google proposed a programming model that is MapReduce for processing large amounts of data. After YAHOO! proposed Hadoop, the implementation of open source MapReduce, many companies, and enterprises have started using this programming model, and establish their own cluster to handle their large amounts of information. Many application
Computing resources within a cluster are often not all be used. Therefore, a lot of studies of cluster-scaling are also presented. These studies proposed to reduce the size of the cluster to achieve power saving, and how to add more computing nodes in order to obtain better performance. However, these studies do not take the power-saving and performance into consideration.
Therefore, taking the advantages of performance and energy saving into account, we propose a simple policy. We analyzed the features of MapReduce and used these features to develop our policy. This policy can effectively identify how many computing nodes can be removed from a cluster without affecting the execution time. We test our policy in many cases to prove it is well-performed in different configurations, and taking into account the purpose of performance and power saving.

Outline
Chapter 1 Introduction	1
Chapter 2 Related Work	3
2.1 Subset	3
2.2 All-In	5
Chapter 3 Proposed policy	6
3.1 System Overview	6
3.2 System analysis	8
Chapter 4 Emulation Verification	15
4.1 Emulator	15
4.2 Experiment Setup	17
4.3 Benchmarks	19
4.4 Different waves and various mappers	20
4.5 Different computational size	25
4.6 Evaluation on Hadoop	28
Chapter 5 Discussion	31
5.1 Block size	31
5.2 Execution time of reduce phase	31
5.3 Data replications	31
5.4 Straggler	32
Chapter 6 Conclusion and Future Work	33
Reference	34

                                    

[1]. J. Dean and S. Ghemawat. MapReduce: Simplified data processing on large clusters. Communications of the ACM, vol. 51, pp. 107-113, 2008.
[2]. Yahoo! . Available: http://www.yahoo.com
[3]. Facebook. Available: http://www.facebook.com/
[4]. Dropbox. Available: https://www.dropbox.com/
[5]. PowerBy – Hadoop Wiki. Available: http://wiki.apache.org/hadoop/PoweredBy
[6]. Google. Available: http://www.google.com
[7]. Hadoop. Available: http://hadoop.apache.org
[8]. HDFS File System Shell Guide – get. Available: http://hadoop.apache.org/common/docs/r0.20.0/hdfs_shell.html#get
[9]. HDFS File System Shell Guide – put. Available: http://hadoop.apache.org/common/docs/r0.20.0/hdfs_shell.html#put
[10]. Gmail. Available: http://www.gmail.com
[11]. Koomey, J. Gowth in Data center electricity use 2005 to 2010. Analytics Press, Oakland CA (2011).
[12]. J. Leverich and C. Kozyrakis. On the Energy (In)efficiency of Hadoop Clusters. ACM SIGOPS Operating Systems Review, Volume 44 Issue 1, 2010,
[13]. HiNet. Available: http://www.hinet.net/
[14]. hicloud. Available: http://hicloud.hinet.net/
[15]. W. Lang and J. M. Patel. Energy Management for MapReduce Clusters. VLDB, 2010
[16]. Rini T. Kaushik and Milind Bhandarkar. GreenHDFS:towards an energy-conserving, storage-efficient, hybrid hadoop compute cluster. In Proc. of HotPower'10, pp.1-9, 2010.
[17]. Nitesh Maheshwari, Radheshyam Nanduri, and Vasudeva Varma. Dynamic energy efﬁcient data placement and cluster reconﬁguration algorithm for mapreduce framework. Future Generation Computer Systems, 28(1):119 – 127, 2012.
[18]. M. Zaharia, A. Konwinski, A. D. Joseph, R. Katz, and I. Stoica. Improving MapReduce Performance in Heterogeneous Environments. In Proc. OSDI, pages 29–42, San Diego, CA, December 2008.
[19]. K. Shvachko, H. Huang, S. Radia, and R. Chansler. The HadoopDistributed File System. In Proceedings of the 26th IEEE Transactions on Computing Symposium on Mass Storage Systems andTechnologies (MSST ’10), Lake Tahoe NV, May 2010.
[20]. Jaliya Ekanayake, Shrideep Pallickara, and Geoffrey Fox. Map-Reduce for Data Intensive Scientific Analyses Proceedings of the IEEE International Conference on e-Science. Indianapolis. 2008. December 7-12 2008
[21]. C. Chu, S. Kim, Y. A. Lin, Y. Y. Yu, G. Bradski, A. Ng, and K. Olukotun. “Map-reduce for machine learning on multicore.” in Proceedings of Neural Information Processing Systems Conference (NIPS), pp. 281-288, 2006.
[22]. J.H.C. Yeung et al. “Map-reduce as a programming model for custom computing machines,” in 16th International Symposium on Field-Programmable Custom Computing Machines (FCCM'08), pp. 149-159, April 2008.
[23]. J. Urbani, S. Kotoulas, E. Oren, and F. van Harmelen. “Scalable distributed reasoning using mapreduce,” in LNCS, vol. 5823, pp. 634---649. Springer, Heidelberg, 2009.
[24]. T. Elsayed, J. Lin, and D. W. Oard. “Pairwise document similarity in large collections with mapreduce,” in Proc. of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies (HLT'08), pp. 265–268, 2008.

2017-08-31公開

簡易檢索 / 詳目顯示

相關論文