簡易檢索 / 詳目顯示

研究生: 蘇鴻偉
Su, Hong-Wei
論文名稱: 雲端分散式檔案系統的負載平衡
Load Balancing in Distributed Cloud File Systems
指導教授: 蕭宏章
Hsiao, Hung-Chang
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2011
畢業學年度: 99
語文別: 英文
論文頁數: 50
中文關鍵詞: 負載平衡分散式檔案系統雲端
外文關鍵詞: Load balance, distributed file systems, cloud
相關次數: 點閱:89下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • MapReduce 是雲端計算的一種應用,其被用來處理、分析大量的資料,分散式檔案系統是影響其效能的關鍵部分。這種類型的檔案系統中,一個檔案被切割成多份的區塊(chunk),並且被配置到不同的儲存節點(node)中,使得MapReduce 的程序可以在這些儲存節點中平行地去執行。無論如何,在雲端的計算環境中,節點發生錯誤是正常的現象,而且節點可以任意的離開或加入於系統中。系統上的檔案也可以隨時的被建立、刪除以及複製。因此雲端系統存在著負載不平衡的問題,也就是因為每一個檔案區塊並非均勻分布的在儲存節點中,所以我們在此文章中,提出一個分散式負載平衡演算法來解決這樣的問題。現今雲端的分散式檔案系統完全依賴集中式節點來重新配置這些在系統中的檔案區塊,但是在一個節點、檔案數眾多且節點易於發生錯誤的環境中,光靠單一的集中式節點去重新分配區塊是明顯不足的。當集中式節點在為了重新配置區塊而去計算區塊的搬移問題,可能會因為處理時間過長而變成效能上的瓶頸,並且當此節點發生錯誤時,會造成整體負載平衡無法進行。所以在文章中,我們提出一個新穎且完全分散式的負載平衡演算法來解決負載平衡問題。在我們的方法中,節點只依賴部分的資訊來平衡它們的負載,所以並不會有效能上的瓶頸,且當節點發生錯誤時,並不會去影響到其他節點執行負載平衡演算法。我們的演算法與現今雲端系統的集中式解法以及文獻中的分散式解決方法做比較。效能分析的結果指出我們方法的效能近似於集中式做法,並且在負載平衡要素、區塊移動的耗費以及演算法額外負擔的考量上,優於另一個分散式演算法。

    Distributed file systems are key building blocks for cloud computing applications based on the MapReduce programming paradigm. In such file systems, a file is partitioned to a number of chunks which are allocated in distinct storage nodes (or nodes) so that MapReduce tasks can be performed in parallel over the storage nodes. However, in a cloud computing environment, failure is the norm and nodes may arbitrarily join, leave and rejoin the system. Files can also be dynamically created, deleted, and appended. The net effect results in load imbalance in a distributed file system, that is, the file chunks are not distributed as uniformly as possible in the storage nodes. To tackle the load imbalance problem, while distributed load balancing algorithms exist in the literature, emerging distributed file systems in production systems strongly depend on a central node for chunk reallocation. This is clearly inadequate in a large-scale, failure-prone environment as the central load balancer perceives considerable workload in the reallocation and may become the performance bottleneck and the single point of failure. In this thesis, a novel, fully distributed load rebalancing algorithm is presented to cope with the load imbalance problem. In our proposal, storage nodes balance their loads spontaneously with only patial knowledge, introducing no performance bottleneck and single point of failure. Our algorithm is compared against a centralized approach in a production system and a competing distributed solution presented in the literature. The performance results indicate that our proposal is comparable to the centralized approach and considerably outperforms the distributed algorithm in terms of load imbalance factor, movement cost, and/or algorithmic overhead.

    1 Introduction 1 2 Related Work 4 3 Load Rebalancing Problem 8 4 Our Proposal 11 4.1 Architecture . . . . . . . . . . . . . . . . . . . .. 11 4.2 Load Rebalancing Algorithm . . . . . . . . . . 12 4.2.1 Concept . . . . . . .. . . . . . . . . . . . . . 12 4.2.2 Basic Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.2.3 Exploiting Physical Network Locality . . . . . . . . . . . . . . . 18 4.2.4 Taking Advantage of Node Heterogeneity . . . . . . . . . . . . . 19 4.2.5 Managing Replicas . . . . . . . . . . . . . . . . . . . . . . . . . 21 5 Performance Study 23 5.1 Simulation Setup andWorkloads . . . . . . . . . . . . . . . . . . . . . 23 5.2 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 5.2.1 Comparative Study . . . . . . . . . . . . . . . . . . . . . . . . . 25 5.2.2 Effects of Varying the Number of Samples . . . . . . . . . . . . 30 5.2.3 Effects of Varying Algorithmic Rounds . . . . . . . . . . . . . . 31 5.2.4 Effects of System Dynamics . . . . . . . . . . . . . . . . . . . . 32 6 Summary and Future Work 33 Bibliography 35 Appendices 40 A Performance of Estimating A based on Gossiping 40 B Theoretical Analysis for Exploiting of Physical Network Locality 43 C Theoretical Analysis for Impacts on Different Numbers of Samples 48

    [1] J. Dean and S. Ghemawat, “MapReduce: Simplified Data Processing on Large Clusters,” in Proc. 6th Symp. Operating System Design and Implementation (OSDI’04), Dec. 2004, pp. 137–150.
    [2] S. Ghemawat, H. Gobioff, and S.-T. Leung, “The Google File System,” in Proc. 19th ACM Symp. Operating Systems Principles (SOSP’03), Oct. 2003, pp. 29–43.
    [3] Hadoop Distributed File System, http://hadoop.apache.org/hdfs/.
    [4] VMware, http://www.vmware.com/.
    [5] Xen, http://www.xen.org/.
    [6] J. Kubiatowicz, D. Bindel, Y. Chen, S. E. Czerwinski, P. R. Eaton, D. Geels, R. Gummadi, S. C. Rhea, H. Weatherspoon, W. Weimer, C. Wells, and B. Y. Zhao, “OceanStore: An Architecture for Global-Scale Persistent Storage,” in Proc.
    9th Int’l Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS’00), Nov. 2000, pp. 190–201.
    [7] F. Dabek, M. F. Kaashoek, D. Karger, R. Morris, and I. Stoica, “Wide-Area Cooperative Storage with CFS,” in Proc. 18th ACM Symp. Operating Systems Principles (SOSP’01), Oct. 2001, pp. 202–215.
    [8] A. Rowstron and P. Druschel, “Storage Management and Caching in PAST, a Large-Scale, Persistent Peer-to-Peer Storage Utility,” in Proc. 18th ACM Symp. Operating Systems Principles (SOSP’01), Oct. 2001, pp. 188–201.
    [9] I. Stoica, R. Morris, D. Liben-Nowell, D. R. Karger, M. F. Kaashoek, F. Dabek, and H. Balakrishnan, “Chord: a Scalable Peer-to-Peer Lookup Protocol for Internet
    Applications,” IEEE/ACM Trans. Netw., vol. 11, no. 1, pp. 17–21, Feb. 2003.
    [10] A. Rowstron and P. Druschel, “Pastry: Scalable, Distributed Object Location and Routing for Large-Scale Peer-to-Peer Systems,” LNCS 2218, pp. 161–172, Nov. 2001.
    [11] A. Rao, K. Lakshminarayanan, S. Surana, R. Karp, and I. Stoica, “Load Balancing in Structured P2P Systems,” in Proc. 2nd Int’l Workshop Peer-to-Peer Systems (IPTPS’02), Feb. 2003, pp. 68–79.
    [12] Y. Zhu and Y. Hu, “Efficient, Proximity-Aware Load Balancing for DHT-Based P2P Systems,” IEEE Trans. Parallel Distrib. Syst., vol. 16, no. 4, pp. 349–361, Apr. 2005.
    [13] H. Shen and C.-Z. Xu, “Locality-Aware and Churn-Resilient Load Balancing Algorithms in Structured P2P Networks,” IEEE Trans. Parallel Distrib. Syst., vol. 18, no. 6, pp. 849–862, June 2007.
    [14] H.-C. Hsiao, H. Liao, S.-S. Chen, and K.-C. Huang, “Load Balance with Imperfect Information in Structured Peer-to-Peer Systems,” IEEE Trans. Parallel Distrib.
    Syst., vol. 22, no. 4, pp. 634–649, Apr. 2011.
    [15] D. Karger and M. Ruhl, “Simple Efficient Load Balancing Algorithms for Peer-to-Peer Systems,” in Proc. 16th ACM Symp. Parallel Algorithms and Architectures (SPAA’04), June 2004, pp. 36–43.
    [16] L. M. Ni and K. Hwang, “Optimal Load Balancing in a Multiple Processor System with Many Job Classes,” IEEE Trans. Software Eng., vol. 11, no. 5, pp. 491–496, May 1985.
    [17] L. M. Ni, C.-W. Xu, and T. B. Gendreau, “A Distributed Drafting Algorithm for Load Balancing,” IEEE Trans. Software Eng., vol. 11, no. 10, pp. 1153–1161, Oct. 1985.
    [18] G. Copeland, W. Alexander, E. Boughter, and T. Keller, “Data Placement in Bubba,” in Proc. ACM SIGMOD’88, June 1988, pp. 99–108.
    [19] P. Scheuermann, G. Weikum, and P. Zabback, “Data Partitioning and Load Balancing in Parallel Disk Systems,” in Proc. 7th Int’l Conf. Very Large Data Bases
    (VLDB’98), Feb. 1998, pp. 48–66.
    [20] F. B. Schmuck and R. L. Haskin, “GPFS: A Shared-Disk File System for Large Computing Clusters,” in Proc. USENIX Conf. File and Storage Technologies (FAST’02), Jan. 2002, pp. 231–244.
    [21] Lustre, http://www.lustre.org/.
    [22] B. Welch, M. Unangst, Z. Abbasi, G. Gibson, B. Mueller, J. Small, J. Zelenka, and B. Zhou, “Scalable Performance of the Panasas Parallel File System,” in Proc. 6th USENIX Conf. File and Storage Technologies (FAST’08), Feb. 2008, pp. 17–33.
    [23] W. Ligon and R. B. Ross, Beowulf Cluster Computing with Linux. MIT Press, Nov. 2001, ch. PVFS: Parallel Virtual File System, pp. 391–430.
    [24] Y. Hua, Y. Zhu, H. Jiang, D. Feng, and L. Tian, “Supporting Scalable and Adaptive Metadata Management in Ultra Large-scale File Systems,” IEEE Trans. Parallel Distrib. Syst., vol. 22, no. 4, pp. 580–593, Apr. 2011.
    [25] S. A. Weil, S. A. Brandt, E. L. Miller, D. D. E. Long, and C. Maltzahn, “Ceph: A Scalable, High-Performance Distributed File System,” in Proc. 7th Symp. Operating
    System Design and Implementation (OSDI06), Nov. 2006, pp. 307–320.
    [26] H. Tang, A. Gulbeden, J. Zhou, W. Strathearn, T. Yang, and L. Chu, “A Self-Organizing Storage Cluster for Parallel Data-Intensive Applications,” in Proc. Intl Conf. High Performance Networking and Computing (SC’04), Nov. 2004.
    [27] J. Stribling, Y. Sovran, I. Zhang, X. Pretzer, J. Li, M. F. Kaashoek, and R. Morris, “Flexible, Wide-Area Storage for Distributed Systems with WheelFS,” in Proc.
    6th USENIX Symp. Networked Systems Design and Implementation (NSDI’09), Apr. 2009, pp. 43–58.
    [28] B. Godfrey and I. Stoica, “Heterogeneity and Load Balance in Distributed Hash Tables,” in Proc. IEEE INFOCOM’05, Mar. 2005, pp. 596–606.
    [29] P. Ganesan, M. Bawa, and H. Garcia-Molina, “Online Balancing of Range-Partitioned Data with Applications to Peer-to-Peer Systems,” in Proc. 13th Int’l Conf. Very Large Data Bases (VLDB’04), Sept. 2004, pp. 444–455.
    [30] A. Bharambe, M. Agrawal, and S. Seshan, “Mercury: Supporting Scalable Multi-Attribute Range Queries,” in Proc. ACM SIGCOMM’04, Aug. 2004, pp. 353–366.
    [31] Q. H. Vu, B. C. Ooi, M. Rinard, and K.-L. Tan, “Histogram-Based Global Load Balancing in Structured Peer-to-Peer Systems,” IEEE Trans. Knowl. Data Eng.,
    vol. 21, no. 4, pp. 595–608, Apr. 2009.
    [32] J. W. Byers, J. Considine, and M. Mitzenmacher, “Simple Load Balancing for Distributed Hash Tables,” in Proc. 1st Int’l Workshop Peer-to-Peer Systems (IPTPS’03), Feb. 2003, pp. 80–87.
    [33] G. S. Manku, “Balanced Binary Trees for ID Management and Load Balance in Distributed Hash Tables,” in Proc. 23rd ACM Symp. Principles Distributed Computing (PODC’04), July 2004, pp. 197–205.
    [34] M. R. Garey and D. S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman and Co., 1979.
    [35] D. Eastlake and P. Jones, “US Secure Hash Algorithm 1 (SHA1),” RFC 3174, Sept. 2001.
    [36] M. Jelasity, A. Montresor, and O. Babaoglu, “Gossip-based aggregation in large dynamic networks,” ACM Trans. Comput. Syst., vol. 23, no. 3, pp. 219–252, Aug. 2005.
    [37] M. Jelasity, S. Voulgaris, R. Guerraoui, A.-M. Kermarrec, and M. V. Steen, “Gossip-Based Peer Sampling,” ACM Trans. Comput. Syst., vol. 25, no. 3, Aug. 2007.
    [38] H. Sagan, Space-Filling Curves, 1st ed. Springer, 1994.
    [39] H. Abu-Libdeh, P. Costa, A. Rowstron, G. O’Shea, and A. Donnelly, “Symbiotic Routing in Future Data Centers,” in Proc. ACM SIGCOMM’10, Aug. 2010, pp. 51–62.
    [40] C. Guo, G. Lu, D. Li, H. Wu, X. Zhang, Y. Shi, C. Tian, Y. Zhang, and S. Lu, “BCube: A High Performance, Server-Centric Network Architecture for Modular Data Centers,” in Proc. ACM SIGCOMM’9, Aug. 2009, pp. 63–74.
    [41] W. K. Hastings, “Monte Carlo Sampling Methods Using Markov Chains and Their Applications,” Biometrika, vol. 57, no. 1, pp. 97–109, Apr. 1970.
    [42] M. Mitzenmacher and E. Upfal, Probability and Computing: Randomized Algorithms and Probabilistic Analysis. Cambridge University, 2005, ch. Coupling of Markov Chains, pp. 271–294.

    無法下載圖示 校內:2016-07-28公開
    校外:不公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE