簡易檢索 / 詳目顯示

研究生: 王懿宸
Wang, Yi-Chen
論文名稱: Astraea Partitioner: 基於過去全局資訊實現Kafka叢集節點動態負載平衡
Astraea Partitioner: Dynamic Load Balancing with Prior Global Information for Data Streaming in Kafka
指導教授: 蕭宏章
Hsiao, Hung-Chang
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2022
畢業學年度: 110
語文別: 中文
論文頁數: 27
中文關鍵詞: 分佈式流處理系統負載平衡Apache Kafka叢集節點指標參數處理
外文關鍵詞: Distributed stream processing system, load balancing, Apache Kafka, cluster node metrics parameter processing
相關次數: 點閱:131下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 分佈式流處理系統 Kafka,因其容器在節點中分佈不均勻導致的叢集節點間負載不一致。負載不平衡的情況不僅會導致叢集中某些節點的硬體資源閒置,高負載節點處理資料的積壓會讓整個叢集處理資料的頻寬受制於它。
    本論文提出 Astraea Partitioner,一個基於叢集狀況監測,通過對收集的數據處理計算每臺節點的負載分數,以此來安排規劃每臺節點應承擔的資料負載量。設計的目的為能即時調整叢集節點間的負載,讓節點間的負載狀態保持在一個動態平衡。達到無須讓正在運行的業務下線,也能夠在負載不均勻的叢集中提升輸送量效能,並且叢集負載越不均勻改善的程度越明顯。

    Kafka, a distributed stream processing system, has inconsistent load among cluster nodes due to the uneven distribution of containers among nodes. The load imbalance not only causes the hardware resources of some nodes in the cluster to be idle, but also the backlog of data processing in the highly loaded nodes will limit the bandwidth of data processing in the whole cluster.
    In this paper, we propose Astraea Partitioner , a system based on cluster condition monitoring, which calculates the load fraction of each node by processing the collected data, so as to arrange and plan the amount of data load that each node should bear. The purpose of the design is to adjust the load between the nodes of the cluster in real time, so that the load status between the nodes is kept in a dynamic balance. The more uneven the load of the cluster, the more obvious the improvement.

    摘要 i 致謝 v 目錄 vi 圖目錄 vii Chapter 1. 簡介 1 Chapter 2. 背景研究 3 2.1 Apache Kafka 3 2.2 Kafka Producer架構 4 2.3 Default Partitioner 4 2.4 Kafka Metrics 5 Chapter 3. 研究问题 6 Chapter 4. Astraea Partitioner 8 4.1 定義 Broker 的負載分數 8 4.1.1 選取Kafka Metric及其各自在Broker中的分數 8 4.1.2 使用熵權法為各 Kafka Metric賦權 9 4.1.3 計算節點負載分數 11 4.2 Smoothly Weighted Round-Robin 11 4.2.1 Weighted Round-Robin 12 4.2.2 Smoothly Weighted Round-Robin algorithm 12 4.2.3 Smoothly Weighted Round-Robin with effective weights 14 Chapter 5. 實驗验证 17 5.1 測試環境 17 5.2 Astraea Partitioner效能評測對比 17 Chapter 6. 相關研究 23 Chapter 7. 結論 25 參考文獻 26

    [1] Shadi,A.Noghabi, Kartik Paramasivam, Yi Pan, Navina Ramesh, Jon Bri-nghurst,Indranil Gupta, & Roy H.Campbell. (2017). Samza: stateful scalable Stream processing at LinkedIn. VLDB End-owment, 1634-1635.-doi:10.14778/3137765.313-7770
    [2] Abdelmajid Chaffai, Larbi Hassouni, & Houda Anoun. (2018). Informal Learning in Twitter: Architecture of Data Analysis Workflow and Extraction of Top Group of Connected Hashtags. BD-CA, 872. doi:10.1007/978-3-319-96292-4_1
    [3] NITIN SHARMA. (2020).Featuring Apache Ka-fka in the Netflix Studio and Finance World. Retrieved from https://www.confluent.io/blog/how-kafka-is-used-by-netfli-x/( JAN 21, 2020)
    [4] Steve Vinoski. (2006). Advanced message queu-ing protocol S Vinoski. IEEE Internet Computing, 10. doi: 10.1109/MIC.2006.116
    [5] Jay Kreps, Neha Narkhede, & Jun Rao. (2011).-Kafka: A distributed messaging system for log processing. NetDB, 11. acm:978-1-4503-0652-2/11/06
    [6] Wen Liang-chen, Zhang Xue-feng, & Zhu Li-me-i. (2009). Method of ameliorative multiobje-ctive synthetic evaluation based on entropy weight and its application. IEEE Internet Computing, 17-19. doi: 10.1109/CCDC.2009.5192218
    [7] Da-Yong Chang. (1996). Applications of the ex-tent analysis method on fuzzy Ahp, 95. Doi:10.1-016/03772217(95)00300-2
    [8] Maxim Dounin. (2012). Smooth Weighted Round-Robin balancing. Retrieved from https://github.co-m/nginx/nginx/commit/52327e0627f49dbda1e8db695e63a4b0af4448b1
    [9] Edward I. Altman. (1983). Altman-Corporate Financial Distress: A Complete Guide to Predicting, Avoiding, and Dealing with Bankruptcy. U.S.: Wiley
    [10] Chia-Ping Tsai, Yi-Chen Wang, Ching-Hong Fa-ng, Zheng-Xian Li, Xiang-Jun Sun, & Zhi-Mao Teng. (2021). Astraea Partitioner address. Retrieved from https://github.com/s-kiptests/astraea
    [11] Confluent Self-Balancing Clusters. (2021). Retrieved from https://docs.confluent.i-o/platform/c-urr-ent/kafka/sbc/index.html
    [12] JMC. Available: https://www.oracle.com/technetwork/java/javaseproducts/mission-control/java-mission-control-1998576.html
    [13] Java.11,docs: https://docs.oracle.com/en/java/javase/11/docs/api/
    [14] Cloudera. Available: https://www.cloudera.com/
    [15] Oracle. Available: https://www.oracle.com/database/index.html
    [16] S. Ghemawat, H. Gobioff, and S.-T. Leung, “The Google File System,” in Proc. of the Nineteenth ACM Symposium on Operating Systems Principles, New York, NY, USA, 2003.
    [17] H. C. Hsiao, H. Liao, S. T. Chen, and K. C. Huang, “Load Balance with Imperfect Information in Structured Peer-to-Peer Systems,” IEEE Transactions on Parallel and Distributed Systems, vol. 22, no. 4, 2011.
    [18] Kafka: https://kafka.apache.org/
    [19] Grafana: https://grafana.com/products/cloud/
    [20] Promethes https://prometheus.io/docs/prometheus/latest/getting_started/
    [21] Katevenis, M.; Sidgiropoulos, S.; Courcoubetis, C. Weighted Round-Robin cell multiplexing in a general-purpose ATM switch chip. IEEE Journal on Selected Areas in Communications. 1991, 9 (8): 1265–1279. ISSN 0733-8716. doi:10.1109/49.105173.
    [22] Chaskar, H.M.; Madhow, U. Fair scheduling with tunable latency: A round-robin approach. IEEE/ACM Transactions on Networking. 2003, 11 (4): 592–601. ISSN 1063-6692. doi:10.1109/TNET.2003.815290.
    [23] Event (computing). https://en.wikipedia.org/wiki/Event_(computing)
    [24] Auto Data Balancing. https://docs.confluent.io/platform/current/kafka/rebalancer-/index.html#rebalancer
    [25] RedHat Openshift. https://access.redhat.com/documentation/zh-cn/red_hat_amq/20-21.q3/html/using_amq_streams_on_openshift/index
    [26] linkedin/cruise-control. https://github.com/linkedin/cruise-control

    無法下載圖示 校內:2027-08-08公開
    校外:2027-08-08公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE