簡易檢索 / 詳目顯示

研究生: 鄧智懋
Teng, Zhi-Mao
論文名稱: 基於 Partition 負載資訊實現 Kafka Consumer 負載平衡
Kafka Consumer load balancing based on partition load information
指導教授: 蕭宏章
Hsiao, Hung-Chang
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2023
畢業學年度: 111
語文別: 中文
論文頁數: 39
中文關鍵詞: KafkaConsumer AssignorLoad balance
外文關鍵詞: Kafka, Consumer Assignor, Load balance
相關次數: 點閱:73下載:13
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • Apache Kafka 為一分散式資料串流軟體,其特色為低延遲與高吞吐量,每秒能處理成千上萬筆資料,讓許多知名外商公司使用 Kafka 作為資料管線,串接許多不同的服務應用。根據服務的熱門程度,流入 Kafka 的資料量會有所不同。當流入的資料量不受控制時,可能會讓 Kafka 讀取端負載不平衡,導致吞吐量與資料端對端延遲受到影響,並讓後續串接的資料處理邏輯延宕。

    本論文提出一 Cost-Aware Assignor,Assignor 為 Kafka 讀取端分配負載的元件,其目的是解決由於 Kafka 寫入端寫入不平衡的流量對讀取端產生的影響。通過將需要讀取的資料量均勻分配給 Kafka 讀取端應用程式,以降低資料的端對端延遲,提高 Kafka 讀取端的吞吐量,並使後續串接的服務能夠即時處理資料,從而確保資料流的順暢運行。

    我們的實驗結果顯示,在 Partition 流量不均的情境下,使用本文提出之 Assignor 的 Consumer Group 相較於使用 Kafka 預設 Assignor 的 Consumer Group,能夠提升 15% 的吞吐量並降低 12 倍的資料平均端對端延遲。

    Apache Kafka is a distributed data streaming software known for its low latency and high throughput. It can process thousands to tens of thousands of data records per second, making it a popular choice for many well-known foreign companies to serve as a data pipeline, connecting various service applications.

    This paper introduces a Cost-Aware Assignor, which is responsible for distributing the partition load among the consumers in the same consumer group. Its purpose is to address the effect caused by imbalanced traffic from Kafka's producer. By evenly distributing the partition load to the consumers in consumer group, it reduces the average end-to-end latency, enhances Kafka consumer group throughput, and enables real-time processing of data by subsequent connected services, ensuring smooth operation of the data flow.

    Our experimental results demonstrate that in scenarios with uneven partition traffic, using the Consumer Group with the Assignor proposed in this paper outperforms the Consumer Group using Kafka's default Assignor, achieving a 15% increase in throughput and reducing average data end-to-end latency by a factor of 12.

    摘要 i 英文延伸摘要 ii 誌謝 v 目錄 vi 表格 viii 圖片 ix Chapter 1. 簡介 1 Chapter 2. 研究背景 4 2.1. Kafka 4 2.1.1. Broker 5 2.1.2. Topic 5 2.1.3. Partition 5 2.2. Kafka Consumer 6 2.2.1. Kafka Consumer 拉取流程 7 2.2.2. Consumer Group 7 2.2.3. Consumer Assignor 8 2.3. 重新平衡協定 9 2.3.1. Find Coordinator 9 2.3.2. Join Group 10 2.3.3. Sync Group 10 2.4. Kafka 預設 Assignor 12 2.5. Kafka Consumer 節省資源機制 13 2.6. Java Management Extensions 14 2.6.1. MBean 14 2.7. KRaft 15 Chapter 3. 系統架構 16 3.1. 問題描述 16 3.2. 研究方法 16 3.2.1. Cost-Aware Assignor 架構 17 3.2.2. 執行流程 17 3.3. 研究方法細節與實作 18 3.3.1. Cost Function 19 3.3.2. 隨機分配演算法 20 3.4. 使用 Cost-Aware Assignor 之條件 21 3.5. 使用 Cost-Aware Assignor 的額外開銷 22 3.5.1. 時間開銷 22 3.5.2. 網路頻寬開銷 23 Chapter 4. 實驗 24 4.1. 實驗環境 24 4.2. 實驗一:Partitions 流量不均 26 4.2.1. 實驗方式 26 4.2.2. 實驗結果 28 Chapter 5. 相關研究 35 5.1. 優化 Kafka 效能相關研究 35 5.2. 負載平衡手法相關研究 36 Chapter 6. 結論 37 References 38

    [1] Kafka Consumer. https://docs.confluent.io/platform/current/clients/consumer.html, 2021.
    [2] Kafka Partition. https://developer.confluent.io/learn-kafka/apache-kafka/partitions/, 2021.
    [3] Kafka Topic. https://developer.confluent.io/learn-kafka/apache-kafka/topics/, 2021.
    [4] Self-Balancing Clusters. https://docs.confluent.io/platform/current/kafka/sbc/index.html, 2021
    [5] JConsole. https://docs.oracle.com/javase/8/docs/technotes/guides/management/jconsole.html, 2023
    [6] JMC. https://www.oracle.com/java/technologies/jdk-mission-control.html, 2023.
    [7] Kafka. https://kafka.apache.org/, 2023.
    [8] Kafka. https://kafka.apache.org/34/javadoc/org/apache/kafka/clients/consumer/RangeAssignor.html, 2023.
    [9] MBean Client. https://docs.oracle.com/javase/tutorial/jmx/remote/custom.html, 2023.
    [10] Aditya Auradkar, Chavdar Botev, Shirshanka Das, Dave De Maagd, Alex Feinberg, Phanindra Ganti, Lei Gao, Bhaskar Ghosh, Kishore Gopalakrishna, Brendan Harris, Joel Koshy, Kevin Krawez, Jay Kreps, Shi Lu, Sunil Nagaraj, Neha Narkhede, Sasha Pachev, Igor Perisic, Lin Qiao, Tom Quiggle, Jun Rao, Bob Schulman, Abraham Sebastian, Oliver Seeliger, Adam Silberstein, BBoris Shkolnik, Chinmay Soman, Roshan Sumbaly, Kapil Surlaker, Sajid Topiwala, Cuong Tran, Balaji Varadarajan, Jemiah Westerman, Zach White, David Zhang, and Jason Zhang. Data infrastructure at linkedin. pages 1370–1381, 2012.
    [11] Ryan Clark, Jen Edmondson, Dennis Kwong, Jac Noel, Elaine Rainbolt, and Paul Salessi. . https://www.intel.com/content/dam/www/central-libraries/us/en/documents/building-a-cyber-intelligence-platform-apache-kafka-paper.pdf, 2020.
    [12] Confluent. Kafka Producer. https://docs.confluent.io/platform/current/clients/producer.html, 2021
    [13] Dimitris Dedousis, Nikos Zacheilas, and Vana Kalogeraki. On the fly load balancing to address hot topics in topic-based pub/sub systems. In 2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS), pages 76–86, 2018.
    [14] Mazen Ezzeddine, Gael Migliorini, Francoise Baude, and Fabrice Huet. Cost-efficient and latency-aware event consuming in workload-skewed distributed event queues. ICCBDC '22, page 62– 70, New York, NY, USA, 2022. Association for Computing Machinery.
    [15] Jing He and Daqing Gong. Cloud computing load balancing mechanism taking into account load balancing ant colony optimization algorithm. Hindawi, 2022.
    [16] Meenakshi Jindal. Data Reprocessing Pipeline in Asset Management Platform @Netflix. https://netflixtechblog.com/data-reprocessing-pipeline-in-asset-management-platform-netflix-46fe225c35c9, 2022.
    [17] Nikos R. Katsipoulakis, Alexandros Labrinidis, and Panos K. Chrysanthis. Improving stream load balance through shedding. pages 120–126, 2021.
    [18] Diogo Landau, Xavier Andrade, and Jorge G Barbosa. Kafka consumer group autoscaler. arXiv preprint arXiv:2206.11170, 2022.
    [19] Kaushik Mishra and Santosh Majhi. A state-of-art on cloud load balancing algorithms. International Journal of computing and digital systems, 9(2):201–220, 2020.
    [20] Shalu Rani, Dharminder Kumar, and Sakshi Dhingra. A review on dynamic load balancing algorithms. pages 515–520, 2022.
    [21] Suman Sansanwal and Nitin Jain. Review of existing variants of grey wolf optimization algorithm handling load balancing in clouds. pages 1–6, 2021.
    [22] Gwen Shapira, Todd Palino, Rajini Sivaram, and Krit Petty. Kafka: The Definitive Guide, 2nd Edition. O’Reilly Media, Inc., 2021.
    [23] Yongge Shi, Zhen Hu, and Zhiheng Lu. Optimized dynamic load balance method based on ant colony optimization algorithm. pages 70–73, 2021.
    [24] Maulin Vasavada. Marching Toward a Trillion Kafka Messages per Day: Running Kafka at scale at PayPal. https://www.confluent.io/resources/kafka-summit-2020/marching-toward-a-trillion-kafka-messages-per-day-running-kafka-at-scale-at-paypal/, 2020.
    [25] Guozhang Wang, Joel Koshy, Sriram Subramanian, Kartik Paramasivam, Mammad Zadeh, Neha Narkhede, Jun Rao, Jay Kreps, and Joe Stein. Building a replicated logging system with apache kafka. Proc. VLDB Endow., 8(12):1654 – 1655, aug 2015.

    下載圖示 校內:立即公開
    校外:立即公開
    QR CODE