| 研究生: | 鄧智懋 Teng, Zhi-Mao | 
|---|---|
| 論文名稱: | 基於 Partition 負載資訊實現 Kafka Consumer 負載平衡 Kafka Consumer load balancing based on partition load information | 
| 指導教授: | 蕭宏章 Hsiao, Hung-Chang | 
| 學位類別: | 碩士 Master | 
| 系所名稱: | 電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering | 
| 論文出版年: | 2023 | 
| 畢業學年度: | 111 | 
| 語文別: | 中文 | 
| 論文頁數: | 39 | 
| 中文關鍵詞: | Kafka 、Consumer Assignor 、Load balance | 
| 外文關鍵詞: | Kafka, Consumer Assignor, Load balance | 
| 相關次數: | 點閱:73 下載:13 | 
| 分享至: | 
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 | 
Apache Kafka 為一分散式資料串流軟體,其特色為低延遲與高吞吐量,每秒能處理成千上萬筆資料,讓許多知名外商公司使用 Kafka 作為資料管線,串接許多不同的服務應用。根據服務的熱門程度,流入 Kafka 的資料量會有所不同。當流入的資料量不受控制時,可能會讓 Kafka 讀取端負載不平衡,導致吞吐量與資料端對端延遲受到影響,並讓後續串接的資料處理邏輯延宕。
本論文提出一 Cost-Aware Assignor,Assignor 為 Kafka 讀取端分配負載的元件,其目的是解決由於 Kafka 寫入端寫入不平衡的流量對讀取端產生的影響。通過將需要讀取的資料量均勻分配給 Kafka 讀取端應用程式,以降低資料的端對端延遲,提高 Kafka 讀取端的吞吐量,並使後續串接的服務能夠即時處理資料,從而確保資料流的順暢運行。
我們的實驗結果顯示,在 Partition 流量不均的情境下,使用本文提出之 Assignor 的 Consumer Group 相較於使用 Kafka 預設 Assignor 的 Consumer Group,能夠提升 15% 的吞吐量並降低 12 倍的資料平均端對端延遲。
Apache Kafka is a distributed data streaming software known for its low latency and high throughput. It can process thousands to tens of thousands of data records per second, making it a popular choice for many well-known foreign companies to serve as a data pipeline, connecting various service applications.
This paper introduces a Cost-Aware Assignor, which is responsible for distributing the partition load among the consumers in the same consumer group. Its purpose is to address the effect caused by imbalanced traffic from Kafka's producer. By evenly distributing the partition load to the consumers in consumer group, it reduces the average end-to-end latency, enhances Kafka consumer group throughput, and enables real-time processing of data by subsequent connected services, ensuring smooth operation of the data flow.
Our experimental results demonstrate that in scenarios with uneven partition traffic, using the Consumer Group with the Assignor proposed in this paper outperforms the Consumer Group using Kafka's default Assignor, achieving a 15% increase in throughput and reducing average data end-to-end latency by a factor of 12.
[1] Kafka Consumer. https://docs.confluent.io/platform/current/clients/consumer.html, 2021.
[2] Kafka Partition. https://developer.confluent.io/learn-kafka/apache-kafka/partitions/, 2021.
[3] Kafka Topic. https://developer.confluent.io/learn-kafka/apache-kafka/topics/, 2021.
[4] Self-Balancing Clusters. https://docs.confluent.io/platform/current/kafka/sbc/index.html, 2021
[5] JConsole. https://docs.oracle.com/javase/8/docs/technotes/guides/management/jconsole.html, 2023
[6] JMC. https://www.oracle.com/java/technologies/jdk-mission-control.html, 2023.
[7] Kafka. https://kafka.apache.org/, 2023.
[8] Kafka. https://kafka.apache.org/34/javadoc/org/apache/kafka/clients/consumer/RangeAssignor.html, 2023.
[9] MBean Client. https://docs.oracle.com/javase/tutorial/jmx/remote/custom.html, 2023.
[10] Aditya Auradkar, Chavdar Botev, Shirshanka Das, Dave De Maagd, Alex Feinberg, Phanindra Ganti, Lei Gao, Bhaskar Ghosh, Kishore Gopalakrishna, Brendan Harris, Joel Koshy, Kevin Krawez, Jay Kreps, Shi Lu, Sunil Nagaraj, Neha Narkhede, Sasha Pachev, Igor Perisic, Lin Qiao, Tom Quiggle, Jun Rao, Bob Schulman, Abraham Sebastian, Oliver Seeliger, Adam Silberstein, BBoris Shkolnik, Chinmay Soman, Roshan Sumbaly, Kapil Surlaker, Sajid Topiwala, Cuong Tran, Balaji Varadarajan, Jemiah Westerman, Zach White, David Zhang, and Jason Zhang. Data infrastructure at linkedin. pages 1370–1381, 2012.
[11] Ryan Clark, Jen Edmondson, Dennis Kwong, Jac Noel, Elaine Rainbolt, and Paul Salessi. . https://www.intel.com/content/dam/www/central-libraries/us/en/documents/building-a-cyber-intelligence-platform-apache-kafka-paper.pdf, 2020.
[12] Confluent. Kafka Producer. https://docs.confluent.io/platform/current/clients/producer.html, 2021
[13] Dimitris Dedousis, Nikos Zacheilas, and Vana Kalogeraki. On the fly load balancing to address hot topics in topic-based pub/sub systems. In 2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS), pages 76–86, 2018.
[14] Mazen Ezzeddine, Gael Migliorini, Francoise Baude, and Fabrice Huet. Cost-efficient and latency-aware event consuming in workload-skewed distributed event queues. ICCBDC '22, page 62– 70, New York, NY, USA, 2022. Association for Computing Machinery.
[15] Jing He and Daqing Gong. Cloud computing load balancing mechanism taking into account load balancing ant colony optimization algorithm. Hindawi, 2022.
[16] Meenakshi Jindal. Data Reprocessing Pipeline in Asset Management Platform @Netflix. https://netflixtechblog.com/data-reprocessing-pipeline-in-asset-management-platform-netflix-46fe225c35c9, 2022.
[17] Nikos R. Katsipoulakis, Alexandros Labrinidis, and Panos K. Chrysanthis. Improving stream load balance through shedding. pages 120–126, 2021.
[18] Diogo Landau, Xavier Andrade, and Jorge G Barbosa. Kafka consumer group autoscaler. arXiv preprint arXiv:2206.11170, 2022.
[19] Kaushik Mishra and Santosh Majhi. A state-of-art on cloud load balancing algorithms. International Journal of computing and digital systems, 9(2):201–220, 2020.
[20] Shalu Rani, Dharminder Kumar, and Sakshi Dhingra. A review on dynamic load balancing algorithms. pages 515–520, 2022.
[21] Suman Sansanwal and Nitin Jain. Review of existing variants of grey wolf optimization algorithm handling load balancing in clouds. pages 1–6, 2021.
[22] Gwen Shapira, Todd Palino, Rajini Sivaram, and Krit Petty. Kafka: The Definitive Guide, 2nd Edition. O’Reilly Media, Inc., 2021.
[23] Yongge Shi, Zhen Hu, and Zhiheng Lu. Optimized dynamic load balance method based on ant colony optimization algorithm. pages 70–73, 2021.
[24] Maulin Vasavada. Marching Toward a Trillion Kafka Messages per Day: Running Kafka at scale at PayPal. https://www.confluent.io/resources/kafka-summit-2020/marching-toward-a-trillion-kafka-messages-per-day-running-kafka-at-scale-at-paypal/, 2020.
[25] Guozhang Wang, Joel Koshy, Sriram Subramanian, Kartik Paramasivam, Mammad Zadeh, Neha Narkhede, Jun Rao, Jay Kreps, and Joe Stein. Building a replicated logging system with apache kafka. Proc. VLDB Endow., 8(12):1654 – 1655, aug 2015.