| 研究生: |
陳儁 Chen, Jun |
|---|---|
| 論文名稱: |
在同質平行網路中排程協同流以最小化最大完成時間 Scheduling Coflows for Minimizing the Makespan in Identical Parallel Networks |
| 指導教授: |
陳奇業
Chen, Chi-Yeh |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
| 論文出版年: | 2022 |
| 畢業學年度: | 110 |
| 語文別: | 英文 |
| 論文頁數: | 71 |
| 中文關鍵詞: | 協同流排程 、同質的平行網絡 、最大完成時間 、數據中心 、近似演算法 |
| 外文關鍵詞: | coflow scheduling, identical parallel networks, makespan, data center, approximation algorithm |
| 相關次數: | 點閱:102 下載:23 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著科技的發展,平行計算應用程式己普遍在大型數據中心中執行。這些平行計算應用程式包含計算階段與通訊階段,透過反覆執行這二個階段來完成工作。然而由於計算需求日益增加,大型數據中心需要負擔大量的通訊需求。協同流 (Coflow) 是最近提出的一種網路抽象概念,用於描述此平行計算中的通訊模式。本論文研究協同流的排程問題,目的於如何在同質的平行網絡中最小化最大完成時間。大型數據中的協同流排程問題被認為是最重要的NP-hard問題之一。本論文考慮協同流可被分為分割與不可分割的二種情況,在可分割的協同流中,不同流可以被安排在不同的網絡核心傳輸,而不可分割的協同流中,不同流只能被安排在相同的網絡核心傳輸。對於可分割的協同流排程問題,我們提出了一個(3 − 2/m)-近似演算法以及另一個( 8/3 − 2/(3m))-近似演算法,m 代表所有網絡核心的數量。對於不可分割的協同流排程問題,我們提出了一個(2m)-近似演算法。最後,我們對提出的演算 法與 Weaver [Huang et al., In 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pages 1071–1081, 2020.] 進行模擬實驗,並將我們提出的演算法 以及 Weaver 的效能進行比較。
With the development of technology, parallel computing applications have been commonly executed in large data centers. These parallel computing applications include computation phase and communication phase, and work is completed by repeatedly executing these two phases. However, due to the ever-increasing computing demands, large data centers are burdened with massive communication demands. Coflow is a recently proposed networking abstraction to capture communication patterns in data-parallel computing frameworks. This thesis focuses on the coflow scheduling problem in identical parallel networks, where the goal is to minimize makespan, the maximum completion time of coflows. The coflow scheduling problem in huge data center is considered one of the most significant NP-hard problems. In this thesis, coflow can be considered as either divisible or indivisible case. Distinct flows in a divisible coflow can be transferred through different network cores, while those in an indivisible coflow can only be transferred through the same network core. In the divisible coflow scheduling problem, this thesis proposes a (3 − 2/m)-approximation algorithm, and a ( 8/3 − 2/(3m))-approximation algorithm, where m is the number of network cores. In the indivisible coflow scheduling problem, this thesis proposes a (2m)-approximation algorithm. Finally, we simulate our proposed algorithm and Weaver’s [Huang et al., In 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pages 1071–1081, 2020.] and compare the performance of our algorithms with that of Weaver’s.
[1] Saba Ahmadi, Samir Khuller, Manish Purohit, and Sheng Yang. On scheduling coflows. Algorithmica, 82(12):3604–3629, 2020.
[2] Mohammad Al-Fares, Alexander Loukissas, and Amin Vahdat. A scalable, commodity data center network architecture. ACM SIGCOMM computer communication review, 38(4):63–74, 2008.
[3] Dhruba Borthakur. The hadoop distributed file system: Architecture and design. Hadoop Project Website, 11(2007):21, 2007.
[4] Chi-Yeh Chen. Scheduling coflows for minimizing the total weighted completion time in heterogeneous parallel networks. arXiv preprint arXiv:2204.07799, 2022.
[5] Chi-Yeh Chen. Scheduling coflows for minimizing the total weighted completion time in identical parallel networks. CoRR, abs/2204.02651, 2022.
[6] Chi-Yeh Chen. Scheduling coflows with precedence constraints for minimizing the total weighted completion time in identical parallel networks. arXiv preprint arXiv:2205.02474, 2022.
[7] Mosharaf Chowdhury. Coflow-benchmark. https://github.com/coflow/ coflow-benchmark.
[8] Mosharaf Chowdhury. Coflowsim. https://github.com/coflow/coflowsim.
[9] Mosharaf Chowdhury and Ion Stoica. Coflow: A networking abstraction for cluster applications. In Proceedings of the 11th ACM Workshop on Hot Topics in Networks, pages 31–36, 2012.
[10] Mosharaf Chowdhury and Ion Stoica. Efficient coflow scheduling without prior knowledge. ACM SIGCOMM Computer Communication Review, 45(4):393–406, 2015.
[11] Mosharaf Chowdhury, Yuan Zhong, and Ion Stoica. Efficient coflow scheduling with varys. In Proceedings of the 2014 ACM conference on SIGCOMM, pages 443–454, 2014.
[12] Jeffrey Dean and Sanjay Ghemawat. Mapreduce: simplified data processing on large clusters. Communications of the ACM, 51(1):107–113, 2008.
[13] Albert Greenberg, James R Hamilton, Navendu Jain, Srikanth Kandula, Changhoon Kim, Parantap Lahiri, David A Maltz, Parveen Patel, and Sudipta Sengupta. Vl2: A scalable and flexible data center network. In Proceedings of the ACM SIGCOMM 2009 conference on Data communication, pages 51–62, 2009.
[14] Asif Hasnain and Holger Karl. Coflow scheduling with performance guarantees for data center applications. In 2020 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID), pages 850–856. IEEE, 2020.
[15] Xin Sunny Huang, Yiting Xia, and TS Eugene Ng. Weaver: Efficient coflow scheduling in heterogeneous parallel networks. In 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pages 1071–1081. IEEE, 2020.
[16] Michael Isard, Mihai Budiu, Yuan Yu, Andrew Birrell, and Dennis Fetterly. Dryad: distributed data-parallel programs from sequential building blocks. In Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007, pages 59–72, 2007.
[17] Loris Marchal. Lecture 2: Scheduling on parallel machines. 2012.
[18] Zhen Qiu, Cliff Stein, and Yuan Zhong. Minimizing the total weighted completion time of coflows in datacenter networks. In Proceedings of the 27th ACM symposium on Parallelism in Algorithms and Architectures, pages 294–303, 2015.
[19] Sushant Sachdeva and Rishi Saket. Optimal inapproximability for scheduling problems via structural hardness for hypergraph vertex cover. In 2013 IEEE Conference on Computational Complexity, pages 219–229. IEEE, 2013.
[20] Mehrnoosh Shafiee and Javad Ghaderi. Scheduling coflows in datacenter networks: Improved bound for total weighted completion time. ACM SIGMETRICS Performance Evaluation Review, 45(1):29–30, 2017.
[21] Mehrnoosh Shafiee and Javad Ghaderi. An improved bound for minimizing the total weighted completion time of coflows in datacenters. IEEE/ACM Transactions on Networking, 26(4):1674–1687, 2018.
[22] Dian Shen, Junzhou Luo, Fang Dong, and Junxue Zhang. Virtco: joint coflow scheduling and virtual machine placement in cloud data centers. Tsinghua Science and Technology, 24(5):630–644, 2019.
[23] Arjun Singh, Joon Ong, Amit Agarwal, Glen Anderson, Ashby Armistead, Roy Bannon, Seb Boving, Gaurav Desai, Bob Felderman, Paulie Germano, et al. Jupiter rising: A decade of clos topologies and centralized control in google’s datacenter network. ACM SIGCOMM computer communication review, 45(4):183–197, 2015.
[24] David P. Williamson and David B. Shmoys. Greedy Algorithms and Local Search, page 27–56. Cambridge University Press, 2011.