簡易檢索 / 詳目顯示

研究生: 李兆恆
Lee, Chao-Heng
論文名稱: 公平性與工作流排程
Fairness and Workflow Scheduling
指導教授: 蕭宏章
Hsiao, Hung-Chang
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2024
畢業學年度: 112
語文別: 中文
論文頁數: 67
中文關鍵詞: 公平性工作流排程
外文關鍵詞: fairness, workflow, scheduling
相關次數: 點閱:57下載:14
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 工作流排程旨在將一執行流程切分為多個任務並利用有限資源中並行運算的優勢降低整體所需執行時間。為了提供各個單位運行工作流,工作流平台提供了多租戶使用者提交各自的工作流並透過資源調度的方式執行各個工作流任務,在此情境下如何排程不同使用者間的工作流除了整體系統運行時間外,個別工作流的公平性成為一個需要關注的議題。

    如何制定工作流排程公平性指標為本論文欲解決之問題。透過討論先前研究中提出之公平性指標所存在的探討面向有限性,本研究制定另一公平性指標,其貢獻在於提供工作流排程對於資源使用的公平性展現。此外我們也制定了一排程演算法框架並透過六個演算法作為示例展示,根據此框架可以簡化判斷不同演算法間對於公平性之影響,最後我們透過實驗驗證前述討論結果並從中展現此指標對於資源使用的公平性關係。

    In the context of the increasing computational demands of large data sets, establishing data analysis and machine learning pipelines through workflows has become a primary operational method. Workflows offer users advantages such as computational logic, reusability of results, scalability, version control, and bottleneck tracking. To support workflow execution for various units, workflow platforms enable multi-tenant users to submit workflows and execute tasks via resource scheduling. However, ensuring fairness in resource allocation among users under limited resources has become a key concern, necessitating relevant metrics for evaluation and validation.

    This paper first examines fairness metrics for workflow scheduling proposed in previous literature to understand their focus and limitations. It then introduces a new fairness metric, offering a different perspective on fairness discussions compared to the original metric. The aim is to provide users with options to select and utilize metrics based on their needs.

    This paper formulates a workflow scheduling framework and discusses its impact on fairness through six workflow scheduling examples. Through this discussion, the differences between the two metrics are highlighted, and the results are validated in the final experiment.

    摘要 i 英文延伸摘要 ii 誌謝 v 目錄 vi 表格 viii 圖片 ix Chapter 1. 緒論 1 1.1. 問題與挑戰 1 1.2. 提出方法 2 1.3. 實驗驗證 2 1.4. 本文貢獻 3 1.5. 論文結構 3 Chapter 2. 背景 4 2.1. 工作流 4 2.2. 工作流排程 5 2.3. 工作流屬性 6 2.3.1. top level(t-level) 6 2.3.2. bottom level(b-level) 6 2.3.3. As Late As Possible(ALAP) 6 Chapter 3. 公平性指標 8 3.1. 問題描述 8 3.2. 文獻中公平性指標 9 3.2.1. 示例 9 3.3. 限制 11 3.3.1. 延遲放大導致公平性數值降低 11 3.3.2. 觀察面向受限於完成時間 11 3.4. 提出之公平性指標 12 3.4.1. 示例 13 3.5. 比較 14 Chapter 4. 公平性之於排程演算法 15 4.1. 工作流排程演算法框架 15 4.2. 公平性之於排程演算法 16 4.2.1. Highest Level First with Estimated Times 16 4.2.2. Insertion Scheduling Heuristic 18 4.2.3. Modified Critical Path 20 4.2.4. Earliest Time First 22 4.2.5. Dynamic Level Scheduling 24 4.2.6. Fairness Policy 26 Chapter 5. 實驗 30 5.1. 隨機工作流生成 30 5.1.1. 示例 31 5.2. 實驗情境 31 5.3. 實驗一 31 5.3.1. 實驗結果 32 5.4. 實驗二 35 5.4.1. 平均工作流任務數量 35 5.4.2. 處理單元數量 38 5.4.3. 任務權重隨機範圍 41 5.4.4. 工作流層數隨機範圍 44 Chapter 6. 相關研究 47 6.1. 公平性相關研究 47 6.2. 排程演算法框架相關研究 47 6.3. 排程演算法相關研究 48 6.3.1. 完成時間最佳化 48 6.3.2. 多目標最佳化 48 Chapter 7. 結論 52 參考文獻 53

    [1] Farzaneh Abazari, Morteza Analoui, Hassan Takabi, and Song Fu. Mows: Multi-objective workflow scheduling in cloud computing based on heuristic algorithm. Simulation Modelling Practice and Theory, 93:119–132, 2019. Modeling and Simulation of Cloud Computing and Big Data.
    [2] Saeid Abrishami, Mahmoud Naghibzadeh, and Dick H.J. Epema. Deadline-constrained workflow scheduling algorithms for infrastructure as a service clouds. Future Generation Computer Systems, 29(1):158–169, 2013. Including Special section: AIRCC-NetCoM 2009 and Special section: Clouds and Service-Oriented Architectures.
    [3] Thomas L. Adam, K. M. Chandy, and J. R. Dickson. A comparison of list schedules for parallel processing systems. Commun. ACM, 17(12):685–690, dec 1974.
    [4] Hamid Arabnejad, Jorge G. Barbosa, and Frédéric Suter. Fair Resource Sharing for Dynamic Scheduling of Workflows on Heterogeneous Systems, pages 145–167. 2014.
    [5] Vahid Arabnejad, Kris Bubendorfer, and Bryan Ng. Budget and deadline aware e-science workflow scheduling in clouds. IEEE Transactions on Parallel and Distributed Systems, 30(1):29–44, 2019.
    [6] Anne Benoit, Ümit V. Çatalyürek, Yves Robert, and Erik Saule. A survey of pipelined workflow scheduling: Models and algorithms. ACM Comput. Surv., 45(4), aug 2013.
    [7] Shishir Bharathi, Ann Chervenak, Ewa Deelman, Gaurang Mehta, Mei-Hui Su, and Karan Vahi. Characterization of scientific workflows. In 2008 Third Workshop on Workflows in Support of Large-Scale Science, pages 1–10, 2008.
    [8] Jian Cao, Yan Yao, and Yi Wang. Mining change operations for workflow platform as a service. World Wide Web, 18(4):1071–1092, jul 2015.
    [9] Huangke Chen, Xiaomin Zhu, Dishan Qiu, and Ling Liu. Uncertainty-aware real-time workflow scheduling in the cloud. In 2016 IEEE 9th International Conference on Cloud Computing (CLOUD), pages 577–584, 2016.
    [10] Wei Chen, Young Choon Lee, Alan Fekete, and Albert Y. Zomaya. Adaptive multiple-workflow scheduling with task rearrangement. J. Supercomput., 71(4):1297–1317, apr 2015.
    [11] Wei-Neng Chen and Jun Zhang. An ant colony optimization approach to a grid workflow scheduling problem with various qos requirements. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 39(1):29–43, 2009.
    [12] Zong-Gan Chen, Zhi-Hui Zhan, Ying Lin, Yue-Jiao Gong, Tian-Long Gu, Feng Zhao, Hua-Qiang Yuan, Xiaofeng Chen, Qing Li, and Jun Zhang. Multiobjective cloud workflow scheduling: A multiple populations ant colony system approach. IEEE Transactions on Cybernetics, 49(8):2912–2926, 2019.
    [13] Tainã Coleman, Henri Casanova, and Rafael Ferreira da Silva. Wfchef: Automated generation of accurate scientific workflow generators. In 2021 IEEE 17th International Conference on eScience (eScience), pages 159–168, 2021.
    [14] Juan J. Durillo, Hamid Mohammadi Fard, and Radu Prodan. Moheft: A multi-objective list-based method for workflow scheduling. In 4th IEEE International Conference on Cloud Computing Technology and Science Proceedings, pages 185–192, 2012.
    [15] Hamid Mohammadi Fard, Radu Prodan, Juan Jose Durillo Barrionuevo, and Thomas Fahringer. A multi-objective approach for workflow scheduling in heterogeneous environments. In 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012), pages 300–309, 2012.
    [16] R. L. Graham. Bounds on multiprocessing anomalies and related packing algorithms. In Proceedings of the May 16-18, 1972, Spring Joint Computer Conference, AFIPS’72 (Spring), page 205–217, New York, NY, USA, 1971. Association for Computing Machinery.
    [17] Jing-Jang Hwang, Yuan-Chieh Chow, Frank D. Anger, and Chung-Yee Lee. Scheduling precedence graphs in systems with interprocessor communication times. SIAM J. Comput., 18(2):244–257, apr 1989.
    [18] Ghazaleh Khojasteh Toussi, Mahmoud Naghibzadeh, Saeid Abrishami, Hoda Taheri, and Hamid Abrishami. Edqws: an enhanced divide and conquer algorithm for workflow scheduling in cloud. J. Cloud Comput., 11(1), may 2022.
    [19] B. Kruatrachue and T. Lewis. Grain size determination for parallel processing. IEEE Software, 5(1):23–32, 1988.
    [20] Yu-Kwong Kwok and Ishfaq Ahmad. Static scheduling algorithms for allocating directed task graphs to multiprocessors. ACM Comput. Surv., 31(4):406–471, dec 1999.
    [21] Young Choon Lee, Hyuck Han, Albert Y. Zomaya, and Mazin Yousif. Resource-efficient workflow scheduling in clouds. Know.-Based Syst., 80(C):153–162, may 2015.
    [22] Weiling Li, Yunni Xia, Mengchu Zhou, Xiaoning Sun, and Qingsheng Zhu. Fluctuation-aware and predictive workflow scheduling in cost-effective infrastructure-as-a-service clouds. IEEE Access, 6:61488–61502, 2018.
    [23] Chee Sun Liew, Malcolm P. Atkinson, Michelle Galea, Tan Fong Ang, Paul Martin, and Jano I. Van Hemert. Scientific workflows: Moving across paradigms. ACM Comput. Surv., 49(4), dec 2016.
    [24] Xiaojin Ma, Huahu Xu, Honghao Gao, and Minjie Bian. Real-time multiple-workflow scheduling in cloud environments. IEEE Transactions on Network and Service Management, 18(4):4002–4018, 2021.
    [25] Fabrizio Marozzo, Domenico Talia, and Paolo Trunfio. A workflow management system for scalable data mining on clouds. IEEE Transactions on Services Computing, 11(3):480–492, 2018.
    [26] Deepak Poola, Kotagiri Ramamohanarao, and Rajkumar Buyya. Fault-tolerant workflow scheduling using spot instances on clouds. Procedia Computer Science, 29:523–533, 2014. 2014 International Conference on Computational Science.
    [27] Shaghayegh Sharif, Javid Taheri, Albert Y. Zomaya, and Surya Nepal. Online multiple workflow scheduling under privacy and deadline in hybrid cloud environment. In 2014 IEEE 6th International Conference on Cloud Computing Technology and Science, pages 455–462, 2014.
    [28] G.C. Sih and E.A. Lee. A compile-time scheduling heuristic for interconnection-constrained heterogeneous processor architectures. IEEE Transactions on Parallel and Distributed Systems, 4(2):175–187, 1993.
    [29] Ying-Lin Tsai, Hsiao-Ching Liu, and Kuo-Chan Huang. Adaptive dual-criteria task group allocation for clustering-based multi-workflow scheduling on parallel computing platform. J. Supercomput., 71(10):3811–3831, oct 2015.
    [30] Yuxin Wang, Shijie Cao, Guan Wang, Zhen Feng, Chi Zhang, and He Guo. Fairness scheduling with dynamic priority for multi workflow on heterogeneous systems. In 2017 IEEE 2nd International Conference on Cloud Computing and Big Data Analysis (ICCCBDA), pages 404–409, 2017.
    [31] MY Wu and Danijel Gajski. Hypertool: a programming aid for message-passing systems. IEEE transactions on parallel and distributed systems, 1(3):330–343, 1990.
    [32] Qishi Wu, Daqing Yun, Xiangyu Lin, Yi Gu, Wuyin Lin, and Yangang Liu. On workflow scheduling for end-to-end performance optimization in distributed network environments. In Walfredo Cirne, Narayan Desai, Eitan Frachtenberg, and Uwe Schwiegelshohn, editors, Job Scheduling Strategies for Parallel Processing, pages 76–95, Berlin, Heidelberg, 2013. Springer Berlin Heidelberg.
    [33] Xuewen Xia, Huixian Qiu, Xing Xu, and Yinglong Zhang. Multi-objective workflow scheduling based on genetic algorithm in cloud environment. Inf. Sci., 606(C):38–59, aug 2022.
    [34] Doris Xin, Hui Miao, Aditya Parameswaran, and Neoklis Polyzotis. Production machine learning pipelines: Empirical analysis and optimization opportunities. In Proceedings of the 2021 International Conference on Management of Data, SIGMOD ’21, page 2639–2652, New York, NY, USA, 2021. Association for Computing Machinery.
    [35] Lingfang Zeng, Bharadwaj Veeravalli, and Xiaorong Li. Saba. J. Parallel Distrib. Comput., 75(C):141–151, jan 2015.
    [36] Henan Zhao and R. Sakellariou. Scheduling multiple dags onto heterogeneous systems. In Proceedings 20th IEEE International Parallel Distributed Processing Symposium, pages 14 pp.–, 2006.
    [37] Laiping Zhao, Yizhi Ren, and Kouichi Sakurai. Reliable workflow scheduling with less resource redundancy. Parallel Comput., 39(10):567–585, oct 2013.
    [38] Yong Zhao, Youfu Li, Shiyong Lu, Ioan Raicu, and Cui Lin. Devising a cloud scientific workflow platform for big data. In 2014 IEEE World Congress on Services, pages 393–401, 2014.
    [39] Junlong Zhou, Tian Wang, Peijin Cong, Pingping Lu, Tongquan Wei, and Mingsong Chen. Cost and makespan-aware workflow scheduling in hybrid clouds. J. Syst. Archit., 100(C), nov 2019.
    [40] Zhaomeng Zhu, Gongxuan Zhang, Miqing Li, and Xiaohui Liu. Evolutionary multi-objective workflow scheduling in cloud. IEEE Transactions on Parallel and Distributed Systems, 27(5):1344–1357, 2016.

    下載圖示 校內:立即公開
    校外:立即公開
    QR CODE