| 研究生: |
黃智垣 Huang, Chih-Yuan |
|---|---|
| 論文名稱: |
雲原生工作流之比對探討 A Comparative Study for Cloud-Native Workflows |
| 指導教授: |
蕭宏章
Hsiao, Hung-Chang |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
| 論文出版年: | 2022 |
| 畢業學年度: | 110 |
| 語文別: | 中文 |
| 論文頁數: | 21 |
| 中文關鍵詞: | Kubernetes 、Cloud Orchestration 、Flyte 、TPC-DS 、Spark |
| 外文關鍵詞: | Kubernetes, Cloud Orchestration, Flyte, TPC-DS, Spark |
| 相關次數: | 點閱:86 下載:24 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著微服務的興起,系統維運人員要管理的對象從幾個應用程式,變成成千上百個分散在雲端的容器。為了更好的管理及使用這些容器,Cloud Orchestration 逐漸被各企業重視。Kubernetes(K8s) 是其中一種 Cloud Orchestration 的工具,提供自動化佈署、擴展、管理多個容器的系統,並透過 Master 管理介面控管所有節點。
佈署一個應用到 K8s 上並沒有那麼容易,需要填寫許多 Yaml 檔案,來讓應用可以順利執行。本論文觀察一個名為 Flyte 的開源軟體平台,該平台提供高即時性、可擴充與可維護的工作流。為了比較 Flyte 與傳統佈署 K8s 方式的差異,本論文選擇目前業界標準的 Data Warehousing Benchmark,也就是 TPC-DS 作為目標應用,分成定性與量化兩個方面,觀察兩種編排方式的差異。
With the rise of microservices, system maintainers have to manage hundreds of containers in the cloud instead of a few applications. Kubernetes (K8s) is one of the Cloud Orchestration tools that provides automated deployment, scaling, and management of multiple containers and control of all nodes through the Master Management Interface. Deploying an application to K8s is not that easy and requires many Yaml files to be filled out to allow the application to run smoothly. This paper looks at an open source software platform called Flyte, which provides a highly real-time, scalable and maintainable workflow. In order to compare the differences between Flyte and the traditional deployment of K8s, this paper selects the current industry standard Data Warehousing Benchmark, or TPC-DS, as the target application and divides it into qualitative and quantitative aspects to observe the differences between the two scheduling approaches.
[1] Amazon SageMaker. https://aws.amazon.com/sagemaker/.
[2] Apache Hadoop - Homepage. https://hadoop.apache.org/.
[3] Apache Kafka - Homepage. https://kafka.apache.org/.
[4] Apache Mesos - Homepage. https://mesos.apache.org/.
[5] Apache Spark - Homepage. https://spark.apache.org/.
[6] Apache Submarine. https://submarine.apache.org/.
[7] Databricks. https://www.databricks.com/.
[8] Flyte - Homepage. https://flyte.org/.
[9] Kubeflow - Homepage. https://www.kubeflow.org/.
[10] Kubernetes - Homepage. https://kubernetes.io/.
[11] Microsoft Azure Pipelines. https://azure.microsoft.com/products/devops/pipelines/.
[12] TPC-DS. https://www.tpc.org/tpcds/.
[13] Daniel Baur, Daniel Seybold, Frank Griesinger, Athanasios Tsitsipas, Christopher B.Hauser, and Jörg Domaschka. Cloud orchestration features: Are tools fit for purpose? In 2015 IEEE/ACM 8th International Conference on Utility and Cloud Computing (UCC), pages 95–101, 2015.
[14] databricks. databricks/spark-sql-perf - Github. https://github.com/databricks/spark-sqlperf.
[15] David F. Ferraiolo and D. Richard Kuhn. Role-based access controls. CoRR, abs/0903.2171, 2009.
[16] Peter J. Morris. The dawn of big data. North Carolina Medical Journal, 2014.
[17] Raghunath Othayoth Nambiar and Meikel Pöss. The making of tpc-ds. In VLDB, 2006.
[18] Rajiv Ranjan and Boualem Benatallah. Programming cloud resource orchestration framework: Operations and research challenges. CoRR, abs/1204.2204, 2012.
[19] Abhishek Verma, Luis Pedrosa, Madhukar R. Korupolu, David Oppenheimer, Eric Tune, and John Wilkes. Large-scale cluster management at Google with Borg. In
Proceedings of the European Conference on Computer Systems (EuroSys), Bordeaux,
France, 2015.