| 研究生: |
李家齊 Li, Chia-Chi |
|---|---|
| 論文名稱: |
資料中繼與轉傳服務系統 The Data Relaying Service |
| 指導教授: |
蕭宏章
Hsiao, Hung-Chang |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
| 論文出版年: | 2016 |
| 畢業學年度: | 104 |
| 語文別: | 中文 |
| 論文頁數: | 23 |
| 中文關鍵詞: | file system 、HDFS 、HBase 、分散式系統 、大數據 、Hadoop |
| 外文關鍵詞: | file system, HDFS, HBase, distributed system, BigData, Hadoop |
| 相關次數: | 點閱:130 下載:19 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
製作大數據的服務往往需要有大數據的儲存與傳輸系統。本論文目標是製作可以具有一套通用client端Application Programming Interface(API),內部有一套分散式儲存系統以及可以藉有該系統操作外部檔案系統,在兩不同的檔案系統間傳輸檔案的分散式系統。因為具有通用client 端API,所以可易於直接在The Data Relaying Service(DRS) 頂層另外開發系統,像是在頂層架一層file system framework,就變成一套file system。或是頂層架一層運算平台(ex: Massively R Data Parallel Computation over Hadoop without MapReduce[1])。而為了不影響頂層開發的效能以及成本上考量,該系統需要可跨平台,易於擴增以及能易於在頂層另外開發系統。所以製作時可以使用一些可利於開發大數據分析程式的儲存系統(ex: Hadoop Distributed File System (HDFS)[2], HBase[3])。在效能上也不能與外部檔案系統的讀寫上差異太大,且佔用資源也不能太多。
Sometimes to make service of BigData need file systems of BigData. In this paper, our goal is to make distributed system, which has a common API of client, and has a distributed file storage system, and it can operate other file system via itself to transport file between two different file systems. Because of common API of client, users can easily make some services on The Data Relaying Service (DRS). For example, you can make a file system framework on it, so that this service will change to a file system, or you can make a computing platform on it. (Ex: Massively R Data Parallel Computation over Hadoop without MapReduce) In addition, to prevent some problems of cost and efficacy, this service need to can cross platforms, easy amplification, and can make other system on it easily. So, to make it can use some platforms and file systems which are benefit from making system of BigData such as Hadoop Distributed File System (HDFS), HBase. Final, DRS should not differ much from other file systems in efficacy, and should not cost many resources.
[1] Yan-Jhou Huang . Massively R Data Parallel Computation over Hadoop without MapReduce, In Proceedings of National Cheng Kung University, 2016
[2] HDFS User Guide. https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html
[3] HBase. https://hbase.apache.org/
[4] network-attached storage(NAS).
http://searchstorage.techtarget.com/definition/network-attached-storage
[5] GREENPLUM DATABASE. http://greenplum.org/
[6] J. Postel and J. Reynolds. RFC959 - FILE TRANSFER PROTOCOL (FTP), In Proceedings of RFC, 1985
[7] Common Internet File System. Microsoft.
https://technet.microsoft.com/zh-tw/library/cc939973.aspx
[8] SMB: The Server Message Block Protocol. http://ubiqx.org/cifs/SMB.html
[9] Roy Thomas Fielding. Architectural Styles and the Design of Network-based Software Architectures, 2000
[10] MapReduce Tutorial. https://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html
[11] Vinod Kumar Vavilapalli,Arun C Murthy,Chris Douglas, Sharad Agarwal, Mahadev Konar, Robert Evans, Thomas Graves, Jason Lowe, Hitesh Shah, Siddharth Seth, Bikas Saha, Carlo Curino, Owen O’Malley, Sanjay Radia, Benjamin Reed, and Eric Baldeschwiele , Apache Hadoop YARN: Yet Another Resource Negotiator, 2013
[12] Matei Zaharia, Mosharaf Chowdhury, Michael J. Franklin, Scott Shenker, and Ion Stoica, Spark: Cluster Computing with Working Sets, 2010
[13] Patrick Hunt, Mahadev Konar, Flavio P. Junqueira, and Benjamin Reed (2012). ZooKeeper: Wait-free coordination for Internet-scale systems, 2012
[14] Amazon S3入門. https://aws.amazon.com/tw/s3/getting-started/
[15] Ashish Thusoo, Joydeep Sen Sarma, Namit Jain, Zheng Shao, Prasad Chakka, Suresh Anthony, Hao Liu, Pete Wyckoff, and Raghotham Murthy, Hive - A Warehousing Solution Over a Map-Reduce Framework, 2009
[16] Apache Pig. https://pig.apache.org/
[17] HFS ~ Http File Server. http://www.rejetto.com/hfs/
[18] Hung-Chang Hsiao, Hsueh-Yi Chung, Haiying Shen, and Yu-Chang Chao. Load Rebalancing for Distributed File Systems in Clouds, 24(5):951~962, 2013
[19] A. Hastings. Distributed lock management in a transaction processing environment. In Proceedings of IEEE 9th Symposium on Reliable Distributed Systems, 1990
[20] Avinash Lakshman, and Prashant Malik, Cassandra - A Decentralized Structured Storage System, 2009
[21] ICINGA. https://www.icinga.org/
[22] John Howard, Michael Kazar, Sherri Menees, David Nichols, Mahadev Satyanarayanan, Robert Sidebotham, and Michael West. Scale and performance in a distributed file system. ACM Transactions on Computer Systems, 6(1):51–81, 1988
[23] Network Working Group, HTTP Over TLS, 2000