簡易檢索 / 詳目顯示

研究生: 李家齊
Li, Chia-Chi
論文名稱: 資料中繼與轉傳服務系統
The Data Relaying Service
指導教授: 蕭宏章
Hsiao, Hung-Chang
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2016
畢業學年度: 104
語文別: 中文
論文頁數: 23
中文關鍵詞: file systemHDFSHBase分散式系統大數據Hadoop
外文關鍵詞: file system, HDFS, HBase, distributed system, BigData, Hadoop
相關次數: 點閱:130下載:19
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 製作大數據的服務往往需要有大數據的儲存與傳輸系統。本論文目標是製作可以具有一套通用client端Application Programming Interface(API),內部有一套分散式儲存系統以及可以藉有該系統操作外部檔案系統,在兩不同的檔案系統間傳輸檔案的分散式系統。因為具有通用client 端API,所以可易於直接在The Data Relaying Service(DRS) 頂層另外開發系統,像是在頂層架一層file system framework,就變成一套file system。或是頂層架一層運算平台(ex: Massively R Data Parallel Computation over Hadoop without MapReduce[1])。而為了不影響頂層開發的效能以及成本上考量,該系統需要可跨平台,易於擴增以及能易於在頂層另外開發系統。所以製作時可以使用一些可利於開發大數據分析程式的儲存系統(ex: Hadoop Distributed File System (HDFS)[2], HBase[3])。在效能上也不能與外部檔案系統的讀寫上差異太大,且佔用資源也不能太多。

    Sometimes to make service of BigData need file systems of BigData. In this paper, our goal is to make distributed system, which has a common API of client, and has a distributed file storage system, and it can operate other file system via itself to transport file between two different file systems. Because of common API of client, users can easily make some services on The Data Relaying Service (DRS). For example, you can make a file system framework on it, so that this service will change to a file system, or you can make a computing platform on it. (Ex: Massively R Data Parallel Computation over Hadoop without MapReduce) In addition, to prevent some problems of cost and efficacy, this service need to can cross platforms, easy amplification, and can make other system on it easily. So, to make it can use some platforms and file systems which are benefit from making system of BigData such as Hadoop Distributed File System (HDFS), HBase. Final, DRS should not differ much from other file systems in efficacy, and should not cost many resources.

    摘要 i EXTENDED ABSTRACT ii 致謝 v 內文目錄 vi 表目錄 vii 圖目錄 viii 符號與縮寫清單 ix 第一章 導論 1 第二章 系統架構 5 2.1 Data Storage Interface 6 2.2 Storage 6 2.2.1 架構 7 2.3 Metrics 8 2.4 Task Monitor 9 2.5 Lock Manager 9 2.6 Load Balancer 9 2.7 Connection Pool 10 2.8 Client REST API 10 第三章 效能量測 12 3.1 單一檔案效能量測 13 3.2 消耗資源 15 3.3 Throughput 16 3.4 Load balance 17 第四章 相關研究 20 第五章 結論 21 5.1 Future work 21 參考資料 22

    [1] Yan-Jhou Huang . Massively R Data Parallel Computation over Hadoop without MapReduce, In Proceedings of National Cheng Kung University, 2016
    [2] HDFS User Guide. https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html
    [3] HBase. https://hbase.apache.org/
    [4] network-attached storage(NAS).
    http://searchstorage.techtarget.com/definition/network-attached-storage
    [5] GREENPLUM DATABASE. http://greenplum.org/
    [6] J. Postel and J. Reynolds. RFC959 - FILE TRANSFER PROTOCOL (FTP), In Proceedings of RFC, 1985
    [7] Common Internet File System. Microsoft.
    https://technet.microsoft.com/zh-tw/library/cc939973.aspx
    [8] SMB: The Server Message Block Protocol. http://ubiqx.org/cifs/SMB.html
    [9] Roy Thomas Fielding. Architectural Styles and the Design of Network-based Software Architectures, 2000
    [10] MapReduce Tutorial. https://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html
    [11] Vinod Kumar Vavilapalli,Arun C Murthy,Chris Douglas, Sharad Agarwal, Mahadev Konar, Robert Evans, Thomas Graves, Jason Lowe, Hitesh Shah, Siddharth Seth, Bikas Saha, Carlo Curino, Owen O’Malley, Sanjay Radia, Benjamin Reed, and Eric Baldeschwiele , Apache Hadoop YARN: Yet Another Resource Negotiator, 2013
    [12] Matei Zaharia, Mosharaf Chowdhury, Michael J. Franklin, Scott Shenker, and Ion Stoica, Spark: Cluster Computing with Working Sets, 2010
    [13] Patrick Hunt, Mahadev Konar, Flavio P. Junqueira, and Benjamin Reed (2012). ZooKeeper: Wait-free coordination for Internet-scale systems, 2012
    [14] Amazon S3入門. https://aws.amazon.com/tw/s3/getting-started/
    [15] Ashish Thusoo, Joydeep Sen Sarma, Namit Jain, Zheng Shao, Prasad Chakka, Suresh Anthony, Hao Liu, Pete Wyckoff, and Raghotham Murthy, Hive - A Warehousing Solution Over a Map-Reduce Framework, 2009
    [16] Apache Pig. https://pig.apache.org/
    [17] HFS ~ Http File Server. http://www.rejetto.com/hfs/
    [18] Hung-Chang Hsiao, Hsueh-Yi Chung, Haiying Shen, and Yu-Chang Chao. Load Rebalancing for Distributed File Systems in Clouds, 24(5):951~962, 2013
    [19] A. Hastings. Distributed lock management in a transaction processing environment. In Proceedings of IEEE 9th Symposium on Reliable Distributed Systems, 1990
    [20] Avinash Lakshman, and Prashant Malik, Cassandra - A Decentralized Structured Storage System, 2009
    [21] ICINGA. https://www.icinga.org/
    [22] John Howard, Michael Kazar, Sherri Menees, David Nichols, Mahadev Satyanarayanan, Robert Sidebotham, and Michael West. Scale and performance in a distributed file system. ACM Transactions on Computer Systems, 6(1):51–81, 1988
    [23] Network Working Group, HTTP Over TLS, 2000

    下載圖示 校內:2021-01-01公開
    校外:2021-01-01公開
    QR CODE