成功大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	林霆鈞 Lin, Ting-Jyun
論文名稱：	在NoSQL資料庫上實現分散式快取系統：設計考量及效能影響 Distributed In-Memory Caches for NoSQL Persistent Stores: Design Considerations and Performance Impacts
指導教授：	蕭宏章 Hsiao, Hung-Chang
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering
論文出版年：	2013
畢業學年度：	101
語文別：	英文
論文頁數：	44
中文關鍵詞：	NoSQL 、分散式快取系統
外文關鍵詞：	NoSQL, distrubuted in-memory cache
相關次數：	點閱：49 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

NoSQL是新興的鍵值永久儲存載體(key-value persistent data stores)，它的設計可以用來處理迅速增加的海量資料。列舉數個較著名的NoSQL產品，例如Google Bigtable/Spanner、Hadoop HBase、Cassandra、MongoDB、Counchbase等。但是，典型的鍵值儲存載體，操作資料物件(data object)是在硬碟中進行，這是非常沒效率的。因此，我們將提出一個快取模型，將資料分散在多台快取伺服器上，並暫存於快取伺服器上的記憶體中。我們揭示一些設計快取模型的議題。這些議題，都會跟一致性(consistency)、擴展性(scalability)，以及可用性(availability)等有關。特別是，我們研究報告指出，為了發展快取叢集(cache cluster)，我們將會對後端鍵值永久儲存載體有深刻的了解。我們將介紹提出的快取模型，並假設後端的NoSQL資料庫為Hadoop HBase。此外，大部分的NoSQL資料庫，皆提供了一個主要的運算操作──範圍搜尋(range search)。我們的快取模型結合後端的HBase資料庫，執行範圍搜尋，並使用電腦模擬來調查快取模型的效能。

NoSQL key-value persistent data stores are emerging, which are designed to accommodate a potentially large volume of data increased rapidly from a variety of sources. Examples of such key-value stores include Google Bigtable/Spanner, Hadoop HBase, Cassandra, MongoDB, Counchbase, etc. As typical key-value stores manipulate data objects stored in disks, the accesses are inefficiency. We present in this paper a caching model to store data objects tentatively in memories distributed in a number of cache servers. We reveal a number of design issues involved in developing such a caching model. These issues are relevant to consistency, scalability and availability. Specifically, our study notes that to develop a cache cluster shall have an in-depth understanding regarding the backend persistent key-value stores. We then present a caching model, assuming that Hadoop HBase serves as our backend. In addition, performing range search, one of major operations offered by most NoSQL
key-value stores, in our caching model together with the HBase backend is illustrated. Computer simulations demonstrate our performance results.

Introduction 5
1 Goals and Challenges 7
2 Our Contributions 9
3 Roadmap 11
Background 12
1 Fundamental Operations in a Persistent Data Store 13
2 Scalability and Self-Management 14
3 Consistency 15
4 Transactions 16
5 Availability and Fault Tolerance 17
Related Work 19
1 Distributed Hash Tables 19
2 Range Indexing and Search 20
3 Consistency Model 21
4 Hadoop HBase 24
5 memcached 25
Proposed Distributed In-Memory Caches 26
1 Overview 26
2 Data Object Placement and Replacement 26
3 Range Indexing and Lookup 27
3.1 Data Structures and Existing Algorithms 28
3.2 Partial Result Cache 29
3.3 Atomicity and FIFO/Total Ordering 30
4 Read-One-Write-All and Two-Phase Commit Protocol 30
5 Handling Failures 31
Performance Analysis 33
1 Parameters 33
2 Performance Metrics 35
3 Workloads 36
4 Simulation Results 36
Summary 40
                                    

[1] Google Earth. http://earth.google.com/.
[2] The Spread Toolkit. http://www.spread.org/.
[3] Voldemort. http://www.project-voldemort.com/voldemort/.
[4] M. Ahamad, G. Neiger, J. E. Burns, P. Kohli, and P. W. Hutto. Causal Memory:Definitions, Implementation, and Programming. Distributed Computing, 9(1):37–49,1995.
[5] B. Atikoglu, Y. Xu, E. Frachtenberg, S. Jiang, and M. Paleczny. Workload Analysis of a Large-Scale Key-Value Store. In Proc. ACM SIGMETRICS 2012, pages 53–64, June
2012.
[6] A. Auradkar, C. Botev, S. Das, D. D. Maagd, A. Feinberg, P. Ganti, L. Gao, B. Ghosh, K. Gopalakrishna, B. Harris, J. Koshy, K. Krawez, J. Kreps, S. Lu, S. Nagaraj, N. Narkhede, S. Pachev, I. Perisic, L. Qiao, T. Quiggle, J. Rao, B. Schulman, A. Sebastian, O. Seeliger, A. Silberstein, B. Shkolnik, C. Soman, R. Sumbaly, K. Surlaker, S. Topiwala, C. Tran, B. Varadarajan, J. Westerman, Z. White, D. Zhang, and J. Zhang. Data Infrastructure at LinkedIn. In Proc. IEEE 28th Int’l Conf. Data Engineering(ICDE’12), pages 1370–1381, Apr. 2012.
[7] Baidu. http://www.baidu.com/.
[8] P. Bailis and A. Ghodsi. Eventual Consistency Today: Limitations, Extensions, and Beyond. Commun. ACM, 56(5):55–63, 2013.
[9] Cassandra. http://cassandra.apache.org/.
[10] F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra,A. Fikes, and R. E. Gruber. Bigtable: A Distributed Storage System for Structured Data. ACM Trans. Comput. Syst., 26(2), Feb. 2012.
[11] Y. Chawathe, S. Ramabhadran, S. Ratnasamy, A. LaMarca, J. Hellerstein, and S. Shenker. A Case Study in Building Layered DHT Applications. In Proc. ACM SIGCOMM 2005, pages 97–108, Aug. 2005.
[12] G. Chockler, R. Guerraoui, I. Keidar, and M. Vukolic. Reliable Distributed Storage.IEEE Computer, 42(4):60–67, 2009.
[13] B. F. Cooper, R. Ramakrishnan, U. Srivastava, A. Silberstein, P. Bohannon, H.-A. Jacobsen, N. Puz, D. Weaver, and R. Yerneni. PNUTS: Yahoo!’s Hosted Data Serving Platform. In Proc. VLDB Endowment, volume 1, pages 1277-1288, Aug. 2008.
[14] B. F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears. Benchmarking Cloud Serving Systems with YCSB. In Proc. 1st ACM Symp. Cloud Computing(SoCC’10), pages 143–154, June 2010.
[15] J. C. Corbett, J. Dean, M. Epstein, A. Fikes, C. Frost, J. Furman, S. Ghemawat, A. Gubarev, C. Heiser, P. Hochschild, W. Hsieh, S. Kanthak, E. Kogan, H. Li, A. Lloyd, S. Melnik, D. Mwaura, D. Nagle, S. Quinlan, R. Rao, L. Rolig, Y. Saito, M. Szymaniak, C. Taylor, R. Wang, and D. Woodford. Spanner: Google’s Globally-Distributed Database. In Proc. Tenth Symp. Operating System Design and Implementation(OSDI’12), Oct. 2012.
[16] G. Coulouris, J. Dollimore, T. Kindberg, and G. Blair. Distributed Systems: Concepts and Design, chapter Replication. Addison Wesley, 5th edition, 2011.
[17] G. Coulouris, J. Dollimore, T. Kindberg, and G. Blair. Distributed Systems: Concepts and Design, chapter Transactions and Concurrency Control. Addison Wesley, 5th edition, 2011.
[18] Counchbase. http://www.couchbase.com/.
[19] G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels. Dynamo: Amazon’s Highly Available Key-Value Store. In Proc. 21st ACM Symp. Operating Systems Principles (SOSP’07), pages 205–220, Oct. 2007.
[20] Facebook. http://www.facebook.com/.
[21] S. Gilbert and N. A. Lynch. Perspectives on the CAP Theorem. IEEE Computer, 45(2): 30–36, 2012.
[22] HBase. http://hbase.apache.org/.
[23] Hive. http://hive.apache.org/.
[24] H.-C. Hsiao, H. Liao, S.-S. Chen, and K.-C. Huang. Load Balance with Imperfect Information in Structured Peer-to-Peer Systems. IEEE Trans. Parallel Distrib. Syst., 22(4):634–649, Apr. 2011.
[25] Hypertable. http://hypertable.org/.
[26] A. D. Kshemkalyani and M. Singhal. Distributed Computing: Principles, Algorithms, and Systems, chapter Message Ordering and Group Communication. Cambridge University Press, 1st edition, 2008.
[27] L. Lamport. How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs. IEEE Trans. Computers, 28(9):690–691, 1979.
[28] L. Lamport. Fast Paxos. Distributed Computing, 19(2):79–102, 2006.
[29] Line. http://line.naver.jp.
[30] LinkedIn. http://www.linkedin.com/.
[31] Memcached. http://memcached.org/.
[32] R. Miller. Facebook Server Count: 60,000 or More. http://www.datacenterknowledge.com/ archives/ 2010/06/28/facebook-server-count-60000-or-more/.
[33] MongoDB. http://www.mongodb.org/.
[34] R. Nishtala, H. Fugal, S. Grimm, M. Kwiatkowski, H. Lee, H. C. Li, R. McElroy, M. Paleczny, D. Peek, P. Saab, D. Stafford, T. Tung, and V. Venkataramani. Scaling Memcache at Facebook. In Proc. Tenth Usenix Symp. Networked Systems Design and Implementation (NSDI’13), Apr. 2013.
[35] A. Rowstron and P. Druschel. Pastry: Scalable, Distributed Object Location and Routing for Large-Scale Peer-to-Peer Systems. Lecture Notes in Computer Science, 2218:161–172, Nov. 2001.
[36] I. Stoica, R. Morris, D. Liben-Nowell, D. R. Karger, M. F. Kaashoek, F. Dabek, and H. Balakrishnan. Chord: a Scalable Peer-to-Peer Lookup Protocol for Internet Applications. IEEE/ACM Trans. Netw., 11(1):17–21, Feb. 2003.
[37] Y. Tang, S. Zhou, and J. Xu. LIGHT: A Query-Efficient Yet Low-Maintenance Indexing Scheme over DHTs. IEEE Trans. Knowl. Data Eng., 22(1):59–75, Jan. 2012.
[38] Twitter. http://twitter.com/.
[39] Wikipedia. http://www.wikipedia.org/.

校內：2018-07-31公開
校外：不公開電子論文尚未授權公開，紙本請查館藏目錄

簡易檢索 / 詳目顯示

相關論文