| 研究生: |
鄭凱元 Cheng, Kai-Yuan |
|---|---|
| 論文名稱: |
在Apache HBase上高效能交易處理結合Scale-Out的快取系統:設計、實作、效能測試 High Performance Transactions Processing in Apache HBase with Scale-Out Caches: Design, Implementation and Performance Benchmarking |
| 指導教授: |
蕭宏章
Hsiao, Hung-Chang |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
| 論文出版年: | 2014 |
| 畢業學年度: | 102 |
| 語文別: | 英文 |
| 論文頁數: | 32 |
| 中文關鍵詞: | 分散式 、快取系統 、交易 |
| 外文關鍵詞: | HBase, Cache, transaction |
| 相關次數: | 點閱:104 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
儘管 Apache HBase™已經是一套相當出色的分散式big data store,但是缺少multi-row transactions的功能,因此本論文著重在研究提供transaction的HBase系統及加入scale-out caches來加快transactions的執行,間接地增加整體系統效能。我們在本研究中發現,caches的加入無法維持整個系統的transactional consistency,且違反底層資料庫的一致性(consistency)特性,因此我們藉由一台集中式伺服器來確保資料不論是來caches或是databases都是一致且最新的值;另外,我們也提供API讓使用者決定是否要存取cache的功能。最後,在實驗部分,我們透過TPC-C來驗證藉由加入scale-out caches的方式,可以提升整體系統執行transactions的效能。
Although Apache HBase ™ is an emerging distributed key-value persistent data store. It lacks supporting multi-row transactions. In this thesis we explore how HBase can be enabled to provide transactions processing. Specifically, we suggest a scale-out caching mechanism to improve the overall system throughput on performing transactions. With caches, we observe that application programmers may have the difficulty to deal with the consistency issues between caches and databases server. We thus suggest a centralized approach to guarantee that data items stored in caches are up-to-date and consistent with their persistent database stores. In addition, we provide APIs to programmers to determine whether their application data are cached or not. Finally, with TPC-C, our experimental results in real environments validate that scale-out caches are efficient and effective in accelerating data access to NoSQL databases.
[1] R. Cattell, “Scalable SQL and NoSQL Data Stores,” ACM SIGMOD Record, vol. 39, no. 4, pp. 12–27, Dec 2010.
[2] Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber, “Bigtable: a distributed storage system for structured data,” Proc. 7th USENIX Symposium on Operating Systems Design and Implementation (OSDI '06), vol. 7, USENIX Association Berkeley, CA, USA, pp. 15-15, 2006.
[3] Apache HBase, http://hbase.apache.org
[4] James C. Corbett, Jeffery Dean, Michael Epstein, Andrew Fikes, Christopher Frost, J. J. Furman, Sanjay Ghemawat, Andrey Gubarev, Christopher Heiser, Peter Hochschild, Wilson Hsieh, Sebastian Kanthak, Eugene Kogan, Hongyi Li, Alexander Lloyd, Sergey Melnik, David Mwaura, David Nagle, Sean Quinlan, Rajesh Rao, Lindsay Rolig, Yasushi Saito, Michal Szymaniak, Christopher Taylor, Ruth Wang, and Dale Woodford, “Spanner: Google’s globally-distributed database,” Proc. 10th USENIX conference on Operating Systems Design and Implementation (OSDI ’12), USENIX Association, Berkeley, CA, USA, pp. 251–264, 2012.
[5] Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, Hans-Arno Jacobsen, Nick Puz, Daniel Weaver, and Ramana Yerneni, “PNUTS: Yahoo!'s hosted data serving platform,” Proc. VLDB Endowment, vol. 1, no. 2, pp. 1277-1288, Aug. 2008.
[6] Jason. Baker, Chris. Bond, James C. Corbett, JJ Furman, Andrey Khorlin, James Larson, Jean- Michel Leon, Yawei Li, Alexander Lloyd, and Vadim Yushprakh, “Megastore: Providing scalable, highly available storage for interactive services,” Proc. Conference on Innovative Data system Research (CIDR), pp. 223–234, 2011.
[7] Jim Gray, “Notes on data base operating systems,” Proc. Operating Systems, an Advanced Course, London, UK, pp. 393-481, 1978.
[8] Daniel Peng, and Frank Dabek, “Large-scale incremental processing using distributed transactions and notifications,” Proc. 9th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’10), pp. 1–15, 2010.
[9] Hal Berenson, Phil. Bernstein, Jim Gray, Jim Melton, Elizabeth O’Neil, and Patrick O’Neal, “A critique of ANSI SQL isolation levels,” Proc. ACM SIGMOD international conference on Management of data, New York, NY, USA, pp 1-10, 1995.
[10] Memcached, http://www.danga.com/memcached/.
[11] N. Sampathkumar, M. Krishnaprasad, and A. Nori. Introduction to caching with Windows Server AppFabric. Technical report, Microsoft Corporation, Nov 2009.
[12] NCache, http://www.alachisoft.com/ncache/
[13] Omid, https://github.com/yahoo/omid/.
[14] TPC-C, http://www.tpc.org/tpcc/.
[15] Apache Hadoop, http://hadoop.apache.org/.
[16] Rajesh Nishtala, Hans Fugal, Steven Grimm, Marc Kwiatkowski, Herman Lee, Harry C Li, Ryan McElroy, Michael H Paleczny, Daniel N Peek, Paul Saab, David Stafford, Tony Tung, and Venkateshwaran Venkataramani, “Scaling Memcache at Facebook,” Proc. 10th USENIX conference on Networked Systems Design and Implementation (NSDI ’13), USENIX Association Berkeley, CA, USA, pp. 385-398, 2013.
[17] Ashish Thusoo, Zheng Shao, Suresh Anthony, Dhruba Borthakur, Namit Jain, Joydeep Sen Sarma, Raghotham Murthy, and Hao Liu, “Data warehousing and analytics infrastructure at Facebook,” Proc. ACM SIGMOD International Conference on Management of Data (SIGMOD ’10), New York, USA, pp. 1013–1020, 2010.
[18] Twitter, http://twitter.com/
[19] H. T. Kung, and John. T. Robinson, “On optimistic methods for concurrency control,” ACM Transactions on Database Systems (TODS), vol. 6, no. 2, pp. 213–226, June 1981.
[20] Alan Fekete, Dimitrios Liarokapis, Elizabeth J O'Neil, Patrick E O'Neil, and Dennis Elliott Shasha. “Making snapshot isolation serializable,” ACM Transactions on Database Systems (TODS), vol. 30, no. 2, pp. 492–528, June 2005.
[21] Atul Adya, Barbara Liskov, and Patrick O’Neil, “Generalized isolation level definitions,” Proc. 16th International Conference on Data Engineering(ICDE), IEEE Computer Society Washington, DC, USA, pp. 67–78, 2000.
[22] Mihaela A Bornea, Orion Hodson, Sameh Elnikety, and Alan Fekete, “One-copy serializability with snapshot isolation under the hood,” Proc. IEEE 27th International Conference on Data Engineering. IEEE Computer Society Washington, DC, USA, pp. 625–636, Apr. 2011.
[23] Michael J Cahill, Uwe Röhm, and Alan D. Fekete. “Serializable isolation for snapshot databases,” Proc. ACM SIGMOD international conference on Management of data, ACM, New York, NY, USA, pp. 729-738, 2008.
[24] Stephen A Revilak, Patrick E O'Neil, and Elizabeth J O'Neil, “Precisely Serializable Snapshot Isolation (PSSI),” Proc. IEEE 27th International Conference on Data Engineering, IEEE Computer Society Washington, DC, USA, pp. 482–493, 2011.
[25] Hyungsoo Jung, Hyuck Han, Alan Fekete, and Uwe Röhm, “Serializable Snapshot Isolation for Replicated Databases in High-Update Scenarios,” Proc. PVLDB, pp. 783-794, 2011.
[26] Daniel Gomez Ferro´∗, Flavio Junqueira∗, Ivan Kelly, Benjamin Reed∗, Maysam Yabandeh∗†, “Omid: Lock-free Transactional Support for Distributed Data Stores,” Proc. Data Engineering (ICDE), 2014 IEEE 30th International Conference on, Chicago, IL, USA, pp. 676 – 687, Apr. 2014.
[27] R Bakalova, Alex Chow, C Fricano, P Jain, N Kodali, D Poirier, S Sankaran, and D Shupp, “WebSphere dynamic cache: Improving J2EE application performance,” IBM Systems Journal, vol. 43, no. 2, pp. 351-370, Apr. 2004.
[28] Redis, http://redis.io/.
[29] Cassandra, http://cassandra.apache.org/.
[30] John Kenneth Ousterhout, Parag Agrawal, David B Erickson, Christos E Kozyrakis, Jacob Leverich, David Mazières, Subhasish Mitra, Aravind Narayanan, Diego Ongaro, Guru Parulkar, Mendel Rosenblum, Stephen M Rumble, Eric Stratmann, and Ryan Stutsman, “The case for RAMClouds,” Communications of the ACM, vol. 54, no. 7, pp. 121-130, July. 2011.
[31] JBoss Cache, http://www.jboss.org/jbosscache/.
[32] OracleAS web cache, http://www.oracle.com/technology/products/ias/web_cache/.
[33] Dan R K Ports, Austin T Clements, Irene Zhang, Samuel Ross Madden, Barbara H. Liskov, “Transactional consistency and automatic management in an application data cache,” Proc. 9th USENIX Symposium on Operating Systems Design and Implementation, USENIX Association Berkeley, CA, USA, pp. 1–15, 2010.
[34] Ubuntu, http://www.ubuntu.com/, 2014.