| 研究生: |
石允齊 Shih, Yun-Chi |
|---|---|
| 論文名稱: |
基於GPU之資料庫快速重建技術 Restoring Databases with GPU |
| 指導教授: |
蕭宏章
Hsiao, Hung-Chang |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
| 論文出版年: | 2019 |
| 畢業學年度: | 107 |
| 語文別: | 中文 |
| 論文頁數: | 26 |
| 中文關鍵詞: | LevelDB 、CUDA 、GPU 、Skiplist |
| 外文關鍵詞: | LevelDB, CUDA, GPU, Skiplist |
| 相關次數: | 點閱:32 下載:1 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
Compute Unified Device Architecture (CUDA) 是由Nvidia 所推出的一種整合計算,透過Nvidia GPU多核的特性去解決資料密集的計算。而LevelDB是Google設計的一個開放原始碼 key-value 儲存的資料庫函式庫。本論文研究基於CUDA及LevelDB下提出一具平行處理的資料結構,設計的目的乃針對大量資料快速建立可查詢之資料庫,以及快速恢復資料庫之使用。我們提出GPU in-memory平行索引 (in-memory parallelized indexing) 建造方法並 透過實現於LevelDB 驗證。實驗結果發現,我們透過GPU來平行構建in-memory indexing可較傳統基於CPU的LevelDB在執行速度上達將近700倍。
The thesis presents a novel database restoring algorithm and its implementation with a massive number of computing units in GPU. Compared with traditional CPU-based database restoring schema. Our proposed solution outperforms the CPU-based one over 700x in terms of execution time. Specifically, we present a novel algorithm in constructing in-memory indices with GPU. Given a set of data items, we exploit potential parallelism for building such in-memory indices while typical solutions attack the problem inefficiently by injecting data items into an DB one at a time. We demonstrate our idea by augmenting LevelDB, a well-know database serving as a library. Our present implementation is basedon Nvidia GPU CUDA (Compute Unified Device Architecture).
[1] "Graphics processing unit," [Online]. Available: https://en.wikipedia.org/wiki/Graphics_processing_unit.
[2] C. Bishop, Pattern recognition and machine learning, 2006.
[3] "Hadoop," Apache, [Online]. Available: https://hadoop.apache.org/.
[4] K. Shvachko, H. Kuang, S. Radia and R. Chansler, "The Hadoop Distributed File System," 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), 2010.
[5] "HBase," Apache, [Online]. Available: https://hbase.apache.org/.
[6] "LevelDB," google, [Online]. Available: https://github.com/google/leveldb.
[7] "Google Chrome," [Online]. Available: https://en.wikipedia.org/wiki/Google_Chrome.
[8] M. Swan, Blockchain: Blueprint for a new economy, O'REILLY.
[9] "BitcoinCore," Bitcoin, [Online]. Available: https://bitcoin.org/en/bitcoin-core/.
[10] "Go Ethereum," [Online]. Available: https://geth.ethereum.org/.
[11] M. F. SANNER, "Python: a programming language for software integration and development," The Scripps Research Institute.
[12] “R (programming language),” [線上]. Available: https://en.wikipedia.org/wiki/R_(programming_language).
[13] C.-P. Tsai, H.-C. Hsiao, Y.-C. Chao, M. Hsu and A. R. Chang, "Bridging the Gap between Big Data System Software Stack and Application," in IEEE International Conference on Big Data, The Case of Semiconductor Wafer Fabrication Foundries, 2018.
[14] W. Pugh, "Skip Lists: A Probabilistic Alternative to Balanced Trees".
[15] "Cuda Toolkit," Nvidia, [Online]. Available: https://developer.nvidia.com/cuda-downloads.
[16] "Cuda Toolkit Documentation," Nvidia, [Online]. Available: https://docs.nvidia.com/cuda/.
[17] D. R. BW Kernighan, The C programming language, 2006.
[18] "Stream processing," Wikipedia, [Online]. Available: https://en.wikipedia.org/wiki/Stream_processing.
[19] "Single Instruction Multiple Thread," [Online]. Available: https://en.wikipedia.org/wiki/Single_instruction,_multiple_threads.
[20] R. Cattell, "Scalable SQL and NoSQL data stores," Association for Computing Machinery, pp. 12-27, 4 12 2010.
[21] P. O’Neil, E. Cheng, D. Gawlick and E. O’Neil, "The log-structured merge-tree (LSM-tree)," Acta Informatica, pp. 351-385, 7 1996.
[22] P. Misra and M. Chaudhuri, "Performance Evaluation of Concurrent Lock-free Data Structures on GPUs," 2012 IEEE 18th International Conference on Parallel and Distributed Systems, 17-19 12 2012.
[23] N. Moscovici, N. Cohen and E. Petrank, "A GPU-Friendly Skiplist Algorithm," in 2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT), 2017.
[24] "Garbage collection," wikipedia, [Online]. Available: https://en.wikipedia.org/wiki/Tracing_garbage_collection.
[25] A. H. Team, "Apache HBase ™ Reference Guide," [Online]. Available: https://hbase.apache.org/book.html#arch.bulk.load.
[26] J. Dean and S. Ghemawat, "MapReduce:simplifed data processing on large clusters," Communications of the ACM - 50th anniversary issue: 1958 - 2008, pp. 107-113, 1 1 2008.
[27] "Kinetica," Kinetica, [Online]. Available: https://www.kinetica.com/.
[28] "Thrust," [Online]. Available: https://thrust.github.io/.
[29] "Radix sort," Wekipedia, [Online]. Available: https://en.wikipedia.org/wiki/Radix_sort.
[30] "Memory Allocation Overhead," virginia university department of computer science, [Online]. Available: https://www.cs.virginia.edu/~mwb7w/cuda_support/memory_management_overhead.html.