| 研究生: |
呂明翰 Lyu, Ming-Han |
|---|---|
| 論文名稱: |
InnoDB大量資料寫入之磁碟頻寬效能提升探討 Exploiting the Disk Bandwidth for Heavy Write Traffic in InnoDB |
| 指導教授: |
蕭宏章
Hsiao, Hung-Chang |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
| 論文出版年: | 2017 |
| 畢業學年度: | 105 |
| 語文別: | 中文 |
| 論文頁數: | 38 |
| 中文關鍵詞: | MariaDB 、InnoDB 、磁碟 、Flush |
| 外文關鍵詞: | MariaDB, InnoDB, Disk, Flush |
| 相關次數: | 點閱:130 下載:2 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
在面對大數據的時代,由於資料龐大,儲存資料的伺服器因為記憶體往往不敷所需,使得資料得仰賴傳統磁碟系統來負責保存。當今的儲存元件技術,記憶體的儲存空間與磁碟系統有這數個量級的差異,分別是Gbytes及Tbytes (甚至Pbytes) 等級。再加上對資料的計算分析必須仰賴記憶體空間,因此資料存取的一個核心議題為如何有效率地在伺服器裡的磁碟與記憶體之間做資料移動或複製。其中資料讀取乃指定的資料從磁碟系統讀出並存在記憶體,而資料儲存則為將暫存在記憶體的資料移至磁碟永久保存。
本論文研究著重探討如何有效率地進行資料儲存,即將資料盡可能快速的從記憶體移至磁碟保存。儲存資料的效能瓶頸議題其中一個是面對於磁碟的存取速度遠遠不及記憶體的資料處理能力,並且大部分的應用程式對於傳統磁碟未能達到良好的使用。我們聚焦在當前被廣泛使用的傳統關聯式資料庫MariaDB (前身為MySQL) 所使用的儲存引擎InnoDB,探討在面對大量資料寫入的前提下將InnoDB記憶體與磁碟同步機制始能最大化使用硬碟的寫入頻寬。我們調查可能影響硬碟寫入頻寬的因素,包含連續寫的程度,每次寫入的page數量及page的大小設定。同時釐清這些影響面向對於磁碟頻寬使用的個別程度,我們進一步將影響相對重大的因素提出對應的演算法。將這些想法被實現於InnoD裡使進行實際評測。評測結果發現我們的想法能有效開發硬碟頻寬最高可達80%。並提升原本InnoDB磁碟寫入效能約90%。
Emerging big data applications demand to store data in size of tera-/peta-bytes. While the storage space of traditional hard disk drives (HDD's) increases exponentially per year, HDD's are often the performance bottlenecks in storage systems including file systems and databases. We investigate in this thesis study performance issues for relational databases when accommodating vast data that are generated timely. Specifically, we aim at MariaDB with InnoDB storage engine, a popular database used widely by industry. As traditional databases are designed with mostly read data traffic, they may not perform well for data written heavily. We identify potential performance issues in MariaDB/InnoDB for processing enormous data. Solutions attacking the issues are also suggested. Extensive performance studies based on our augmentation to MariaDB/InnoDB are also performed to validate our proposals. Moreover, we discuss side effects due to our solutions.
[1] A. F. Conn, J. V. Cavanaugh, and S. Cho, “A Theoretical Design for SSD Texture Storage,” in 2012 41st International Conference on Parallel Processing Workshops, 2012, pp. 581–587.
[2] P. O’Neil, E. Cheng, D. Gawlick, and E. O’Neil, “The Log-structured Merge-Tree (LSM Tree),” Acta Inf., vol. 33, no. 4, pp. 351–385, Jun. 1996.
[3] D. S. Batory, “B+ Trees and Indexed Sequential Files: A Performance Comparison,” in Proceedings of the 1981 ACM SIGMOD International Conference on Management of Data, New York, NY, USA, 1981, pp. 30–39.
[4] MariaDB. Available: https://mariadb.org/
[5] Storage Engines of MariaDB. Available: https://mariadb.com/kb/en/mariadb/storage-engines/
[6] InnoDB. Available: https://mariadb.com/kb/en/mariadb/xtradb-and-innodb/
[7] Transaction of InnoDB. Available: https://dev.mysql.com/doc/refman/5.7/en/innodb-transaction-model.html
[8] The join and foreign key between tables. Available: https://dev.mysql.com/doc/refman/5.7/en/innodb-foreign-key-constraints.html
[9] MVCC in InnoDB. Available: https://dev.mysql.com/doc/refman/5.7/en/innodb-multi-versioning.html
[10] Lock mode in InnoDB. Available: https://mariadb.com/kb/en/mariadb/xtradbinnodb-lock-modes/
[11] Recovery mode in InnoDB. Available: https://mariadb.com/kb/en/mariadb/xtradbinnodb-recovery-modes/
[12] Buffer Pool of InnoDB. Available: https://mariadb.com/kb/en/mariadb/xtradbinnodb-buffer-pool/
[13] Doublewrite of InnoDB. Available: https://mariadb.com/kb/en/mariadb/xtradbinnodb-doublewrite-buffer/
[14] MySQL. Available: https://www.mysql.com/
[15] SQLite. Available: https://www.sqlite.org/
[16] MongoDB. Available: https://www.mongodb.com/
[17] LevelDB. Available: http://leveldb.org/
[18] HBase. Available: https://hbase.apache.org/
[19] Percona Server. Available: https://www.percona.com/software/mysql-database/percona-server
[20] MariaDB vs MySQL. Available: https://mariadb.com/kb/en/mariadb/mariadb-vs-mysql-compatibility/
[21] XtraDB. Available: https://mariadb.com/kb/en/mariadb/about-xtradb/
[22] Aria. Available: https://mariadb.com/kb/en/mariadb/aria-storage-engine/
[23] Memory storage engine. Available: https://mariadb.com/kb/en/mariadb/memory-storage-engine/
[24] Secondary Index of InnoDB. Available: https://dev.mysql.com/doc/refman/5.7/en/innodb-index-types.html
[25] Undo log of InnoDB. Available: https://mariadb.com/kb/en/mariadb/undo-log/
[26] InnoDB空間請求設定. Available: https://mariadb.com/kb/en/mariadb/xtradbinnodb-server-system-variables/#innodb_data_file_path
[27] MariaDB結構示意圖. Available: https://www.slideshare.net/bytebot/mariadb-indepth-hands-on-training-in-seoul
[28] W. Naheman and J. Wei, “Review of NoSQL databases and performance testing on HBase,” in Proceedings 2013 International Conference on Mechatronic Sciences, Electric Engineering and Computer (MEC), 2013, pp. 2304–2309.
[29] J. Li, Z. Xu, Y. Jiang, and R. Zhang, “The overview of big data storage and management,” in 2014 IEEE 13th International Conference on Cognitive Informatics and Cognitive Computing, 2014, pp. 510–513.
[30] R. Nair, “Evolution of Memory Architecture,” Proceedings of the IEEE, vol. 103, no. 8, pp. 133
[31] K. Shvachko, H. Kuang, S. Radia, and R. Chansler, “The Hadoop Distributed File System,” in 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), 2010, pp. 1–10.