成功大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	楊祐昇 Yang, You-Sheng
論文名稱：	利用 GPU 加速大型基因體序列在 Smith-Waterman 加入仿射間隙模型演算法上的比對 Accelerating Large-Scale DNA Sequence Alignment Using the Smith-Waterman with affine gap Algorithm on GPU
指導教授：	賀保羅 Horton, Paul
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering
論文出版年：	2025
畢業學年度：	113
語文別：	英文
論文頁數：	47
中文關鍵詞：	大型基因體序列比對、GPU加速、帶有仿射間隙懲罰的區域比對、Smith-Waterman演算法
外文關鍵詞：	Large-Scale Genomic Sequence Alignment, GPU acceleration, local alignment with affine gap, Smith-Waterman Algorithm, CUDA
相關次數：	點閱：84 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

基因序列比對在生物資訊學與分子生物學中扮演非常重要的角色，它可以被用在基因功能預測、物種演化分析、疾病研究等各種分析上，然而這些方法通常具有平方級的時間複雜度，導致在面對大規模基因體資料時，所需的計算資源及時間成本急遽增加，成為分析上的一大瓶頸。
本篇論文研究的主要目標，旨在針對人類粒線體(mtDNA)與核基因體(nDNA)序列的局部比對問題，設計並實作一套高效能的加速方法，我們基於 Compute Unified Device Architecture (CUDA) 平台，根據 Smith-Waterman 這個演算法並加入仿射間隙懲罰（affine gap penalty）模型，透過優化 Sandes 等人於2013年提出的 GPU 優化策略 (“CUDAlign: Using GPU to Accelerate the Comparison of Megabase Genomic Sequences＂. Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming) 在 GPU 記憶體空間分配，將 Smith-Waterman 加入仿射間隙模型演算法中的矩陣壓縮成一維陣列，加速 GPU 在計算時存取記憶體速度同時因為減少矩陣佔用空間因此也增加 GPU 單次可計算的範圍，除此之外我們也修改其演算法，將原本演算法限制其 ?ℎ????? ??????? 必須被序列整除修改成使用者可以自由設定 ?ℎ????? ??????? ，如此一來可以排除序列質因數分解時出現較大的質數，導致平行度受限的問題，同時使用者也可以透過彈性的設定找出所使用的 GPU 最佳配置，進一步提升了局部比對的整體效率與可擴展性，我們透過實驗找出在不同的 GPU 上最適合的 threads 數量，實作出適用於粒線體與核基因體等此類大型基因序列的比對工具。
實驗結果證實該工具不僅實現了顯著的加速效果，並提供視覺化的圖表分析，讓使用者可以根據自身需求進行分析，本研究成果已開放於 GitHub 平台，使用者可透過以下連結取得並自由下載使用本工具：https://github.com/YYS13/LargeScaleSW

Sequence alignment plays a critical role in bioinformatics and molecular biology. It is widely used in various analyses such as gene function prediction, evolutionary studies, and disease research. However, these methods generally exhibit quadratic time complexity, leading to a rapid increase in computational resources and time costs when handling large-scale genomic data, thus becoming a major bottleneck in biological data analysis.
The primary goal of this study is to design and implement a high-performance acceleration scheme for the local alignment of human mitochondrial DNA and nuclear genome (nDNA) sequences. Leveraging the Compute Unified Device Architecture (CUDA) platform, we build on the Smith–Waterman algorithm with an affine gap-penalty model. Furthermore, we optimize GPU memory allocation by incorporating the strategy proposed by Sandes et al. in 2013 ("CUDAlign: Using GPU to Accelerate the Comparison of Megabase Genomic Sequences", Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming) . In our implementation, the dynamic programming matrices used in the Smith–Waterman algorithm with affine gap penalty are compressed into one-dimensional arrays, which both speeds up memory access during computation and enlarges the sequence window that fits in GPU memory. In addition, we relax the original algorithm’s constraint that threadsPerBlock must exactly divide the sequence length: users can now freely specify threadsPerBlock, avoiding parallelism limits caused by large prime factors in the sequence length. This flexibility allows users to find the best configuration for their specific GPU, further improving overall efficiency and scalability. Through extensive experiments, we identify the optimal thread counts on different GPUs and deliver a tool tailored to large sequences such as mtDNA and nDNA.
Experimental results demonstrate that the developed tool achieves significant speedup and provides visualization charts, enabling users to perform customized analyses according to their specific needs. The tool has been made publicly available on the GitHub platform and can be freely downloaded and used via the following link: https://github.com/YYS13/LargeScaleSW

中文摘要 i
Abstract iii
誌謝 v
Contents vi
List of Tables viii
List of Figures ix
Nomenclature x
Introduction 1
1 Background and Motivation 1
2 Smith-Waterman Algorithm 3
3 CUDA architecture 6
Related Work 8
Methods 10
1 Anti-Diagonal Strategy and Data Dependency 10
2 Coalesced Memory Access 11
3 Cell Delegation 13
3.1 Data Harzard in Cell Delegation 13
4 Sliding Window 15
5 Workflow 16
5.1 Pseudo Code(Host) 16
5.2 Pseudo Code(Device) 19
Results 21
1 Data 21
2 Experiment Setup 21
3 Benchmark 22
4 Configuration setting 22
4.1 Speedup 24
4.2 Segments 25
Discussion 30
1 Future Work 31
Conclusion 33
Bibliography 34
                                    

[1] Fang Zheng et al. “Accelerating Biological Sequence Alignment Algorithm on GPU with CUDA”. 2011 International Conference on Computational and Information Sciences (2011), pp. 1112–1115.
[2] H. Satam et al. “Next-Generation Sequencing Technology: Current Trends and Advancements”. Biology (2023).
[3] Ian Buck et al. “NVIDIA CUDA Software and GPU Parallel Computing Architecture”. Proceedings of the 6th ACM Conference on Computing Frontiers (2009).
[4] John Nickolls et al. “Scalable Parallel Programming with CUDA”. ACM Queue (2008).
[5] Kailash W. Kalare et al. “Parallelization of Global Sequence Alignment on Graphics Processing Unit” (2020), pp. 1–5.
[6] Nauman Ahmed et al. “GPU Accelerated API for Alignment of Genomics Sequencing Data”. IEEE (2017), pp. 1–8.
[7] Paul Horton et al. “Mammalian NUMT insertion is nonrandom”. Nucleic Acids Research 40 (2012), pp. 9073–9088.
[8] Srivalsa Bhaskaran et al. “A Review of Next Generation Sequencing Methods and its Applications in Laboratory Diagnosis”. Journal of Pure and Applied Microbiology (2022), pp. 825–833.
[9] Stephen F. Altschul et al. “Optimal Sequence Alignment Using Affine Gap Costs”. Bulletin of Mathematical Biology (1986), pp. 603–616
[10] T. R. P. Siriwardena et al. “Accelerating Global Sequence Alignment using CUDA Compatible Multi-core GPU” (2010), pp. 250–255. doi: 10.1109/ICIAFS.2010. 5715652.
[11] Temple F. Smith et al. “Identification of Common Molecular Subsequences”. Journal of Molecular Biology (1981), pp. 195–197.
[12] W. Wei et al. “Nuclear-embedded mitochondrial DNA sequences in 66,083 human genomes”. Nature 611 (2022), pp. 105–114.
[13] Edans Flavius de O. Sandes et al. “CUDAlign: Using GPU to Accelerate the Comparison of Megabase Genomic Sequences”. Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (2010), pp. 137–146.

校外：立即公開

簡易檢索 / 詳目顯示

相關論文