簡易檢索 / 詳目顯示

研究生: 林子瑜
Lin, Tzu-Yu
論文名稱: 應用FPGA加速Minimap2第三代基因序列比對演算法之研究
A Study on Accelerating the Minimap2 Third-Generation Genomic Sequence Alignment Algorithm Using FPGA
指導教授: 黃吉川
Hwang, Chi-Chuan
學位類別: 碩士
Master
系所名稱: 工學院 - 工程科學系
Department of Engineering Science
論文出版年: 2024
畢業學年度: 112
語文別: 中文
論文頁數: 93
中文關鍵詞: 基因定序動態規劃硬體加速FPGA
外文關鍵詞: Genetic sequencing, Dynamic programming, Hardware acceleration, FPGA
相關次數: 點閱:87下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • DNA 定序的過程通常需要將序列片段和參考基因組做比對,本論文採用目前最多人使用的Minimap2第三代基因序列比對軟體,它使用的演算法包含三個步驟:蒐集種子、計算連結、計算比對。蒐集種子是找出在序列片段和參考序列上出現的相同鹼基序列片段,這部分使用了Minimizer演算法,並利用雜湊表找出相同的片段。計算串鏈則是把前面所找出的鹼基片段依據它們的在序列上的位置找出距離相近的,並串連成數條種子鏈並進行評分,這部分使用的是動態規劃演算法實現。計算比對是比對前面相連的鹼基片段中的空白片段,這部分採取的比對演算法為動態規劃。本文基於Minimap2計算串鏈的演算法設計了一個FPGA加速平台,首先通過調整串鏈演算法的計算順序後減少硬體的最長路徑來提升硬體工作頻率,並且設計多個平行運算單元以及pipeline 的結構,達到了 50 倍的平行化。本文的設計實現在Xilinx Alveo U250 FPGA上,加速器的部分運行125MHz的頻率上,並通過PCIe Gen3 x16介面與Host端相連。在此設計下,本文以PacBio HG002樣本進行測試,整體的比對流程約耗時103分鐘,與原本的軟體在CPU上以28線程執行,耗時約215分鐘相比,提升了約2.13倍的效率。本研究的結果表明,通過FPGA的平行計算能力以及靈活硬體架構設計,可以顯著提升基因長序列比對的速度和效率。此硬體加速方法在基因定序和病理分析中的應用,展示了其在未來醫學領域的巨大潛力。

    DNA sequencing often involves aligning sequence fragments with a reference genome. This paper uses Minimap2, a widely used third-generation genome alignment software, which employs three steps: seed collection, chaining, and alignment. Seed collection identifies identical base fragments using the Minimizer algorithm and a hash table. Chaining links nearby fragments into chains using dynamic programming, and alignment fills gaps between these fragments.
    The paper designs an FPGA acceleration platform for Minimap2's chaining algorithm. By optimizing the computation order, the hardware's critical path is reduced, enhancing frequency. Multiple parallel computing units and a pipeline structure achieve 50-fold parallelization. Implemented on a Xilinx Alveo U250 FPGA, the accelerator runs at 125MHz, connected via PCIe Gen3 x16. Testing with the PacBio HG002 sample, the process takes about 103 minutes, versus 215 minutes on a 28-thread CPU, improving efficiency by 2.13 times.
    Results show that FPGA's parallel computing and flexible design significantly boost genome alignment speed and efficiency, indicating great potential for gene sequencing and pathological analysis in future medical applications.

    摘要 i Extended Abstract ii 誌謝 viii 目錄 ix 表目錄 xii 圖目錄 xiii 第一章 緒論 1 1.1 研究背景 1 1.2 國內外研究現狀 4 1.2.1 BWA-MEM的加速研究現狀 5 1.2.2 Minimap2的加速研究現狀 6 1.3 研究動機 7 1.4 本文架構 8 第二章 相關概念與方法 9 2.1 基因定序技術介紹 9 2.2 序列比對相關概念介紹 10 2.2.1 遺傳與變異 10 2.2.2 序列比對流程 12 2.3 Minimap2比對流程 13 2.4 現場可程式化邏輯閘陣列 15 2.5 FPGA設計流程 17 2.6 本章小結 20 第三章 研究主軸 21 3.1 Minimap2演算法分析 21 3.1.1 種子生成演算法 21 3.1.2 串鏈演算法 24 3.1.3 拓展比對演算法 26 3.2 Minimap2效能分析 30 3.2.1 函數執行流程分析 30 3.2.2 函數耗時分析 31 3.3 AXI總線協議 32 3.3.1 AXI簡介 32 3.3.2 AXI通道定義 34 3.3.3 AXI握手機制 34 3.3.4 AXI讀取傳輸事務 36 3.3.5 AXI寫入傳輸事務 39 3.4 PCIe協定 42 3.4.1 PCIe協定版本 42 3.4.2 PCIe架構 43 第四章 硬體架構實現 45 4.1 優化串鏈演算法之計算順序 45 4.2 串鏈演算法硬體加速比對流程 47 4.3 輸入文件前處理 48 4.4 系統架構 49 4.5 加速器硬體架構設計與實現 50 4.5.1 Top_minimap2_wrapper 51 4.5.2 Controller 51 4.5.3 Decoder 54 4.5.4 Requester 56 4.5.5 PE Array 59 4.5.6 Output Scheduler 61 4.6 FPGA實現流程 62 4.6.1 功能及時序模擬 62 4.6.2 電路合成與實現 63 4.6.3 FPGA燒錄與測試 63 第五章 結果與討論 65 第六章 未來展望 69 參考文獻 71

    [1] Gaj T, Sirk SJ, Shui SL, Liu J. Genome-Editing Technologies: Principles and Applications. Cold Spring Harbor Perspective in biology. 2016 Dec.
    [2] Watson, James D. and Francis Harry Compton Crick. Molecular Structure of Nucleic Acids: A Structure for Deoxyribose Nucleic Acid. Nature 171 (1953): 737-738.
    [3] Sanger F, Nicklen S, Coulson AR. DNA sequencing with chain-terminating inhibitors. Proc Natl Acad Sci U S A. 1977 Dec.
    [4] DNA sequencing costs: data. 2021. National Human Genome Research Institute. Retrieved 1 Mar 2021. from https://www.genome.gov/about-genomics/factsheets/Sequencing-Human-Genome-cost.
    [5] Reis-Filho JS. Next-generation sequencing. Breast Cancer Res. 2009;11 Suppl 3(Suppl 3): S12.
    [6] Stephens Z D, Lee S Y, Faghri F, et al. Big Data: Astronomical or Genomical? [J]. PLoS Biol, 2015, 13(7): e1002195-e.
    [7] Li H, Homer N. A survey of sequence alignment algorithms for next-generation sequencing [J]. Brief Bioinform, 2010, 11(5): 473-83.
    [8] Li H, Handsaker B, Wysoker A, et al. The sequence alignment/map format and SAMtools [J]. Bioinformatics, 2009, 25(16): 2078-9.
    [9] Kleiman S L. The picard scheme [J]. arXiv preprint math/0504020, 2005.
    [10] Bauer D. Variant calling comparison CASAVA1. 8 and GATK [J]. Nature precedings, 2011.
    [11] Houtgast E J, Sima V-M, Bertels K, et al. Hardware acceleration of BWA-MEM genomic short read mapping for longer read lengths [J]. Computational biology and chemistry, 2018, 75(54-64). 71
    [12] Fonseca N A, Rung J, Brazma A, et al. Tools for mapping high-throughput sequencing data [J]. Bioinformatics, 2012, 28(24): 3169-77.
    [13] Li H, Ruan J, Durbin R. Mapping short DNA sequencing reads and calling variants using mapping quality scores [J]. Genome Res, 2008, 18(11): 1851-8.
    [14] Li R, Yu C, Li Y, et al. SOAP2: an improved ultrafast tool for short read alignment [J]. Bioinformatics, 2009, 25(15): 1966-7.
    [15] Langmead B, Salzberg S L. Fast gapped-read alignment with Bowtie 2 [J]. Nat Methods, 2012, 9(4): 357-9.
    [16] Li H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM [J]. ArXiv, 2013, 1303.
    [17] Li H, Durbin R. Fast and accurate long-read alignment with Burrows–Wheeler transform [J]. Bioinformatics, 2010, 26(5): 589-95.
    [18] Li H. Minimap2: pairwise alignment for nucleotide sequences [J]. Bioinformatics, 2018, 34(18): 3094-100.
    [19] Burrows M, Wheeler D. A block-sorting lossless data compression algorithm; proceedings of the Digital SRC Research Report, F, 1994 [C].
    [20] Manber U, Myers G. Suffix arrays: a new method for on-line string searches [J]. siam Journal on Computing, 1993, 22(5): 935-48.
    [21] Li H. Exploring single-sample SNP and INDEL calling with whole-genome de novo assembly [J]. Bioinformatics, 2012, 28(14): 1838-44.
    [22] Vasimuddin M, Misra S, Li H, et al. Efficient Architecture-Aware Acceleration of BWA-MEM for Multicore Systems; proceedings of the 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS), F 20-24 May 2019, 2019 [C].
    [23] Chang M F, Chen Y, Cong J, et al. The SMEM Seeding Acceleration for DNA Sequence Alignment; proceedings of the 2016 IEEE 24th Annual International 72 Symposium on Field-Programmable Custom Computing Machines (FCCM), F 1-3 May 2016, 2016 [C].
    [24] Smith T F, Waterman M S. Identification of common molecular subsequences [J]. J Mol Biol, 1981, 147(1): 195-7.
    [25] Myers E W, Miller W. Optimal alignments in linear space [J]. Bioinformatics, 1988, 4(1): 11-7.
    [26] Wozniak A. Using video-oriented instructions to speed up sequence comparison [J]. Bioinformatics, 1997, 13(2): 145-50.
    [27] Rognes T, Seeberg E. Six-fold speed-up of Smith-Waterman sequence database searches using parallel processing on common microprocessors [J]. Bioinformatics, 2000, 16(8): 699-706.
    [28] Farrar M. Striped Smith–Waterman speeds database searches six times over other SIMD implementations [J]. Bioinformatics, 2006, 23(2): 156-61.
    [29] Houtgast E J, Sima V-M, Bertels K, et al. An FPGA-based systolic array to accelerate the BWA-MEM genomic mapping algorithm; proceedings of the 2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS), F, 2015 [C]. IEEE.
    [30] Kung H. Systolic array [M]. Encyclopedia of computer science. 2003: 1741-3.
    [31] Ahmed N, Sima V, Houtgast E, et al. Heterogeneous Hardware/Software Acceleration of the BWA-MEM DNA Alignment Algorithm [M]. 2015.
    [32] Houtgast E J, Sima V-M, Bertels K, et al. GPU-accelerated BWA-MEM genomic mapping algorithm using adaptive load balancing; proceedings of the International Conference on Architecture of Computing Systems, F, 2016 [C]. Springer.
    [33] Houtgast E J, Sima V, Marchiori G, et al. Power-efficiency analysis of accelerated BWA-MEM implementations on heterogeneous computing platforms; proceedings of 73 the 2016 International Conference on ReConFigurable Computing and FPGAs (ReConFig), F 30 Nov.-2 Dec. 2016, 2016 [C].
    [34] Feng Z, Qiu S, Wang L, et al. Accelerating long read alignment on three processors; proceedings of the Proceedings of the 48th International Conference on Parallel Processing, F, 2019 [C].
    [35] Guo L, Lau J, Ruan Z, et al. Hardware Acceleration of Long Read Pairwise Overlapping in Genome Sequencing: A Race Between FPGA and GPU [M]. 2019.
    [36] Bartlett J M, Stirling D. A short history of the polymerase chain reaction [M]. PCR protocols. Springer. 2003: 3-6.
    [37] Yanhu L, Lu W, Li Y. The principle and application of the single-molecule real-time sequencing technology [J]. Yi chuan= Hereditas, 2015, 37(3): 259-68.
    [38] Davis, Brigid M., Michael C. Chao, and Matthew K. Waldor. Entering the era of bacterial epigenomics with single molecule real time DNA sequencing. Current opinion in microbiology 16.2 (2013): 192-198.
    [39] Ardui, Simon, et al. Single molecule real-time (SMRT) sequencing comes of age: applications and utilities for medical diagnostics. Nucleic acids research 46.5 (2018): 2159-2168.
    [40] Healy K. Nanopore-based single-molecule DNA analysis [J]. 2007.
    [41] Branton, Daniel, et al. "The potential and challenges of nanopore sequencing." Nature biotechnology 26.10 (2008): 1146-1153.
    [42] Leggett, Richard M., and Matthew D. Clark. "A world of opportunities with nanopore sequencing." Journal of experimental botany 68.20 (2017): 5419- 5429.
    [43] Needleman S B, Wunsch C D. A general method applicable to the search for similarities in the amino acid sequence of two proteins [J]. Journal of molecular biology, 1970, 48(3): 443-53. 74
    [44] Li, Heng. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34.18 (2018): 3094-3100.
    [45] Li, Heng. New strategies to improve minimap2 alignment accuracy. Bioinformatics 37.23 (2021): 4572-4574.
    [46] Harris, Robert S., Monika Cechova, and Kateryna D. Makova. "Noise-cancelling repeat finder: uncovering tandem repeats in error-prone long-read sequencing data." Bioinformatics 35.22 (2019): 4809-4811.
    [47] Brown, Stephen D., et al. Field-programmable gate arrays. Vol. 180. Springer Science & Business Media, 1992.
    [48] Feist, Tom. "Vivado design suite." White Paper 5 (2012): 30.
    [49] Schleimer S, Wilkerson D S, Aiken A. Winnowing: local algorithms for document fingerprinting; proceedings of the Proceedings of the 2003 ACM SIGMOD international conference on Management of data, F, 2003 [C].
    [50] Li H. Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences [J]. Bioinformatics, 2016, 32(14): 2103-10.
    [51] Holtgrewe, Manuel. Mason: a read simulator for second generation sequencing data. (2010).
    [52] Abouelhoda, Mohamed Ibrahim, and Enno Ohlebusch. Chaining algorithms for multiple genome comparison. Journal of Discrete Algorithms 3.2-4 (2005): 321-341.
    [53] Gotoh O. Optimal sequence alignment allowing for long gaps [J]. Bull Math Biol, 1990, 52(3): 359-73.
    [54] Farrar M. Striped Smith-Waterman speeds database searches six times over other SIMD implementations [J]. Bioinformatics, 2007, 23(2): 156-61.
    [55] Suzuki H, Kasahara M. Introducing difference recurrence relations for faster semiglobal alignment of long sequences [J]. BMC Bioinformatics, 2018, 19(Suppl 1): 45. 75
    [56] Lucero, James. Simulating High Performance Video Systems with Bus Functional Models.
    [57] Pradeep, S., and C. Laxmi. Design and verification environment for AMBA AXI protocol for SoC integration. Int. J. Res. Eng. Technol 3 (2014): 338-343.
    [58] Gaikwad, Nikhil, and Vijay N. Patil. Verification of AMBA AXI on-chip communication protocol. 2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA). IEEE, 2018.
    [59] Feist, Tom. Vivado design suite. White Paper 5 (2012): 30.
    [60] Chen, Chien-Hung, Jiun-Cheng Ju, and Jer Huang. A synthesizable AXI protocol checker for SoC integration. 2010 International SoC Design Conference. IEEE, 2010.
    [61] Tidala, Neethika. "High performance network on chip using AXI4 protocol interface on an FPGA." 2018 second international conference on electronics, communication and aerospace technology (ICECA). IEEE, 2018.
    [62] Noami, Ahmed, B. Pradeep Kumar, and P. Chandrasekhar. High performance AXI4 interface protocol for multi-core memory controller on SoC. Data Engineering and Communication Technology. Springer, Singapore, 2021. 131- 140.
    [63] Lucero, James. Simulating High Performance Video Systems with Bus Functional Models.
    [64] Zazo, Jose Fernando, et al. A PCIe DMA engine to support the virtualization of 40 Gbps FPGA-accelerated network appliances. 2015 International Conference on ReConFigurable Computing and FPGAs (ReConFig). IEEE, 2015.
    [65] Lee, Jin. Implementation and Performance Evaluation of PCI express on Xilinx FPGA. Journal of the Korea Institute of Information and Communication Engineering 22.12 (2018): 1667-1674. 76
    [66] Xilinx, Xilinx PCI. Express DMA Drivers and Software Guide; Xilinx Answer 65444. (2016).
    [67] Zabołotny, Wojciech M. DMA implementations for FPGA-based data acquisition systems. Photonics Applications in Astronomy, Communications, Industry, and High Energy Physics Experiments 2017. Vol. 10445. SPIE, 2017.
    [68] Liu, Henry. Pipeline engineering. CRC Press, 2017.
    [69] Hon, T., Mars, K., Young, G. et al. Highly accurate long-read HiFi sequencing data for five complex genomes. Sci Data 7, 399 (2020).
    [70] Wenger AM, Peluso P, Rowell WJ, Chang PC, Hall RJ, Concepcion GT, et al. Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome. Nat Biotechnol. 2019, 3710:1155-1162.
    [71] Dandan Lang, Shilai Zhang, Pingping Ren, Fan Liang, Zongyi Sun, Guanliang Meng, Yuntao Tan, Xiaokang Li, Qihua Lai, Lingling Han, Depeng Wang, Fengyi Hu, Wen Wang, Shanlin Liu, Comparison of the two up-to-date sequencing technologies for genome assembly: HiFi reads of Pacific Biosciences Sequel II system and ultralong reads of Oxford Nanopore, GigaScience, Volume 9, Issue 12, December 2020, giaa123.

    無法下載圖示 校內:2029-07-29公開
    校外:2029-07-29公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE