| 研究生: |
王尊緯 Wang, Tsun-Wei |
|---|---|
| 論文名稱: |
WebGPU 於高效能生物資訊計算:瀏覽器端 Pair-HMM Forward 演算法之優化 WebGPU for High-Performance Bioinformatics Computing: Browser-Based Optimization of the Pair-HMM Forward Algorithm |
| 指導教授: |
賀保羅
Horton, Paul |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
| 論文出版年: | 2025 |
| 畢業學年度: | 113 |
| 語文別: | 英文 |
| 論文頁數: | 58 |
| 中文關鍵詞: | 瀏覽器 GPU 計算 、成對隱馬可夫模型 、生物資訊加速 |
| 外文關鍵詞: | WebGPU, Pair-Hidden Markov Model, Bioinformatics Acceleration |
| 相關次數: | 點閱:21 下載:5 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著 GPU 加速在生物資訊領域日益普及 (Banerjee 2017, Liu 2021),傳統 CUDA/OpenCL 工作流程仍需安裝廠商驅動程式,且受限於特定硬體,難以支援線上教學與前端臨床分析。2024 年標準化的 WebGPU 以單一 JavaScript API 統合 Vulkan、Direct3D 12 與 Metal (W3C 2024),具備免安裝、跨硬體可攜性與本機資料駐留三大優勢。本研究以運算密集的 Pair-Hidden Markov Model Forward 演算法 (Durbin 1998) 為案例,評估該新框架的效能與可行性。
我們以周育晨公開的 C++/CUDA 原始碼為基準 (Chou 2024),首先實作 WebGPU 基線版本;隨後針對 CPU 與 GPU 之間頻繁往返,以及 BindGroup 重建耗時等瓶頸,依序導入單一 CommandBuffer 批次提交與 Dynamic Uniform Offsets,形成優化版本 WebGPU-Optimized。
在 NVIDIA RTX 2070 Super、Apple M1 與 Intel UHD 620 三款裝置上,對序列長度從 10^2 到 10^5 的測試顯示:優化版本在最佳情況下可獲逾 100 倍加速,並達到 CUDA 效能的 84% 以上;三款裝置的相對 log-likelihood 誤差皆低於 10^-5。即便無 NVIDIA GPU,本方法仍較單執行緒 C++ 提供一至二個數量級的加速效果。
研究結果證實,僅憑 JavaScript 與 WGSL,即能在瀏覽器中於秒級完成 Pair-HMM Forward 計算。我們提出兩項瀏覽器端專屬優化策略,並提供跨硬體的詳細效能量測,為 Web 原生基因體分析工具奠定基礎,推動生物資訊運算的民主化與即時化。
As GPU acceleration gains traction in bioinformatics (Banerjee 2017, Liu 2021), conventional CUDA and OpenCL workflows still require vendor-specific driver installation and remain tied to particular hardware, limiting online teaching and front-end clinical analysis. Standardized in 2024, WebGPU unifies Vulkan, Direct3D 12 and Metal under a single JavaScript API (W3C 2024), offering three key advantages: no installation, cross-hardware portability and on-device data residency. Using the compute-intensive Pair-Hidden Markov Model Forward algorithm (Pair-HMM Forward) (Durbin 1998) as a case study, we assess the performance and feasibility of this emerging framework.
Building on the open-source C++/CUDA implementation by Chou Yu-Chen (Chou 2024), we first develop a WebGPU baseline. We then mitigate its principal bottlenecks—frequent round-trips between CPU and GPU and costly BindGroup reconstruction—by introducing (i) batched submission of a single CommandBuffer and (ii) Dynamic Uniform Offsets, yielding an optimized variant termed WebGPU-Optimized.
Benchmarks on three devices—NVIDIA RTX 2070 Super, Apple M1 and Intel UHD 620—across sequence lengths from 10^2 to 10^5 show that the optimized version achieves speed-ups exceeding 100× in the best case and reaches over 84 % of CUDA’s throughput, while maintaining relative log-likelihood error below 10^-5 on all devices. Even without an NVIDIA GPU, our WebGPU implementation outperforms single-threaded C++ by one to two orders of magnitude.
These findings demonstrate that pure JavaScript and WGSL can execute the Pair-HMM Forward algorithm within seconds in a web browser. We contribute two browser-specific optimization strategies and provide detailed cross-hardware performance measurements, laying the groundwork for web-native genomic analysis tools and advancing the democratization and real-time execution of bioinformatics workloads.
[1] Apple. Apple M1 Chip — Technical Overview. Apple Developer Documentation. 2020. URL: https : / / developer . apple . com / documentation / apple _ silicon / apple_m1.
[2] Apple. Apple M2 Max Chip — Technical Overview. Apple Developer Documentation. 2023. URL: https://developer.apple.com/documentation/apple_silicon/apple_m2_max.
[3] S. S. Banerjee et al. “Hardware Acceleration of the Pair-HMM Algorithm for DNA Variant Calling”. Proc. 27th FPL. 2017, pp. 165–172. DOI: 10.23919/FPL.2017.8056826.
[4] Y.-C. Chou. Pair-HMM Forward: Reference & GPU-Accelerated Implementations – A GPU-Based Approach to Accelerate the Pair Hidden Markov Model Forward Algorithm for DNA Sequence Profile Alignment. [GitHub repository]. 2024. URL: https://github.com/yuchen0620/ChouYuchen-master-thesis.
[5] R. Durbin et al. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press, 1998.
[6] P. Ferragina and G. Manzini. “Opportunistic Data Structures with Applications”. Proc. 41st IEEE FOCS. 2000, pp. 390–398. DOI: 10.1109/SFCS.2000.892127.
[7] P. Ghosh et al. “Web3DMol: Interactive Protein Structure Visualization Based on WebGL”. Bioinformatics 34.13 (2018), pp. 2275–2277. DOI: 10.1093/bioinformatics/bty534.
[8] Google Chrome Developers. WebGPU Now Available in Chrome. Chromium Blog. 2024. URL: https://blog.chromium.org/2024/05/webgpu-now-available.html.
[9] Google Chrome Team. Chrome’s 2024 Recap for Devs: Re-imagining the Web with AI. Chrome for Developers Blog. 2024. URL: https://developer.chrome.com/blog/chrome-2024-recap.
[10] Illumina. NovaSeq X Series Reagent Kits — Specifications. 2024. URL: https : / / www . illumina . com / systems / sequencing - platforms / novaseq - x - plus / specifications.html.
[11] Intel. Intel UHD Graphics 620 — Product Specifications. Intel ARK. 2018. URL: https://ark.intel.com/content/www/us/en/ark/products/126789.
[12] B. Jones. Toji.dev Blog Series: WebGPU Best Practices. 2023. URL: https://toji.dev/webgpu-best-practices/.
[13] A. Klöckner et al. “PyCUDA and PyOpenCL: A Scripting-Based Approach to GPU Run-Time Code Generation”. Parallel Computing 38.3 (2012), pp. 157–174. DOI: 10.1016/j.parco.2011.09.001.
[14] K. Krampis, T. Booth, B. Chapman, et al. “Cloud BioLinux: Pre-configured and On-Demand Bioinformatics Computing for the Genomics Community”. BMC Bioinformatics 13 (2012), p. 42. DOI: 10.1186/1471-2105-13-42.
[15] B. Langmead et al. “Ultrafast and Memory-Efficient Alignment of Short DNA Sequences to the Human Genome”. Genome Biology 10.3 (2009), R25. DOI: 10.1186/gb-2009-10-3-r25.
[16] H. Li et al. “The Sequence Alignment/Map Format and SAMtools”. Bioinformatics 25.16 (2009), pp. 2078–2079. DOI: 10.1093/bioinformatics/btp352.
[17] H. Li and R. Durbin. “Fast and Accurate Long-Read Alignment with Burrows–Wheeler Transform”. Bioinformatics 26.5 (2010), pp. 589–595. DOI: 10.1093/bioinformatics/btq698.
[18] Y. Liu, S. Schrinner, et al. “GPU Acceleration in Genomics: A Comprehensive Survey”. Briefings in Bioinformatics 22.5 (2021), bbab042. DOI: 10.1093/bib/bbab042.
[19] Y. Liu, A. Wirawan, and B. Schmidt. “CUDASW++ 3.0: Accelerating Smith–Waterman Protein Database Search by Coupling CPU and GPU SIMD Instructions”. BMC Bioinformatics 14 (2013), p. 117. DOI: 10.1186/1471-2105-14-117.
[20] E. R. Mardis. “DNA Sequencing Technologies: 2006–2016”. Nature Protocols 12.2 (2017), pp. 213–218. DOI: 10.1038/nprot.2016.182.
[21] A. McKenna et al. “The Genome Analysis Toolkit: A MapReduce Framework for Analyzing Next-Generation DNA Sequencing Data”. Genome Research 20.9 (2010), pp. 1297–1303. DOI: 10.1101/gr.107524.110.
[22] MDN Web Docs. WebAssembly SIMD. 2023. URL: https://developer.mozilla.org/en-US/docs/WebAssembly/SIMD.
[23] MDN Web Docs. WebGPU API. 2025. URL: https://developer.mozilla.org/en-US/docs/Web/API/WebGPU_API.
[24] NVIDIA. GeForce RTX 2070 SUPER Founders Edition Specifications. Product brief. 2019. URL: https://www.nvidia.com/en-us/geforce/graphics-cards/rtx-2070-super/specs.
[25] NVIDIA Corporation. CUDA C++ Programming Guide v12.4. 2023. URL: https://docs.nvidia.com/cuda/cuda-c-programming-guide.
[26] B. Schmidt et al. “gpuPairHMM: High-Speed Pair-HMM Forward Algorithm for DNA Variant Calling on GPUs”. arXiv preprint arXiv:2411.11547 (2024). URL: https : / / arxiv.org/abs/2411.11547.
[27] J. E. Stone, D. Gohara, and G. Shi. “OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems”. Computing in Science & Engineering 12.3 (2010), pp. 66–73. DOI: 10.1109/MCSE.2010.69.
[28] TensorFlow.js Team. WebGPU Backend for TensorFlow.js. 2024. URL: https : / / www.tensorflow.org/js/guide/webgpu.
[29] M. Vasimuddin et al. “Efficient Architecture-Aware Acceleration of BWA-MEM for Multicore Systems”. Proceedings of IEEE IPDPS. 2019, pp. 314–324. DOI: 10.1109/IPDPS.2019.00041.
[30] W3C. WebGPU Specification. W3C Recommendation. 2024. URL: https://www.w3.org/TR/webgpu/.