| 研究生: |
林怡君 Lin, Yi-Chun |
|---|---|
| 論文名稱: |
次世代基因定序之品質 Quality of Base Calling for Next Generation Sequence |
| 指導教授: |
詹世煌
Chan, Shih-Huang |
| 學位類別: |
碩士 Master |
| 系所名稱: |
管理學院 - 統計學系 Department of Statistics |
| 論文出版年: | 2015 |
| 畢業學年度: | 103 |
| 語文別: | 英文 |
| 論文頁數: | 27 |
| 中文關鍵詞: | 次世代基因定序 、鹼基品質 、特徵臉 、主成分分數 |
| 外文關鍵詞: | Next Generation Sequencing, base quality, eigenfaces, principal component scores |
| 相關次數: | 點閱:212 下載:5 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
DNA 分別由腺嘌呤(Adenine, A)、胸腺嘧啶(Thymine, T)、胞嘧啶(Cytosine, C)以及鳥糞嘌呤(Guanine, G)四種不同單位以不同排列順序所組成。隨著技術的改良與進步,基因定序的儀器也越是精密,相較於傳統基因定序的方法,次世代基因定序(Next Generation Sequencing, NGS),可完整檢測且快速地整合叢集生成、定序與配對組裝完整的基因序列,同時大幅降低所需的時間與成本,其中,定序品質扮演重要的角色。
Illumina 公司使用前四個週期執行模板生成,其後續週期的鹼基位置則以此模板做為定序。已有研究指出,在第四個週期後鹼基的位置並非固定,李佩芳(2012) 用單點鹼基叢集法,說明鹼基游移;邵筠芬(2013) 採用非鹼基存在之空白區塊,利用環狀編碼概念,探討游移現象;林盈樺(2014) 改用鹼基存在的位子,同以環狀編碼概念做延伸,並且加入長度與角度變化,來判斷鹼基飄移之方向。然而,上述研究著重以週期之局部區域做位移探討,本研究則利用特徵臉之方法,快速了解週期間鹼基及其位置之變化,並進一步探討週期與鹼基位移的關係。
There are four different bases, adenine (A), guanine (G), cytosine (C), and thymine (T), making up DNA. Next Generation Sequencing is a new technique allowing to sequence DNA much more quickly and cheaply than the previously used Sanger sequencing. However, issue on quality of sequencing, although plays an important role in coding the DNA, does not receive much attention academically and practically. Illumina, one of the famous companies, claims that the positions of bases follow the same sequels vertically after the fourth cycle.
However, several works have been conducted and proved that the positions of base calling are not fixed. See Li (2012), Shao (2013) and Lin (2014). The above authors basically used a specific region to prove that there does exist a shift in base position. In the thesis, we apply machine learning technique called eigenface recognition and principal component scores to represent the overall behavior of cycles, and find the relationship between shift and cycles using the coefficients of eigenfaces.
[1] Chen, J. A. (2014). "Evaluating the Quality of Base Calling for Next Generation Sequence",
Master Thesis, Department of statistics, National Cheng-Kung University.
[2] Illumina (2013). "MiSeq® System User Guide", San Diego, California 92122 U.S.A.
[3] Li, P. F. (2012). "Base Calling of Read Sequencing for Next Generation Sequencing
(NGS)", Master Thesis, Department of statistics, National Cheng-Kung University.
[4] Lin, Y. H. (2014). "The Shift Phenomenon of Bases for Next Generation Sequence and
its Effect", Master Thesis, Department of statistics, National Cheng-Kung University.
[5] Sanger, F., Nicklen, S. and Coulson, A. R. (1977). "DNA sequencing with chainterminating
inhibitors", Proceedings of the National Academy of Sciences of the USA
74(12):5463-5467.
[6] Shao, Y. F. (2013). "Pattern Recognition for Next Generation Sequence", Master Thesis,
Department of statistics, National Cheng-Kung University.
[7] Turk, M. and Pentland, A. (1991). "Eigenfaces for Recognition", Journal of Cognitive
Neuroscience, Vol. 3, pp.71–86.