| 研究生: |
徐子傑 Hsu, Zih-Jie |
|---|---|
| 論文名稱: |
尋找基因複製數變異的方法之研究 A Study on Finding the Copy Number Variation of Gene |
| 指導教授: |
馬瀰嘉
Ma, Mi-Chia |
| 學位類別: |
碩士 Master |
| 系所名稱: |
管理學院 - 統計學系 Department of Statistics |
| 論文出版年: | 2014 |
| 畢業學年度: | 102 |
| 語文別: | 英文 |
| 論文頁數: | 62 |
| 中文關鍵詞: | 基因複製數變異 、外顯子序列 、基因序列比對 |
| 外文關鍵詞: | copy number variation, exon sequences, gene sequence comparison |
| 相關次數: | 點閱:105 下載:2 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著DNA晶片技術的發展,研究者發現,在人類基因組中存在大量大於1 kb但小於3 Mb的DNA片段多態性(polymorphism),包括片段的插入、缺失、重複等。這種多態性被稱作基因複製數變異(copy-number variant)。由於其發生的頻率遠遠高於染色體結構變異,而且在整個基因組中覆蓋的核苷酸總數大大超過單核苷酸多態性(Single Nucleotide Polymorphism, SNP)的總數。有研究認為基因複製數變異可能和特定區域基因組的基因表達和調控有關,或影響基因組內分子的進化和系統的發育。
本研究資料為成大醫學院分子醫學研究所暨生物資訊中心孫孝芳教授所提供的基因序列比對片段數(read count),研究對象為台灣一般族群的外顯子序列樣本(exome sequence sample),資料為台灣漢民族基因庫12位受測者其中每位受試者約19萬筆外顯子序列比對片段數,本研究將提出簡易的分析方法來找出基因複製數變異,以及利用統計模擬和舊有的方法作比較,以評估不同方法的優缺點。
With the development of DNA chip technology, the researchers found the presence of polymorphism which length more than 1 kb but less than 3 Mb of DNA segment in the human genome, polymorphism including insertion, deletion, duplication. This is called gene copy number variation. Since the frequency of occurrence is much higher than the variation of chromosome structure and the total number of nucleotides in the genome to cover much more than a single nucleotide polymorphism in total. Studies suggest that gene copy number variation may be related to gene expression and regulation specific regions of the genome and or affecting the development of genome evolution and molecular systems.
In this study, read count data of the gene sequence is provided by Professor H. Sunny Sun from the Institute of Molecular Medicine, National Cheng Kung University Medical College, and Cancer Research Center. The subjects were from the Taiwan general population sample of exon sequences. Data from the 12 subjects and each subject has about 19 million segments of exon sequence mapping, this study will propose some simple methods to identify gene duplication and variance, and use statistical simulation to compare these methods and assess advantages and disadvantages of different methods.
[1]Alkan,C., Kidd,J.M., Marques-Bonet,T., Aksay,G., Antonacci,F.,Hormozdiari,F., Kitzman,J.O., Baker,C., Malig,M., Mutlu,O.et al. (2009) Personalized copy number and segmental duplication maps using next-generation sequencing. Nature Genetics, 41,1061–1067.
[2]Chiang,D.Y., Getz,G., Jaffe,D.B., Zhao,X., Carter,S.L., Russ,C., Nusbaum,C., Meyerson,M. and Lander,E.S. (2008) High-resolution mapping of copy-number alterations with massively parallel sequencing. Nature Methods, 6, 99–103.
[3]Klambauer G., Schwarzbauer K., Mayr A., Clevert D., Mitterecker A., Bodenhofer U. and Hochreiter S. (2012) cn.MOPS, mixture of Poissons for discovering copy number variations in next-generation sequencing data with a low false discovery rate. Nucleic Acids Research, Advance Access, 1–14.
[4]Mortazavi A., Williams B.A., McCue K., Schaeffer L., Wold B. (2008). Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nature Methods. 5(7), 621-8.
[5]MacDonald J.R., Ziman R., Yuen R.K., Feuk L., Scherer S.W. (2014). The Database of Genomic Variants, a curated collection of structural variation in the human genome. Nucleic Acids Res.42 (Database issue), D986-92.
[6]Sathirapongsasuti F. J., Lee H., Horst B.A., Brunner G., Cochran A.J., Binder S., Quackenbush J. and Nelson S.F. (2011) Exome sequencing-based copy-number variation and loss of heterozygosity detection, Exome CNV. Bioinformatics, Vol. 27, 2648–2654.
[7]Pan W.H., Fann C.j.S., Wu J.Y., Hung Y.T., Ho M.S., Ta T.H., Chen Y.J., Liao C.J., Yang M.L., Cheng A.T.A., Chen Y.T. (2006). Han Chinese Cell and Genome bank in Taiwan: purpose, design and ethical consideration. Human Heredity; 61, 27-30.
[8]Xie,C. and Tammi,M.T. (2009) CNV-Seq, a new method to detect copy number variation using high-throughput sequencing. BMC Bioinformatics, 10, 80.
[9]Yoon,S., Xuan,Z., Makarov,V., Ye,K. and Sebat,J. (2009) Sensitive and accurate detection of copy number variants using read depth of coverage. Genome Research, 19, 1586–1592.