| 研究生: |
王志清 Wang, Chih-Ching |
|---|---|
| 論文名稱: |
利用核心密度估計預測具物種獨特性之原生微型核糖核酸 Prediction of Species-specific MicroRNA Precursors using Kernel Density Estimation |
| 指導教授: |
張天豪
Chang, Tien-Hao (Darby) |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
| 論文出版年: | 2008 |
| 畢業學年度: | 96 |
| 語文別: | 中文 |
| 論文頁數: | 42 |
| 中文關鍵詞: | 微型核糖核酸 、物種獨特性 、核心密度估度 |
| 外文關鍵詞: | microRNA, species-specific, kernel density estimation |
| 相關次數: | 點閱:77 下載:2 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
微型核糖核酸(microRNA)是一段非常短的非編碼核糖核酸(non-coding RNA),長度約為21~23核乾苷酸,基因的調控、對生物的發育有非常顯著影響。近年來已經有許多研究致力於發現未知的原生微型核糖核酸(microRNA precursors),其中利用ab initio方法來預測原生微型核糖核酸逐漸受到重視,因為ab initio方法不依賴序列比對的資訊,相較於傳統的比較研究法,更能發現具有物種獨特性(species-specific)的未知原生微型核糖核酸。
本論文對於原生微型核糖核酸的預測提出一個新的ab initio方法:miR-KDE。MiR-KDE延續之前的研究,先將核糖核酸序列編碼成特徵向量(feature vector),然後利用分類器(classifier)判斷該序列中是否含有微型核糖核酸。本論文採用一套可變式核心密度估計(relaxed variable kernel density estimation, RVKDE)分類器,該分類器相較於大部分ab initio方法採用的支援向量機(support vector machine, SVM),更注重資料的局部資訊(local information)。為了評估miR-KDE預測未知原生微型核糖核酸的效能,我們利用人類的原生微型核糖核酸來預測其他39個物種的原生微型核糖核酸。實驗的結果顯示出miR-KDE能夠有效的預測原生微型核糖核酸(整體的正確率達到94.7%),而且因為可變式核心密度估計分類器注重局部資訊的特性,比其他ab initio方法更適合用來預測具有物種獨特性的未知原生微型核糖核酸。
MicroRNAs (miRNAs) are short non-coding RNAs (~21–23 nucleotides) participating in posttranscriptional regulation of gene expression. There have been many efforts on discovering miRNA precursors (pre-miRNA) over the years. Recently, ab initio approaches get more attention compared to comparative approaches. This is because ab initio approaches discard sequence alignment and can discover species-specific premiRNAs.
This study proposes a novel ab initio method, miR-KDE, for pre-miRNA prediction. MiR-KDE follows the practice of previous ab initio approaches to encode RNA molecules into feature vectors, which can be incorporated with a classifier for pre-miRNA prediction. The relaxed variable kernel density estimation (RVKDE) classifier is adopted in miR-KDE. When compared with wildly-used support vector machine (SVM), the RVKDE classifier exploits more local information of the training dataset. For evaluating miR-KDE, an experiment is conducted by using human pre-miRNAs to predict pre-miRNAs from other 39 species. The experimental results show that miR-KDE yields a good overall accuracy of 94.7%, and has advantages over two compared ab initio approaches for species-specific pre-miRNAs.
[1] J. D. Watson, Molecular Biology of the Gene, 5/E PEARSON, 2008.
[2] I. Bentwich, A. Avniel, Y. Karov, R. Aharonov, S. Gilad, O. Barad, A. Barzilai, P. Einat, U. Einav, E. Meiri, E. Sharon, Y. Spector, and Z. Bentwich, "Identification of hundreds of conserved and nonconserved human microRNAs," Nat Genet, vol. 37, pp. 766-770, 2005.
[3] R. J. G. Sam Griffiths-Jones, Stijn van Dongen, Alex Bateman and Anton J. Enright "miRBase: microRNA sequences, targets and gene nomenclature," Nucleic Acids Research, vol. 34, pp. D140-D144, 2006
[4] http://www.genscript.com/miRNA_2.html, "About miRNAs," GenScript Corporation.
[5] Y. Lee, C. Ahn, J. Han, H. Choi, J. Kim, J. Yim, J. Lee, P. Provost, O. Radmark, S. Kim, and V. N. Kim, "The nuclear RNase III Drosha initiates microRNA processing," Nature, vol. 425, pp. 415-419, 2003.
[6] K. J. Yoontae Lee, Jun-Tae Lee, Sunyoung Kim, and V.Narry Kim, "MicroRNA maturation: stepwise processing and subcellular localization," EMBO vol. 21, pp. 4663–4670, 2002
[7] D. P. Bartel, "MicroRNAs: Genomics, Biogenesis, Mechanism, and Function," Cell, vol. 116, pp. 281-297, 2004.
[8] M. T. McGinnis S, "BLAST: at the core of a powerful and diverse set of sequence analysis tools," Nucleic Acids Res, vol. 32, pp. W20–W25, 2004
[9] N. P. Alain Sewer, Pablo Landgraf, Alexei Aravin, Sebastien Pfeffer, Michael J Brownstein, Thomas Tuschl, Erik van Nimwegen and Mihaela Zavolan, "Identification of clustered microRNAs using an ab initio prediction method," BMC Bioinformatics, 2005.
[10] M. N. Malik Yousef, Hagit Shatkay, Stathis Kanterakis,Louise C. Showe and Michael K. Showe, , "Combining multi-species genomic data for microRNA identification using a Na?‥ve Bayes classifier," Bioinformatics, vol. 22, pp. 1325–1334, 2006.
[11] F. L. Chenghai Xue, Tao He, Guo-Ping Liu, Yanda Li and Xuegong Zhang, "Classification of real and pseudo microRNA precursors using local structure-sequence features and support vector machine," BMC Bioinformatics, 2005.
[12] I. L. Hofacker, "Vienna RNA secondary structure server " Nucleic Acids Research, vol. 31, pp. 3429-3431, 2003.
[13] K. L. S. N. a. S. K. Mishra, "De novo SVM classification of precursor microRNAs from genomic pseudo hairpins using global and intrinsic folding measures," Bioinformatics, vol. 23, pp. 1321-1330, 2007
[14] H. P. Schultes EA, LaBean TH, "Estimating the contributions of selection and self-organization in RNA secondary structure," J Mol Evol. , vol. 49, pp. 76-83, 1999.
[15] G. P. Freyhult E, Moulton V, "A comparison of RNA folding measures," BMC Bioinformatics, vol. 6, p. 241, 2005.
[16] Z. M. Moulton V, Steel M, Pointon R, Penny D, "Metrics on RNA secondary structures," J Comput Biol., vol. 7, pp. 277-92, 2000
[17] K. N. Fera D, Shiffeldrim N, Zorn J, Laserson U, Gan HH, Schlick T, "RAG: RNA-As-Graphs web resource," BMC Bioinformatics, vol. 6, p. 5:88, 2004
[18] F. D. Gan HH, Zorn J, Shiffeldrim N, Tang M, Laserson U, Kim N, Schlick T, "RAG: RNA-As-Graphs database--concepts, analysis, and features," Bioinformatics, vol. 20, pp. 1285-91, 2004
[19] W. L. a. A. Godzik, "Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences " Bioinformatics vol. 22, pp. 1658-1659, 2006
[20] K. D. P. a. D. R. Maglott, "RefSeq and LocusLink: NCBI gene-centered resources," Nucleic Acids Res, vol. 29(1), pp. 137–140, 2001
[21] R. B. D. Karolchik, M. Diekhans, T. S. Furey, A. Hinrichs, Y. T. Lu, K. M. Roskin, M. Schwartz, C. W. Sugnet, D. J. Thomas, R. J. Weber, D. Haussler and W.J. Kent "The UCSC Genome Browser Database " Nucleic Acids Research, vol. 31, pp. 51-54, 2003.
[22] M. L. William Ritchie, Daniel Gautheret, "RNA stem–loops: To be or not to be cleaved by RNAse III," RNA, vol. 13, pp. 457–462, 2007.
[23] S. N. K. L. a. S. K. Mishra, "Unique folding of precursor microRNAs: Quantitative evidence and implications for de novo identification," RNA, vol. 13, pp. 170–187, 2007
[24] R. a. W. Holt, The Gamma Function. New York: Artin E:, 1964.
[25] Y.-J. Oyang, S.-C. Hwang, Y.-Y. Ou, C.-Y. Chen, and Z.-W. Chen, "Data Classification With Radial Basis Function Networks Based on a Novel Kernel Density Estimation Algorithm," IEEE TRANSACTIONS ON NEURAL NETWORKS,, vol. 16, pp. 225-236, 2005
校內:2018-08-26公開