簡易檢索 / 詳目顯示

研究生: 王志清
Wang, Chih-Ching
論文名稱: 利用核心密度估計預測具物種獨特性之原生微型核糖核酸
Prediction of Species-specific MicroRNA Precursors using Kernel Density Estimation
指導教授: 張天豪
Chang, Tien-Hao (Darby)
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2008
畢業學年度: 96
語文別: 中文
論文頁數: 42
中文關鍵詞: 微型核糖核酸物種獨特性核心密度估度
外文關鍵詞: microRNA, species-specific, kernel density estimation
相關次數: 點閱:77下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 微型核糖核酸(microRNA)是一段非常短的非編碼核糖核酸(non-coding RNA),長度約為21~23核乾苷酸,基因的調控、對生物的發育有非常顯著影響。近年來已經有許多研究致力於發現未知的原生微型核糖核酸(microRNA precursors),其中利用ab initio方法來預測原生微型核糖核酸逐漸受到重視,因為ab initio方法不依賴序列比對的資訊,相較於傳統的比較研究法,更能發現具有物種獨特性(species-specific)的未知原生微型核糖核酸。
    本論文對於原生微型核糖核酸的預測提出一個新的ab initio方法:miR-KDE。MiR-KDE延續之前的研究,先將核糖核酸序列編碼成特徵向量(feature vector),然後利用分類器(classifier)判斷該序列中是否含有微型核糖核酸。本論文採用一套可變式核心密度估計(relaxed variable kernel density estimation, RVKDE)分類器,該分類器相較於大部分ab initio方法採用的支援向量機(support vector machine, SVM),更注重資料的局部資訊(local information)。為了評估miR-KDE預測未知原生微型核糖核酸的效能,我們利用人類的原生微型核糖核酸來預測其他39個物種的原生微型核糖核酸。實驗的結果顯示出miR-KDE能夠有效的預測原生微型核糖核酸(整體的正確率達到94.7%),而且因為可變式核心密度估計分類器注重局部資訊的特性,比其他ab initio方法更適合用來預測具有物種獨特性的未知原生微型核糖核酸。

    MicroRNAs (miRNAs) are short non-coding RNAs (~21–23 nucleotides) participating in posttranscriptional regulation of gene expression. There have been many efforts on discovering miRNA precursors (pre-miRNA) over the years. Recently, ab initio approaches get more attention compared to comparative approaches. This is because ab initio approaches discard sequence alignment and can discover species-specific premiRNAs.
    This study proposes a novel ab initio method, miR-KDE, for pre-miRNA prediction. MiR-KDE follows the practice of previous ab initio approaches to encode RNA molecules into feature vectors, which can be incorporated with a classifier for pre-miRNA prediction. The relaxed variable kernel density estimation (RVKDE) classifier is adopted in miR-KDE. When compared with wildly-used support vector machine (SVM), the RVKDE classifier exploits more local information of the training dataset. For evaluating miR-KDE, an experiment is conducted by using human pre-miRNAs to predict pre-miRNAs from other 39 species. The experimental results show that miR-KDE yields a good overall accuracy of 94.7%, and has advantages over two compared ab initio approaches for species-specific pre-miRNAs.

    摘要 I Abstract II 誌謝 III 目 錄 IV 圖目錄 VI 表目錄 VII CHAPTER 1 緒論 1 CHAPTER 2 相關研究 3 2.1 微型核糖核酸 3 2.1.1 微型核糖核酸的特徵 3 2.1.2 微型核糖核酸的命名方式 4 2.1.3 微型核糖核酸的作用機制 5 2.2 預測微型核糖核酸方法 6 2.3 分類器介紹 8 2.3.1 支援向量機 8 2.3.2 可變式核心密度估計分類器 10 CHAPTER 3 資料集與研究方法 11 3.1 目標 11 3.2 資料集 11 3.3 特徵集 13 3.3.1 核糖核酸初級序列特微擷取 14 3.3.2 初級序列折疊成二級結構的測量值 14 3.3.3 初級序列折疊成二級結構測量值的正規化 19 3.3.4 莖-環結構特徵擷取 20 3.4分類工具 21 CHAPTER 4 實驗結果和討論 23 4.1 結果評估準則 24 4.2 同物種原生微型核糖核酸預測結果 25 4.3 不同物種原生微型核糖核酸預測結果 26 4.4 不同分類器與特徵集的影響 30 4.5 支援向量機及可變式核心密度估計分類器的決策分界線 31 CHAPTER 5 結論 38 5.1 結論 38 5.2 未來展望 38 參考文獻 40

    [1] J. D. Watson, Molecular Biology of the Gene, 5/E PEARSON, 2008.
    [2] I. Bentwich, A. Avniel, Y. Karov, R. Aharonov, S. Gilad, O. Barad, A. Barzilai, P. Einat, U. Einav, E. Meiri, E. Sharon, Y. Spector, and Z. Bentwich, "Identification of hundreds of conserved and nonconserved human microRNAs," Nat Genet, vol. 37, pp. 766-770, 2005.
    [3] R. J. G. Sam Griffiths-Jones, Stijn van Dongen, Alex Bateman and Anton J. Enright "miRBase: microRNA sequences, targets and gene nomenclature," Nucleic Acids Research, vol. 34, pp. D140-D144, 2006
    [4] http://www.genscript.com/miRNA_2.html, "About miRNAs," GenScript Corporation.
    [5] Y. Lee, C. Ahn, J. Han, H. Choi, J. Kim, J. Yim, J. Lee, P. Provost, O. Radmark, S. Kim, and V. N. Kim, "The nuclear RNase III Drosha initiates microRNA processing," Nature, vol. 425, pp. 415-419, 2003.
    [6] K. J. Yoontae Lee, Jun-Tae Lee, Sunyoung Kim, and V.Narry Kim, "MicroRNA maturation: stepwise processing and subcellular localization," EMBO vol. 21, pp. 4663–4670, 2002
    [7] D. P. Bartel, "MicroRNAs: Genomics, Biogenesis, Mechanism, and Function," Cell, vol. 116, pp. 281-297, 2004.
    [8] M. T. McGinnis S, "BLAST: at the core of a powerful and diverse set of sequence analysis tools," Nucleic Acids Res, vol. 32, pp. W20–W25, 2004
    [9] N. P. Alain Sewer, Pablo Landgraf, Alexei Aravin, Sebastien Pfeffer, Michael J Brownstein, Thomas Tuschl, Erik van Nimwegen and Mihaela Zavolan, "Identification of clustered microRNAs using an ab initio prediction method," BMC Bioinformatics, 2005.
    [10] M. N. Malik Yousef, Hagit Shatkay, Stathis Kanterakis,Louise C. Showe and Michael K. Showe, , "Combining multi-species genomic data for microRNA identification using a Na?‥ve Bayes classifier," Bioinformatics, vol. 22, pp. 1325–1334, 2006.
    [11] F. L. Chenghai Xue, Tao He, Guo-Ping Liu, Yanda Li and Xuegong Zhang, "Classification of real and pseudo microRNA precursors using local structure-sequence features and support vector machine," BMC Bioinformatics, 2005.
    [12] I. L. Hofacker, "Vienna RNA secondary structure server " Nucleic Acids Research, vol. 31, pp. 3429-3431, 2003.
    [13] K. L. S. N. a. S. K. Mishra, "De novo SVM classification of precursor microRNAs from genomic pseudo hairpins using global and intrinsic folding measures," Bioinformatics, vol. 23, pp. 1321-1330, 2007
    [14] H. P. Schultes EA, LaBean TH, "Estimating the contributions of selection and self-organization in RNA secondary structure," J Mol Evol. , vol. 49, pp. 76-83, 1999.
    [15] G. P. Freyhult E, Moulton V, "A comparison of RNA folding measures," BMC Bioinformatics, vol. 6, p. 241, 2005.
    [16] Z. M. Moulton V, Steel M, Pointon R, Penny D, "Metrics on RNA secondary structures," J Comput Biol., vol. 7, pp. 277-92, 2000
    [17] K. N. Fera D, Shiffeldrim N, Zorn J, Laserson U, Gan HH, Schlick T, "RAG: RNA-As-Graphs web resource," BMC Bioinformatics, vol. 6, p. 5:88, 2004
    [18] F. D. Gan HH, Zorn J, Shiffeldrim N, Tang M, Laserson U, Kim N, Schlick T, "RAG: RNA-As-Graphs database--concepts, analysis, and features," Bioinformatics, vol. 20, pp. 1285-91, 2004
    [19] W. L. a. A. Godzik, "Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences " Bioinformatics vol. 22, pp. 1658-1659, 2006
    [20] K. D. P. a. D. R. Maglott, "RefSeq and LocusLink: NCBI gene-centered resources," Nucleic Acids Res, vol. 29(1), pp. 137–140, 2001
    [21] R. B. D. Karolchik, M. Diekhans, T. S. Furey, A. Hinrichs, Y. T. Lu, K. M. Roskin, M. Schwartz, C. W. Sugnet, D. J. Thomas, R. J. Weber, D. Haussler and W.J. Kent "The UCSC Genome Browser Database " Nucleic Acids Research, vol. 31, pp. 51-54, 2003.
    [22] M. L. William Ritchie, Daniel Gautheret, "RNA stem–loops: To be or not to be cleaved by RNAse III," RNA, vol. 13, pp. 457–462, 2007.
    [23] S. N. K. L. a. S. K. Mishra, "Unique folding of precursor microRNAs: Quantitative evidence and implications for de novo identification," RNA, vol. 13, pp. 170–187, 2007
    [24] R. a. W. Holt, The Gamma Function. New York: Artin E:, 1964.
    [25] Y.-J. Oyang, S.-C. Hwang, Y.-Y. Ou, C.-Y. Chen, and Z.-W. Chen, "Data Classification With Radial Basis Function Networks Based on a Novel Kernel Density Estimation Algorithm," IEEE TRANSACTIONS ON NEURAL NETWORKS,, vol. 16, pp. 225-236, 2005

    無法下載圖示 校內:2018-08-26公開
    校外:2028-08-26公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE