簡易檢索 / 詳目顯示

研究生: 陳威淳
Chen, Wei-Chun
論文名稱: 基於深度學習預測C.elegans上piRNA-mRNA的標靶關係
Predictions of piRNA-mRNA targeting relationships in C. elegans using deep learning
指導教授: 吳謂勝
Wu, Wei-Sheng
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2019
畢業學年度: 107
語文別: 中文
論文頁數: 66
中文關鍵詞: piRNAmRNA標靶預測深度學習
外文關鍵詞: piRNA, mRNA, traget relationships, deep learning
相關次數: 點閱:49下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • piRNA屬於小型非編碼核糖核酸的一種,能透過標靶mRNA序列,搭配RNAi途徑來沉默外來基因,因此piRNA被稱為基因體的守衛。而以往生物學家要研究piRNA和mRNA的標靶關係時,必須透過實驗的試誤法求得,但此法需要浪費大量的時間和人力成本。因此現階段有研究是透過機器學習的方法,對piRNA和mRNA抽取特徵後,搭配SVM演算法進行標靶基因的預測,但這種篩選方法受限於人類對於生物機制的理解,較難決定比較可信的特徵,進而導致錯誤的分類結果。隨著近年來深度學習演算法的發展並大量地使用在生物資訊上,透過原始生物序列的輸入,讓機器自主的學習重要特徵。因此本研究提出一個基於殘差網路的深度學習模型,在不經由選取特徵的條件下,將piRNA和mRNA結合序列位置當作輸入,讓網路學習其標靶的規則。而最後本研究也推廣至對整條mRNA的判斷,利用上述提出的網路模型,搭配合適濾波器來分類piRNA和mRNA的結合問題,最後在site-level和gene-level準確度的效能平均表現,分別能達到80%和77%的判斷水平。

    piRNA is a small non-coding RNA that can silence foreign genes through the target mRNA sequence and the RNAi pathway. Therefore, piRNA is called the guard of the genome. In the past, when biologists were to study the target relationship between piRNA and mRNA, they must be obtained through experimental trial and error, but this method requires a lot of time and labor costs. With the development of deep learning algorithms in recent years and the large use of biological information, the input of the original biological sequence allows the machine to learn important features autonomously. Therefore, this study proposes a deep learning model based on residual network, which takes the position of piRNA and mRNA binding sequence as input and allows the network to learn the rules of its target without selecting features. Finally, this study was also extended to the judgment of the whole mRNA sequence. Using the proposed network model, combined with a suitable filter to classify the binding problem of piRNA and mRNA pair, and finally the average performance of site-level and gene-level accuracy. , can reach the judgment level of 80% and 77% respectively.

    摘要 I 英文延伸摘要 III 致謝 VI 目錄 VII 表目錄 X 圖目錄 XI 第一章 研究背景與動機 1 1.1 模式生物-線蟲 1 1.2 線蟲piRNA的簡介 1 1.2.1 線蟲piRNA的生成 2 1.2.2 piRNA的調控機制 3 1.2.3 piRNA的標靶規則 4 1.3 基於機器學習之標靶預測 5 1.4 研究動機 7 第二章 深度神經網路 8 2.1 卷積神經網路 8 2.2 神經網路優化演算法 12 2.2.1 權重正則化 12 2.2.2 批量標準化 13 2.2.3 最佳化演算法 15 2.3 深度殘差網路 17 第三章 piRNA標靶預測架構流程 18 3.1 資料集 18 3.1.1 正資料集的準備 21 3.1.2 負資料集的準備 22 3.2 資料編碼 24 3.3 模型選擇及參數調整 26 3.4 訓練與測試流程 29 3.5 可視化流程 31 第四章 實驗設備與結果 32 4.1 實驗環境 32 4.2 評分指標與結果 32 4.3 實驗ㄧ:不同網路架構之比較 35 4.4 實驗二:模型超參數的比較 37 4.4.1 卷積核大小比較 38 4.4.2 池化層大小比較 39 4.4.3 學習率調整比較 41 4.4.4 結合位點長度比較 41 4.4.5 piRNA標靶個數對模型的影響 42 4.5 實驗三:可視化結果討論 44 4.6 實驗四:piRNA gene level測試 46 4.6.1 gene level的分類流程 46 4.6.2 結合能量的選擇和測試結果 47 4.6.3 各濾波器的效能影響 48 4.6.4 視覺化gene-level的分類過程 50 4.7 與其他piRNA分類器的比較 54 4.8 piRNA實例討論 56 第五章 結論與未來展望 59 5.1 結論 59 5.2 未來展望 60 參考文獻 61

    [1] S. BRENNER, “The genetics of Caenorhabditis elegans,” GENETICS, vol. 77, no. 1, pp. 71-94, 1974
    [2] The C. elegans Sequencing Consortium, “Genome Sequence of the Nematode C. elegans: A Platform for Investigating Biology,” Science, vol. 282, no. 5396, pp. 2012-2018, 1998.
    [3] J. G. Ruby, C. Player, M. J. Axtell, W. Lee, C. Nusbaum, H. Ge and D. P. Bartel, “Large-Scale Sequencing Reveals 21U-RNAs and Additional MicroRNAs and Endogenous siRNAs in C. elegans,” Cells, vol. 127, no. 6, pp. 1193-1207, 2006.
    [4] A. Fire, S. Xu, M. K. Montgomery, S. A. Kostas, S. E. Driver and C. C. Mello, “Potent and specific genetic interference by double-stranded RNA in Caenorhabditis elegans” Nature, vol. 391, no. 6669, pp. 806-811, 1998
    [5] M. E. Weick and A. E. Miska, “piRNAs: from biogenesis to function,” Development, vol. 141, no. 18, pp. 3458-3471, 2014.
    [6] P. J. Batista, J. G. Ruby, J. M. Claycomb, R. Chiang, N. Fahlgren, K. D. Kasschau, D. A. Chaves, W. Gu, J. J. Vasale, S. Duan, D. C. Jr, S. Luo, G. P. Schroth, J. C. Carrington, D. P. Bartel and C. C. Mello, “PRG-1 and 21U-RNAs Interact to Form the piRNA Complex Required for Fertility in C. elegans,” Mol. Cell, vol. 31, no. 1, pp. 67-78, 2008.
    [7] P. P. Das, P. M. Bagijn, D. L. Goldstein, R. J. Woolford, J. N. Lehrbach, A. Sapetschnig, R. H. Buhecha, J. M. Gilchrist, L. K. Howe, R. Stark, N. Matthews, E. Berezikov, R. F. Ketting, S. Tavare and A. E. Miska, “Piwi and piRNAs act upstream of an endogenous siRNA pathway to suppress Tc3 transposon mobility in the Caenorhabditis elegans germline,” Mol Cell, vol. 31, no. 1, pp. 79-90, 2008.
    [8] G. Wang and V. Reinke, “A C. elegans Piwi, PRG-1, regulates 21U-RNAs during spermatogenesis,” Curr Biol, vol. 18, no. 12, pp. 861-867, 2008.
    [9] Y.Kirino and Z. Mourelatos, “Mouse Piwi-interacting RNAs are 2'­O­methylated,” Nat Struct Mol Biol, vol. 14, no. 4, pp. 349-350, 2007.
    [10] K. Saito, Y. Sakaguchi, T. Suzuki, H. Siomi and C. M. Siomi, “Pimet, the Drosophila homolog of HEN1, mediates 2'­O­methylated of Piwi- interacting RNAs at their 3' ends,” Genes Dev, vol. 21, no. 13, pp. 1603-1608, 2007.
    [11] M. P. Bagijn, L. D. Goldstein, A. Sapetschnig, E. M. Weick, S. Bouasker, N. J. Lehrbach, M. J. Simard and E. A. Miska, “Function, Targets, and Evolutions of Caenorhabditis elegans piRNAs,” Science, vol. 337, no. 6094, pp. 574-578, 2012.
    [12] M. Shirayama, M. Seth, H. C. Lee, W. Gu, T. Ishidate, D. Conte and C. C. Mello, “piRNAs Initiate an Epigenetic Memory of Non-self RNA in the C. elegans Germline,” Cell, vol. 150, no. 1, pp. 65-77, 2012.
    [13] W. Gu, M. Shirayama, D. C. Jr, et al., “Distinct Argonaute-mediated 22G-RNA Pathways Direct Genome Surveillance in the C. elegans Germline,” Mol Cell, vol. 36, no. 2, pp. 231-244, 2009.
    [14] D. Zhang, S. Tu, M. Stubna, W. S. Wu, W. C. Huang, Z. Weng and H. C. Lee, “The piRNA Targeting Rules and the Resistance to piRNA Silencing in Endogenous Genes,” Science, vol. 359, no. 6375, pp. 587-592, 2018.
    [15] J. Yuan, P. Zhang, Y. Cui, J. Wang, G. Skogerbo, D. W. Huang, R. Chen and S. He, “Computational Identifiaction of piRNA Targets on Mouse mRNAs,” Bioinformatics, vol. 32, no. 8, pp. 1170-1177, 2016.
    [16] E. E. Ahmed, S. M. El-Gokhy, et al., “Enhanced Framework for miRNA Target Prediction,” in Proc. IEEE Conf. ICCES, Cairo, Egypt, 2018
    [17] M. Yousef, S. Jung, A. V. Kossenkov, L. C. Showe and M. K. Showe, “Naïve Bayes for microRNA Target Predictions Machine Learning for microRNA targets,” Bioinformatics, vol. 23, no. 22, pp. 2987-2992, 2007.
    [18] Y. Yang, Y. P. Wang and K. B. Lin, “MiRTif: A Support Vector Machine-based microRNA Target Interaction Filter,” Bioinformatics, vol. 9, no. 12, pp. S4, 2008.
    [19] S. Cheng, M. Guo, C. Wang, X. Liu and X. Wu, “MiRTDL: A Deep Learning Approach for miRNA Target Prediction,” IEEE/ACM Transactios on Computational Biology and Bioinformatics, vol. 13, no. 6, pp. 1161-1169, 2016.
    [20] M. Wen, P. Cong, Z. Zhang, H. Lu and T. Li, “DeepMirTar: A Deep-learning Approach for Predicting Human miRNA Targets,” Bioinformatics, vol. 34, no. 22, pp. 3781-3787, 2018.
    [21] B. Lee, J. Beak, S. Park and S. Yoon, “deepTarget: End-to-end Learning Framework for microRNA Target Prediction using Deep Recurrent Neural Networks,” arXiv:1603.09123, 2016.
    [22] Albert Pla, Xiangfu Zhong and Simon Rayner, “miRAW: A Deep Learning-based Approach to predict microRNA Targets by Analyzing Whole microRNA transcripts,” PLOS Computational Biology, vol. 14, no. 7, 2017.
    [23] R. Lorenz, S. H. Bernhart, C. H. Z. Siederdissen, H. Tafer, C. Flamm, P. F. Stadler and I. L. Hofacker, “ViennaRNA Package 2.0,” Algorithms for Molecular Biology, vol. 6, no. 1, pp. 6-26, 2011.
    [24] Y. Bengio, A. Courvile and P. Vincent, “Representation Learning: A Review and New Perspectives,” arXiv:1206.5538, 2014.
    [25] Warren S. Mcculloch and Walter Pitts, “A Logical Calculus of the Ideas Immanent in Nervous Activity,” The bulletin of mathematical biophysics, vol. 5, no. 4, pp. 115-133, 1943.
    [26] F. Rosenblatt, “The Perceptron: A Probabilistic Model for Information Storage and Organization in the Brain,” Psychological Review, vol. 65, no. 6, pp. 386-408, 1958.
    [27] Y. Fukushima, K. Hara and M. Kimura, “Receptive Field Mechanisms of Ganglion Cells in the Cat Retina,” Biological Cybernetics, vol. 52, no. 1, pp. 37-43, 1985.
    [28] Kunihiko Fukushima, “Neocognitron: A Self-organizing Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position,” Biological Cybernetics, vol. 36, no. 4, pp. 193-202, 1980.
    [29] Y. Lecun, L. Bottou, Y. Bengio and P. Halfner, “Gradient-Based Learning Applied to Document Recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, 1998.
    [30] Karen Simonyan and Andrew Zisserman, “Very Deep Convolutional Networks for Large-scale Image Recognition,” arXiv:1409.1556, 2014.
    [31] Yoon Kim, “Convolutional Neural Networks for Sentence Classification,” in Proc. EMNLP Conf., Doha, Qatar, 2014.
    [32] Tom Young, Devamanyu Hazarika, Soujanya Poria and Erik Cambria, “Recent Trends in Deep Learning Based Natural Language Processing,” arXiv:1708.02709, 2018.
    [33] R. Poplin, P. C. Chang, D. Alexander, et al., “A Univarsal SNP and Small-indel Variant Caller Using Deep Neural Networks,” Nature Biotechnology, vol. 36, pp. 983-987, 2018.
    [34] Hidetoshi Shimodaira, “Improving Predictive Inference under Covariate Shift by Weighting the Log-likelihood function,” Journal of Statistical Planning and Inference, vol. 90, no. 2000, pp. 227-244, 2000.
    [35] Sergey Ioffe, Christian Szegedy, “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift,” arXiv:1502.03167, 2015.
    [36] Ning Qian, “On the Momentum Term in Gradient Descent Learning Algorithms,” Neural Networks, vol. 12, no. 1, pp. 145-151, 1999.
    [37] John Duchi, Elad Hazan, Yoram Singer, “Adative Subgradient Methods for Online Learning and Stochastic Optimization,” Journal of Machine Learning Research, vol. 12, pp. 2121-2159, 2011.
    [38] Diederik P. Kingma, Jimmy Lei Ba, “Adam: A Method for Stochastic Optimization,” arXiv:1412.6980, 2017.
    [39] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun, “Deep Residual Learning for Image Recognition,” arXiv:1512.03385, 2015.
    [40] 陳宗德,建立C. elegans與C. briggsae內預測piRNA-mRNA交互作用的資料庫,碩士論文,國立成功大學電機系,2018。
    [41] Zhou De-Jian and Ye Ke-Qiong, “CLIP Techniques in Studying Protein-RNA Interactions,” Chinese Bulletin of Life Sciences, vol. 26, no. 3, pp. 207-213, 2014.
    [42] A. J. Travis, J. Moody, A. Helwak, D. Tollervey, G. Kudla, “Hyb: A Bioinformatics Pipeline for the analysis of CLASH (crosslinking, ligation and sequencing of hybrids) Data,” Methods, vol. 65, no. 3, pp. 263-273, 2014.
    [43] M. Menor, T. Ching, X. Zhu, D. Garmire and L. X. Garmire, “mirMark: a site-level and UTR-level classifier for miRNA target prediction,” Genome Biology, vol. 15, no. 10, 2014.
    [44] Ben Langmead, Cole Trapnell, Mihai Pop and Steven L Salzberg, “Ultrafast and Memory-efficient Alignment of Short DNA Sequences to the Human Genome,” Genome Biol, vol. 10, no. 3, 2009.
    [45] T. Mikolov, K. Chen, G. Corrado and J. Dean, “Efficient Estimation of Word Representations in Vector Space,” arXiv:1301.3781, 2013.
    [46] R. Johnson, T. Zhang, “Effective Use of Word Order for Text Categorization with Convolutional Neural Network,” arXiv:1412.1058, 2015.
    [47] M. Wang, C. Tai, W. E and L. Wei, “DeFine: Deep Convolutional Neural Networks Accurately Quantify Intensities of Transcription Factor-DNA Binding and Facilitate Evaluation of Functional Non-coding Variants,” Nucleic Acids Reaserrch, vol. 46, no. 11, 2018.
    [48] B. T. Do, V. Golkov, G. E. Gurel and D. Cremers, “Precursor microRNA Identification Using Deep Convolutional Neural Networks,” bioRxiv, vol. 46, no. 11, 2018.
    [49] 林易瑩,使用卷積神經網路預測小分子核糖核酸目標基因於非結合位點序列,碩士論文,國立成功大學電機系,2019。
    [50] K. He, X. Zhang, S. Ren and J. Sun, “Identity Mappings in Deep Residual Networks,” arXiv:1603.05027, 2016.
    [51] R. R. Selvaraju, M. Cogswell, et al., “Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization,” arXiv:1610.02391, 2017.
    [52] C. Szegedy, W. Liu, Y. Jia, et al., “Going Deeper with Convolutions,” arXiv:1409.4842, 2014.
    [53] U. Mückstein, H. Tafer, J. Hackermüller, SH. Bernhart , PF. Stadier and IL. Hofacker, “Thermodynamics of RNA-RNA Blinding,” Bioinformatics, vol. 22, no. 10, pp. 1177-1182, 2006.
    [54] P. Zhang, J. Y. Kang, L. T. Gou, et al., “MIWI and piRNA-mediated Cleavage of Messenger RNAs in Mouse Testes,” Cell Research, vol. 25, no. 2, pp. 193-207, 2015.
    [55] Shen EZ, Chen H, Ozturk AR, Tu S, et al., “Identification of piRNA Binding Sites Reveals the Argonaute Regulatory Landscape of the C. elegans Germline,” Cell Research, vol. 172, no. 5, pp. 937-951, 2018.

    下載圖示 校內:2024-07-24公開
    校外:2024-07-24公開
    QR CODE