研究生: |
林子文 Lin, Tzu-wen |
---|---|
論文名稱: |
利用蛋白質所包含之調控特性來預測蛋白質間交互作用 Predicting protein-protein interactions based on the regulatory characteristic of the gene sequences of the protein pairs |
指導教授: |
張天豪
Chang, Tien-Hao |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
論文出版年: | 2014 |
畢業學年度: | 102 |
語文別: | 中文 |
論文頁數: | 44 |
中文關鍵詞: | 系統發育譜 、機器學習 、蛋白質交互作用 |
外文關鍵詞: | Phylogenetic profile, Machine learning, Protein-protein interaction |
相關次數: | 點閱:138 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
蛋白質與蛋白質間的交互作用(protein-protein interaction,PPI)在生物所表現的功能中扮演重要的角色,找出這些PPI有助於瞭解分子層級中生物系統的反應機制。時至今日,有許多蛋白質固有的特性(包括;蛋白質序列、結構、功能等)被用來預測PPI。然而,沒有關於調控特性(舉例:調控蛋白質基因的轉錄因素)是否影響PPI的直接研究。本研究分析基因的調控特性是否會對PPI有影響,並建立一個基於調控特性的預測模組來預測PPI。
本研究進行了一系列對調控特性相關的完整測試,蒐集了8種不同的調控特性,並將其轉錄成12種特徵向量,包含:DNA 彎曲度、基因距離、基因大小、核小體佔有率、TATA 盒、轉錄因子結合證據、轉錄因子破壞證據以及轉錄因子結合位點相似度。實驗結果顯示,基因距離對預測蛋白質對之間是否有PPI有顯著效益,而且,依此方法對釀酒酵母菌(Saccharomyces cerevisiae)預測的結果較其他預測器優秀。
本實驗是第一個探討調控特性對PPI影響的研究,而且證實了調控特性應該被考慮在特徵中而不該被忽略。加入調控特性的特徵模組有助於幫研究者找到未知的分子機制。最後,本研究為往後的研究提供了一個新的往調控特性前進的研究方向。
Protein-protein interaction (PPIs) are essential to diverse biological processes. Elucidating these PPIs helps our understanding of the mechanisms of biological systems at the molecular level. Nowadays, various protein intrinsic features have been studied to predict PPIs. However, no studies have analyzed the regulatory features between two interacting proteins. This study aims to answer whether regulatory features preserve effects on PPIs after the gap from gene to protein as well as to build a regulatory feature-based prediction model to predict PPIs.
This study has conducted a comprehensive analysis of regulatory features. It collected eight kinds of transcriptional characteristics and encoded them to 12 transcriptional features: DNA bendability, gene size, gene distance, nucleosome occupancy, TATA box information, TF binding and knockout information and eight regulatory similarities based on TFBS data. The experimental results show that gene distance, improved the prediction performance and indicate that these regulatory features did influence the PPI prediction after the gap from gene to protein. In Saccharomyces cerevisiae, our method’s prediction is better than other methods.
This work is the first study to discuss the regulatory features in predicting PPIs and the results suggest this category of features must be considered in the future. The pro-posed new regulatory characteristic encoding method has been shown capable to identify whether two proteins have interaction. The constructed prediction model is helpful to discover the unknown molecular mechanisms of specific regulatory functions. Finally, this study leads the following works in related research topics to consider regulatory features.
1. Alberts, B., The cell as a collection of protein machines: preparing the next generation of molecular biologists. Cell, 1998. 92(3): p. 291-4.
2. Jones, S. and J.M. Thornton, Principles of protein-protein interactions. Proc Natl Acad Sci U S A, 1996. 93(1): p. 13-20.
3. Choo, K.H., T.W. Tan, and S. Ranganathan, SPdb--a signal peptide database. BMC Bioinformatics, 2005. 6: p. 249.
4. Ashburner, M., et al., Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet, 2000. 25(1): p. 25-9.
5. Fields, S. and O.K. Song, A Novel Genetic System to Detect Protein Protein Interactions. Nature, 1989. 340(6230): p. 245-246.
6. Ito, T., et al., A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proceedings of the National Academy of Sciences of the United States of America, 2001. 98(8): p. 4569-4574.
7. Gavin, A.C., et al., Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature, 2002. 415(6868): p. 141-147.
8. Ho, Y., et al., Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature, 2002. 415(6868): p. 180-183.
9. Gavin, A.C., et al., Proteome survey reveals modularity of the yeast cell machinery. Nature, 2006. 440(7084): p. 631-636.
10. Zhu, H., et al., Global analysis of protein activities using proteome chips. Science, 2001. 293(5537): p. 2101-2105.
11. Tong, A.H.Y., et al., A combined experimental and computational strategy to define protein interaction networks for peptide recognition modules. Science, 2002. 295(5553): p. 321-324.
12. Bader, G.D., D. Betel, and C.W.V. Hogue, BIND: the Biomolecular Interaction Network Database. Nucleic Acids Research, 2003. 31(1): p. 248-250.
13. von Mering, C., et al., STRING: a database of predicted functional associations between proteins. Nucleic Acids Res, 2003. 31(1): p. 258-61.
14. Salwinski, L., et al., The Database of Interacting Proteins: 2004 update. Nucleic Acids Research, 2004. 32: p. D449-D451.
15. Guldener, U., et al., MPact: the MIPS protein interaction resource on yeast. Nucleic Acids Research, 2006. 34: p. D436-D441.
16. Stark, C., et al., BioGRID: a general repository for interaction datasets. Nucleic Acids Res, 2006. 34(Database issue): p. D535-9.
17. Kerrien, S., et al., IntAct - open source resource for molecular interaction data. Nucleic Acids Research, 2007. 35: p. D561-D565.
18. Keshava Prasad, T.S., et al., Human Protein Reference Database--2009 update. Nucleic Acids Res, 2009. 37(Database issue): p. D767-72.
19. Matthews, L., et al., Reactome knowledgebase of human biological pathways and processes. Nucleic Acids Res, 2009. 37(Database issue): p. D619-22.
20. Licata, L., et al., MINT, the molecular interaction database: 2012 update. Nucleic Acids Res, 2012. 40(Database issue): p. D857-61.
21. Shoemaker, B.A. and A.R. Panchenko, Deciphering protein-protein interactions. Part II. Computational methods to predict protein and domain interaction partners. PLoS Computational Biology, 2007. 3(4): p. 595-601.
22. Barkai, I.T.a.N., Two strategies for gene regulation by promoter nucleosomes. Genome Research, 2008. 18(1084-1901).
23. Itay Tirosh, J.B.a.N.B., The pattern and evolution of yeast promoter bendability Trends Genet, 2007. 23: p. 318–321.
24. Lin, Z., Wu,W.S., Liang,H., Woo,Y. and Li,W.H., The spatial distribution of cis regulatory elements in yeast promotersand its implications for transcriptional regulation. BMC Genomics, 2010. 11: p. 581.
25. Bartel, D.P., MicroRNAs: Genomics, biogenesis, mechanism, and function. Cell, 2004. 116(2): p. 281-297.
26. Young, K.H., Yeast two-hybrid: so many interactions, (in) so little time. Biol Reprod, 1998. 58(2): p. 302-11.
27. Dandekar, T., et al., Conservation of gene order: a fingerprint of proteins that physically interact. Trends Biochem Sci, 1998. 23(9): p. 324-8.
28. Huynen, M., et al., Predicting protein function by genomic context: quantitative evaluation and qualitative inferences. Genome Res, 2000. 10(8): p. 1204-10.
29. Teichmann, S.A. and M.M. Babu, Conservation of gene co-regulation in prokaryotes and eukaryotes. Trends Biotechnol, 2002. 20(10): p. 407-10; discussion 410.
30. Goh, C.S., et al., Co-evolution of proteins with their interaction partners. J Mol Biol, 2000. 299(2): p. 283-93.
31. Pazos, F. and A. Valencia, Similarity of phylogenetic trees as indicator of protein-protein interaction. Protein Eng, 2001. 14(9): p. 609-14.
32. Fariselli, P., et al., Prediction of protein--protein interaction sites in heterocomplexes with neural networks. Eur J Biochem, 2002. 269(5): p. 1356-61.
33. Qi, Y., J. Klein-Seetharaman, and Z. Bar-Joseph, Random forest similarity for protein-protein interaction prediction from multiple sources. Pac Symp Biocomput, 2005: p. 531-42.
34. Shen, J., et al., Predicting protein-protein interactions based only on sequences information. Proc Natl Acad Sci U S A, 2007. 104(11): p. 4337-41.
35. Yu, C.Y., L.C. Chou, and D.T. Chang, Predicting protein-protein interactions in unbalanced data using the primary structure of proteins. BMC Bioinformatics, 2010. 11: p. 167.
36. Yen-Jen, O., et al. Data classification with a relaxed model of variable kernel density estimation. in Neural Networks, 2005. IJCNN '05. Proceedings. 2005 IEEE International Joint Conference on. 2005.
37. Artin, E., The Gamma Function. 1964, New York: Holt, Rinehart and Winston.
38. Vapnik, C.C.a.V., Support vector machine. pp, 1995. 20: p. 273-297.
39. J. Michael Cherry*, C.A., Catherine Ball, Stephen A. Chervitz, Selina S. Dwight, Erich T. Hester, Yankai Jia, Gail Juvik, TaiYun Roe, Mark Schroeder, Shuai Weng and David Botstein SGD: Saccharomyces Genome Database. Nucleic Acids Research, 1998. 26: p. 73-79.
40. Miguel C. Teixeira, P.M., Pooja Jain, Sandra Tenreiro,Alexandra R. Fernandes, Nuno P. Mira, Marta Alenquer, Ana T. Freitas, Arlindo L. Oliveira and Isabel Sa´-Correia, The YEASTRACT database: a tool for the analysis of transcription regulatory associations in Saccharomyces cerevisiae. Nucleic Acids Research, 2006. 34: p. D446-D451.
41. Lifton RP, G.M., Karp RW, Hogness DS, The organization of the histone genes in Drosophila melanogaster: functional and evolutionary implications. Cold Spring Harb Symp Quant Biol 1978. 42: p. 1047-1051.
42. Pugh, F., Transcription from a TATA-Iess promoter requires a multisubunit TFIID complex. GENES & DEVELOPMEN, 2011. 5: p. 1935-1945.
43. Russell, P., ed. iGenetics. 2001.
44. Brukner, I., Sa´ nchez,R., Suck,D. and Pongor,S, Sequence-dependent bending propensity of DNA as revealed by DNase I: parameters for trinucleotides. EMBO 1995. 14: p. 1812-1818.
45. Luger K, M.A., Richmond RK, Sargent DF, Richmond TJ Crystal structure of the nucleosome core particle at 2.8 A resolution. Nature, 1997. 389(6648): p. 251.
46. Noam Kaplan, I.K.M., Yvonne Fondufe-Mittendorf, Andrea J. Gossett, Desiree Tillo, Yair Field1, Emily M. LeProust, Timothy R. Hughes, Jason D. Lieb, Jonathan Widom and Eran Segal, The DNA-encoded nucleosome organization of a eukaryotic genome. Nature, 2009. 458: p. 362-366.
47. Pedro T. Monteiro, N.D.M., Miguel C. Teixeira, Sofia d’Orey,Sandra Tenreiro, Nuno P. Mira, He´ lio Pais, Alexandre P.Francisco, Alexandra M. Carvalho, Artur B. Lourenc¸ o, Isabel Sa´ -Correia, Arlindo L. Oliveira and Ana T. Freitas, YEASTRACT-DISCOVERER: new tools to improve the analysis of transcriptional regulatory associations in Saccharomyces cerevisiae. Nucleic Acids Res, 2007. 36: p. D132–D136.
48. Lee TI, Y.R., Transcription of eukaryotic protein-coding genes. Annual Review of Genetics, 2000. 34: p. 77-137.
49. Kenzie D MacIsaac, T.W., D Benjamin Gordon, David K Gifford, Gary D Stormo and Ernest Fraenkel, An improved map of conserved regulatory sites for Saccharomyces cerevisiae. BMC Bioinformatics, 2006. 7: p. 113.
50. De Santis, M., et al., Combining optimization and machine learning techniques for genome-wide prediction of human cell cycle-regulated genes. Bioinformatics, 2014. 30(2): p. 228-33.
51. Garten, Y., S. Kaplan, and Y. Pilpel, Extraction of transcription regulatory signals from genome-wide DNA–protein interaction data. Nucleic Acids Research, 2005. 33(2): p. 605-615.
52. Kim, R.S., H. Ji, and W.H. Wong, An improved distance measure between the expression profiles linking co-expression and co-regulation in mouse. BMC Bioinformatics, 2006. 7(1): p. 44.
53. Veerla, S. and M. Höglund, Analysis of promoter regions of co-expressed genes identified by microarray analysis. BMC Bioinformatics, 2006. 7(1): p. 384.
54. Shalgi, R., et al., Global and local architecture of the mammalian microRNA-transcription factor regulatory network. PLoS Comput Biol, 2007. 3(7): p. e131.
55. Sexton, T., F. Bantignies, and G. Cavalli. Genomic interactions: chromatin loops and gene meeting points in transcriptional regulation. in Seminars in cell & developmental biology. 2009. Elsevier.
56. Schoenfelder, S., I. Clay, and P. Fraser, The transcriptional interactome: gene expression in 3D. Current opinion in genetics & development, 2010. 20(2): p. 127-133.
57. Schleif, R., DNA looping. Annual review of biochemistry, 1992. 61(1): p. 199-223.
58. Grimes, D.A. and K.F. Schulz, Refining clinical diagnosis with likelihood ratios. Lancet, 2005. 365(9469): p. 1500-5.
59. Lin, T.W., J.W. Wu, and D.T. Chang, Combining phylogenetic profiling-based and machine learning-based techniques to predict functional related proteins. PLoS One, 2013. 8(9): p. e75940.
60. Tan-Wong, S.M., et al., Gene loops enhance transcriptional directionality. Science, 2012. 338(6107): p. 671-5.