簡易檢索 / 詳目顯示

研究生: 魏旻良
Wei, Ming-Liang
論文名稱: 基因共同表現、共同調控、功能相似、蛋白質作用與序列相似度間之關聯研究
Studying the linkage among co-expression, co-regulation, co-function, co-protein interaction and sequence similarity
指導教授: 吳謂勝
Wu, Wei-Sheng
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2014
畢業學年度: 102
語文別: 英文
論文頁數: 37
中文關鍵詞: 共同表現共同調控功能相似度蛋白質作用相似度序列相似度
外文關鍵詞: co-expression, co-regulation, functional similarity, protein-protein interaction similarity, sequence similarity
相關次數: 點閱:98下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 基因相似度對於預估各種生物機制上有所幫助。隨著生物技術的進步,各種生物實驗資料已經可以取得來進行基因間關係的研究。如序列相似度、功能相似度、共同表現、相似蛋白表現與共同調控。這些多樣的生物指標代表不同的類型的資料,而這些指標彼此間互相影響。
    而統計上的量化有助於鑑別這些指標之間的關聯,並也助於以適當的若干指標來取代所關注卻資料不全的指標。在先前的研究中,部分指標間的關聯已經運用於使用其他的指標來驗證或取代生物資料不足的舊指標。然而,這些不同指標的關聯非常零落地被提及,並且無系統性地分析。這篇研究提供了完整基因共同表現、共同調控、功能相似、蛋白質作用與序列相似度,任兩指標間關聯的分析。這五種生物上的資料選擇是因為這些資料能非常廣泛的在各種物種都能取得,並且也非常廣泛地被使用。這些指標關聯的關聯經由均值曲線來分析,並更進而將其關聯分類為:可對應,可鑑別,弱關聯三種關係。基於這些鑑別出的關聯,合適於任意指標的預估子都將在本文呈現出來。最後,這些不同資料間的關聯再更全面性地分析,並從中發現一些生物上的機制。

    Gene similarity is helpful for predicting various biological mechanisms. With the advance of biotechnologies, various types of biological data, such as sequence similarity, functional similarity, co-expression, co-protein interaction, and co-regulation are available for studying gene relations. These different types of biological data are figured by types of features and mutually affected by each other.
    Statistically quantifying the linkage among different types of biological data is helpful to discover the linkages between features and select proper feature to alternate desired biological feature by inadequate data.
    In the previous works, the linkage between different types of biological data is implicitly applied to validate or quantify an unknown biological data by other available data. However the linkages between different types of biological data are simply mentioned in these works without a systematic analysis.
    The present work gives a comprehensive study of linkage among co-expression, co-regulation, co-function, protein-protein interaction similarity and sequence similarity. The five types of biological data were selected because (i) they are widely available in many species (availability) and (ii) there are existing gene/protein relation measures based on them (popularity).
    The linkages between features are analyzed by mean-value curve, and further identified into entirely implicated, partially implicated, and obscure. Base on the identified linkages, proper predictors of each type of data are presented. Finally, these linkages among features are globally analyzed and some biological mechanisms are revealed from these globally-analyzed relations between types of data.

    中文摘要 [I] Abstract (English) [II] 誌謝 [III] List of Tables [VI] List of Figures [VII] List of Abbreviations [VIII] Chapter 1 Introduction [1]   1.1 Motivation [1]   1.2 Biological background [2]   1.2.1 Gene regulation [3]   1.2.2 Gene expression [4]   1.2.3 Protein sequence, interaction, and functional annotation [4] Chapter 2 Method [6]   2.1 Workflow[6]   2.2 Similarity scores [8]   2.2.1 Sequence similarity [8]   2.2.2 Co-function [8]   2.2.3 Protein-protein interaction similarity [9]   2.2.4 Co-expression [10]   2.2.5 Co-regulation [10]   2.3 Mean-value curve [11]   2.4 Identifying the relations [12] Chapter 3 Result [15]   3.1 Sequence similarity as an alternative [15]   3.2 Functional similarity an alternative [16]   3.3 PPI similarity an alternative [16]   3.4 Co-expression an alternative [17]   3.5 Co-regulation an alternative [18] Chapter 4 Dicussion [28]   4.1 Selecting proper predictor to depict target feature [28]   4.2 Global scope of linkages among features [29] Chapter 5 Conclusion [31] References [33]

    [1] D. J. Allocco, I. S. Kohane, and A. J. Butte, “Quantifying the relationship between coexpression,
    co-regulation and gene function,” BMC bioinformatics, vol. 5, no. 1, p. 18,
    2004.
    [2] Y. Loewenstein, D. Raimondo, O. C. Redfern, J. Watson, D. Frishman, M. Linial,
    C. Orengo, J. Thornton, and A. Tramontano, “Protein function annotation by homologybased
    inference,” Genome Biol, vol. 10, no. 2, p. 207, 2009.
    [3] A. Schlicker, F. S. Domingues, J. Rahnenf¨uhrer, and T. Lengauer, “A new measure for
    functional similarity of gene products based on Gene Ontology,” BMC bioinformatics,
    vol. 7, no. 1, p. 302, 2006.
    [4] J. L. Sevilla, V. Segura, A. Podhorski, E. Guruceaga, J. M. Mato, L. A. Martinez-Cruz,
    F. J. Corrales, and A. Rubio, “Correlation between gene expression and GO semantic
    similarity,” Computational Biology and Bioinformatics, IEEE/ACM Transactions on,
    vol. 2, no. 4, pp. 330–338, 2005.
    [5] H. Ge, Z. Liu, G. M. Church, and M. Vidal, “Correlation between transcriptome and
    interactome mapping data from Saccharomyces cerevisiae,” Nature genetics, vol. 29,
    no. 4, pp. 482–486, 2001.
    [6] A. Feiglin, S. Ashkenazi, A. Schlessinger, B. Rost, and Y. Ofran, “Co-expression and
    co-localization of hub proteins and their partners are encoded in protein sequence,” Mol.
    BioSyst., 2014.
    [7] I. Nooren and J. M. Thornton, “Diversity of protein-protein interactions,” The EMBO
    journal, vol. 22, no. 14, pp. 3486–3492, 2003.
    [8] R. S. Kim, H. Ji, and W. H. Wong, “An improved distance measure between the expression
    profiles linking co-expression and co-regulation in mouse,” BMC bioinformatics,
    vol. 7, no. 1, p. 44, 2006.
    [9] R. Xulvi-Brunet and H. Li, “Co-expression networks: graph properties and topological
    comparisons,” Bioinformatics, vol. 26, no. 2, pp. 205–214, 2010.
    [10] M. R. J. Carlson, B. Zhang, Z. Fang, P. S. Mischel, S. Horvath, and S. F. Nelson, “Gene
    connectivity, function, and sequence conservation: predictions from modular yeast coexpression
    networks,” BMC genomics, vol. 7, no. 1, p. 40, 2006.
    [11] E. M. Marcotte, M. Pellegrini, H.-L. Ng, D. W. Rice, T. O. Yeates, and D. Eisenberg,
    “Detecting protein function and protein-protein interactions from genome sequences,”
    Science, vol. 285, no. 5428, pp. 751–753, 1999.
    [12] S. Letovsky and S. Kasif, “Predicting protein function from protein/protein interaction
    data: a probabilistic approach,” Bioinformatics, vol. 19, no. suppl 1, pp. i197–i204, 2003.
    [13] C. Brun, F. Chevenet, D. Martin, J. Wojcik, A. Gu´enoche, and B. Jacq, “Functional
    classification of proteins for the prediction of cellular function from a protein-protein
    interaction network,” Genome biology, vol. 5, no. 1, p. R6, 2003.
    [14] U. Karaoz, T. M. Murali, S. Letovsky, Y. Zheng, C. Ding, C. R. Cantor, and S. Kasif,
    “Whole-genome annotation by using evidence integration in functional-linkage networks,”
    Proceedings of the National Academy of Sciences of the United States of America,
    vol. 101, no. 9, pp. 2888–2893, 2004.
    [15] D. Stojanova, M. Ceci, and D. Malerba, “Using PPI network autocorrelation in hierarchical
    multi-label classification trees for gene function prediction,” BMC bioinformatics,
    vol. 14, no. 1, p. 285, 2013.
    [16] M. Ashburner, C. A. Ball, J. A. Blake, D. Botstein, H. Butler, J. M. Cherry, A. P. Davis,
    K. Dolinski, S. S. Dwight, and J. T. Eppig, “Gene Ontology: tool for the unification of
    biology,” Nature genetics, vol. 25, no. 1, pp. 25–29, 2000.
    [17] T. Price, F. I. Pe˜na III, and Y.-R. Cho, “Survey: Enhancing protein complex prediction
    in PPI networks with GO similarity weighting,” Interdisciplinary Sciences: Computational
    Life Sciences, vol. 5, no. 3, pp. 196–210, 2013.
    [18] Z. Wu and M. Palmer, “Verbs semantics and lexical selection,” pp. 133–138, 1994.
    [19] X. Guo, R. Liu, C. D. Shriver, H. Hu, and M. N. Liebman, “Assessing semantic similarity
    measures for the characterization of human regulatory pathways,” Bioinformatics,
    vol. 22, no. 8, pp. 967–973, 2006.
    [20] P. Resnik, “Using information content to evaluate semantic similarity in a taxonomy,”
    arXiv preprint cmp-lg/9511007, 1995.
    [21] M. Mistry and P. Pavlidis, “Gene Ontology term overlap as a measure of gene functional
    similarity,” BMC bioinformatics, vol. 9, no. 1, p. 327, 2008.
    [22] S. F. Altschul, W. Gish, W. Miller, E. W. Myers, and D. J. Lipman, “Basic local alignment
    search tool,” Journal of molecular biology, vol. 215, no. 3, pp. 403–410, 1990.
    [23] S. Henikoff and J. G. Henikoff, “Amino acid substitution matrices from protein blocks,”
    Proceedings of the National Academy of Sciences, vol. 89, no. 22, pp. 10915–10919,
    1992.
    [24] J. M. Cherry, E. L. Hong, C. Amundsen, R. Balakrishnan, G. Binkley, E. T. Chan,
    K. R. Christie, M. C. Costanzo, S. S. Dwight, and S. R. Engel, “Saccharomyces Genome
    Database: the genomics resource of budding yeast,” Nucleic acids research, vol. 40,
    no. D1, pp. D700–D705, 2012.
    [25] D. T.-H. Chang, C.-Y. Wu, and C.-Y. Fan, “A study on promoter characteristics of
    head-to-head genes in Saccharomyces cerevisiae,” BMC genomics, vol. 13, no. Suppl 1,
    p. S11, 2012.
    [26] F. M. Couto, M. J. Silva, and P. M. Coutinho, “Measuring semantic similarity between
    Gene Ontology terms,” Data & knowledge engineering, vol. 61, no. 1, pp. 137–152,
    2007.
    [27] U. Nagalakshmi, Z. Wang, K. Waern, C. Shou, D. Raha, M. Gerstein, and M. Snyder,
    “The transcriptional landscape of the yeast genome defined by RNA sequencing,”
    Science, vol. 320, no. 5881, pp. 1344–1349, 2008.
    [28] I. Lee, Z. Li, and E. M. Marcotte, “An improved, bias-reduced probabilistic functional
    gene network of baker’s yeast, Saccharomyces cerevisiae,” PloS one, vol. 2, no. 10,
    p. e988, 2007.
    [29] D. Bu, Y. Zhao, L. Cai, H. Xue, X. Zhu, H. Lu, J. Zhang, S. Sun, L. Ling, and N. Zhang,
    “Topological structure analysis of the protein-protein interaction network in budding
    yeast,” Nucleic acids research, vol. 31, no. 9, pp. 2443–2450, 2003.
    [30] C. Stark, B.-J. Breitkreutz, T. Reguly, L. Boucher, A. Breitkreutz, and M. Tyers, “BioGRID:
    a general repository for interaction datasets,” Nucleic acids research, vol. 34,
    no. suppl 1, pp. D535–D539, 2006.
    [31] M. A. Hibbs, D. C. Hess, C. L. Myers, C. Huttenhower, K. Li, and O. G. Troyanskaya,
    “Exploring the functional landscape of gene expression: directed search of large
    microarray compendia,” Bioinformatics, vol. 23, no. 20, pp. 2692–2699, 2007.
    [32] M. C. Teixeira, P. Monteiro, P. Jain, S. Tenreiro, A. R. Fernandes, N. P. Mira, M. Alenquer,
    A. T. Freitas, A. L. Oliveira, and I. S´a-Correia, “The YEASTRACT database: a tool
    for the analysis of transcription regulatory associations in Saccharomyces cerevisiae,”
    Nucleic acids research, vol. 34, no. suppl 1, pp. D446–D451, 2006.
    [33] Y. Garten, S. Kaplan, and Y. Pilpel, “Extraction of transcription regulatory signals from
    genome-wide DNA-protein interaction data,” Nucleic acids research, vol. 33, no. 2,
    pp. 605–615, 2005.
    [34] K. Cartharius, K. Frech, K. Grote, B. Klocke, M. Haltmeier, A. Klingenhoff, M. Frisch,
    M. Bayerlein, and T. Werner, “MatInspector and beyond: promoter analysis based on
    transcription factor binding sites,” Bioinformatics, vol. 21, no. 13, pp. 2933–2942, 2005.
    [35] P. Aloy, H. Ceulemans, A. Stark, and R. B. Russell, “The relationship between sequence
    and interaction divergence in proteins,” Journal of molecular biology, vol. 332, no. 5,
    pp. 989–998, 2003.
    [36] K. Sikic and O. Carugo, “Protein sequence redundancy reduction: comparison of various
    method,” Bioinformation, vol. 5, no. 6, p. 234, 2010.
    [37] S. F. Altschul, T. L. Madden, A. A. Sch¨affer, J. Zhang, Z. Zhang, W. Miller, and D. J.
    Lipman, “Gapped BLAST and PSI-BLAST: a new generation of protein database search
    programs,” Nucleic acids research, vol. 25, no. 17, pp. 3389–3402, 1997.
    [38] V. Y. Muley and A. Ranjan, “Evaluation of Physical and Functional Protein-Protein
    Interaction Prediction Methods for Detecting Biological Pathways,” PloS one, vol. 8,
    no. 1, p. e54325, 2013.
    [39] G. B. Fogel, D. G. Weekes, G. Varga, E. R. Dow, H. B. Harlow, J. E. Onyia, and
    C. Su, “Discovery of sequence motifs related to coexpression of genes using evolutionary
    computation,” Nucleic Acids Research, vol. 32, no. 13, pp. 3826–3835, 2004.

    下載圖示 校內:立即公開
    校外:立即公開
    QR CODE