| 研究生: |
蔣明村 Chiang, Ming-tsun |
|---|---|
| 論文名稱: |
使用自動化樣板建立的蛋白質交互作用驗證提供系統 Evidence Providing System for Protein-Protein Interactions by Automatic Constructed Templates |
| 指導教授: |
蔣榮先
Chiang, Jung-hsien |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
| 論文出版年: | 2007 |
| 畢業學年度: | 95 |
| 語文別: | 中文 |
| 論文頁數: | 56 |
| 中文關鍵詞: | 蛋白質交互作用 、資訊萃取 、樣板 |
| 外文關鍵詞: | template, extraction, PPI |
| 相關次數: | 點閱:100 下載:1 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
蛋白質交互作用的關係重要來源為PubMed醫學文件,透過文獻的閱讀可以了解目前生醫學界對蛋白質與蛋白質交互作用關係的相關研究,進而對生物路徑與蛋白質功能有更進一步的了解,然而文件的閱讀,耗廢非常大量的時間,使用電腦來輔助找尋蛋白質交互作用是較有效率的作法,在眾多的資訊萃取技術中,樣板模型的使用經過多年的研究,已有了一定的基礎,該方式不僅可靠,且較為貼近人類在文字理解的方法,在本論文中,以樣板模型的方式來進行蛋白質與蛋白質交互作用的分析,提出一個結合決策系統的方法來判斷蛋白質交互作用出現的關係,並且用來辨識蛋白質交互作用。
有別於一般系統使用人工方式訂立樣板規則,本研究使用自動化方式,建立蛋白質交互作用萃取樣板,並且提出幾個屬性來彌補樣板使用上的不足,經由實驗證明對於蛋白質驗證句的找出,具有極大的幫助。
The main source of protein-protein interactions is PubMed articles. By studying interactions from previous research, scholars all around the world can understand the states of current progress, and they can take advantage of these to expand pathway and protein functions. However the survey of documents is time-consuming and exhaustive. It is more efficient to use information extraction technology to filter superfluous information. To develop a robust and reliable system to mine protein-protein interactions, we need patterns for the recognition of natural language. In this paper, we propose a evidence providing system forprotein-protein interactions based on patterns and integrate decision model to extract PPI sentences from scientific literature.
Unlike other system depend on manual rules, we present a machine learning approach to construct patterns automatically and several attributes to assist to cooperate with pattern model. We also demonstrate that our system is able to provide protein interaction sentences well.
[1] M. J. Scheuemie, M. Weeber, B. J. Schijvenaars, E. M. van Mulligen, C. C. van der Eijk, R. Jelier, B. Mons and J. A. Kors, “Distribution of information in biomedical abstracts and full-text publications”, Bioinformatics, Vol 20 no 16 pp. 2597-2604 2004
[2] G. D. Bader, I. Donaldson, C. Wolting, B.F. Ouellette, T. Pawson and C. W. Hogue, ”BIND-The Biomolecular Interaction Network Databse.”, Nucleic Acids Res., 29, no. 1, pp. 242-245, 2001
[3] M. Kanehisa, S. Goto, S. Kawashima and A. Nakaya, “The KEGG databases at GenomeNet”, Nucleic Acids Res., 30, pp. 42-46, 2002
[4] A. Bairoch and R. Apweiler, “The SWISS-PROT protein sequence database and its supplement TrEMBL”, Nucleic Acids Res., 28, pp. 45-48, 2000
[5] A. E. Marcotte and R. R. Mooney, “Using Biomedical Literature Mining to Consolidate the Set of Known Human Protein-Protein Interactions”, In Proceedings ISMB/ACL. Biolink, pp. 46-53, June 2005
[6] Y. Song, E. Kim, G.. G.. Lee and B. Yi, “POSBIOTM-NER:a trainable biomedical named-entity recognition system”, Bioinformatics, Vol. 21 no. 11 pp. 2794-2796 2005
[7] A. Koike, Y. Niwa and T. Takagi, “Automatic extraction of gene/protein biological functions from biomedical text”, Bioinformatics, Vol. 21 no. 7 pp. 1227-1236, 2005
[8] C. C. van der Eijk, E. M. van Mulligen, J. A. Kors and B. Mons, “Constructing an Associative Concept Space for Literature-Based Discovery”, Journal of the American Society for Information Science and Technology, pp. 436-444, 2004
[9] M. Huang, X. Zhu, Y. Hao, D.G.. Payan, K. Qu and M. Li, “Discovering Patterns to Extract Protein-Protein Interactinos from Full Texts”, Bioinformatics, Vol 20, no 15, pp. 1553-1561 2004
[10] R. Jelier, G. Jenster, L.C. J. Dorssers, C. C. van der Eijk, E. M. van Mulligen, B. Mons and J. A. Kors ”Co-occurrence based meta-analysis of scientific texts: retrieving biological relationships between genes”, Bioinformatics, Vol. 21 no. 9 pp. 2049-2058, 2005
[11] M. Huang, X. Zhu and M. Li, “A Hybrid Method for Relation Extraction from Biomedical Literature”, International Journal of Medical Informatics, 2006
[12] C. Friedman, P. Kra, H. Yu, M. Krauthammer and A. Rzhetsky, “GENIES: a natural-language processing system for the extraction of molecular pathways from journal articles”, Bioinformatics, Vol 17, no. 1, pp. S74-S82 2001
[13] R. C. Bunescu, R. Ge, R. Kate, R. J. Mooney, Y. W. Wong, E. M. Marcotte and A. Ramani, “Learning to Extract Proteins and their Interactions from Medline Abstracts”, In Proceeding ICML. Machine Learning in Bioinformatics, pp. 46-53, August 2003
[14] I. Donaldson, J. Martin, B. Bruijn, C. Wolting, V. Lay, B. Tuekam, S. Zhang, B. Baskin, G. D. Bader, K. Michalickova, T. Pawson and C. W. V. Hogue, “PreBIND and Textomy – mining the biomedical literature for protein-protein interactions using a support vector machine”, BMC Bioinformatics, Vol 4, no. 11, March 2003
[15] R. C. Bunescu and Mooney, “Subsequence Kernels for Relation Extraction”, In Proceedings of Neural Information Processing Systems, December 2005
[16] R. Malik, L. Franke and A. Siebes, ”Combination of text-mining algorithms increases the performance”, Bioinformatics, Vol 22 no. 17 pp. 2151-2157 2006
[17] R. Hoffmann and A. Valencia, “Implementing the iHOP concept for navigation of biomedical literature”, Bioinformatics, Vol 21, no. 2, pp. ii252-ii258 2005
[18] NCBI:http://www.ncbi.nlm.nih.gov
[19] UniProt:http://www.ebi.uniprot.org
[20] BioCreAtIvE:http://biocreative.sourceforge.net/
[21] LingPipe:http://www.alias-i.com/
[22] Montytagger:http://web.media.mit.edu/~hugo/