簡易檢索 / 詳目顯示

研究生: 王文宏
Wang, Wen-Hung
論文名稱: 利用一個以基因本體論為基礎之語意相似度量測來發掘相似功能之基因
Discovering Similar Genes According to Gene Ontology Based Semantic Similarity Measure
指導教授: 蔣榮先
Chiang, Jung-Hsien
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2004
畢業學年度: 92
語文別: 中文
論文頁數: 53
中文關鍵詞: 基因本體論語意相似度非線性函數量化
外文關鍵詞: Gene Ontology, Semantic Similarity, Quantification
相關次數: 點閱:97下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  •   基因序列的比對是常常被用來比較兩基因的方法,然而有許多應用是需要倚靠兩基因一些生物上的作用,藉此來比較兩基因。對於大部分的基因,其生物上的一些功能,都會以一些生物專有名詞來對該基因加以註解,而在基因本體論上都有定義這些專有名詞的階層關係。因此我們透過研究基因本體論的階層架構並提出計算兩生物專有名詞的語意相似度量測方法,以及基因特徵量化策略來計算兩基因功能上的相近程度,藉此來量化兩基因的功能註解。

      我們也提供視覺化界面來呈現經過完整計算後相似功能之基因,並且以量化的方式來呈現兩基因在功能上的相近程度,最後利用真實生物路徑配合文獻中作用關係之證據的找尋來發掘未知的生物路徑。

      Many bioinformatics data resources not only hold data in the form of sequences, but also as annotation. In the majority case, Annotation is written as scientific natural language: this is suitable for humans, but not particularity useful for machine processing. Ontologies offer a mechanism by which knowledge can be represented in a form capable of such processing.

      In this thesis we investigate the hierarchy of Gene Ontology, and propose a semantic similarity measure for quantifying annotations (Gene Ontology’s terminologies) between two genes. Finally, based on the quantification procedure and finding gene-gene interaction evidences from biomedical literature we try to discover the candidate genes of unknown pathway from well-known pathway.

    第一章 導論 1 1.1 前言 1 1.2 研究動機 1 1.3 解決方法 2 1.4 系統概述 3 第二章 文獻回顧與相關研究 5 2.1 何謂生物資訊學 5 2.1.1生物資訊的起源 5 2.1.2生物資訊的應用領域 7 2.1.3生物資訊的發展方向 7 2.2 語意相似度相關研究 8 2.3 基因本體論階層式架構相關研究 9 第三章 語意相似度量測與基因特徵量化策略 12 3.1 方法論概述 12 3.2以視覺化方式呈現基因關係 14 3.3系統架構圖 15 3.3.1 基因本體論與LocusLink資料庫的前處理 16 3.3.2 資料的篩選 16 3.4 語意相似度量測模組 17 3.4.1 轉換函數的特性 17 3.4.2 根據最短距離來量化 18 3.4.3 根據深度來量化 19 3.4.4 根據最短距離與深度來量化 22 3.4.5 GO Term正負向關係 25 3.5 基因特徵量化策略 27 3.6 候選基因之選取 29 第四章 實驗設計與結果分析 30 4.1 資料集介紹 30 4.1.1 同源基因 30 4.1.2 MAP Kinase生物路徑 31 4.1.3 RON生物路徑 32 4.1.4 Lutheran反應途徑 32 4.1.5 微陣列資料 33 4.2 實驗與結果分析 34 4.2.1 系統參數調整與篩選GO Term之門檻值的決定 34 4.2.2 反應途徑之候選基因 36 4.2.3 分析微陣列資料 41 第五章 結論與未來研究方向 43 5.1 結論 43 5.2 未來研究方向 44 參考文獻 45 附錄A 基因功能註解的證據(Evidence Codes) 47

    [1] A. Budanitsky and G. Hirst, “Semantic Distances in WordNet: An Experimental, Application-Oriented Evaluation of Five Measures,” Proceedings of the Workshop WordNet and Other Lexical Resources, Second Meeting North American Chapter Association for Computational Linguistics, Pittsburgh, PA. June 2001.

    [2] E. Kretschmann, W. Fleischmann and R. Apweiler “Automatic Rule Generation for Protein Annotation with the C4.5 Data Mining Algorithm Applied on SWISS-PROT,” Bioinformatics, vol. 17, no. 10, pp. 920-926 , 2001.

    [3] F.M. Couto, Mario J. Silva and P. Coutinho, “Implementation of Functional Semantic Similarity Measure between Gene-Products,” FCUL Technical Report DI/FCUL TR 3-29, November 2003.

    [4] H. Alani, S. Dasmahapatra, K. O’Hara and N. Shadbolt, “Identifying Communities of Practice through Ontology Network Analysis,” IEEE Intelligent System, vol. 18, no. 2, March/April 2003.

    [5] J.H. Chiang and H.C. Yu, “MeKE: Discovering the Functions of Gene Products from Biomedical Literature via Sentence Alignment,” Bioinformatics, vol. 19, no. 11, pp. 1417-1422 , 2003.

    [6] K.I. Fukuda and T. Takagi, “Knowledge Representation and Signal Transduction Pathways,” Bioinformatics, vol. 17, no.9, pp. 829-837, 2003.

    [7] M. McHale, “A Comparison of WordNet and Roget’s Taxonomy for Measuring Semantic Similarity,” Proc. COLING/ACL Workshop on Usage of WordNet in Natural Language Processing Systems, Montreal, Canada, August, pp. 115-120, 1998.

    [8] P. Resnik, “Using Information Content to Evaluate Semantic Similarity in a Taxonomy, ” Proceedings of the 14th International Joint Conference on Artificial Intelligence, vol. 1, pp. 448-453, Montreal, August, 1995.

    [9] P.W. Lord, R.D. Stevens, A. Brass and C.A. Goble, “Semantic Similarity Measures as tools for exploring the Gene Ontology, ” Pac. Symp. Biocomput. 8, pp. 601-612, 2003.

    [10] P.W. Lord, R.D. Stevens, A. Brass and C.A. Goble, “Investigating Semantic Similarity Measures across the Gene Ontology: the Relationship between Sequence and Annotation,” Bioinformatics, vol. 19, no. 10, pp.1275-1283, 2003.

    [11] R. Rada, H. Mili, E. Bichnell, and M. Blettner, “Development and Application of a Metric on Semantic Nets,” IEEE Trans. Systems, Man, and Cybernetics, vol. 9, no. 1, pp. 17-30, January 1989.

    [12] S.G. Lee, J.U. Hur and Y.S. Kim, “A Graph-Theoretic Modeling on GO Space for Biological Interpretation of Gene Clusters,” Bioinformatics, vol. 20, no. 3, pp. 381-388, 2004.

    [13] S. Raychaudhuri and R.B. Altman, “A Literature-Based Method for Assessing Functional Coherence of a Gene Group,” Bioinformatics, vol.19, no. 3, pp. 396-401, 2003.

    [14] S. Raychaudhuri, J.T. Chang, P.D. Sutphin, and R.B. Altman, “Associating Genes with Gene Ontology Codes Using a Maximum Entropy Analysis of Biomedical Literature,” Genome Research, 12:203-214, 2001.

    [15] T. Ono, H. Hishigaki, A. Tanigami and T. Takagi, “Automated Extraction of Information on Protein-Protein Interactions from the Biological Literature,” Bioinformatics, vol. 17, no. 2, pp. 155-161, 2001.

    [16] Y. Li, Z.A. Bandar, and D. McLean, “An Approach for Measuring Semantic Similarity between Words Using Multiple Information Sources”, IEEE Transaction on Knowledge and Data Engineering, vol. 15, no. 4, July/August 2003.

    [17] 基因本體論網站:http://www.geneontology.org

    [18] LocusLink資料庫網站:http://www.ncbi.nlm.nih.gov/LocusLink/index.html

    [19] Biocarta生物路徑網站:http://www.biocarta.com

    [20] PubMed網站:http://www.ncbi.nlm.nih.gov/entrez/query.fcgi

    [21] 中央研究院計算中心通訊第19卷第20期 <<生物資訊>>

    下載圖示 校內:2005-07-29公開
    校外:2005-07-29公開
    QR CODE