簡易檢索 / 詳目顯示

研究生: 金忠良
Chin, Chung-Liang
論文名稱: 以基因本體論為基礎之基因功能聚類與生醫文件探勘系統
GeneLibrarian:An Integrated Gene Ontology Clustering and Biomedical Text Mining System
指導教授: 蔣榮先
Chiang, Jung-Hsien
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2005
畢業學年度: 93
語文別: 中文
論文頁數: 54
中文關鍵詞: 基因本體論基因功能聚類文件探勘
外文關鍵詞: gene clustering, gene ontology, text mining
相關次數: 點閱:96下載:4
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  •   基因序列比對與微陣列都是常常被用來做基因比較的方法,透過這樣的測試可以將功能相近或表現相似的基因找出,提供用者資訊以進一步分析,然而有許多基因具有相似的結構或在微陣列表現相近是因為其具有類似的功能,而基因經過研究後的功能許多都被使用基因本體論中所定義的專有名詞註解,而在基因本體論當中提供了這些專有名詞的階層式架構,在本論文中,以生物分類的角度來使用這個階層式架構,提出ㄧ相似度量測的方法來計算不同專有名詞間的關係,並且用來對基因進行分析聚類。

      論文中還對醫生醫文件探勘系統進行修改,以解決網路醫學文件格式變更,與使用過程中所發現之問題,使其能繼續正常使用,並且減少錯誤資訊出現之機率,自動的將文件中重要的資訊擷取出來呈現給使用者。

      Sequence alignment and micro-array are the two methods often used to identify genes with either similar in function or in expression for further analysis. Studies have shown that genes with resembling functions have similar structure or micro-array expression profiles. And many of these functions are annotated by the vocabulary (so-called GO term) defined in Gene Ontology, in which terms are arranged in polyhierarchical manner. In this thesis, we propose a strategy to measure the similarities among different GO terms by employing the notion of phylogenetics to GO structure. Such similarity measure of GO terms is then applied in gene clustering.

     Meanwhile, the existing text mining system for biomedical documents serves to automatically retrieve important information from biomedical literature and display it to users. We have revised the system to cope with the changes in document format as well as resolve the problems identified during its usage such that the probability of presenting erroneous information is reduced.

    第一章 導論 1 1.1 前言 1 1.2 研究動機 1 1.3 解決方法 2 1.4 論文架構 3 第二章 相關研究 4 2.1 生物資訊學 4 2.1.1 PubMed 5 2.1.2 Entrez Gene 6 2.1.3 Gene Ontology 7 2.2 基因分析 8 2.2.1 BLAST 9 2.2.2 Microarray 10 2.2.3 GO Annotation 11 2.3 生醫文件探勘系統 14 第三章 基因功能聚類與生醫文件探勘系統 16 3.1 以基因本體論為基礎之基因功能聚類 16 3.1.1 系統概論 16 3.1.2 GO Term之親緣關係 17 3.1.3 GO Term編碼 20 3.1.4 基因註解轉為序列 21 3.1.5 GO Term之權重值 22 3.1.6序列相似度計算 24 3.1.7 HAC聚類演算法 25 3.1.8 聚類系統介面與結果 26 3.2 生醫文件探勘系統 28 3.2.1 系統簡介 28 3.2.2系統修改 29 3.2.3系統結果 30 3.3基因群相關文件搜尋 32 3.3.1 系統流程 32 3.3.2 系統介紹與結果 33 第四章 實驗設計與結果分析 37 4.1 資料集介紹及處理 37 4.1.1 Cell cycle pathway相關基因 37 4.1.2 酵母菌有絲分裂相關基因 39 4.2 實驗結果與分析 41 4.2.1 Cell cycle基因聚類結果 41 4.2.2 酵母菌有絲分裂相關基因聚類結果 45 4.3 與其他系統之比較 48 5.1 結論 51 5.2 未來展望 51

    [1] Alexander W. Rives, Timothy Galitski, "Modular organization of cellular network", PNAS, vol. 100, no. 3, 1128-1133, February 4, 2003

    [2] A. Budanitsky and G. Hirst, “Semantic Distances in WordNet: An Experimental, Application-Oriented Evalution of Five Measures,” Proc. Workshop WordNet and Other Lexical Resources, Second Meeting North Am. Chapter Assoc. for Computational Linguistics, June 2001.

    [3] Blaschke C. and A. Valencia, ”The Frame-based Module of the SUISEKI Information Extraction System”, IEEE Intelligent Systems, vol. 17, pp. 14-20, 2002.

    [4] Francisco M. Couto, Mario J. Silva and Pedro Coutinho, “Implementation of Functional Semantic Similarity Measure between Gene-Products”, 2003.

    [5] Friedman C, Kra P, Yu H, Krauthammer M, Rzhetsky A,“GENIES: a natural-language processing system for the extraction of molecular pathways from journal articles,”Bioinformatics , vol. 20, pp. S74-S82, 2001.

    [6] Gan Sheng-Xuan, Chiang Jung-Hsien, "A Summarization System for Gene Relations in Biomedical Literatures", 2004

    [7] J.H Chiang, H.C. Yu, and H.J Hsu,“GIS: a biomedical text-mining system for gene information discovery,”Bioinformatics , vol. 20, no. 1, pp. 120-121, 2004.

    [8] Jinze Liu, Wei Wang, Jiong Yang, "Gene ontology friendly biclustering of expression profiles", Proceedings of the 2004 IEEE Computational Systems Bioinformatics Conference (CSB 2004),pp.436-447, 16-19 Aug. 2004

    [9] Jung-Hsien Chiang and Hsu-Chun Yu “MeKE: Discovering the Functions of Gene Products from Biomedical Literature via Sentence Alignment,” Bioinformatics, Vol.19 No. 11, pages 1417-1422 , 2003.

    [10] Lord.P, Steven.R, Brass.A, and Goble.C “Semantic Similarity Measures as tools for exploring the Gene Ontology,” Pac. Symp. Biocomput., 8, 601-612, 2003.

    [11] P. Resnik, "Using Information Content to Evaluate Semantic Similarity in a Taxonomy", Proceedings of the 14th International Joint Conference on Artificial Intelligence, Vol. 1, 448-453, Montreal, August 1995.

    [12] P.W. Lord, R.D. Stevens, A. Brass and C.A. Goble, “Investigating Semantic Similarity Measures across the Gene Ontology: the Relationship between Sequence and Annotation,” Bioinformatics, Vol. 19,no. 10 , pp.1275-1283, 2003

    [13] Stephen F. Altschul, Warren Gish, Webb Miller, Eugene W. Meyers and David J. Lipman, "Basic Local Alignment Search Tool", Journal of Molecular Biology, Volume 215, Issue 3, 5 October 1990, Pages 403-410

    [14] Wang Wen-Hung, Chiang Jung-Hsien, "Discovering Similar Genes According to Gene Ontology Based Semantic Similarity Measure", 2004

    [15] Yuhua Li, Zuhair A, Bandar, and David McLean, “An Approach for Measuring Semantic Similarity between Words Using Multiple Information Sources”, IEEE Transaction on Knowledge and Data Engineering, Vol. 15, No. 4, July/August 2003.

    [16] 中央研究院計算中心:http://www.ascc.net

    [17] 國家高速網路與計算中心:http://bioinfo.nchc.org.tw

    [18] 臺大醫院微陣列核心實驗室:http://microarray.mc.ntu.edu.tw

    [19] Biocarta生物路徑網站:http://www.biocarta.com/index.asp

    [20] BioRag (Bioresource for array genes) at www.biorag.org

    [21] 基因本體論網站:http://www.geneontology.org

    [22] KEGG生物路徑網站:http://www.genome.ad.jp/kegg

    [23] NCBI網站:http://www.ncbi.nlm.nih.gov

    [24] 酵母菌資料庫網站:http://yeast.cellzome.com

    下載圖示 校內:2006-08-09公開
    校外:2006-08-09公開
    QR CODE