研究生: |
張哲斌 Zhang, Zhe-Bin |
---|---|
論文名稱: |
蘭花基因資料庫6.0: 更新蘭花基因組和轉錄組,開發基因表現、蛋白質結構域和調控工具 OrchidBase 6.0: updated orchid genomes and transcriptomes and development of tools for gene expression, protein domains, and regulation |
指導教授: |
吳謂勝
Wu, Wei-Sheng |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
論文出版年: | 2021 |
畢業學年度: | 109 |
語文別: | 中文 |
論文頁數: | 100 |
中文關鍵詞: | 基因表現量 、基因調控 、InterProScan 、OrchidBase 、Pfam |
外文關鍵詞: | OrchidBase, Pfam, InterProScan, gene expression, gene regulation |
相關次數: | 點閱:106 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
蘭科植物是被子植物中的第二大科,是最具多樣性,分布最廣的植物。蝴蝶蘭在臺灣也有著很高的經濟價值。因此研究蘭科植物基因的調控或是鑑定蘭科植物的蛋白質家族十分重要。而蘭花基因資料庫(Orchidbase)第五版目前收錄五種蘭科植物的全基因組序列,可以供生物學家查詢解序後的資料。本研究除了在Orchidbase新增兩個舌唇蘭物種外,也開發三個功能,分別是Enhanced sequence search、Domain Search、Promoter Analysis。本研究開發之第一個功能Enhanced sequence search功能提升原本OrchiBase的BLAST功能。在BLAST分析後增加基因表現量之圖表製作,其中包含熱圖(heat map)、集群(cluster)、表現圖譜(expression pattern)及主成分分析(Principal component analysis, PCA)之功能,將表現量的數字資料圖形化(data visualization),可快速提供使用者理解這些蘭科植物基因表現資料背後所代表的科學意義。可以讓生物學家對於蛋白質家族在蘭科植物的基因數量及其親緣演化關係有著一定可信度的預測。生物學家也可以用本研究開發的第二個功能,Domain Search,用蛋白質結構域(domain)的角度來預測同源基因。可以彌補用Enhanced sequence search功能中找到的同源基因數量不足的缺點。而第三個功能為Promoter Analysis,提供給生物學家查詢蘭花基因啟動子可能結合的轉錄因子,也提供生物學家對於有興趣的基因的上游調控機制有更進一步的了解。
The Orchidaceae are a diverse and widespread family of flowering plants, and Phalaenopsis also hase high economic value in Taiwan. Therefore, it is very important to study the regulation of genes in orchids or to identify the protein family of orchids. The fifth version of Orhidbase (Orchidbase 5.0) currently contains the entire genome sequences of five orchid species, which can be used by biologists to inquire about the data of genome sequencing. In addition to adding two new Platanthera species in Orchidbase, this research also developed three website functions. They are respectively Enhanced sequence search, Domain Search, and Promoter Analysis. The first function of this study, Enhanced sequence search, enhances the original BLAST function of OrchiBase. After BLAST analysis, it increases the gene expression graph, which includes heat map, cluster, expression pattern and principal component analysis (PCA). These data visualization of expression data can quickly provide users with an understanding of the scientific significance behind these orchid plant gene expression data. They also provide biologists to have a certain credibility in predicting the gene numbers of the protein family in orchids and their genetic evolution. Biologists can also use the second function in this research, Domain Search, to predict homologous genes from the perspective of protein domains. It can make up for the shortcomings of insufficient number of homologous genes found in the Enhanced sequence search.The third function is Promoter Analysis, which provides biologists with inquiring about the transcription factors that orchid gene promoters may bind, and also provides biologists with a better understanding of the upstream regulatory mechanism of genes of interest.
[1] J. Cai et al., "The genome sequence of the orchid Phalaenopsis equestris," Nature genetics, vol. 47, no. 1, pp. 65-72, 2015.
[2] G.-Q. Zhang et al., "The Dendrobium catenatum Lindl. genome sequence provides insights into polysaccharide synthase, floral development and adaptive evolution," Scientific reports, vol. 6, no. 1, pp. 1-10, 2016.
[3] G.-Q. Zhang et al., "The Apostasia genome and the evolution of orchids," Nature, vol. 549, no. 7672, pp. 379-383, 2017.
[4] A. Amiryousefi, J. Hyvönen, and P. Poczai, "The plastid genome of Vanillon (Vanilla pompona, Orchidaceae)," Mitochondrial DNA Part B, vol. 2, no. 2, pp. 689-691, 2017.
[5] T.-Z. Li, L.-J. Chen, M. Wang, J.-B. Chen, and J. Huang, "The complete chloroplast genome of Vanilla shenzhenica (Orchidaceae)," Mitochondrial DNA Part B, vol. 4, no. 2, pp. 2610-2611, 2019.
[6] C.-H. Fu et al., "OrchidBase: a collection of sequences of the transcriptome derived from orchids," Plant and cell physiology, vol. 52, no. 2, pp. 238-243, 2011.
[7] G. H. Dunteman, Principal components analysis (no. 69). Sage, 1989.
[8] Y.-Y. Chen et al., "Genome-wide identification of YABBY genes in Orchidaceae and their expression patterns in Phalaenopsis orchid," Genes, vol. 11, no. 9, p. 955, 2020.
[9] R. D. Finn, J. Clements, and S. R. Eddy, "HMMER web server: interactive sequence similarity searching," Nucleic acids research, vol. 39, no. suppl_2, pp. W29-W37, 2011.
[10] N. Mulder and R. Apweiler, "Interpro and interproscan," in Comparative genomics: Springer, 2007, pp. 59-70.
[11] Y.-F. Lin et al., "Genome-wide identification and characterization of TCP genes involved in ovule development of Phalaenopsis equestris," Journal of experimental botany, vol. 67, no. 17, pp. 5051-5066, 2016.
[12] Y.-C. Chuang et al., "A dual repeat cis-element determines expression of GERANYL DIPHOSPHATE SYNTHASE for monoterpene production in Phalaenopsis orchids," Frontiers in plant science, vol. 9, p. 765, 2018.
[13] W.-L. Wu et al., "Expression regulation of MALATE SYNTHASE involved in glyoxylate cycle during protocorm development in Phalaenopsis aphrodite (Orchidaceae)," Scientific reports, vol. 10, no. 1, pp. 1-16, 2020.