簡易檢索 / 詳目顯示

研究生: 陳靜誼
Chen, Jing-Yi
論文名稱: 使用對比預訓練預測抗冠狀病毒肽
Anti-coronavirus Peptide Prediction Using Contrastive Pretraining
指導教授: 張天豪
Chang, Tien-Hao
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2023
畢業學年度: 111
語文別: 中文
論文頁數: 42
中文關鍵詞: 對比學習深度學習抗冠狀病毒肽
外文關鍵詞: Contrastive Learning, Deep Learning, Anti-coronavirus Peptide
相關次數: 點閱:61下載:11
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 近年來,COVID-19疫情對全球生活帶來重大影響,迫切需要發展有效的治療方法。儘管自2020年底以來已推出多種COVID-19疫苗和治療藥物,但對某些患者而言,接種疫苗和使用現有藥物仍存在一定風險,因此仍需要開發對人體更為安全的治療方式。
    基於肽的治療方法近年來受到廣泛關注,因其具有低毒性和高目標選擇性,被視為有潛力的治療途徑。許多抗病毒肽 (Antiviral Peptides, AVPs) 已被證實具有抗病毒活性,適用於預防或治療病毒性疾病,其中能對抗冠狀病毒的肽稱為抗冠狀病毒肽。抗冠狀病毒肽 (Anti-coronavirus Peptide, ACVP) 被視為治療冠狀病毒感染的有力候選藥物。然而,傳統的生物實驗驗證具有抗冠狀病毒活性的胜肽耗時且成本高昂。為解決此問題,許多研究已開始運用傳統機器學習或深度學習方法,以預測可能具有抗冠狀病毒活性的潛在胜肽。
    本研究提出了一個模型來預測抗冠狀病毒肽。所提出的模型採用了對比學習的概念來編碼蛋白質序列的特徵。這些編碼後的序列隨後被輸入至一個隨機森林模型進行預測。本研究進行的實驗結果顯示,所提出的模型在iACVP資料集上展現出優異的表現,其曲線下面積達到0.936。

    In recent years, the COVID-19 pandemic has significantly impacted global life, urging the development of effective treatments. Despite the introduction of various COVID-19 vaccines and therapeutics since late 2020, there remain risks associated with vaccine administration and existing drugs for certain patients, highlighting the need for safer treatment approaches.

    Peptide-based therapies have garnered attention due to their low toxicity and high target specificity, presenting a promising avenue for treatment. Many antiviral peptides (AVPs) have been proven effective against viral diseases, including coronaviruses. Among these AVPs, peptides with activity against coronaviruses are known as anti-coronavirus peptides (ACVPs), holding potential as candidates for treating coronavirus infections. However, traditional biological experiments to validate ACVP activity are time-consuming and costly. To address this, numerous studies have utilized traditional machine learning and deep learning methods to predict potential ACVPs.

    This study proposes a model to predict ACVPs. The proposed model adopts the concept of contrastive learning to encode protein sequences’ features. The encoded sequences are then fed into a random forest model for prediction. The experimental results conducted in this study demonstrate the excellent performance of the proposed model on the iACVP dataset, achieving an AUC of 0.936.

    第一章 緒論 1 第二章 相關研究 3 2.1 抗冠狀病毒肽 (ANTI-CORONAVIRUS PEPTIDE) 3 2.2 抗冠狀病毒肽預測研究 5 2.2.1 PreAntiCoV 6 2.2.2 AVPIden 7 2.2.3 ENNAVIA 8 2.2.4 iACVP 9 2.3 深度學習 11 2.3.1 卷積神經網路 (Convolutional Neural Network, CNN) 12 2.3.2 對比學習 (Contrastive Learning) 13 2.3.3 Word2Vec 15 2.3.4 ProtBert 16 2.4 基於樹模型 (TREE-BASED MODEL) 18 2.4.1 隨機森林 (Random Forest) 18 第三章 研究方法 19 3.1 資料前處理 19 3.2 資料編碼 19 3.3 模型訓練與驗證流程 19 3.4 模型架構 20 3.4.1 五肽導向預訓練模型 21 3.4.2 下游分類器 24 第四章 研究結果 26 4.1 資料集 26 4.2 效能評估標準 27 4.3 與其他方法之比較 29 4.4 消融實驗:本研究方法與IACVP的比較 30 4.4.1 分析iACVP中的Word2Vec 30 4.4.2 根據模型架構及訓練方法做比較 31 4.5 消融實驗:本研究方法與PROTBERT的比較 34 4.5.1 分析ProtBert 35 4.5.2 根據模型架構及訓練方法做比較 37 第五章 結論 40 5.1 結論 40 5.2 未來展望 40 參考文獻 41

    [1] Moroy, G., & Tuffery, P. (2022). Peptide-based strategies against SARS-CoV-2 attack: An updated in silico perspective. Frontiers in Drug Discovery, 2.
    [2] Pang, Y., Wang, Z., Jhong, J. H., & Lee, T. Y. (2021). Identifying anti-coronavirus peptides by incorporating different negative datasets and imbalanced learning strategies. Briefings in bioinformatics, 22(2), 1085-1095.
    [3] Pang, Y., Yao, L., Jhong, J. H., Wang, Z., & Lee, T. Y. (2021). AVPIden: a new scheme for identification and functional prediction of antiviral peptides based on machine learning approaches. Briefings in Bioinformatics, 22(6), bbab263.
    [4] Timmons, P. B., & Hewage, C. M. (2021). ENNAVIA is a novel method which employs neural networks for antiviral and anti-coronavirus activity prediction for therapeutic peptides. Briefings in bioinformatics, 22(6), bbab258.
    [5] Kurata, H., Tsukiyama, S., & Manavalan, B. (2022). iACVP: markedly enhanced identification of anti-coronavirus peptides using a dataset-specific word2vec model. Briefings in Bioinformatics, 23(4), bbac265.
    [6] Mahlapuu, M., Håkansson, J., Ringstad, L., & Björn, C. (2016). Antimicrobial peptides: an emerging category of therapeutic agents. Frontiers in cellular and infection microbiology, 194.
    [7] Lau, J. L., & Dunn, M. K. (2018). Therapeutic peptides: Historical perspectives, current development trends, and future directions. Bioorganic & medicinal chemistry, 26(10), 2700-2707.
    [8] Mahendran, A. S. K., Lim, Y. S., Fang, C. M., Loh, H. S., & Le, C. F. (2020). The potential of antiviral peptides as COVID-19 therapeutics. Frontiers in pharmacology, 11, 575444.
    [9] Qureshi, A., Thakur, N., Tandon, H., & Kumar, M. (2014). AVPdb: a database of experimentally validated antiviral peptides targeting medically important viruses. Nucleic acids research, 42(D1), D1147-D1153.
    [10] Mustafa, S., Balkhy, H., & Gabere, M. (2019). Peptide-protein interaction studies of antimicrobial peptides targeting middle east respiratory syndrome coronavirus spike protein: an in silico approach. Advances in bioinformatics, 2019.
    [11] Jhong, J. H., Chi, Y. H., Li, W. C., Lin, T. H., Huang, K. Y., & Lee, T. Y. (2019). dbAMP: an integrated resource for exploring antimicrobial peptides with functional activities and physicochemical properties on transcriptome and proteome data. Nucleic acids research, 47(D1), D285-D297.
    [12] Qureshi, A., Thakur, N., Tandon, H., & Kumar, M. (2014). AVPdb: a database of experimentally validated antiviral peptides targeting medically important viruses. Nucleic acids research, 42(D1), D1147-D1153.
    [13] Kang, X., Dong, F., Shi, C., Liu, S., Sun, J., Chen, J., ... & Zheng, H. (2019). DRAMP 2.0, an updated data repository of antimicrobial peptides. Scientific data, 6(1), 148.
    [14] Pirtskhalava, M., Amstrong, A. A., Grigolava, M., Chubinidze, M., Alimbarashvili, E., Vishnepolsky, B., ... & Tartakovsky, M. (2021). DBAASP v3: database of antimicrobial/cytotoxic activity and structure of peptides as a resource for development of new therapeutics. Nucleic acids research, 49(D1), D288-D297.
    [15] Qureshi, A., Thakur, N., & Kumar, M. (2013). HIPdb: a database of experimentally validated HIV inhibiting peptides. PloS one, 8(1), e54908.
    [16] Thakur, N., Qureshi, A., & Kumar, M. (2012). AVPpred: collection and prediction of highly effective antiviral peptides. Nucleic acids research, 40(W1), W199-W204.
    [17] Manavalan, B., Basith, S., & Lee, G. (2022). Comparative analysis of machine learning-based approaches for identifying therapeutic peptides targeting SARS-CoV-2. Briefings in bioinformatics, 23(1), bbab412.
    [18] Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.
    [19] LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324.
    [20] Chen, T., Kornblith, S., Norouzi, M., & Hinton, G. (2020, November). A simple framework for contrastive learning of visual representations. In International conference on machine learning (pp. 1597-1607). PMLR.
    [21] Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
    [22] Elnaggar, A., Heinzinger, M., Dallago, C., Rehawi, G., Wang, Y., Jones, L., ... & Rost, B. (2021). Prottrans: Toward understanding the language of life through self-supervised learning. IEEE transactions on pattern analysis and machine intelligence, 44(10), 7112-7127.
    [23] Breiman, L. (2001). Random forests. Machine learning, 45, 5-32.
    [24] Lata, S., Mishra, N. K., & Raghava, G. P. (2010). AntiBP2: improved version of antibacterial peptide prediction. BMC bioinformatics, 11(1), 1-7.
    [25] Lissabet, J. F. B., Belén, L. H., & Farias, J. G. (2019). AntiVPP 1.0: a portable tool for prediction of antiviral peptides. Computers in biology and medicine, 107, 127-130.

    下載圖示 校內:立即公開
    校外:立即公開
    QR CODE