研究生: |
陳靜誼 Chen, Jing-Yi |
---|---|
論文名稱: |
使用對比預訓練預測抗冠狀病毒肽 Anti-coronavirus Peptide Prediction Using Contrastive Pretraining |
指導教授: |
張天豪
Chang, Tien-Hao |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
論文出版年: | 2023 |
畢業學年度: | 111 |
語文別: | 中文 |
論文頁數: | 42 |
中文關鍵詞: | 對比學習 、深度學習 、抗冠狀病毒肽 |
外文關鍵詞: | Contrastive Learning, Deep Learning, Anti-coronavirus Peptide |
相關次數: | 點閱:61 下載:11 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
近年來,COVID-19疫情對全球生活帶來重大影響,迫切需要發展有效的治療方法。儘管自2020年底以來已推出多種COVID-19疫苗和治療藥物,但對某些患者而言,接種疫苗和使用現有藥物仍存在一定風險,因此仍需要開發對人體更為安全的治療方式。
基於肽的治療方法近年來受到廣泛關注,因其具有低毒性和高目標選擇性,被視為有潛力的治療途徑。許多抗病毒肽 (Antiviral Peptides, AVPs) 已被證實具有抗病毒活性,適用於預防或治療病毒性疾病,其中能對抗冠狀病毒的肽稱為抗冠狀病毒肽。抗冠狀病毒肽 (Anti-coronavirus Peptide, ACVP) 被視為治療冠狀病毒感染的有力候選藥物。然而,傳統的生物實驗驗證具有抗冠狀病毒活性的胜肽耗時且成本高昂。為解決此問題,許多研究已開始運用傳統機器學習或深度學習方法,以預測可能具有抗冠狀病毒活性的潛在胜肽。
本研究提出了一個模型來預測抗冠狀病毒肽。所提出的模型採用了對比學習的概念來編碼蛋白質序列的特徵。這些編碼後的序列隨後被輸入至一個隨機森林模型進行預測。本研究進行的實驗結果顯示,所提出的模型在iACVP資料集上展現出優異的表現,其曲線下面積達到0.936。
In recent years, the COVID-19 pandemic has significantly impacted global life, urging the development of effective treatments. Despite the introduction of various COVID-19 vaccines and therapeutics since late 2020, there remain risks associated with vaccine administration and existing drugs for certain patients, highlighting the need for safer treatment approaches.
Peptide-based therapies have garnered attention due to their low toxicity and high target specificity, presenting a promising avenue for treatment. Many antiviral peptides (AVPs) have been proven effective against viral diseases, including coronaviruses. Among these AVPs, peptides with activity against coronaviruses are known as anti-coronavirus peptides (ACVPs), holding potential as candidates for treating coronavirus infections. However, traditional biological experiments to validate ACVP activity are time-consuming and costly. To address this, numerous studies have utilized traditional machine learning and deep learning methods to predict potential ACVPs.
This study proposes a model to predict ACVPs. The proposed model adopts the concept of contrastive learning to encode protein sequences’ features. The encoded sequences are then fed into a random forest model for prediction. The experimental results conducted in this study demonstrate the excellent performance of the proposed model on the iACVP dataset, achieving an AUC of 0.936.
[1] Moroy, G., & Tuffery, P. (2022). Peptide-based strategies against SARS-CoV-2 attack: An updated in silico perspective. Frontiers in Drug Discovery, 2.
[2] Pang, Y., Wang, Z., Jhong, J. H., & Lee, T. Y. (2021). Identifying anti-coronavirus peptides by incorporating different negative datasets and imbalanced learning strategies. Briefings in bioinformatics, 22(2), 1085-1095.
[3] Pang, Y., Yao, L., Jhong, J. H., Wang, Z., & Lee, T. Y. (2021). AVPIden: a new scheme for identification and functional prediction of antiviral peptides based on machine learning approaches. Briefings in Bioinformatics, 22(6), bbab263.
[4] Timmons, P. B., & Hewage, C. M. (2021). ENNAVIA is a novel method which employs neural networks for antiviral and anti-coronavirus activity prediction for therapeutic peptides. Briefings in bioinformatics, 22(6), bbab258.
[5] Kurata, H., Tsukiyama, S., & Manavalan, B. (2022). iACVP: markedly enhanced identification of anti-coronavirus peptides using a dataset-specific word2vec model. Briefings in Bioinformatics, 23(4), bbac265.
[6] Mahlapuu, M., Håkansson, J., Ringstad, L., & Björn, C. (2016). Antimicrobial peptides: an emerging category of therapeutic agents. Frontiers in cellular and infection microbiology, 194.
[7] Lau, J. L., & Dunn, M. K. (2018). Therapeutic peptides: Historical perspectives, current development trends, and future directions. Bioorganic & medicinal chemistry, 26(10), 2700-2707.
[8] Mahendran, A. S. K., Lim, Y. S., Fang, C. M., Loh, H. S., & Le, C. F. (2020). The potential of antiviral peptides as COVID-19 therapeutics. Frontiers in pharmacology, 11, 575444.
[9] Qureshi, A., Thakur, N., Tandon, H., & Kumar, M. (2014). AVPdb: a database of experimentally validated antiviral peptides targeting medically important viruses. Nucleic acids research, 42(D1), D1147-D1153.
[10] Mustafa, S., Balkhy, H., & Gabere, M. (2019). Peptide-protein interaction studies of antimicrobial peptides targeting middle east respiratory syndrome coronavirus spike protein: an in silico approach. Advances in bioinformatics, 2019.
[11] Jhong, J. H., Chi, Y. H., Li, W. C., Lin, T. H., Huang, K. Y., & Lee, T. Y. (2019). dbAMP: an integrated resource for exploring antimicrobial peptides with functional activities and physicochemical properties on transcriptome and proteome data. Nucleic acids research, 47(D1), D285-D297.
[12] Qureshi, A., Thakur, N., Tandon, H., & Kumar, M. (2014). AVPdb: a database of experimentally validated antiviral peptides targeting medically important viruses. Nucleic acids research, 42(D1), D1147-D1153.
[13] Kang, X., Dong, F., Shi, C., Liu, S., Sun, J., Chen, J., ... & Zheng, H. (2019). DRAMP 2.0, an updated data repository of antimicrobial peptides. Scientific data, 6(1), 148.
[14] Pirtskhalava, M., Amstrong, A. A., Grigolava, M., Chubinidze, M., Alimbarashvili, E., Vishnepolsky, B., ... & Tartakovsky, M. (2021). DBAASP v3: database of antimicrobial/cytotoxic activity and structure of peptides as a resource for development of new therapeutics. Nucleic acids research, 49(D1), D288-D297.
[15] Qureshi, A., Thakur, N., & Kumar, M. (2013). HIPdb: a database of experimentally validated HIV inhibiting peptides. PloS one, 8(1), e54908.
[16] Thakur, N., Qureshi, A., & Kumar, M. (2012). AVPpred: collection and prediction of highly effective antiviral peptides. Nucleic acids research, 40(W1), W199-W204.
[17] Manavalan, B., Basith, S., & Lee, G. (2022). Comparative analysis of machine learning-based approaches for identifying therapeutic peptides targeting SARS-CoV-2. Briefings in bioinformatics, 23(1), bbab412.
[18] Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.
[19] LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324.
[20] Chen, T., Kornblith, S., Norouzi, M., & Hinton, G. (2020, November). A simple framework for contrastive learning of visual representations. In International conference on machine learning (pp. 1597-1607). PMLR.
[21] Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
[22] Elnaggar, A., Heinzinger, M., Dallago, C., Rehawi, G., Wang, Y., Jones, L., ... & Rost, B. (2021). Prottrans: Toward understanding the language of life through self-supervised learning. IEEE transactions on pattern analysis and machine intelligence, 44(10), 7112-7127.
[23] Breiman, L. (2001). Random forests. Machine learning, 45, 5-32.
[24] Lata, S., Mishra, N. K., & Raghava, G. P. (2010). AntiBP2: improved version of antibacterial peptide prediction. BMC bioinformatics, 11(1), 1-7.
[25] Lissabet, J. F. B., Belén, L. H., & Farias, J. G. (2019). AntiVPP 1.0: a portable tool for prediction of antiviral peptides. Computers in biology and medicine, 107, 127-130.