研究生: |
鮑弘仁 Pao, Hung-Jen |
---|---|
論文名稱: |
使用基於Transformer的預訓練模型實現加密流量分類 Encrypted Traffic Classification Using Transformer-Based Pre-Trained Models |
指導教授: |
張燕光
Chang, Yeim-Kuan |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
論文出版年: | 2024 |
畢業學年度: | 112 |
語文別: | 英文 |
論文頁數: | 103 |
中文關鍵詞: | 深度學習 、Transformer 、加密流量分類 、預訓練模型 、遷移訓練 |
外文關鍵詞: | Deep learning, Transformer neural networks, Encrypted traffic classification, Pre-trained models, Transfer training |
相關次數: | 點閱:62 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著 VPN 等加密技術的廣泛使用,加密數據包已經佔據了網路流量的重要部分。這使得網路流量識別變得更為重要和複雜,特別是在服務質量(QoS)和入侵檢測系統(IDS)等應用中。傳統的流量分類方法,如基於連接埠號碼和深度封包檢查(DPI),在面對加密流量時效率低下,因為加密技術隱藏了數據包的內容。機器學習方法依賴手動特徵提取,耗時且泛化能力有限,而現代深度學習方法雖能自動提取特徵,但在小數據樣本上表現不佳,且計算成本高。
為了解決這些問題,我們提出了三種創新的模型應用策略,分別命名為 BERT-Packet、BERT-Flow 和 ViT-Flow。這些方法借鑒了 Transformer 在自然語言處理和計算機視覺領域的成功應用,並使用預訓練模型進行遷移學習來分類加密流量。BERT-Packet 方法將封包內容轉換為位元組對,並使用預訓練的 BERT 模型進行微調,以達到高效的流量分類效果,並用 warmup 策略來提高模型的穩定性。BERT-Flow 方法一次處理多個封包,使用預訓練的 BERT 模型並新增 adapter 層來捕捉多個封包間的關聯性。ViT-Flow 方法則將多個封包資料轉換為圖像,並使用預訓練的 Vision Transformer 模型進行微調,以實現高準確度的流量分類。此外,我們為圖像輸入提出數據增強方法解決數據稀缺問題。
實驗結果表明,BERT-Packet 和 ViT-Flow 在標準數據集上顯示出優越的性能和準確率。我們也確認了數據增強方法在增強模型泛化能力方面的有效性。透過使用遷移學習和預訓練模型,我們顯著降低了訓練和推理過程中的計算成本。
這篇論文展示了我們通過設計合理的資料前處理,成功應用現成的預訓練模型,並遷移式訓練後在網路領域中使用,達到一樣或更好的準確率,同時大幅減少了計算成本。這些成果證明了預訓練模型在加密流量分類中的潛力,為未來網路領域提供了寶貴的技術參考。
In today's network landscape, the rapid development and widespread use of encryption technologies such as VPNs have led to encrypted packets constituting a significant portion of network traffic. This development has increased the importance and complexity of network traffic identification, making traffic classification a critical task in network management, especially in applications such as Quality of Service and Intrusion Detection Systems. Traditional traffic classification methods are inefficient when dealing with encrypted traffic, as encryption hides the packet contents. Machine learning methods rely on manual feature extraction, which is time-consuming and has limited generalization capabilities, while modern deep learning methods perform poorly on small data samples and have high computational costs.
To address these issues, we propose three innovative modeling application strategies, named BERT-Packet, BERT-Flow, and ViT-Flow. These methods leverage the successful application of Transformers in Natural Language Processing and Computer Vision, utilizing transfer learning with pre-trained models to classify encrypted traffic. The BERT-Packet method converts packet contents into byte pairs and fine-tunes a pre-trained BERT model to achieve efficient traffic classification, while employing a warmup strategy to enhance model stability. The BERT-Flow method processes multiple packets at once, using a pre-trained BERT model with additional adapter layers to capture the relationships between packets. The ViT-Flow method transforms multiple packet data into images and fine-tunes a pre-trained Vision Transformer model to achieve high accuracy in traffic classification. Additionally, we introduce data augmentation methods for image inputs to address data scarcity issues.
Experimental results show that BERT-Packet, BERT-Flow, and ViT-Flow demonstrate superior performance and accuracy on standard datasets. Additionally, we confirmed the effectiveness of data augmentation methods in enhancing model generalization capabilities. By using transfer learning and pre-trained models, we significantly reduce the computational cost of training and inference.
This thesis demonstrates that by designing effective data preprocessing methods and successfully applying existing pre-trained models for transfer learning in the network domain, we can achieve comparable or better accuracy while significantly reducing computational costs. These findings highlight the potential of pre-trained models in encrypted traffic classification and provide valuable technical insights for future research and applications in network security.
[1] Bujlow, T., Carela-Español, V., & Barlet-Ros, P. (2015). Independent comparison of popular DPI tools for traffic classification. Computer Networks, 76, 75–89.
[2] Sharafaldin, I., Lashkari, A. H., & Ghorbani, A. A. (2018). Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization.
[3] Velan, P., Čermák, M., Čeleda, P., & Drašar, M. (2015). A survey of methods for encrypted traffic classification and analysis. International Journal of Network Management, 25(5), 355–374.
[4] Nguyen, T. T., & Armitage, G. (2008). A survey of techniques for internet traffic classification using machine learning. IEEE Communications Surveys & Tutorials, 10(4), 56–76.
[5] Draper-Gil, G., Lashkari, A. H., Mamun, M. S. I., & Ghorbani, A. A. (2016). Characterization of Encrypted and VPN Traffic using Time-related Features.
[6] Rezaei, S., & Liu, X. (2019). Deep Learning for Encrypted Traffic Classification: An Overview. IEEE Communications Magazine, 57(5), 76–81.
[7] Liu, C., He, L., Xiong, G., Cao, Z., & Li, Z. (2019). FS-Net: A Flow Sequence Network For Encrypted Traffic Classification.
[8] Liu, C., Wang, W., Wang, M., Lv, F., & Konan, M. (2017). An efficient instance selection algorithm to reconstruct training set for support vector machine. Knowledge-Based Systems, 116, 58–73.
[9] LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).
[10] Lotfollahi, M., Siavoshani, M. J., Zade, R. S. H., & Saberian, M. (2019). Deep packet: a novel approach for encrypted traffic classification using deep learning. Soft Computing, 24(3), 1999–2012.
[11] Wang, N. W., Zhu, N. M., Zeng, N. X., Ye, N. X., & Sheng, N. Y. (2017). Malware traffic classification using convolutional neural network for representation learning.
[12] Wang, W., Zhu, M., Wang, J., Zeng, X., & Yang, Z. (2017). End-to-end encrypted traffic classification with one-dimensional convolution neural networks.
[13] Vaswani, A., Shazeer, N.M., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., & Polosukhin, I. (2017). Attention is All you Need. Neural Information Processing Systems.
[14] Zhao, R., Huang, Y., Deng, X., Xue, Z., Li, J., Huang, Z., & Wang, Y. (2021). Flow Transformer: A Novel Anonymity Network Traffic Classifier with Attention Mechanism. 2021 17th International Conference on Mobility, Sensing and Networking.
[15] Luo, Y., Chen, X., Ge, N., Feng, W., & Lu, J. (2022). Transformer-Based Malicious Traffic Detection for Internet of Things. ICC 2022 - IEEE International Conference on Communications.
[16] Devlin, J., Chang, M., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. North American Chapter of the Association for Computational Linguistics.
[17] Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., & Houlsby, N. (2021). An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. International Conference on Learning Representations.
[18] He, H. Y., Yang, Z. G., & Chen, X. N. (2020). PERT: Payload Encoding Representation from Transformer for Encrypted Traffic Classification.
[19] Lin, X., Xiong, G., Gou, G., Li, Z., Shi, J., & Yu, J. (2022). ET-BERT: A Contextualized Datagram Representation with Pre-training Transformers for Encrypted Traffic Classification. Proceedings of the ACM Web Conference 2022.
[20] Yan, A., McAuley, J., Lu, X., Du, J., Chang, E. Y., Gentili, A., & Hsu, C. N. (2022). RadBERT: Adapting Transformer-based Language Models to Radiology. Radiology Artificial Intelligence, 4(4).
[21] Ryu, M. (2021). [RE] ALBERT: A Lite BERT for Self-supervised Learning of Language Representations.
[22] Zander, S., Nguyen, T., & Armitage, G. (2005). Automated traffic classification and application identification using machine learning.
[23] Yamansavascilar, B., Guvensan, M. A., Yavuz, A. G., & Karsligil, M. E. (2017). Application identification via network traffic classification. 2017 International Conference on Computing, Networking and Communications (ICNC).
[24] Muehlstein, J., Zion, Y., Bahumi, M., Kirshenboim, I., Dubin, R., Dvir, A., & Pele, O. (2016). Analyzing HTTPS encrypted traffic to identify user's operating system, browser and application. 2017 14th IEEE Annual Consumer Communications & Networking Conference (CCNC), 1-6.
[25] Finsterbusch, M., Richter, C., Rocha, E., Muller, J. A., & Hanssgen, K. (2014). A Survey of Payload-Based Traffic Classification Approaches. IEEE Communications Surveys & Tutorials, 16(2), 1135–1156.
[26] Bujlow, T., Carela-Español, V., & Barlet-Ros, P. (2015b). Independent comparison of popular DPI tools for traffic classification. Computer Networks, 76, 75–89.
[27] Dingledine, R., Mathewson, N., & Syverson, P. (2004). Tor: The Second-Generation Onion Router.
[28] He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep Residual Learning for Image Recognition.
[29] Elman, J. L. (1990). Finding Structure in Time. Cognitive Science, 14(2), 179–211.
[30] Glorot, X., Bordes, A., & Bengio, Y. (2011). Deep Sparse Rectifier Neural Networks. International Conference on Artificial Intelligence and Statistics.
[31] Hochreiter, S., & Schmidhuber, J. (1997). Long Short-Term Memory. Neural Computation, 9(8), 1735–1780.
[32] Lecun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324.
[33] https://scapy.net/
[34] Tahaei, H., Afifi, F., Asemi, A., Zaki, F., & Anuar, N. B. (2020). The rise of traffic classification in IoT networks: A survey. Journal of Network and Computer Applications, 154, 102538.
[35] Li, D., Li, W., Wang, X., Nguyen, C. T., & Lu, S. (2020). App trajectory recognition over encrypted internet traffic based on deep neural network. Computer Networks, 179, 107372.
[36] Taylor, V. F., Spolaor, R., Conti, M., & Martinovic, I. (2018). Robust Smartphone App Identification via Encrypted Network Traffic Analysis. IEEE Transactions on Information Forensics and Security, 13(1), 63–78.
[37] Al-Naami, K., Chandra, S., Mustafa, A., Khan, L., Lin, Z., Hamlen, K., & Thuraisingham, B. (2016). Adaptive encrypted traffic fingerprinting with bi-directional dependence.
[38] Sirinam, P., Imani, M., Juárez, M., & Wright, M.K. (2018). Deep Fingerprinting: Undermining Website Fingerprinting Defenses with Deep Learning. Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security.
[39] Liu, C., He, L., Xiong, G., Cao, Z., & Li, Z. (2019b). FS-Net: A Flow Sequence Network For Encrypted Traffic Classification.
[40] Lin, K., Xu, X., & Gao, H. (2021). TSCRNN: A novel classification scheme of encrypted traffic based on flow spatiotemporal features for efficient management of IIoT. Computer Networks, 190, 107974.
[41] Sengupta, S., Ganguly, N., De, P., & Chakraborty, S. (2019). Exploiting Diversity in Android TLS Implementations for Mobile App Traffic Classification.
[42] Zellers, R., Bisk, Y., Schwartz, R., & Choi, Y. (2018). SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference.
[43] Rajpurkar, P., Zhang, J., Lopyrev, K., & Liang, P. (2016). SQuAD: 100,000+ Questions for Machine Comprehension of Text. Conference on Empirical Methods in Natural Language Processing.
[44] Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., & Bowman, S. (2018c). GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding.
[45] Sanh, V., Debut, L., Chaumond, J., & Wolf, T. (2019). DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. ArXiv, abs/1910.01108.
[46] Panchenko, A., Niessen, L., Zinnen, A., & Engel, T. (2011). Website fingerprinting in onion routing based anonymization networks.
[47] Shen, M., Zhang, J., Zhu, L., Xu, K., & Du, X. (2021). Accurate Decentralized Application Identification via Encrypted Traffic Analysis Using Graph Neural Networks. IEEE Transactions on Information Forensics and Security, 16, 2367–2380.
[48] https://huggingface.co/google-bert/bert-base-uncased.
[49] Ba, J., Kiros, J.R., & Hinton, G.E. (2016). Layer Normalization. ArXiv, abs/1607.06450.
[50] Yang, T., Zhu, Y., Xie, Y., Zhang, A., Chen, C., & Li, M. (2023). AIM: Adapting image models for efficient video action recognition. In Proceedings of the Eleventh International Conference on Learning Representations.
[51] Lashkari, A. H., Gil, G. D., Mamun, M. S. I., & Ghorbani, A. A. (2017). Characterization of Tor Traffic using Time based Features.
[52] https://huggingface.co/google/vit-base-patch16-224-in21k.
[53] Van Ede, T., Bortolameotti, R., Continella, A., Ren, J., Dubois, D. J., Lindorfer, M., Choffnes, D., Van Steen, M., & Peter, A. (2020). FlowPrint: Semi-Supervised Mobile-App Fingerprinting on Encrypted Network Traffic.
[54] Zheng, W., Gou, C., Yan, L., & Mo, S. (2020). Learning to Classify: A Flow-Based Relation Network for Encrypted Traffic Classification.
[55] Liu, C., Wang, W., Wang, M., Lv, F., & Konan, M. (2017b). An efficient instance selection algorithm to reconstruct training set for support vector machine. Knowledge-Based Systems, 116, 58–73.