簡易檢索 / 詳目顯示

研究生: 楊宇恩
Yang, Yu-En
論文名稱: 應用於網路入侵偵測之基於強化學習的神經結構搜尋機制以最佳化卷積 Transformer 架構
Optimizing Convolutional Transformer Architectures for Network Intrusion Detection: A Reinforcement Learning Based Neural Architecture Search
指導教授: 林輝堂
Lin, Hui-Tang
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2025
畢業學年度: 113
語文別: 英文
論文頁數: 73
中文關鍵詞: 神經結構搜索入侵檢測強化學習深度學習Transformer湯普森採樣
外文關鍵詞: Neural Architecture Search, Intrusion Detection, Reinforcement Learning, Deep Learning, Transformer, Thompson Sampling
相關次數: 點閱:48下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著物聯網(IoT)環境中潛在威脅日益嚴峻,入侵檢測系統需具備高精度與高效能的深度學習模型。近年來,神經架構搜尋(NAS)被提出作為自動化模型設計的解方,取代過往仰賴人力堆疊與調適模型的方式。然而,過往 NAS 在 IoT 異常檢測領域的應用仍不夠成熟,面臨兩大挑戰:首先,許多研究在架構組成元素的設計上缺乏系統性,若其功能重疊或不具辨識能力,將嚴重限制搜尋效果與模型潛力;其次,NAS 的評估階段往往高度仰賴單次訓練結果,忽略模型效能的波動性,導致控制器易誤判架構實力,進而產生搜尋偏差。
    為了解決上述問題,本文提出一種強化學習驅動的模組化架構搜尋機制,專為 IoT 入侵檢測任務設計。首先,構建涵蓋多種特徵處理能力的架構元素集合,包括卷積神經網路與Transformer的組件元素設計,為求兼顧特徵擴展、特徵選擇與特徵融合功能。並設計一致的編碼規則,提升元素之間的相容性與組合邏輯。接著,在強化學習訓練中導入效能分布建模機制,透過記錄架構歷次訓練表現並參考湯普森採樣(Thompson Sampling),估計架構潛在實力,進而穩定 reward 計算與提升控制器決策品質。實驗結果將驗證所提方法於多個實際 IoT 攻擊資料集之表現,證實本方法與過往有關研究相比,在精度、運算資源使用量上都有顯著的優勢。

    As the threats within the Internet of Things (IoT) environment continue to escalate, intrusion detection systems increasingly require highly accurate and efficient deep learning models. In recent years, Neural Architecture Search (NAS) has emerged as a promising solution for automating the design of neural networks. It replaces the traditional manual design and tuning process. However, existing applications of NAS in the field of IoT anomaly detection remain underdeveloped and face two major challenges. First, the design of architectural components in many studies lack systematic planning. If the functional diversity among components is insufficient or redundant, the search effectiveness and model potential can be significantly hindered. Second, the performance evaluation in NAS often relies heavily on a single training result, overlooking the inherent variability in model performance. This can mislead the controller into over- or underestimating an architecture’s quality, resulting in biased search outcomes.
    To address these issues, this thesis proposes a reinforcement learning based modular architecture search framework tailored for IoT intrusion detection tasks. We first construct a component library that supports a wide range of feature processing capabilities, incorporating both Convolutional Neural Network (CNN) and Transformer-based building blocks, to ensure balanced feature expansion, selection, and fusion. A unified encoding scheme is also designed to improve the compatibility and composability of these architectural elements. Furthermore, a performance distribution modeling mechanism is integrated into the reinforcement learning process. By recording each architecture's historical performance and applying Thompson Sampling, the framework estimates the underlying potential of architectures more accurately, thereby stabilizing reward computation and enhancing controller decision-making. When evaluated on multiple real-world IoT attack datasets, the experimental results demonstrate that the proposed scheme outperforms existing approaches in the literature in terms of accuracy and computational resource efficiency.

    摘要 I Abstract II Acknowledgments IV Contents V List of Tables VII List of Figures VIII Chapter 1 Introduction 1 1.1 Overview 1 1.2 Internet of Things 2 1.3 Deep Learning 4 1.4 Neural Architecture Search 6 1.5 Motivation 7 1.6 Objective 9 1.7 Thesis Outline 10 Chapter 2 Background and Related Work 11 2.1 Background 12 2.1.1 Dataset 12 2.1.2 Machine Learning-based Intrusion Detection in IoT Environments 14 2.2 Related work 15 2.2.1 ConvTransformer 16 2.2.2 NAS for Network Intrusion Detection 18 Chapter 3 Proposed Scheme 21 3.1 System Architecture 22 3.2 NAS for ConvTransformer 24 3.2.1 NAS for ConvTransformer 24 3.2.2 Block-Base Design 26 3.2.3 Transformer 28 3.2.4 Encode Rule 30 3.3 Continuous Thompson Sampling with Confidence Bound (CTSCB) 33 3.3.1 Thompson Sampling 35 3.3.2 Decision Optimization 36 3.4 Entropy and Multi-Objective Regularization 40 Chapter 4 Performance Evaluation 45 4.1 Experiment Environment 46 4.2 Experiment Setting 46 4.2.1 Experiment Design 46 4.2.2 Performance Indices 47 4.3 Performance Analysis 48 Chapter 5 Conclusion 57 Bibliography 59

    [1] e^2Link, “GSMA Forecasts 250 Billion IoT Devices by 2025”, Available from: https://www.eelinkiot.com/gsma-forecasts-250-billion-iot-devices-by-2025.

    [2] Tech Monitor, “Global IoT revenues to hit $3 trillion by 2025, IoT connections to hit 27 billion”, Available from: https://www.techmonitor.ai/technology/global-iot-revenues-to-hit-3-trillion-by-2025-iot-connections-to-hit-27-billion-4970203.

    [3] Market US, “Enterprise IoT Statistics 2025 By Technology, Devices, Software”, Available from: https://scoop.market.us/enterprise-iot-statistics.

    [4] The World of Connected AI, “Security Attacks on IoT Devices Surge by 107% in Early 2024”, Available from: https://wca.org/security-attacks-on-iot-devices-surge-by-107-in-early-2024.

    [5] Leonardo Babun, Kyle Denney, Z. Berkay Celik, Patrick McDaniel, A. Selcuk Uluagac, “A Survey on IoT Platforms: Communication, Security, and Privacy Perspectives”, Computer Networks, vol. 192, p. 108040, Jun. 2021.

    [6] Rong Zhang, Weiping Li, Tong Mo, “Review of Deep Learning”, Information and Control, vol. 47, no. 4, pp. 385-397, Oct. 2018.

    [7] Syeda Manjia Tahsien, Hadis Karimipour, Petros Spachos, “Machine learning based solutions for security of Internet of Things (IoT): A survey”, Journal of Network and Computer Applications, vol. 161, p. 102630, Jul. 2020.

    [8] Shahnawaz Ahmad, Iman Shakeel, Shabana Mehfuz, Javed Ahmad, “Deep learning models for cloud, edge, fog, and IoT computing paradigms: Survey, recent advances, and future directions”, Computer Science Review, vol. 49, p. 100568, Aug. 2023.

    [9] Huseyin Ahmetoglu, Resul Das, “A comprehensive review on detection of cyber-attacks: Data sets, methods, challenges, and future research directions”, Internet of Things, vol. 20, p. 100615, Nov. 2022.

    [10] Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton, “ImageNet classification with deep convolutional neural networks”, Communications of the ACM, vol. 60, no. 6, pp. 84–90, May 2017.

    [11] J J Hopfield, “Neural networks and physical systems with emergent collective computational abilities.”, Feynman and Computation, pp. 7–19.

    [12] Sepp Hochreiter, Jürgen Schmidhuber, “Long Short-Term Memory”, Neural Computation, vol. 9, no. 8, pp. 1735–1780, Nov. 1997.

    [13] Yuqiao Liu, Yanan Sun, Bing Xue, Mengjie Zhang, “A Survey on Evolutionary Neural Architecture Search”, IEEE Transactions on Neural Networks and Learning Systems, vol. 34, no. 2, pp. 550–570, Feb. 2023.

    [14] Huseyin Ahmetoglu, Resul Das, “A comprehensive review on detection of cyber-attacks: Data sets, methods, challenges, and future research directions”, Internet of Things, vol. 20, p. 100615, Nov. 2022.

    [15] G. Bovenzi, G. Aceto, D. Ciuonzo, A. Montieri, V. Persico, A. Pescapé, “Network anomaly detection methods in IoT environments via deep learning: a fair comparison of performance and robustness”, Computers & Security, vol. 128, p. 103167, May 2023.

    [16] Xu Jia, Han Wu, Ruochen Zhang, Min Peng, “CSformer: Enhancing deep learning efficiency for intelligent IoT”, Computer Communications, vol. 214, pp. 33–45, Jan. 2024.

    [17] Sarwar N, Bajwa I, Hussain M et al., “IoT Network Anomaly Detection in Smart Homes Using Machine Learning”, International Conference on Inventive Computation Technologies (ICICT), pp. 1882–1888, Apr. 2025.

    [18] Zainab Alwaisi, Tanesh Kumar, Erkki Harjula, Simone Soderi, “Securing constrained IoT systems: A lightweight machine learning approach for anomaly detection and prevention”, Internet of Things, vol. 28, p. 101398, Dec. 2024.

    [19] Xintong Wang, Zixuan Wang, Enliang Wang, Zhixin Sun, “Spatial-temporal knowledge distillation for lightweight network traffic anomaly detection”, Computers & Security, vol. 137, p. 103636, Feb. 2024.

    [20] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin, "Attention is All you Need". arXiv:1706.03762, Jun. 2017.

    [21] Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, Neil Houlsby, “An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale”, International Conference on Learning Representations, Jan. 2021.

    [22] Tete Xiao, Mannat Singh, Eric Mintun, Trevor Darrell, Piotr Dollar, Ross Girshick, “Early Convolutions Help Transformers See Better”, Conference on Neural Information Processing Systems, 2021.

    [23] Haiping Wu, Bin Xiao, Noel Codella, Mengchen Liu, Xiyang Dai, Lu Yuan, Lei Zhang, “CvT: Introducing Convolutions to Vision Transformers”, IEEE/CVF International Conference on Computer Vision (ICCV), pp. 22–31, Oct. 2021.

    [24] Mingming Hu, Kun Zhang, Ruibang You, Bibo Tu, “AuthConFormer: Sensor-based Continuous Authentication of Smartphone Users Using A Convolutional Transformer”, Computers & Security, vol. 127, p. 103122, Apr. 2023.

    [25] D. M. J, B. B. J. V, "YOLOv7-ConvTrans: Hybrid Vision Transformer with YOLOv7 for Underwater Object Detection," International Conference on Mobile Networks and Wireless Communications (ICMNWC), pp. 1–8, Dec. 2024.

    [26] W. Shang, J. Qiu, H. Shi, S. Wang, L. Ding, and Y. Xiao, ‘‘An efficient anomaly detection method for industrial control systems: Deep convolutional autoencoding transformer network’’, International Journal of Intelligent Systems, vol. 2024, pp. 1–18, May 2024.

    [27] Waseem Ullah, Tanveer Hussain, Fath U Min Ullah, Mi Young Lee, Sung Wook Baik, “TransCNN: Hybrid CNN and transformer mechanism for surveillance anomaly detection”, Engineering Applications of Artificial Intelligence, vol. 123, p. 106173, Aug. 2023.

    [28] Barret Zoph, Quoc V. Le, “Neural Architecture Search with Reinforcement Learning”, 2017 International Conference on Learning Representations (ICLR).

    [29] Bowen Baker, Otkrist Gupta, Nikhil Naik, Ramesh Raskar, “Designing Neural Network Architectures using Reinforcement Learning”, 2017 International Conference on Learning Representations (ICLR).

    [30] Jun-Min Shao, Guo-Qiang Zeng, Kang-Di Lu, Guang-Gang Geng, Jian Weng, “Automated federated learning for intrusion detection of industrial control systems based on evolutionary neural architecture search”, Computers & Security, vol. 143, p. 103910, Aug. 2024.

    [31] Jia-Cheng Huang, Guo-Qiang Zeng, Guang-Gang Geng, Jian Weng, Kang-Di Lu, Yu Zhang, “Differential evolution-based convolutional neural networks: An automatic architecture design method for intrusion detection in industrial control systems”, vol. 132, p. 103310, Sep. 2023.

    [32] X. Zhang et al., "Enhanced Few-Shot Malware Traffic Classification via Integrating Knowledge Transfer With Neural Architecture Search", IEEE Transactions on Information Forensics and Security, vol. 19, pp. 5245–5256, 2024.

    [33] F. Zhang et al., "Privacy-Preserving Federated Neural Architecture Search With Enhanced Robustness for Edge Computing", IEEE Transactions on Mobile Computing, vol. 24, no. 3, pp. 2234-2252, March 2025.

    [34] H. Qin, H. Zhu, X. Jin, X. Yu, M. A. El-Yacoubi and S. Yang, "EM-DARTS: Hierarchical Differentiable Architecture Search for Eye Movement Recognition" IEEE Transactions on Instrumentation and Measurement, vol. 74, pp. 1-16, 2025.

    [35] S. Yang, X. Sun, K. Xu, Y. Liu, Y. Tian and X. Zhang, "Hybrid Architecture-Based Evolutionary Robust Neural Architecture Search", IEEE Transactions on Emerging Topics in Computational Intelligence, vol. 8, no. 4, pp. 2919-2934, Aug. 2024.

    [36] Jonathan Frankle, Michael Carbin, “The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks”, arXiv:1803.03635, Mar. 2019.

    [37] Vidhya Kamakshi, Narayanan C. Krishnan, “Explainable Image Classification: The Journey So Far and the Road Ahead”, AI, vol. 4, no. 3, pp. 620–651, Aug. 2023.

    [38] Marius Lindauer, Frank Hutter, “Best Practices for Scientific Research on Neural Architecture Search”, arXiv:1909.02453, Sep. 2019.

    [39] Ningning Ma, Xiangyu Zhang, Hai-Tao Zheng, Jian Sun, “ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design”, arXiv:1807.11164, Jul. 2018.

    [40] W. Hoeffding, “Probability inequalities for sums of bounded random variables,” Journal of the American Statistical Association, vol. 58, no. 301, pp. 13-30, 1963.

    [41] Daniel J. Russo, Benjamin Van Roy, Abbas Kazerouni, Ian Osband and Zheng Wen, “A Tutorial on Thompson Sampling”, Foundations and Trends in Machine Learning, vol. 11, no. 1, pp. 1-96, 2018.

    [42] Che-Yu Liu, Lihong Li, “On the Prior Sensitivity of Thompson Sampling”, 2016 International Conference on Algorithmic Learning Theory (ALT).

    [43] Joey Hong, Branislav Kveton, Manzil Zaheer, Mohammad Ghavamzadeh, Craig Boutilier, “Thompson Sampling with a Mixture Prior”, 2022 International Conference on Artificial Intelligence and Statistics.

    [44] USTC-TFC2016 Dataset, Available from: https://www.kaggle.com/datasets/randasrour/ustctfc2016.

    [45] Ton_IoT Dataset, Available from: https://research.unsw.edu.au/projects/toniot-datasets.

    無法下載圖示 校內:2030-08-22公開
    校外:2030-08-22公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE