簡易檢索 / 詳目顯示

研究生: 梁誌祐
Liang, Chih-Yu
論文名稱: 一多尺度HOG-SVM即時人臉偵測系統及其硬體實現
A Real-Time Multiscale HOG-SVM Face Detection System and Its Hardware Implementation
指導教授: 陳進興
Chen, Chin-Hsing
張志文
Chang, Wenson
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電腦與通信工程研究所
Institute of Computer & Communication Engineering
論文出版年: 2025
畢業學年度: 113
語文別: 英文
論文頁數: 72
中文關鍵詞: 現場可規劃邏輯電路(FPGA)多尺度即時HOG線性SVM
外文關鍵詞: Field Programmable Gate Array (FPGA), multi-scale, real-time, HOG, linear SVM
相關次數: 點閱:2下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著人工智慧與影像處理技術的快速發展,人臉辨識已廣泛應用於各種領域,包括相機的自動對焦、即時影像中的人臉追蹤,以及影像資料集的分類與標註,展現出極高的實用價值與應用潛力。
    本論文著重於設計即時、低成本且具多尺度偵測能力的人臉辨識系統,透過 DE2-115 現場可程式化邏輯閘陣列(FPGA)實現電路運作流程,並利用 TRDB-D5M 相機獲取原始影像資料,將影像轉換為 RGB 與灰階格式並儲存至 SDRAM,隨後將灰階影像輸入至辨識模組,藉由 HOG(Histograms of Oriented Gradients)演算法計算並提取特徵,將整張 640×480 的影像特徵切割為多個 64×64 與 128×128 的區塊,並透過預先以軟體訓練好的SVM(Support Vector Machine)進行分類,判斷每個區域是否為人臉。將辨識結果以方框方式標示,完成即時人臉偵測顯示功能,實現每秒 60 幀、解析度 640×480 的辨識輸出。
    為了驗證系統效能,本研究透過 混淆矩陣評估不同固定大小視窗下的人臉辨識率,藉此調整與最佳化分類器的參數與閾值設定。使用FPPI(False Positives Per Image) 作為評估指標,模擬實際相機輸入影像的環境,進一步分析誤檢率與偵測率之間的平衡。最後,針對頭部歪斜、旋轉、距離變化以及隨機動作等情境進行測試,並於 FPGA 螢幕輸出結果中觀察方框標示的人臉偵測效果,驗證系統在多樣化場景下的即時辨識能力與穩定性。

    With the rapid development of artificial intelligence and image processing technologies, face detection has been widely applied in various fields, including automatic focusing in cameras, real-time face tracking, and the classification and annotation of image datasets, demonstrating significant practical value and application potential.
    This thesis focuses on the design of a real-time, low-cost, and multi-scale face detection system implemented on the DE2-115 Field-Programmable Gate Array (FPGA). The TRDB-D5M camera is used to capture raw image data, which is then converted into RGB and grayscale formats and stored in SDRAM. The grayscale image is fed into the detection module, where features are extracted using the Histogram of Oriented Gradients (HOG) algorithm. The entire 640×480 image is divided into multiple 64×64 and 128×128 blocks, and each block is classified using a Support Vector Machine (SVM) pre-trained in software to determine whether it contains a face. The detection results are displayed on the next frame by marking detected faces with bounding boxes, achieving real-time face detection at 60 frames per second with a resolution of 640×480.
    To evaluate system performance, a confusion matrix is employed to analyze face detection accuracy under different fixed-size detection windows, thereby optimizing classifier parameters and threshold settings. The False Positives Per Image (FPPI) metric is further adopted as an evaluation criterion to simulate real-world camera input scenarios and analyze the trade-off between false detection rates and detection accuracy. Finally, the system is tested under various conditions, including head tilting, rotation, distance variations, and random movements. The detection results, displayed with bounding boxes on the FPGA output screen, verify the system’s real-time detection capability and robustness across diverse scenarios.

    摘 要 I Abstract II 誌 謝 IV Contents V List of Tables VIII List of Figures IX Chapter 1 Introduction 1 1.1 Motivation 1 1.2 Field Programmable Gate Array 2 1.3 Histograms of Oriented Gradients 4 1.4 Support Vector Machine 5 1.5 Contribution of thesis 5 1.6 Thesis Outline 6 Chapter 2 Background Knowledge 7 2.1 Feature Extraction 7 2.1.1 Haar-like Feature 8 2.1.2 DNN-based Feature 9 2.1.3 Histograms of Oriented Gradients 10 2.2 Classifier 11 2.2.1 Naive Bayes 12 2.2.2 Fully Connected Neural Networks 13 2.2.3 Support Vector Machine 14 2.3 HOG/SVM 15 2.3.1 Gradient Computation 16 2.3.2 Orientation Bin Assignment 17 2.3.3 Block Normalization 18 2.4 Support Vector Machine 19 Chapter 3 Software Design and Implementation 22 3.1 Software Development Environment and Tools 22 3.2 Dataset Preparation 23 3.3 HOG Feature Design 24 3.3. Gradient Computation 25 3.3.2 Orientation Bin Assignment 26 3.3.3 Block Normalization 27 3.4 SVM Classifier Training 27 3.5Inference Model 28 Chapter 4 Hardware Design and Implementation 30 4.1 Hardware setup 30 4.1.1 Development Board 30 4.1.2 Camera 31 4.2 System Architecture 32 4.3 HOG/SVM pipeline design 34 4.3.1 Grayscale 35 4.3.2 Image pyramid generation 36 4.3.3 Gradient Computation 37 4.3.4 Orientation Bin Assignment 37 4.3.5 Block Normalization 39 4.3.6 SVM Classification 40 4.3.7 Other details 42 4.4 Result Display 42 4.4.1 Bounding box 43 4.4.2 VGA controller 44 Chapter 5 Experimental Results 45 5.1 Synthesis results 45 5.2 Accuracy 45 5.2.1 Confusion Matrix 45 5.2.2 False Positives Per Image 47 5.3 Visual Examples of Detection 48 Chapter 6 Conclusion and Future Work 56 6.1 Conclusion 56 6.2 Future Work 57 6.2.1 Enhancing Multi-Scale Detection Capability 57 6.2.2 Hardware/Software Co-Processing Mechanism 57 6.2.3 Improving Model Generalization 57 References 59

    [1] S. A. Abdulkareem, A. Hussian, H. M. Al-Jawahry, A. A. Alwan and H. K. Easa, “Artificial intelligence and computer vision based technique for effective facial recognition system”, International Conference on Smart Systems for Electrical, Electronics, Communication and Computer Engineering (ICSSEECC), pages 684–688, 2024.
    [2] M. Asadi Shirzi and M. R. Kermani, “Real-time point recognition for seedlings using kernel density estimators and pyramid histogram of oriented gradients”, Actuators, page 81, 2024.
    [3] P. Y. Chen, C. C. Huang, C. Y. Lien and Y. H. Tsai, “An efficient hardware implementation of HOG feature extraction for human detection”, IEEE Transactions on Intelligent Transportation Systems, vol. 15, no. 2, pages 656–662, 2014.
    [4] S. Chen, X. Ma and S. Zhang, “AdaBoost face detection based on Haar-like intensity features and multi-threshold features”, International Conference on Multimedia and Signal Processing (CMSP), pages 251–255, 2011.
    [5] X. Chen, J. Xu and Z. Yu, “A fast and energy efficient FPGA-based system for real-time object tracking”, Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, pages 965–968, 2017.
    [6] N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection”, IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), pages 886–893, 2005.
    [7] M. Hasan, M. Ahsan, S. Newaz and G. Lee, “Human face detection techniques: A comprehensive review and future research directions”, Electronics, page 2354, 2021.
    [8] J. Janke, M. Castelli and A. Popovič, “Analysis of the proficiency of fully connected neural networks in the process of classifying digital images. Benchmark of different classification algorithms on high-level image features from convolutional layers”, Expert Systems with Applications, vol. 135, pages 12–38, 2019.
    [9] Y. K. Lim, Algorithmic strategies for FPGA-based vision, Monash University: Master’s Thesis, 2012.
    [10] S. Mohamed, W. Sayed, A. Radwan and L. Said, “FPGA implementation of reconfigurable CORDIC algorithm and a memristive chaotic system with transcendental nonlinearities”, IEEE Transactions on Circuits and Systems I: Regular Papers, pages 2885–2892, 2022.
    [11] Q. V. Ngo, Reconfigurable HOG/SVM implementations for pedestrian detection, Universitat Autònoma de Barcelona, 2022.
    [12] H. Wang, Z. Zhang, X. Chen and Y. Wang, “VGA display driver design based on FPGA”, IEEE/ACIS 18th International Conference on Computer and Information Science (ICIS), pages 530–535, 2019.
    [13] M. Wang and Z. Zhang, “FPGA implementation of HOG-based multi-scaler pedestrian detection”, IEEE International Conference on Applied System Invention, pages 1099–1102, 2018.
    [14] H. Zhou and G. Yu, “Research on pedestrian detection technology based on the SVM classifier trained by HOG and LTP features”, Future Generation Computer Systems, pages 604–615, 2021.

    無法下載圖示 校內:2030-08-22公開
    校外:2030-08-22公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE