簡易檢索 / 詳目顯示

研究生: 張哲榮
Zhang, Zhe-Rong
論文名稱: 以FPGA實現基於HOG多尺度之影像中行人偵測
FPGA Implementation of HOG based Multi-scale Pedestrian Detection
指導教授: 王明習
Wang, Ming-Shi
學位類別: 碩士
Master
系所名稱: 工學院 - 工程科學系
Department of Engineering Science
論文出版年: 2017
畢業學年度: 105
語文別: 中文
論文頁數: 72
中文關鍵詞: 多尺度行人偵測HOG線性SVMFPGA
外文關鍵詞: Multi-scale, Pedestrian Detection, Histograms of oriented gradients (HOG), Linear Support vector machine (SVM), Field programmable gate array (FPGA)
相關次數: 點閱:136下載:8
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 物件偵測在許多領域所需包含監視系統、先進駕駛輔助系統(Advanced Driver Assistance Systems, ADAS)、智慧型運輸系統(Intelligent Transport System, ITS)、機器人、四軸飛行器與便攜式電子產品等。在這些應用領域中,行人偵測是一個重要的議題,因為在人為疏失下可能直接或間接的傷害人。因人與相機間的距離不一和人本身高度不一,使得在影像內呈現大小不一的行人,要偵測出影像中所有行人不同大小的行人,需要一個支援多尺度偵測的行人偵測系統。
    在此篇論文中,我們提出一個以硬體方式來執行多尺度行人偵測的方法。這一個方法包含三個步驟:首先,我們需要加彩色圖片轉換至灰階圖,再將其減取樣得到三個不同大小的灰階圖。第二步,從三個不同大小的灰階圖中萃取出不同是窗大小的HOG(Histograms of oriented gradients)特徵。最後,使用線性的支援向量機(Support vector machine, SVM)去辨識不同偵測視窗大小的特徵。但由於此方法運算量大沒辦法再軟體端執行達到即時的偵測,所以將其方法使用FPGA(Field Programmable Gate Array)電路設計實現,以硬體電路設計架構實現即時的多尺度行人偵測。
    最後,從實驗結果所顯示此系統使用了94,374邏輯元素(Logic elements, LEs),共使用了Terasic DE2-115開發板約莫82%的資源,此系統的準確率平均約在97%,且處理能力可達每秒60張640x480解析度的影像。

    Pedestrian detection is needed for many vison applications including surveillance, Advanced Driver Assistance Systems (ADAS), Intelligent Transport System (ITS), drone, robotics, etc. There are different sizes of pedestrian or human in an image due to the different distances from the camera and different object’s height. To detect all of the objects with different sizes, a multi-scale detector is needed. In this study, a multi-scale pedestrian detection is designed based on histogram of oriented gradients (HOG) and implement the method on a field programmable gate array (FPGA). The processing includes three stages. First, the input color image is converted to a gray one and then down sampled the gray image by 2 and by 4, respectively. Second, different window sizes are used to extract the features of HOG from three size gray images. Final, linear support vector machine (SVM) is used to classify the extracted features for different window sizes.
    The experimental results show that the system costs 94,374 logic elements (LEs), which is about 82% of total LEs of Terasic DE2-115 development board. The system detection accuracy is about 97% on average and the processing speed can achieve 60 fps (frame per second) for 640x480 resolution.

    摘要 i 誌謝 xi 目錄 xii 表目錄 xiv 圖目錄 xv 第一章 緒論 1 1.1研究背景與動機 1 1.2 研究目的 1 1.3 論文架構 2 第二章 相關資料探討 3 2.1 關於FPGA基本概念 3 2.2 線緩衝器 7 2.3 特徵萃取與偵測 11 2.3.1 特徵萃取 11 2.3.2 方向梯度直方圖 11 2.3.3 偵測 17 第三章 研究方法 20 3.1 行人辨識方法 21 3.1.1 軟體實作 21 3.1.2 SVM的訓練 32 3.2 行人偵測硬體架構 34 3.2.1 行人偵測模組架構 34 3.2.2 模組實作 37 3.3 整體系統架構 55 第四章 實驗結果與討論 58 4.1 實驗環境 58 4.2 實驗結果與數據 61 4.2.1 標準演算法與近似法準確度比較 61 4.2.2 FPGA硬體資源使用與工作頻率 63 4.2.3 軟體與硬體效能之比較 65 4.2.4 實驗結果 66 第五章 結論與未來展望 69 參考文獻 70

    [1] Haltakov, V., Belzner, H., & Ilic, S. “Scene understanding from a moving camera for object detection and free space estimation,” IEEE Intelligent Vehicles Symposium, 2012, pp. 105-110.
    [2] K. Takagi, K. Mizuno, S. Izumi, H. Kawaguchi, and M. Yoshimoto, “A sub-100-milliwatt dual-core HOG accelerator VLSI for real-time multiple object detection,” 2013 IEEE International Conference on, No. 4, Vol. E96.C, 2013, pp.433-443.
    [3] M. Hahnle, F. Saxen, M. Hisung, U. Brunsmann, and K. Doll, “FPGA-Based Real-Time Pedestrian Detection on High-Resolution Images,” Computer Vision and Pattern Recognition Workshops, 2013 IEEE Conference, 2013, pp. 629-635.
    [4] X. Ma, W. A. Najjar, and A. K. Roy-Chowdhury, “Evaluation and Acceleration of High-Throughput Fixed-Point Object Detection on FPGAs,” IEEE Transactions on Circuits and Systems for Video Technology, No. 6, Vol. 25, 2015, pp. 1051-1062.
    [5] K. Mizuno, Y. Terachi, K. Takagi, S. Izumi, H. Kawaguchi, and M. Yoshimoto, “An FPGA Implementation of a HOG-based Object Detection Processor,” IPSJ Transactions on System LSI Design Methodology, No. 0, Vol. 6, 2013, pp. 42-51.
    [6] K. Mizuno, Y. Terachi, K. Takagi, S. Izumi, H. Kawaguchi, and M. Yoshimoto, “Architectural Study of HOG Feature Extraction Processor for Real-Time Object Detection,” Signal Processing Systems, 2012 IEEE Workshop, 2012, pp. 197-202.
    [7] A. Suleiman and V. Sze, “An Energy-efficient Hardware Implementation of HOG-based Object Detection at 1080HD 60 fps with Multi-scale Support,” Journal of Signal Processing Systems, No. 3, Vol. 84, 2015, pp. 325-337.
    [8] Altera, “Logic Element”. [online]. Avaiable: http://quartushelp.altera.com/15.0/mergedProjects/reference/glossary/def_logelem.htm
    [9] Altera, “Cyclone IV Device Handbook”. [online]. Avaiable: https://www.altera.com/literature/hb/cyclone-iv/cyclone4-handbook.pdf
    [10] A. Benedetti, A. Prati, N. Scarabottolo, “Image convolution on FPGAs: the implementation of a multi-FPGA FIFO structure,” The 24th IEEE Euromicro Conference, Vol. 1, 1998, pp. 123-130.
    [11] P.-Y. Hsiao, C.-H. Chen, H. Wen and S.-J. Chen, “Real-time realisation of noise-immune gradient-based edge detector,” IEE Proceedings-Computers and Digital Techniques, No. 4, Vol. 153, 2006, pp. 261-269.
    [12] B. Bosi, G. Bois, Y. Savaria, “Reconfigurable pipelined 2-D convolvers for fast digital signal processing,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, No. 3, Vol. 7, 1999, pp. 299-308.
    [13] N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” IEEE Computer Society Conference, Computer Vision and Pattern Recognition, Vol. 1, 2005, pp. 886-893.
    [14] Cortes, C. , Vapnik, V. “Support-vector networks,” Machine Learning, No. 3, Vol. 20, 1995, pp. 273-297.
    [15] D. D. Gajski, “Principles of Digital Design.” Upper Saddle River, NJ,USA: Prentice-Hall, 1997.
    [16] Pei-Yin Chen, Chien-Chuan Huang, Chih-Yuan Lien, and Yu-Hsien Tsai, “An Efficient Hardware Implementation of HOG Feature Extraction for Human Detection,” IEEE Transactions on Intelligent Transportation Systems, No. 2, Vol. 15, pp. 656-662, 2014
    [17] H. T. Ngo and V. K. Asari, “A pipelined architecture for real-time correction of barrel distortion in wide-angle camera images,” IEEE Trans. Circuits Syst. Video Technol, No. 3, Vol. 15, 2005, pp. 436-444.
    [18] J. D. Bruguera, N. Guil, T. Lang, J. Villalba, and E. L. Zapata, “CORDIC based parallel/pipelined architecture for the Hough transform,” J. VLSI Signal Process, No. 3, Vol. 12, 2001, pp. 207-221.
    [19] MIT pedestrian data. [Online]. Available: http://cbcl.mit.edu/software-datasets/PedestrianData.html
    [20] INRIA person dataset. [Online]. Available: http://pascal.inrialpes.fr/data/human/
    [21] Clifford E. Cummings, “Synthesis and Scripting Techniques for Designing Multi-Asynchronous Clock Designs,” Synopsys Users Group Conference, San Jose, CA, 2001.
    [22] Clifford E. Cummings, “Simulation and Synthesis Techniques for Asynchronous FIFO Design,” Synopsys Users Group Conference, San Jose, CA, 2002.
    [23] Terasic, Altera DE2-115 Development and Education Board. [Online]. Available: https://www.terasic.com.tw/cgi-bin/page/archive.pl?Language=English&CategoryNo=139&No=502
    [24] Terasic, D8M-GPIO. [Online]. Available: https://www.terasic.com.tw/cgi-bin/page/archive.pl?Language=English&CategoryNo=68&No=1011

    下載圖示 校內:立即公開
    校外:立即公開
    QR CODE