簡易檢索 / 詳目顯示

研究生: 郭祥德
Kuo, Hsiang-Te
論文名稱: 用深度學習辨識空拍救災影像以協尋受難者
Victim detection in aerial images using deep learning
指導教授: 侯廷偉
Hou, Ting-Wei
學位類別: 碩士
Master
系所名稱: 工學院 - 工程科學系
Department of Engineering Science
論文出版年: 2020
畢業學年度: 108
語文別: 中文
論文頁數: 30
中文關鍵詞: 神經網路影像辨識空拍機
外文關鍵詞: Neural network, image recognition, drone
相關次數: 點閱:115下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本論文針對在災難現場空拍機所拍的影像,擷取出圖片,搭配深度學習影像辨識模型,以辨識待救援的受難者。與過往人體辨識研究中圖片通常為平視不同,本研究加入了俯視(鳥瞰)人體的辨識,包含可能只在空拍影像中露出部分肢體的受難者。辨識模型則用了Mask-RCNN搭配ResNet101的深度神經網路。
    而為了訓練網路模型的資料集,除了使用Stanford drone dataset、擬真的城市遊戲畫面,還自行用空拍機拍攝收集,共收集了共857張照片。隨機選擇其中70%為訓練集、15%為驗證集而另外15%為測試集。最後在桌上型電腦主機上訓練完成的Mask-RCNN神經網路模型,用測試集進行測試。辨識的速度為一張4.52秒且凖確率為75.54%。

    This paper is motivated to find people on the images taken by the aerial camera at the disaster area. Hence it is to assist disaster relief personnel in disaster rescuing process. Current researches on recognizing (or classifying) people on pictures are on pictures taken from the front view. Few is on top view. This research introduces deep learning techniques to approach the goal. After trying different models, the final model was Mask-RCNN with ResNet101's deep neural network.
    To train the neural network model, a total of 857 photos were collected from difference sources. The sources are a dataset of pictures taken by an air camera (take by our lab members), Stanford Drone Dataset and a realistic city game’s screen (from top view taken by screen snapshot). Pictures were randomly selected: 70%for the training, 15% for the validation set and another 15% for the testing set. Finally, the completed Mask-RCNN neural network model was trained and tested on a desktop computer. After various tuning parameters, the final model’s precision is 75.54% and its performance is 4.52 seconds per image on a desktop computer.

    摘要 I Extended Abstract II 圖目錄 XI 表目錄 XIII Chapter 1.緒論 1 1.1 研究背景 1 1.2 研究動機與目的 1 1.3 研究貢獻 2 1.4 論文架構 3 Chapter 文獻探討 4 2-1 卷積神經網路 4 2-2 行人偵測 7 2-3 Mask R-CNN 9 2-4 相關研究 12 Chapter 3.實驗方法 16 3-1 資料集 16 3-2 訓練方法 16 Chapter 4.實作成果與討論 21 4-1 實驗結果 21 4-2 結果討論 24 Chapter 5.結論與未來工作 26 5-1 結論 26 5-2 未來工作 26 參考文獻 28

    [1] K. He, G. Gkioxari, P. Dollár and R. Girshick, "Mask R-CNN," 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 2017, pp. 2980-2988, doi: 10.1109/ICCV.2017.322.
    [2] Fukushima, K. "Neocognitron: A Self-Organizing Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position.". Biol. Cybernetics 36,pp. 193–202 ,1980. doi:10.1007/BF00344251
    [3] Y. Lecun, L. Bottou, Y. Bengio and P. Haffner, "Gradient-based learning applied to document recognition," in Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, Nov. 1998, doi: 10.1109/5.726791.
    [4] A.Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet Classification with Deep Convolutional Neural Networks. ", Advances in Neural Information Processing Systems, pp. 1106–1114, Nevada, USA, Dec.2012. DOI: 10.1145/3065386
    [5] N. Dalal and B. Triggs, "Histograms of Oriented Gradients for Human Detection," IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2005, pp. 886-893 vol. 1, San Diego, CA, USA, doi: 10.1109/CVPR.2005.177.
    [6] R. Girshick, J. Donahue, T. Darrell and J. Malik, "Region-Based Convolutional Networks for Accurate Object Detection and Segmentation," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 38, no. 1, pp. 142-158, 1 Jan. 2016, doi: 10.1109/TPAMI.2015.2437384.
    [7] R. Girshick, "Fast R-CNN," 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, U.S.A, 2015, pp. 1440-1448, doi: 10.1109/ICCV.2015.169.
    [8] S. Ren, K. He, R. Girshick and J. Sun, "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, pp. 1137-1149, 1 June 2017, doi: 10.1109/TPAMI.2016.2577031.
    [9] K. He, X. Zhang, S. Ren and J. Sun, "Deep Residual Learning for Image Recognition," IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , 2016, pp. 770-778, Las Vegas, NV, doi: 10.1109/CVPR.2016.90.
    [10] T. Liu, H. Y. Fu, Q. Wen, D. K. Zhang and L. F. Li, "Extended Faster R-CNN for Long Distance Human Detection: Finding pedestrians in UAV images," IEEE International Conference on Consumer Electronics (ICCE), 2018, pp. 1-2, Las Vegas, Nevada, USA, doi: 10.1109/ICCE.2018.8326306.
    [11] S. Nie, Z. Jiang, H. Zhang, B. Cai and Y. Yao, "Inshore Ship Detection Based on Mask R-CNN," IEEE International Geoscience and Remote Sensing Symposium, 2018, Valencia, Spain, pp. 693-696, doi: 10.1109/IGARSS.2018.8519123.
    [12] P. Zhao, H. Gao, Y. Zhang, H. Li and R. Yang, "An Aircraft Detection Method Based on Improved Mask R-CNN in Remotely Sensed Imagery," IEEE International Geoscience and Remote Sensing Symposium, Yokohama, Japan, 2019, pp. 1370-1373, doi: 10.1109/IGARSS.2019.8900528.
    [13] W. Zhang, S. Wang, S. Thachan, J. Chen and Y. Qian, "Deconv R-CNN for Small Object Detection on Remote Sensing Images," IEEE International Geoscience and Remote Sensing Symposium, 2018, pp. 2483-2486, Valencia, Spain, doi: 10.1109/IGARSS.2018.8517436
    [14] P. Fang and Y. Shi, "Small Object Detection Using Context Information Fusion in Faster R-CNN," 2018 IEEE 4th International Conference on Computer and Communications , 2018, pp. 1537-1540, Chengdu, China,doi:10.1109/CompComm.2018.8780579.
    [15] K. Zhao, J. Kang, J. Jung and G. Sohn, "Building Extraction from Satellite Images Using Mask R-CNN with Building Boundary Regularization," IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, UT, USA, 2018, pp. 242-2424, doi: 10.1109/CVPRW.2018.00045.
    [16] D. Schweitzer and R. Agrawal, "Multi-Class Object Detection from Aerial Images Using Mask R-CNN," IEEE International Conference on Big Data (Big Data), Seattle, WA, USA, 2018, pp. 3470-3477, doi: 10.1109/BigData.2018.8622536.
    [17] ITREAD,Mask R-CNN中ROI Align的一些理解. [online] Available: https://www.itread01.com/content/1543472347.html ,last retrieve 14 July 2020
    [18] Jiongnima,例分割模型Mask R-CNN詳解:從R-CNN,Fast R-CNN,Faster R-CNN再到Mask R-CNN [online] Available: https://blog.csdn.net/jiongnima/article/details/79094159?fbclid=IwAR3-Y3dZvaRNoTnCbSmxjOY94TpbsFpzeIFMZ2JTaQb-jOu4FvpQbHXstAw, last retrieve 14 July 2020
    [19] Computation Vision and Geometry Lab [online] Available: https://cvgl.stanford.edu/projects/uav_data/, last retrieve 16 August 2020

    無法下載圖示 校內:2025-08-01公開
    校外:不公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE