| 研究生: |
郭祥德 Kuo, Hsiang-Te |
|---|---|
| 論文名稱: |
用深度學習辨識空拍救災影像以協尋受難者 Victim detection in aerial images using deep learning |
| 指導教授: |
侯廷偉
Hou, Ting-Wei |
| 學位類別: |
碩士 Master |
| 系所名稱: |
工學院 - 工程科學系 Department of Engineering Science |
| 論文出版年: | 2020 |
| 畢業學年度: | 108 |
| 語文別: | 中文 |
| 論文頁數: | 30 |
| 中文關鍵詞: | 神經網路 、影像辨識 、空拍機 |
| 外文關鍵詞: | Neural network, image recognition, drone |
| 相關次數: | 點閱:115 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本論文針對在災難現場空拍機所拍的影像,擷取出圖片,搭配深度學習影像辨識模型,以辨識待救援的受難者。與過往人體辨識研究中圖片通常為平視不同,本研究加入了俯視(鳥瞰)人體的辨識,包含可能只在空拍影像中露出部分肢體的受難者。辨識模型則用了Mask-RCNN搭配ResNet101的深度神經網路。
而為了訓練網路模型的資料集,除了使用Stanford drone dataset、擬真的城市遊戲畫面,還自行用空拍機拍攝收集,共收集了共857張照片。隨機選擇其中70%為訓練集、15%為驗證集而另外15%為測試集。最後在桌上型電腦主機上訓練完成的Mask-RCNN神經網路模型,用測試集進行測試。辨識的速度為一張4.52秒且凖確率為75.54%。
This paper is motivated to find people on the images taken by the aerial camera at the disaster area. Hence it is to assist disaster relief personnel in disaster rescuing process. Current researches on recognizing (or classifying) people on pictures are on pictures taken from the front view. Few is on top view. This research introduces deep learning techniques to approach the goal. After trying different models, the final model was Mask-RCNN with ResNet101's deep neural network.
To train the neural network model, a total of 857 photos were collected from difference sources. The sources are a dataset of pictures taken by an air camera (take by our lab members), Stanford Drone Dataset and a realistic city game’s screen (from top view taken by screen snapshot). Pictures were randomly selected: 70%for the training, 15% for the validation set and another 15% for the testing set. Finally, the completed Mask-RCNN neural network model was trained and tested on a desktop computer. After various tuning parameters, the final model’s precision is 75.54% and its performance is 4.52 seconds per image on a desktop computer.
[1] K. He, G. Gkioxari, P. Dollár and R. Girshick, "Mask R-CNN," 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 2017, pp. 2980-2988, doi: 10.1109/ICCV.2017.322.
[2] Fukushima, K. "Neocognitron: A Self-Organizing Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position.". Biol. Cybernetics 36,pp. 193–202 ,1980. doi:10.1007/BF00344251
[3] Y. Lecun, L. Bottou, Y. Bengio and P. Haffner, "Gradient-based learning applied to document recognition," in Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, Nov. 1998, doi: 10.1109/5.726791.
[4] A.Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet Classification with Deep Convolutional Neural Networks. ", Advances in Neural Information Processing Systems, pp. 1106–1114, Nevada, USA, Dec.2012. DOI: 10.1145/3065386
[5] N. Dalal and B. Triggs, "Histograms of Oriented Gradients for Human Detection," IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2005, pp. 886-893 vol. 1, San Diego, CA, USA, doi: 10.1109/CVPR.2005.177.
[6] R. Girshick, J. Donahue, T. Darrell and J. Malik, "Region-Based Convolutional Networks for Accurate Object Detection and Segmentation," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 38, no. 1, pp. 142-158, 1 Jan. 2016, doi: 10.1109/TPAMI.2015.2437384.
[7] R. Girshick, "Fast R-CNN," 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, U.S.A, 2015, pp. 1440-1448, doi: 10.1109/ICCV.2015.169.
[8] S. Ren, K. He, R. Girshick and J. Sun, "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, pp. 1137-1149, 1 June 2017, doi: 10.1109/TPAMI.2016.2577031.
[9] K. He, X. Zhang, S. Ren and J. Sun, "Deep Residual Learning for Image Recognition," IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , 2016, pp. 770-778, Las Vegas, NV, doi: 10.1109/CVPR.2016.90.
[10] T. Liu, H. Y. Fu, Q. Wen, D. K. Zhang and L. F. Li, "Extended Faster R-CNN for Long Distance Human Detection: Finding pedestrians in UAV images," IEEE International Conference on Consumer Electronics (ICCE), 2018, pp. 1-2, Las Vegas, Nevada, USA, doi: 10.1109/ICCE.2018.8326306.
[11] S. Nie, Z. Jiang, H. Zhang, B. Cai and Y. Yao, "Inshore Ship Detection Based on Mask R-CNN," IEEE International Geoscience and Remote Sensing Symposium, 2018, Valencia, Spain, pp. 693-696, doi: 10.1109/IGARSS.2018.8519123.
[12] P. Zhao, H. Gao, Y. Zhang, H. Li and R. Yang, "An Aircraft Detection Method Based on Improved Mask R-CNN in Remotely Sensed Imagery," IEEE International Geoscience and Remote Sensing Symposium, Yokohama, Japan, 2019, pp. 1370-1373, doi: 10.1109/IGARSS.2019.8900528.
[13] W. Zhang, S. Wang, S. Thachan, J. Chen and Y. Qian, "Deconv R-CNN for Small Object Detection on Remote Sensing Images," IEEE International Geoscience and Remote Sensing Symposium, 2018, pp. 2483-2486, Valencia, Spain, doi: 10.1109/IGARSS.2018.8517436
[14] P. Fang and Y. Shi, "Small Object Detection Using Context Information Fusion in Faster R-CNN," 2018 IEEE 4th International Conference on Computer and Communications , 2018, pp. 1537-1540, Chengdu, China,doi:10.1109/CompComm.2018.8780579.
[15] K. Zhao, J. Kang, J. Jung and G. Sohn, "Building Extraction from Satellite Images Using Mask R-CNN with Building Boundary Regularization," IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, UT, USA, 2018, pp. 242-2424, doi: 10.1109/CVPRW.2018.00045.
[16] D. Schweitzer and R. Agrawal, "Multi-Class Object Detection from Aerial Images Using Mask R-CNN," IEEE International Conference on Big Data (Big Data), Seattle, WA, USA, 2018, pp. 3470-3477, doi: 10.1109/BigData.2018.8622536.
[17] ITREAD,Mask R-CNN中ROI Align的一些理解. [online] Available: https://www.itread01.com/content/1543472347.html ,last retrieve 14 July 2020
[18] Jiongnima,例分割模型Mask R-CNN詳解:從R-CNN,Fast R-CNN,Faster R-CNN再到Mask R-CNN [online] Available: https://blog.csdn.net/jiongnima/article/details/79094159?fbclid=IwAR3-Y3dZvaRNoTnCbSmxjOY94TpbsFpzeIFMZ2JTaQb-jOu4FvpQbHXstAw, last retrieve 14 July 2020
[19] Computation Vision and Geometry Lab [online] Available: https://cvgl.stanford.edu/projects/uav_data/, last retrieve 16 August 2020
校內:2025-08-01公開