簡易檢索 / 詳目顯示

研究生: 楊景倫
Yang, Jing-Lune
論文名稱: 基於深度遮罩式區域卷積神經網路之自動化蘭花瓶苗影像表徵萃取與計算
Automatic Orchid Bottle Seedling Image Feature Extraction and Measurement based on Deep Mask Regions Convolutional Neural Networks
指導教授: 王振興
Wang, Jeen-Shing
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2019
畢業學年度: 107
語文別: 中文
論文頁數: 69
中文關鍵詞: 蘭花瓶苗深度學習遮罩式區域卷積神經網路表徵萃取與計算影像辨識精準培育
外文關鍵詞: Orchid Bottle Seedling, Deep Learning, Mask Regions Convolutional Neural Network (Mask R-CNN), Feature Extraction and Measurement, Image Detection, Precise Cultivation
相關次數: 點閱:214下載:3
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本論文旨在開發出基於遮罩式區域卷積神經網路之蘭花瓶苗影像辨識演算法與自動化表徵計算演算法,以人工智慧技術萃取蘭花瓶苗的成長歷程表徵,以期達到精準培育之目標。本研究首先於蘭園場域收集不同拍攝角度下之蘭花瓶苗影像,以作為影像辨識模型的訓練與測試資料集,接續將影像透過扭曲方式來擴增其資料量,最後對影像資料進行標記以產生瓶苗部位表徵對應之標籤影像作為訓練之黃金標準。本論文所提出之基於遮罩式區域卷積神經網路蘭花瓶苗影像辨識演算法是以不同層數殘差網路(Residual Network, ResNet)如ResNet-26、ResNet-41、ResNet-50、ResNet-101、ResNet-152分別搭配全卷積網路(Fully Convolutional Network, FCN)及U型網路(U-Network, UNet)所組成的10種不同遮罩式區域卷積神經網路模型為主要特徵萃取之工具,對蘭花瓶苗部位表徵(葉、根、根尖顏色白色部分、根尖顏色綠色部分、枯葉黃色部分、枯葉綠色部分)自動且有效的進行辨識,實驗結果顯示,ResNet-101-UNet相較於其他模型有更好的辨識表現,其平均整體AP之辨識準確度達到了77.89%,且其訓練時間為199毫秒/影像。此外,本論文提出之表徵計算演算法可將上述模型所辨識出的蘭花瓶苗不同部位表徵計算出其長度、寬度、數量及面積等表徵數值,實驗結果顯示,於葉部位面積因容易受拍攝角度影響而有較高之平均誤差百分比為16.47±6.41%;於根部位長度則有較低之平均誤差百分比為7.28±3.01%,而其整體表徵數值之平均誤差則皆能被有效抑制,因此研究結果也驗證了本論文所開發的表徵計算演算法應用於蘭花瓶苗部位表徵數值計算之可行性。希冀未來能實現演算法於感測系統上,並於實際場域中落實蘭花瓶苗精準培育之目標。

    This thesis aims to develop an automatic orchid bottle seedling image feature extraction and measurement algorithms based on mask regions convolutional neural networks (Mask R-CNN) for extracting the important growth features of orchid bottle seedlings to reach the goal of precise cultivation. In this study, to train and test the Mask R-CNN, orchid bottle seedling images from different view angles were obtained from an orchid plantation factory in the southern Taiwan. The original images collected from the factory were first labeled for their outlook contours such leaves and roots. These contours are called as masks. Then, the labeled images were distorted to increase the diversity of the training and testing images. Finally, these images with their corresponding masks were served as the golden standards for the network training. The Mask R-CNN-based image detection algorithm has been developed to extract the features of orchid bottle seedlings, including leaf, root, green root tip, white root tip, yellow leaf, green leaf effectively and automatically. Ten different Mask R-CNN models were constructed for performance comparisons. These ten models are the different layers of residual network (ResNet) including ResNet-26, ResNet-41, ResNet-50, ResNet-101, and ResNet-152 combined with fully convolutional network (FCN) and U-network (UNet), respectively. The experimental results show that the ResNet-101-UNet outperforms the other models with higher average precision (AP) of feature extraction at 77.89%, and its training time is 199 ms/image. In addition to the feature extraction, a feature measurement algorithm has been developed to measure/calculate the features, such as the number of leaves and the length, width, and area of each leaf from orchid bottle seedling images detected by the Mask R-CNN models. The experimental results show that the average percentage error of the area measurement of leaves is 16.47±6.41% due to the shading or blocking by other leaves or curly leaves, while the average percentage error of the length measurement of roots is 7.28±3.01%. The overall average errors of the feature measurements/calculations were satisfactory, and thus validated the effectiveness of the proposed methods for the feature extraction of orchid bottle seedlings. In the future, we hope these algorithms can be applied to the orchid plantation industry and reach the goal of precise cultivation of orchid bottle seedlings.

    中文摘要 i 英文摘要 iii 誌謝 ix 目錄 x 表目錄 xii 圖目錄 xiii 第1章 緒論 1 1.1 研究背景與動機 1 1.2 文獻探討 3 1.3 研究目的 6 1.4 論文架構 7 第2章 實驗收案資料處理與雲端運算架構 8 2.1 實驗收案設置與影像資料前處理 8 2.2 雲端DGX-1系統運算架構 13 第3章 基於遮罩式區域卷積神經網路之蘭花瓶苗影像辨識演算法與表徵計算演算法 15 3.1 遮罩式區域卷積神經網路(Mask R-CNN) 16 3.1.1 特徵萃取 21 3.1.2 特徵對齊 33 3.1.3 特徵回歸 33 3.2 損失函數(Loss Function) 38 3.3 訓練參數設置 40 3.4 蘭花瓶苗影像之目標部位表徵計算演算法 42 3.4.1 數量計算 42 3.4.2 面積計算 42 3.4.3 長度計算 43 3.4.4 寬度計算 45 第4章 實驗結果與討論 46 4.1 評估指標 46 4.2 遮罩式區域卷積神經網路模型之影像辨識結果與討論 48 4.2.1 各模型辨識結果對應之混淆矩陣比較 49 4.2.2 各模型之辨識準確度及訓練時間結果比較 54 4.2.3 模型之辨識結果可視化圖 55 4.2.4 各模型之辨識結果討論 57 4.3 表徵計算演算法之數值計算結果與討論 58 4.3.1 數量計算結果 58 4.3.2 面積計算結果 59 4.3.3 長度計算結果 60 4.3.4 寬度計算結果 61 4.3.5 表徵實際值與標準實際值之誤差結果比較 62 4.3.6 表徵實際值與標準實際值之誤差結果討論 63 第5章 結論與未來展望 64 5.1 結論 64 5.2 未來展望 65 參考文獻 67

    [1] L. Bottou, “Stochastic gradient descent tricks,” Neural Networks: Tricks of the trade, pp. 421-436, 2012.
    [2] C. Cortes and V. Vapnik, “Support-vector networks,” Machine Learning, vol. 20, pp. 273-297, 1995.
    [3] P. F. Felzenszwalb, F. Pedro, and D. P. Huttenlocher, “Efficient graph-based image segmentation,” International Journal of Computer Vision, vol. 59, no. 2, pp. 167-181, 2004.
    [4] C. A. Glasbey and K. V. Mardia, “A review of image-warping methods,” Journal of applied statistics, vol. 25, no. 2, pp. 155-171, 1998.
    [5] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 580-587.
    [6] R. Girshick, “Fast R-CNN,” in Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 1440-1448.
    [7] H. C. Hsin, et al. “An adaptive training algorithm for back-propagation neural networks,” IEEE Transactions on Systems, Man, and Cybernetics, vol. 25, no. 3, pp. 512-514, 1995.
    [8] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770-778.
    [9] A. G. Howard, et al. “Mobilenets: Efficient convolutional neural networks for mobile vision applications,” ArXiv Preprint ArXiv:1704.04861, pp. 1-9, 2017.
    [10] K. He, G. Gkioxari, P. Dollar, and R. Girshick, “Mask R-CNN,” in Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 2961-2969.
    [11] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in Neural Information Processing Systems, pp. 1097-1105, 2012.
    [12] Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel, “Backpropagation applied to handwritten zip code recognition,” Neural Computation, vol. 1, no. 4, pp. 541-551, 1989.
    [13] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, 1998.
    [14] J. Long, E. Shelhamer, and T. Darrell, “ Fully convolutional networks for semantic segmentation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 3431-3440.
    [15] T. Y. Lin, P. Dollar, R. Girshick, K. He, B. Hariharan, and S. Belongie, “Feature pyramid networks for object detection,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 2117-2125.
    [16] V. Nair and G. E. Hinton, “Rectified linear units improve restricted boltzmann machines,” in Proceedings of the 27th international conference on machine learning (ICML-10), 2010, pp. 807-814.
    [17] O. Ronneberger, P. Fischer, and T. Brox, “ U-net: Convolutional networks for biomedical image segmentation,” in International Conference on Medical Image Computing and Computer-assisted Intervention, 2015, pp. 234-241.
    [18] A. Samal and D. Chaudhuri, “A simple method for fitting of bounding rectangle to closed regions,” Pattern recognition, vol. 40, no. 7, pp. 1981-1989, 2007.
    [19] K. Saeed, M. Tabędzki, M. Rybnik, and M. Adamski, “K3M: A universal algorithm for image skeletonization and a review of thinning techniques,” International Journal of Applied Mathematics and Computer Science, vol. 20, no. 2, pp. 317-335, 2010.
    [20] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” ArXiv Preprint ArXiv:1409.1556, pp. 1-14, 2014.
    [21] C. Szegedy, et al. “Going deeper with convolutions,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 1-9.
    [22] C. Szegedy and S. Loffe, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” ArXiv Preprint ArXiv:1502.03167, pp. 1-11, 2015.
    [23] R. Shaoqing, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real Time Object Detection with Region Proposal Networks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, pp. 1137-1149, 2017.
    [24] N. Wang and Z. Huang, “Data-driven sparse structure selection for deep neural networks,” in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 304-320.
    [25] L. Yang and F. Albregtsen, “Fast and exact computation of Cartesian geometric moments using discrete Green's theorem,” Pattern Recognition, vol. 29, no. 7, pp. 1061-1073, 1996.
    [26] M. D. Zeiler, D. Krishnan, G. W. Taylor, and R. Fergus, “Deconvolutional networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 10, 2010, pp. 2528-2535.
    [27] A. Zisserman, K. Simonyan, and M. Jaderberg, “Spatial transformer networks,” in Advances in neural information processing systems, pp. 2017-2025, 2015.
    [28] https://www.nvidia.com/zh-tw/data-center/dgx-1/
    [29] https://support-tw.canon-asia.com/contents/TW/TC/6200170600.html
    [30] https://about.taitra.org.tw/News_Detail.aspx?id=9849
    [31] https://enews.url.com.tw/enews/43640

    下載圖示
    2024-08-22公開
    QR CODE