| 研究生: |
林均 Lin, Jiun |
|---|---|
| 論文名稱: |
應用於藥局機器人之裸藥藥盒偵測與處方箋辨識系統 Drug Pills/Boxes Detection and Prescription Recognition System for Pharmacy Robot |
| 指導教授: |
王駿發
Wang, Jhing-Fa |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
| 論文出版年: | 2018 |
| 畢業學年度: | 106 |
| 語文別: | 英文 |
| 論文頁數: | 46 |
| 中文關鍵詞: | 裸藥偵測 、藥盒偵測 、處方箋辨識 、深度卷積神經網路 、物件偵測 、藥局機器人 |
| 外文關鍵詞: | Drug Pills Detection, Drug Boxes Detection, Prescription Recognition, Deep Convolution Neural Network, Object Detection, Pharmacy Robot |
| 相關次數: | 點閱:58 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本研究針對藥局機器人的視覺功能進行開發,基於深度卷積神經網路提出應用於藥局機器人之裸藥與藥盒偵測、及處方箋辨識系統。裸藥偵測系統能協助藥師判別影像中多種藥品類別,藥盒偵測系統用於讓顧客進行藥品諮詢,處方箋系統能自動辨識處方箋上的資訊減緩藥師輸入資訊的時間。在裸藥與藥盒偵測系統中,分成二大部分。第一部分是物件偵測系統,本篇論文利用深度卷積神經網路對影像進行特徵擷取,將特徵圖搭建成特徵金字塔,再利用邊框回歸模型與分類模型輸出物件正確的類別與位置。第二部分是裸藥偵測系統,使用物件偵測系統輸出的裸藥位置,再使用裸藥分類深度卷積神經網路輸出正確的類別。在處方箋辨識系統中,本篇論文利用文字偵測找出處方箋中的文字,對文字進行區塊合併,偵測處方箋上的健保碼並分組,建立一套通用的方法,辨識出處方箋中藥品、途徑、服法、劑量、總量、天數等資訊。最後,以本研究提出之資料庫進行測試,裸藥偵測的平均辨識率Top-1達79.4%,Top-3達88.3%,Top-5達91.8%,藥盒偵測的平均辨識率可達93.5%,處方箋辨識平均辨識率可達92.4%;即時辨識測試結果亦證實本篇提出之系統可進行即時性辨識。
This thesis proposed drug pills, boxes detection system and a prescription recognition system. The systems is developed for the visual system of pharmacy robot. Drug pills detection system can help can help pharmacist to identify multiple drug pills in the image. The drug boxes detection system is used to allow customers to conduct drug consultation. The prescription system can automatically identify the information in the prescription to reduce the time for the pharmacist to input information. In the drug pills and boxes detection system, it is divided into two parts. The first part is the object detection system. Deep convolution neural network have been applied to extract feature and construct feature pyramid with stronger semantics. Then the regression submodel and the classification submodel are used to output the correct category and position of the object. The second part is the drug pills detection pipeline, which uses the drug pills position output by the object detection system, and then uses deep convolutional neural network to output the pill types. In the prescription recognition system, this thesis uses text detection to find the information in the prescription, detects the national health insurance code and groups the information. We proposed a system to identify information such as medicine, routes of administration, usage, dosage, quantity and day in the prescription. Using the drug pills database proposed in this study, the drug pills detection reach top-1 accuracy of 79.4%, top-3 accuracy of 88.3% and top-5 accuracy of 91.8%. The accuracy of drug boxes detection is 93.5% and the accuracy of prescription recognition is 92.4%.
[1] D. G. Lowe, "Distinctive image features from scale-invariant keypoints," International journal of computer vision, vol. 60, no. 2, pp. 91-110, 2004.
[2] N. Dalal and B. Triggs, "Histograms of oriented gradients for human detection," in Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, 2005, vol. 1, pp. 886-893: IEEE.
[3] P. F. Felzenszwalb, R. B. Girshick, D. McAllester, and D. Ramanan, "Object detection with discriminatively trained part-based models," IEEE transactions on pattern analysis and machine intelligence, vol. 32, no. 9, pp. 1627-1645, 2010.
[4] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition," Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, 1998.
[5] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," in Advances in neural information processing systems, 2012, pp. 1097-1105.
[6] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, "Imagenet: A large-scale hierarchical image database," in Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, 2009, pp. 248-255: IEEE.
[7] X. Glorot, A. Bordes, and Y. Bengio, "Deep sparse rectifier neural networks," in Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, 2011, pp. 315-323.
[8] M. D. Zeiler and R. Fergus, "Visualizing and understanding convolutional networks," in European conference on computer vision, 2014, pp. 818-833: Springer.
[9] K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," arXiv preprint arXiv:1409.1556, 2014.
[10] C. Szegedy et al., "Going deeper with convolutions," in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 1-9.
[11] S. Ioffe and C. Szegedy, "Batch normalization: Accelerating deep network training by reducing internal covariate shift," arXiv preprint arXiv:1502.03167, 2015.
[12] K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770-778.
[13] J. Hu, L. Shen, and G. Sun, "Squeeze-and-Excitation Networks," arXiv preprint arXiv:1709.01507, 2017.
[14] G. Huang, Z. Liu, K. Q. Weinberger, and L. van der Maaten, "Densely connected convolutional networks," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, vol. 1, no. 2, p. 3.
[15] R. Girshick, J. Donahue, T. Darrell, and J. Malik, "Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation," in 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 580-587.
[16] J. R. R. Uijlings, K. E. A. v. d. Sande, T. Gevers, and A. W. M. Smeulders, "Selective Search for Object Recognition," International Journal of Computer Vision, journal article vol. 104, no. 2, pp. 154-171, 2013.
[17] K. He, X. Zhang, S. Ren, and J. Sun, "Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 37, no. 9, pp. 1904 - 1916, 2015.
[18] R. Girshick, "Fast R-CNN," in 2015 IEEE International Conference on Computer Vision, 2015, pp. 1440-1448.
[19] S. Ren, K. He, R. Girshick, and J. Sun, "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, pp. 1137-1149, 2017.
[20] J. Dai, Y. Li, K. He, and J. Sun, "R-fcn: Object detection via region-based fully convolutional networks," in Advances in neural information processing systems, 2016, pp. 379-387.
[21] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "You only look once: Unified, real-time object detection," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 779-788.
[22] W. Liu et al., "Ssd: Single shot multibox detector," in European conference on computer vision, 2016, pp. 21-37: Springer.
[23] T.-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie, "Feature Pyramid Networks for Object Detection," in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 936-944.
[24] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, "Rethinking the inception architecture for computer vision," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 2818-2826.
[25] C. Peng, X. Zhang, G. Yu, G. Luo, and J. Sun, "Large kernel matters—improve semantic segmentation by global convolutional network," in Computer Vision and Pattern Recognition (CVPR), 2017 IEEE Conference on, 2017, pp. 1743-1751: IEEE.
[26] T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, "Focal loss for dense object detection," arXiv preprint arXiv:1708.02002, 2017.
[27] D. P. Kingma and J. Ba, "Adam: A method for stochastic optimization," arXiv preprint arXiv:1412.6980, 2014.
[28] F. Chollet, "Xception: Deep learning with depthwise separable convolutions," arXiv preprint, p. 1610.02357, 2017.
[29] CLOUD VISION API. Available: https://cloud.google.com/vision/
[30] 健保用藥品項查詢項目檔. Available: https://data.gov.tw/dataset/23715
校內:2023-08-31公開