| 研究生: |
張博鈞 Chang, Po-Chun |
|---|---|
| 論文名稱: |
應用於MR眼鏡之AI高精度醫療物件偵測 AI-based High Precision Medical Object Detection for MR Glasses |
| 指導教授: |
楊家輝
Yang, Jar-Ferr |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 電腦與通信工程研究所 Institute of Computer & Communication Engineering |
| 論文出版年: | 2023 |
| 畢業學年度: | 111 |
| 語文別: | 英文 |
| 論文頁數: | 39 |
| 中文關鍵詞: | 深度學習 、MR眼鏡 、醫療應用 、卷積神經網路 、物件偵測 |
| 外文關鍵詞: | deep learning, MR glasses, medical applications, convolutional neural networks, object detection |
| 相關次數: | 點閱:109 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著元宇宙的發展概念,讓虛擬世界融入現實生活中成為了一股潮流。近年來,混合實境(MR)、虛擬實境(VR)的技術也大幅發展,在各項領域都可以看見MR、VR的應用。我們提出了一套可以透過MR眼鏡實時的協助醫生執行手術的系統,藉此提供手術模擬以及手術教學。該系統主要是藉由我們提出的醫療物件偵測網路來運行,主要分為網路模組以及MR眼鏡模組。在網路模組中,我們提出了物件意識網路(OANet)來優化CenterNet的演算流程,也提出了輕量化沙漏網路(Light-weight hourglass)來配合減半跨層適應性融合模組(HCAF),藉此改善手術中的重要物件-手術刀尖的偵測結果。而在MR眼鏡模組中,可以將透過網路所偵測到的物件做處理,該模組可將虛擬肝臟放置於現實世界中並重疊於實際的肝臟上面,亦可將偵測到的手術刀尖警示於MR眼鏡上。實驗結果顯示,我們提出的網路可以較精準並且實時的偵測刀尖,也可以結合MR眼鏡提供手術刀尖距離肝臟血管的距離檢測。
With the development of the metaverse, integrating the virtual world into real life has become a trend for creative applications. In recent years, the technologies of virtual reality (VR) and mixed reality (MR) have made significant advancements and their applications can be seen in various fields. In this thesis, we propose a system that can assist doctors in performing surgeries in real-time using MR glasses, providing surgical simulations and training. The system primarily operates through the proposed medical object detection (MOD) network, which consists of two main modules: the network module and the MR glasses module. In the network module, we introduce an object aware network (OANet) to optimize the detection flow of CenterNet. We also propose a lightweight hourglass network in conjunction with the half cross adaptive fusion (HCAF) module to improve the most important object, scalpel-tip, during surgery. In the MR glasses module, the detection results of the network can be processed. This module can overlay a virtual liver onto the real liver in the physical world and display it through the MR glasses. It can also provide alerts for detected scalpel-tip on the MR glasses. The experimental results show that the proposed network can detect scalpel-tip better in real-time, and also can combine with MR glasses to estimate the distance between the scalpel-tip and the vessels.
[1] O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18. Springer, 2015, pp. 234–241.
[2] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "You only look once: Unified, real-time object detection," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 779–788.
[3] Zhou, Xingyi, Dequan Wang, and Philipp Krähenbühl. "Objects as points," arXiv preprint arXiv:1904.07850, 2019.
[4] Newell, Alejandro, Kaiyu Yang, and Jia Deng, "Stacked hourglass networks for human pose estimation," in Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part VIII 14. Springer International Publishing, 2016.
[5] Y. Cao, J. Xu, S. Lin, F. Wei, and H. Hu, “Gcnet: Non-local networks meet squeeze-excitation networks and beyond,” in Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 2019, pp. 1971-1980.
[6] C.-Y. Wang, H.-Y. M. Liao, Y.-H. Wu, P.-Y. Chen, J.-W. Hsieh, and I.-H. Yeh, “Cspnet: A new backbone that can enhance learning capability of cnn,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020, pp. 390–391.
[7] S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: Towards real-time object detection with region proposal networks,” Advances in Neural Information Processing Systems, vol. 28, 2015.
[8] K. He, G. Gkioxari, P. Doll ́ar, and R. Girshick, “Mask r-cnn,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 2961–2969.
[9] T.-Y. Lin, P. Doll ́ar, R. Girshick, K. He, B. Hariharan, and S. Belongie, “Feature pyramid networks for object detection,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 2117–2125.
[10] C Z. Cai and N. Vasconcelos, “Cascade r-cnn: Delving into high quality object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 6154–6162.
[11] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly et al., “An image is worth 16x16 words: Transformers for image recognition at scale,” arXiv preprint arXiv:2010.11929, 2020.
[12] Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo, “Swin transformer: Hierarchical vision transformer using shifted windows,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 10012–10022.
[13] W. Wang, J. Dai, Z. Chen, Z. Huang, Z. Li, X. Zhu, X. Hu, T. Lu, L. Lu, H. Li, X. Wang, and Y. Qiao, "Internimage: Exploring large-scale vision foundation models with deformable convolutions," arXiv preprint arXiv:2211.05778, 2022.
[14] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C. Berg, “Ssd: Single shot multibox detector,” in Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14. Springer, 2016, pp. 21–37.
[15] T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Doll ́ar, “Focal loss for dense object detection,” in Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 2980–2988.
[16] X. Wang, R. Girshick, A. Gupta, and K. He, “Non-local neural networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 7794–7803.
[17] J. Hu, L. Shen, and G. Sun, “Squeeze-and-excitation networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 7132–7141.
[18] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778.
[19] G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, “Densely connected convolutional networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 4700–4708.
[20] H. Law and J. Deng, “Cornernet: Detecting objects as paired keypoints,” in Proceedings of the European conference on computer vision (ECCV), 2018, pp. 734–750.
[21] Qiao, Siyuan, Liang-Chieh Chen, and Alan Yuille. "Detectors: Detecting objects with recursive feature pyramid and switchable atrous convolution," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021.
[22] Kingma, Diederik P., and Jimmy Ba. "Adam: A method for stochastic optimization," arXiv preprint arXiv:1412.6980, 2014.
[23] Diana Lavado, from Kaggle: https://www.kaggle.com/datasets/dilavado/labeled-surgical-tools
[24] Zhou, Xingyi, Vladlen Koltun, and Philipp Krähenbühl. "Probabilistic two-stage detection," arXiv preprint arXiv:2103.07461, 2021.
校內:2028-07-31公開