| 研究生: |
蕭暉樺 Hsiao, Hui-Hua |
|---|---|
| 論文名稱: |
增強 YOLACT 框架以實現胸腔 X 光片中氣管導管與氣管隆突之高效檢測 Enhancing Yolact-based Framework for Efficient Endotracheal Tube and Carina Detection in Portable Supine Chest Radiograph |
| 指導教授: |
陳奇業
Chen, Chi-Yeh |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 醫學資訊研究所 Institute of Medical Informatics |
| 論文出版年: | 2023 |
| 畢業學年度: | 111 |
| 語文別: | 英文 |
| 論文頁數: | 43 |
| 中文關鍵詞: | 氣管插管 、實例分割 、特徵強化模塊 、深度學習 |
| 外文關鍵詞: | Endotracheal intubation, Instance segmentation, Feature enhanced module, Deep learning |
| 相關次數: | 點閱:91 下載:1 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
氣管插管在緊急醫療中被廣泛使用,以協助呼吸困難的病人進行呼吸。然而,為了降低併發症的風險,插管後必須確定導管的位置是否正確。這可以透過測量導管末端與氣管隆突(Carina)之間的距離來評估。在加護病房中,常使用攜帶式X光機獲取影像。然而,現今使用卷積神經網絡(CNNs)來檢測導管位置的方法存在一些問題,首先,目前最先進的方法使用了複雜的模型和後處理算法,降低了整體檢測速度。此外,加護病房中使用的攜帶式X光機可能會有遮擋問題,且影像品質可能較差,進一步影響檢測效果。 本論文的目標有兩個方面。首先是更有效率地檢測導管末端和氣管隆突處,其次是找出一種篩選低質量影像的方法。論文提出的模型架構由YOLACT和大型捲積核特徵強化模塊(LSKFEM)組成。LSKFEM中的LSK模塊通過使用大型捲積核提供更廣闊的視野信息,並利用其中的FEM模塊強化特徵提取能力。在後處理方面,簡化了最先進的算法,以獲得更高的處理速度。 在成功大學附屬醫院提供的數據集上,本論文的方法取得了良好的成果。在預測導管是否合適的準確度上,達到了89.64%的準確率。此外,預測的氣管導管末端到氣管隆突的距離平均誤差為5.172 ± 5.479 mm,預測的導管末端位置平均誤差為4.437 ± 4.729 mm,預測的氣管隆突位置平均誤差為4.192 ± 3.467 mm。此外,系統的每秒處理幀數(FPS)達到了5.3。該論文還使用外部數據集驗證了方法的有效性。總結來說,本文提出了一種改進的方法,可以有效地實時檢測管尖和隆突。它在準確性方面超越了以前的方法,同時解決了加護病房中環境中遇到的特定挑戰。YOLACT和LSKFEM的結合增強了特徵提取能力,簡化的後處理實現了更快的處理速度。
Tracheal intubation plays a crucial role in emergency medical care by assisting patients with respiratory distress. However, ensuring correct tube placement is essential to minimize complications. One method to evaluate tube position is by measuring the distance between the tube tip and the carina, located in the trachea. Portable X-ray machines are commonly used in intensive care units (ICUs) to obtain images for this purpose. However, the existing use of convolutional neural networks (CNNs) to detect tube position presents several challenges. Advanced methods employ complex models and post-processing algorithms, resulting in reduced overall detection speed. Additionally, portable X-ray machines used in ICUs may encounter obstructions and produce lower-quality images, further compromising the accuracy of detection. This paper addresses two main objectives. Firstly, it aims to enhance the efficiency of detecting the tube tip and carina accurately. Secondly, it seeks to develop a method for filtering low-quality images. The proposed model architecture combines YOLACT with a large kernel size feature enhancement module (LSKFEM). Within LSKFEM, the LSK module utilizes large convolutional kernels to provide a broader field of view, while the feature enhancement module (FEM) strengthens feature extraction capabilities. Post-processing techniques are simplified to achieve higher processing speeds. The effectiveness of the proposed method is demonstrated using a dataset provided by the affiliated hospital of National Cheng Kung University. The method achieves a promising accuracy rate of 89.64% in predicting the suitability of tube placement. Furthermore, the average error in predicting the distance between the tube tip and the carina is 5.172 ± 5.479 mm, the average error in predicting the tube tip position is 4.437 ± 4.729 mm, and the average error in predicting the carina position is 4.192 ± 3.467 mm. The system achieves a processing frame rate (FPS) of 5.3 frames per second. The validity of the method is further verified using an external dataset. In summary, this paper presents an improved approach for efficient real-time detection of the tube tip and carina. It surpasses previous methods in terms of accuracy while addressing specific challenges encountered in the ICU setting. The combination of YOLACT and LSKFEM enhances feature extraction capabilities, and the simplified post-processing algorithm enables faster processing speeds.
[1] Matthew R Bentz and Steven L Primack. Intensive care unit imaging. Clinics in chest medicine, 36(2):219–234, 2015.
[2] Daniel Bolya, Chong Zhou, Fanyi Xiao, and Yong Jae Lee. Yolact: Real-time instance segmentation. In Proceedings of the IEEE/CVF international conference on computer vision, pages 9157–9166, 2019.
[3] Zhe Cao, Tomas Simon, Shih-En Wei, and Yaser Sheikh. Realtime multi-person 2d pose estimation using part affinity fields. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 7291–7299, 2017.
[4] Chi-Yeh Chen, Min-Hsin Huang, Yung-Nien Sun, and Chao-Han Lai. Development of automatic endotracheal tube and carina detection on portable supine chest radiographs using artificial intelligence. arXiv preprint arXiv:2206.03017, 2022.
[5] Kai Chen, Jiaqi Wang, Jiangmiao Pang, Yuhang Cao, Yu Xiong, Xiaoxiao Li, Shuyang Sun, Wansen Feng, Ziwei Liu, Jiarui Xu, et al. Mmdetection: Open mmlab detection toolbox and benchmark. arXiv preprint arXiv:1906.07155, 2019.
[6] Bert De Brabandere, Davy Neven, and Luc Van Gool. Semantic instance segmentation with a discriminative loss function. arXiv preprint arXiv:1708.02551, 2017.
[7] Kaiwen Duan, Song Bai, Lingxi Xie, Honggang Qi, Qingming Huang, and Qi Tian. Centernet: Keypoint triplets for object detection. In Proceedings of the IEEE/CVF international conference on computer vision, pages 6569–6578, 2019.
[8] Ross Girshick. Fast r-cnn. In Proceedings of the IEEE international conference on computer vision, pages 1440–1448, 2015.
[9] Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 580–587, 2014.
[10] Kaiming He, Georgia Gkioxari, Piotr Doll ́ar, and Ross Girshick. Mask r-cnn. In Proceedings of the IEEE international conference on computer vision, pages 2961–2969,2017.
[11] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
[12] Zhaojin Huang, Lichao Huang, Yongchao Gong, Chang Huang, and Xinggang Wang. Mask scoring r-cnn. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 6409–6418, 2019.
[13] Tim B Hunter, Mihra S Taljanovic, Pei H Tsau, William G Berger, and James R Standen. Medical devices of the chest. Radiographics, 24(6):1725–1746, 2004.
[14] Heui Chul Jung, Changjin Kim, Jaehoon Oh, Tae Hyun Kim, Beomgyu Kim, Juncheol Lee, Jae Ho Chung, Hayoung Byun, Myeong Seong Yoon, and Dong Keon Lee. Position classification of the endotracheal tube with automatic segmentation of the trachea and the tube on plain chest radiography using deep convolutional neural network. Journal of Personalized Medicine, 12(9):1363, 2022.
[15] Tao Kong, Fuchun Sun, Huaping Liu, Yuning Jiang, Lei Li, and Jianbo Shi. Foveabox: Beyound anchor-based object detection. IEEE Transactions on Image Processing, 29:7389–7398, 2020.
[16] Paras Lakhani. Deep convolutional neural networks for endotracheal tube position and x-ray image classification: challenges and opportunities. Journal of digital imaging,30:460–468, 2017.
[17] Paras Lakhani, Adam Flanders, and Richard Gorniak. Endotracheal tube position assessment on chest radiographs using deep learning. Radiology: Artificial Intelligence, 3(1):e200026, 2020.
[18] Hei Law and Jia Deng. Cornernet: Detecting objects as paired keypoints. In Proceedings of the European conference on computer vision (ECCV), pages 734–750, 2018.
[19] Yuxuan Li, Qibin Hou, Zhaohui Zheng, Ming-Ming Cheng, Jian Yang, and Xiang Li. Large selective kernel network for remote sensing object detection. arXiv preprint arXiv:2303.09030, 2023.
[20] Tsung-Yi Lin, Piotr Doll ́ar, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. Feature pyramid networks for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2117–2125, 2017.
[21] Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Doll ́ar. Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision, pages 2980–2988, 2017.
[22] Shu Liu, Jiaya Jia, Sanja Fidler, and Raquel Urtasun. Sgn: Sequential grouping networks for instance segmentation. In Proceedings of the IEEE international conference on computer vision, pages 3496–3504, 2017.
[23] Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng Yang Fu, and Alexander C Berg. Ssd: Single shot multibox detector. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14, pages 21–37. Springer, 2016.
[24] Wei Liu, Irtiza Hasan, and Shengcai Liao. Center and scale prediction: Anchor-free approach for pedestrian and face detection. arXiv preprint arXiv:1904.02948, 2019.
[25] Jonathan Long, Evan Shelhamer, and Trevor Darrell. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3431–3440, 2015.
[26] Debapriya Maji, Soyeb Nagori, Manu Mathew, and Deepak Poddar. Yolo-pose: Enhancing yolo for multi person pose estimation using object keypoint similarity loss. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2637–2646, 2022.
[27] Liang-Kai Mao, Min-Hsin Huang, Chao-Han Lai, Yung-Nien Sun, and Chi-Yeh Chen. Detecting endotracheal tube and carina on portable supine chest radiographs using one-stage detector with a coarse-to-fine attention. Diagnostics, 12(8):1913, 2022.
[28] Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 779–788, 2016.
[29] Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems, 28, 2015.
[30] Mingxing Tan and Quoc Le. Efficientnet: Rethinking model scaling for convolutional neural networks. In International conference on machine learning, pages 6105–6114. PMLR, 2019.
[31] Zhi Tian, Chunhua Shen, Hao Chen, and Tong He. Fcos: Fully convolutional one-stage object detection. In Proceedings of the IEEE/CVF international conference on computer vision, pages 9627–9636, 2019.
[32] Manu Varshney, Kavita Sharma, Rakesh Kumar, and Preeti G Varshney. Appropriate depth of placement of oral endotracheal tube and its possible determinants in indian adult patients. Indian journal of anaesthesia, 55(5):488, 2011.
[33] Jiaqi Wang, Kai Chen, Shuo Yang, Chen Change Loy, and Dahua Lin. Region proposal by guided anchoring. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2965–2974, 2019.
[34] Xinlong Wang, Tao Kong, Chunhua Shen, Yuning Jiang, and Lei Li. Solo: Segmenting objects by locations. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XVIII 16, pages 649–665. Springer, 2020.
[35] Enze Xie, Peize Sun, Xiaoge Song, Wenhai Wang, Xuebo Liu, Ding Liang, Chunhua Shen, and Ping Luo. Polarmask: Single shot instance segmentation with polar representation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12193–12202, 2020.
[36] Ze Yang, Shaohui Liu, Han Hu, Liwei Wang, and Stephen Lin. Reppoints: Point set representation for object detection. In Proceedings of the IEEE/CVF international conference on computer vision, pages 9657–9666, 2019.
[37] Hui Ying, Zhaojin Huang, Shu Liu, Tianjia Shao, and Kun Zhou. Embedmask: Embedding coupling for one-stage instance segmentation. arXiv preprint arXiv:1912.01954, 2019.
[38] Shifeng Zhang, Cheng Chi, Yongqiang Yao, Zhen Lei, and Stan Z Li. Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9759–9768, 2020.
[39] Chong Zhou. Yolact++ Better Real-Time Instance Segmentation. University of California, Davis, 2020.
[40] Xingyi Zhou, Jiacheng Zhuo, and Philipp Krahenbuhl. Bottom-up object detection by grouping extreme and center points. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 850–859, 2019.
[41] Chenchen Zhu, Yihui He, and Marios Savvides. Feature selective anchor-free module for single-shot object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 840–849, 2019.