研究生: |
仲唯琪 Chung, Wei-Chi |
---|---|
論文名稱: |
基於遷移學習之跨域痤瘡疤痕分類方法 Cross-Domain Acne Scar Classification Method based on Transfer Learning Approach |
指導教授: |
陳牧言
Chen, Mu-Yen |
學位類別: |
碩士 Master |
系所名稱: |
工學院 - 工程科學系 Department of Engineering Science |
論文出版年: | 2024 |
畢業學年度: | 112 |
語文別: | 中文 |
論文頁數: | 60 |
中文關鍵詞: | 遷移學習 、影像分類 、目標檢測 、深度學習 |
外文關鍵詞: | Transfer learning, Image classification, Object detection, Deep learning |
相關次數: | 點閱:81 下載:11 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
據全球疾病負擔統計(Global Burden of Disease, GBD),全球有 9.4% 人口為尋常性痤瘡所苦,為世界排行第八的流行病。因其好發於 12 至 24 歲之青少年,又俗稱青春痘。超過 90% 青少年有青春痘的困擾,皮膚科門診一號難求,並非所有人都能及時得到完善治療。隨著網路迅速發展,也有許多患者選擇將皮膚狀況上傳社群媒體尋求診斷,急於求成的病人若誤信偏方,對患部採取不當措施,反而會留下痤瘡疤痕。
因此,本研究提出一套可自動化擷取痤瘡疤痕區域與分類的方法,透過人工智慧自動偵測疤痕區塊並預測其類別,輔助患者自我評估病況,立即採取正確的治療方法,即可避免病情惡化,早日擺脫痘疤陰霾。本研究架構分為三個階段,第一階段為訓練目標檢測模型 YOLOv8(You Only Look Once version 8) 來提取 ROI(Region of Interest) ,框出影像中的患部範圍,既保留疤痕特徵、也去除可能干擾訓練的物體。如此一來可確保影像均涵括關鍵特徵,在後續資料增強時能應用更靈活的策略。第二階段將所提取出的 ROI(Region of Interest) 進行資料增強,運用多種影像轉換方法提供模型更多有效資料,也比較了離線增強(offline augmentation)與線上增強(online augmentation)對結果的影響。最終階段為遷移式學習(Transfer Learning, TL),由於本論文使用之資料集樣本數有限,故採取對 VGG16 、 ResNet50 、 SwinTransformer-Tiny 和 ConvNeXt-Tiny 等預訓練模型進行微調的方法。因 ImageNet 與 ScarNet 資料集間的相似度較低,屬於跨域(cross-domain)的問題,本論文採用餘弦相似度分類器來提高模型分類準確率,實驗結果表明,線上增強搭配 ConvNeXt-Tiny 和線性分類器的組合達到了 75% 的最高準確率。
According to GBD statistics, 9.4% of the world's population suffers from acne vulgaris, making it the eighth most prevalent disease globally. It most commonly affects teenagers aged 12 to 24, with over 90% experiencing acne. Due to high demand, dermatology clinics are often difficult to access, and not everyone receives timely treatment. With the rapid development of the Internet, many patients turn to social media to seek diagnosis by uploading their skin conditions. However, the mix of true and false information on these platforms can lead patients to take inappropriate measures, worsening their condition.
This study proposes an automated process for capturing and classifying acne scar areas using deep learning. The goal is to help patients self-assess and take correct treatment measures promptly, preventing the condition from worsening and aiding in quicker recovery from acne scars. The research architecture is divided into three stages. The first stage involves training the object detection model YOLOv8 (You Only Look Once version 8) to extract the region of interest (ROI) and frame the affected area in images. This approach retains scar features and removes objects that may interfere with training, ensuring that all images contain key features for more effective data augmentation.
In the second stage, the extracted ROIs are enhanced using various image augmentation methods, providing the model with more effective data. The study compares the impact of offline and online augmentation on the results.
The final stage employs transfer learning due to the limited number of samples in the ScarNet dataset. Pre-trained models such as VGG16, ResNet50, SwinTransformer-Tiny, and ConvNeXt-Tiny are fine-tuned to improve feature extraction and classification accuracy. To address the cross-domain issue between ImageNet and ScarNet, a cosine similarity classifier is used. Experimental results show that combining online augmentation with the ConvNeXt-Tiny model and a linear classifier achieves the highest accuracy of 75%.
[1] 侯沂利, 吳佳縈, 曾涵琪, 劉奎蘭, 藍國忠, 周則中, 徐國芳, 鄭振球, 楊南屏, 賴正, 王必勝, 李建德, and 李志宏. 遠距實時視訊會診醫療在偏鄉地區之應用:高雄長庚紀念醫院皮膚科的經驗分享. 臺灣醫界, 66(2):47–49, Feb 2023.
[2] Martín Abadi, Ashish Agarwal, Paul Barham, et al. TensorFlow: Large-scale machine learning on heterogeneous systems, 2015. Software available from tensorflow.org.
[3] Jordan Aguilar, Diego Benítez, Noel Peréz, Jorge Estrella-Porter, Mikaela Camacho, Maria Viteri, Paola Yépez, and Jonathan Guillerno. Towards the development of an acne-scar risk assessment tool using deep learning. In 2022 IEEE International Autumn Meeting on Power, Electronics and Computing (ROPEC), volume 6, pages 1–6, 2022.
[4] Yusun Ahn, Haneul Choi, and Byungseon Sean Kim. Development of early fire detection model for buildings using computer vision-based cctv. Journal of Building Engineering, 65:105647, 2023.
[5] Jason Ansel, Edward Yang, Horace He, et al. PyTorch 2: Faster Machine Learning Through Dynamic Python Bytecode Transformation and Graph Compilation. In 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2 (ASPLOS ’24). ACM, April 2024.
[6] K Bhate and HC Williams. Epidemiology of acne vulgaris. British Journal of Dermatology, 168(3):474–485, 2013.
[7] Wei-Yu Chen, Yen-Cheng Liu, Zsolt Kira, Yu-Chiang Frank Wang, and Jia-Bin Huang. A closer look at few-shot classification. arXiv preprint arXiv:1904.04232, 2019.
[8] Annelise L Dawson and Robert P Dellavalle. Acne vulgaris. Bmj, 346, 2013.
[9] Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, pages 248–255, 2009.
[10] Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
[11] Brigitte Dreno and Florence Poli. Epidemiology of acne. Dermatology, 206(1):7–10, 2003.
[12] Chengjian Feng, Yujie Zhong, Yu Gao, Matthew R Scott, and Weilin Huang. Tood: Task-aligned one-stage object detection. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 3490–3499. IEEE Computer Society, 2021.
[13] Spyros Gidaris and Nikos Komodakis. Dynamic few-shot visual learning without forgetting. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4367–4375, 2018.
[14] Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 580–587, 2014.
[15] Yunhui Guo, Noel C Codella, Leonid Karlinsky, James V Codella, John R Smith, Kate Saenko, Tajana Rosing, and Rogerio Feris. A broader study of cross-domain few-shot learning. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXVII 16, pages 124–141. Springer, 2020.
[16] Abhishek Gupta, Alagan Anpalagan, Ling Guan, and Ahmed Shaharyar Khwaja. Deep learning for object detection and scene perception in self-driving cars: Survey, challenges, and open issues. Array, 10:100057, 2021.
[17] Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick. Mask r-cnn. In Proceedings of the IEEE international conference on computer vision, pages 2961–2969, 2017.
[18] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
[19] Quan Thanh Huynh, Phuc Hoang Nguyen, Hieu Xuan Le, Lua Thi Ngo, Nhu-Thuy Trinh, Mai Thi-Thanh Tran, Hoan Tam Nguyen, Nga Thi Vu, Anh Tam Nguyen, Kazuma Suda, et al. Automatic acne object detection and acne severity grading using smartphone images and artificial intelligence. Diagnostics, 12(8):1879, 2022.
[20] Glenn Jocher. YOLOv5 by Ultralytics, May 2020.
[21] Glenn Jocher, Ayush Chaurasia, and Jing Qiu. Ultralytics YOLO, January 2023.
[22] Masum Shah Junayed, Md Baharul Islam, Afsana Ahsan Jeny, Arezoo Sadeghzadeh, Topu Biswas, and AFM Shahen Shah. Scarnet: development and validation of a novel deep cnn model for acne scar classification with a new dataset. IEEE Access, 10:1245–1258, 2021.
[23] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25, 2012.
[24] AM Layton, CA Henderson, and WJ Cunliffe. A clinical evaluation of acne scarring and its incidence. Clinical and experimental dermatology, 19(4):303–308, 1994.
[25] Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
[26] Ziying Vanessa Lim, Farhan Akram, Cuong Phuc Ngo, Amadeus Aristo Winarto, Wei Qing Lee, Kaicheng Liang, Hazel Hweeboon Oon, Steven Tien Guan Thng, and Hwee Kuan Lee. Automated grading of acne vulgaris by deep learning with convolutional neural networks. Skin Research and Technology, 26(2):187–192, 2020.
[27] Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, ChengYang Fu, and Alexander C Berg. Ssd: Single shot multibox detector. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14, pages 21–37. Springer, 2016.
[28] Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision, pages 10012–10022, 2021.
[29] Zhuang Liu, Hanzi Mao, Chao-Yuan Wu, Christoph Feichtenhofer, Trevor Darrell, and Saining Xie. A convnet for the 2020s. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11976–11986, 2022.
[30] Ethan H Nguyen, Haichun Yang, Ruining Deng, Yuzhe Lu, Zheyu Zhu, Joseph T Roland, Le Lu, Bennett A Landman, Agnes B Fogo, and Yuankai Huo. Circle representation for medical object detection. IEEE transactions on medical imaging, 41(3):746–754, 2021.
[31] NVIDIA. NVIDIA Docs Hub. https://docs.nvidia.com/tao/tao-toolkit/text/data_services/augment.html, 2023.
[32] AHMET ÖZTÜRK, ERDEM Deveci, ERMAN BAĞCIOĞLU, Figen Atalay, and Zehra Serdar. Anxiety, depression, social phobia, and quality of life in turkish patients with acne and their relationships with the severity of acne. Turkish Journal of Medical Sciences, 43(4):660–666, 2013.
[33] Sinno Jialin Pan and Qiang Yang. A survey on transfer learning. IEEE Transactions on knowledge and data engineering, 22(10):1345–1359, 2009.
[34] RangeKing. Brief summary of YOLOv8 model structure 189. https://github.com/RangeKing, January 2023.
[35] Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 779–788, 2016.
[36] Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
[37] A. Skin. Dermnet nz. https://dermnetnz.org/, 2021. Accessed: Oct. 15, 2020.
[38] Murray B Stein and Dan J Stein. Social anxiety disorder. The lancet, 371(9618):1115–1125, 2008.
[39] Yaniv Taigman, Ming Yang, Marc’Aurelio Ranzato, and Lior Wolf. Deepface: Closing the gap to human-level performance in face verification. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1701–1708, 2014.
[40] Chuanqi Tan, Fuchun Sun, Tao Kong, Wenchang Zhang, Chao Yang, and Chunfang Liu. A survey on deep transfer learning. In Artificial Neural Networks and Machine Learning–ICANN 2018: 27th International Conference on Artificial Neural Networks, Rhodes, Greece, October 4-7, 2018, Proceedings, Part III 27, pages 270–279. Springer, 2018.
[41] Jerry Tan, Stefan Beissert, Fran Cook-Bolden, Rajeev Chavda, Julie Harper, Adelaide Hebert, Edward Lain, Alison Layton, Marco Rocha, Jonathan Weiss, et al. Evaluation of psychological well-being and social impact of atrophic acne scarring: A multinational, mixed-methods study. JAAD international, 6:43–50, 2022.
[42] Jerry KL Tan and Ketaki Bhate. A global perspective on the epidemiology of acne. British Journal of Dermatology, 172(S1):3–12, 2015.
[43] Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, et al. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971, 2023.
[44] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. Advances in neural information processing systems, 30, 2017.
[45] Chien-Yao Wang, Alexey Bochkovskiy, and Hong-Yuan Mark Liao. Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 7464–7475, 2023.
[46] Zhe Wang, Yanxin Yin, Jianping Shi, Wei Fang, Hongsheng Li, and Xiaogang Wang. Zoom-in-net: Deep mining lesions for diabetic retinopathy detection. In Medical Image Computing and Computer Assisted Intervention- MICCAI 2017: 20th International Conference, Quebec City, QC, Canada, September 11-13, 2017, Proceedings, Part III 20, pages 267–275. Springer, 2017.
[47] Xin Wei, Lei Zhang, Jianwei Zhang, Junyou Wang, Wenjie Liu, Jiaqi Li, and Xian Jiang. Decoupled sequential detection head for accurate acne detection. Knowledge-Based Systems, 284:111305, 2024.
[48] Xiaoping Wu, Ni Wen, Jie Liang, Yu-Kun Lai, Dongyu She, Ming-Ming Cheng, and Jufeng Yang. Joint acne image grading and counting via label distribution learning. In Proceedings of the IEEE/CVF international conference on computer vision, pages 10642–10651, 2019.
[49] Xiaomei Zhao, Yihong Wu, Guidong Song, Zhenye Li, Yazhuo Zhang, and Yong Fan. A deep learning model integrating fcnns and crfs for brain tumor segmentation. Medical image analysis, 43:98–111, 2018.
[50] Zhengxia Zou, Keyan Chen, Zhenwei Shi, Yuhong Guo, and Jieping Ye. Object detection in 20 years: A survey. Proceedings of the IEEE, 111(3):257–276, 2023.