| 研究生: |
林星宇 Lin, Xing-Yu |
|---|---|
| 論文名稱: |
基於資料毀損之資料增強方式及通道空間注意力機制應用於跨域少樣本學習 Data Augmentation Based on Data Corruption and Channel Spatial Attention Mechanism for Cross-Domain Few-Shot Learning |
| 指導教授: |
楊竹星
Yang, Chu-Sing |
| 共同指導教授: |
謝錫堃
Shieh, Ce-Kuen |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
| 論文出版年: | 2023 |
| 畢業學年度: | 111 |
| 語文別: | 中文 |
| 論文頁數: | 44 |
| 中文關鍵詞: | 跨域少樣本學習 、通道空間注意力機制 、圖像損毀 |
| 外文關鍵詞: | cross-domain few-shot learning, attention mechanism, image corruption |
| 相關次數: | 點閱:87 下載:4 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
傳統的監督式學習中,每當有新的類別要加入時,往往都必須要重新訓練整個網路架構,並隨著資料量越來越大,訓練時間以及記憶體也不斷上升。近年來,少樣本分類方法被提出用來解決這樣的問題,讓模型能夠辨識出沒有訓練過的類別,使得模型不需要重新訓練,降低訓練成本。然而,在少樣本分類的工作下,必須先使用相同類型的資料集進行訓練,若將其用來辨識不同資料集的類別,效果將不如預期。因此,最近有研究提出跨域少樣本學習,希望能夠訓練單一的網路模型,能夠通用於大部分的資料集內,訓練時,通常分為兩個階段:1)預訓練階段以及 2)元訓練階段。預訓練階段主要透過傳統監督式學習訓練特徵提取器,作為提取特徵的功能;元訓練階段則透過少樣本分類方法,利用元學習算法訓練分類器,將特徵提取器所提取的特徵進行分類。本研究在預訓練階段時,在特徵提取器內加入通道以及空間注意力機制,藉此提升特徵提取能力;而在元訓練階段時,加入基於傅立葉轉換之圖像損毀算法的樣本,轉換算法中,根據傅立葉轉換後得到的相位以及振幅進行縮放,改變圖像的紋理以及語義,使得分類器學習更多元的影像,提升其泛化能力。與過往相同設置的研究下,比較模型算法的準確率,透過實驗表明,本研究所採用的方法能夠在大部分的測試集中,達到更好的性能。
In the cross-domain few-shot learning task, this study employs a two-stage training approach: 1) pre-training stage and 2) meta-training stage. During the pre-training stage, ResNet10 is utilized as the feature extractor, and spatial channel attention mechanisms are incorporated into each Residual block to enhance feature extraction performance. In the meta-training stage, apart from the original few-shot classification, additional samples generated by a Fourier-transform-based image corruption algorithm are employed for training to improve the classifier's generalization capability.Through experimentation and comparison with various datasets from previous studies, this research demonstrates that our method outperforms others on the majority of datasets.
[1] S. Ravi and H. Larochelle, “Optimization as a model for few-shot learning,” in International Conference on Learning Representations, 2017.
[2] S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon, “Cbam: Convolutional block attention module,” in Proceedings of the European Conference on Computer Vision (ECCV), September 2018.
[3] S. Calderon-Ramirez, S. Yang, and D. Elizondo, “Semisupervised deep learning for image classification with distribution mismatch: A survey,” IEEE Transactions on Artificial Intelligence, vol. 3, no. 6, pp. 1015–1029, 2022.
[4] Y. Song, T. Wang, P. Cai, S. K. Mondal, and J. P. Sahoo, “A comprehensive survey of few-shot learning: Evolution, applications, challenges, and opportunities,” ACM Comput. Surv., vol. 55, no. 13s, jul 2023.
[5] P. Bachman, A. Sordoni, and A. Trischler, “Learning algorithms for active learning,” in Proceedings of the 34th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, D. Precup and Y. W. Teh, Eds., vol. 70. PMLR, 06–11 Aug 2017, pp. 301–310.
[6] P. Welinder, S. Branson, T. Mita, C. Wah, F. Schroff, S. Belongie, and P. Perona, “The caltech-ucsd birds-200-2011 dataset,” California Inst. Technol., Pasadena, CA, USA, Tech. Rep. CaltechAUTHORS, pp. 20 111 026–120 541 847, 2011.
[7] Y. Guo, N. C. Codella, L. Karlinsky, J. V. Codella, J. R. Smith, K. Saenko, T. Rosing, and R. Feris, “A broader study of cross-domain few-shot learning,” in Computer Vision– ECCV 2020, A. Vedaldi, H. Bischof, T. Brox, and J.-M. Frahm, Eds. Cham: Springer International Publishing, 2020, pp. 124–141.
[8] H.-Y. Tseng, H.-Y. Lee, J.-B. Huang, and M.-H. Yang, “Cross-domain few-shot classification via learned feature-wise transformation,” in International Conference on Learning Representations, 2020.
[9] N. C. F. Codella, D. Gutman, M. E. Celebi, B. Helba, M. A. Marchetti, S. W. Dusza, A. Kalloo, K. Liopyris, N. Mishra, H. Kittler, and A. Halpern, “Skin lesion analysis toward melanoma detection: A challenge at the 2017 international symposium on biomedical imaging (isbi), hosted by the international skin imaging collaboration (isic),” in 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), 2018, pp. 168–172.
[10] X. Wang, Y. Peng, L. Lu, Z. Lu, M. Bagheri, and R. M. Summers, “Chestx-ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Los Alamitos, CA, USA: IEEE Computer Society, jul 2017, pp. 3462–3471.
[11] S. Benaim and L. Wolf, “One-shot unsupervised cross domain translation,” advances in neural information processing systems, vol. 31, 2018.
[12] H. Qi, M. Brown, and D. G. Lowe, “Low-shot learning with imprinted weights,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018.
[13] B. M. Lake, R. Salakhutdinov, and J. B. Tenenbaum, “Human-level concept learning through probabilistic program induction,” Science, vol. 350, no. 6266, pp. 1332–1338, 2015.
[14] Y. Zhang, H. Tang, and K. Jia, “Fine-grained visual categorization using meta-learning optimization with sample selection of auxiliary data,” in Proceedings of the European Conference on Computer Vision (ECCV), September 2018.
[15] E. Schwartz, L. Karlinsky, J. Shtok, S. Harary, M. Marder, A. Kumar, R. Feris, R. Giryes, and A. Bronstein, “Delta-encoder: an effective sample synthesis method for few-shot object recognition,” in Advances in Neural Information Processing Systems, S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, Eds., vol. 31. Curran Associates, Inc., 2018.
[16] T. Pfister, J. Charles, and A. Zisserman, “Domain-adaptive discriminative one-shot learning of gestures,” in Computer Vision – ECCV 2014, D. Fleet, T. Pajdla, B. Schiele, and T. Tuytelaars, Eds. Cham: Springer International Publishing, 2014, pp. 814–829.
[17] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial networks,” Commun. ACM, vol. 63, no. 11, p. 139–144, oct 2020.
[18] Z. Luo, Y. Zou, J. Hoffman, and L. F. Fei-Fei, “Label efficient learning of transferable representations acrosss domains and tasks,” in Advances in Neural Information Processing Systems, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, Eds., vol. 30. Curran Associates, Inc., 2017.
[19] G. Koch, R. Zemel, R. Salakhutdinov et al., “Siamese neural networks for one-shot image recognition,” in ICML deep learning workshop, vol. 2, no. 1. Lille, 2015.
[20] O. Vinyals, C. Blundell, T. Lillicrap, k. kavukcuoglu, and D. Wierstra, “Matching networks for one shot learning,” in Advances in Neural Information Processing Systems, D. Lee, M. Sugiyama, U. Luxburg, I. Guyon, and R. Garnett, Eds., vol. 29. Curran Associates, Inc., 2016.
[21] J. Snell, K. Swersky, and R. Zemel, “Prototypical networks for few-shot learning,” in Advances in Neural Information Processing Systems, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, Eds., vol. 30. Curran Associates, Inc., 2017.
[22] S. Arik, J. Chen, K. Peng, W. Ping, and Y. Zhou, “Neural voice cloning with a few samples,” in Advances in Neural Information Processing Systems, S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, Eds., vol. 31. Curran Associates, Inc., 2018.
[23] J. Kozerawski and M. Turk, “Clear: Cumulative learning for one-shot one-class image recognition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018.
[24] Y. Lee and S. Choi, “Gradient-based meta-learning with learned layerwise metric and subspace,” in Proceedings of the 35th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, J. Dy and A. Krause, Eds., vol. 80. PMLR, 10–15 Jul 2018, pp. 2927–2936.
[25] Y. Wu, Y. Lin, X. Dong, Y. Yan, W. Ouyang, and Y. Yang, “Exploit the unknown gradually: One-shot video-based person re-identification by stepwise learning,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018.
[26] R. Caruana, “Multitask learning,” Machine learning, vol. 28, pp. 41–75, 1997.
[27] Y. Zhang and Q. Yang, “A survey on multi-task learning,” IEEE Transactions on Knowledge and Data Engineering, vol. 34, no. 12, pp. 5586–5609, 2021.
[28] M. Fink, “Object classification from a single example utilizing class relevance metrics,” in Advances in Neural Information Processing Systems, L. Saul, Y. Weiss, and L. Bottou, Eds., vol. 17. MIT Press, 2004.
[29] M. Ren, E. Triantafillou, S. Ravi, J. Snell, K. Swersky, J. B. Tenenbaum, H. Larochelle, and R. S. Zemel, “Meta-learning for semi-supervised few-shot classification,” arXiv preprint arXiv:1803.00676, 2018.
[30] V. G. Satorras and J. B. Estrach, “Few-shot learning with graph neural networks,” in International Conference on Learning Representations, 2018.
[31] S. Ravi and A. Beatson, “Amortized bayesian meta-learning,” in International Conference on Learning Representations, 2019.
[32] J. Cai, B. Y. Cai, and S. M. Shen, “Sb-mtl: Score-based meta transfer-learning for cross-domain few-shot learning,” ArXiv, vol. abs/2012.01784, 2020.
[33] H.-Y. Tseng, H.-Y. Lee, J.-B. Huang, and M.-H. Yang, “Cross-domain few-shot classification via learned feature-wise transformation,” in International Conference on Learning Representations, 2020.
[34] J. Sun, S. Lapuschkin, W. Samek, Y. Zhao, N.-M. Cheung, and A. Binder, “Explanationguided training for cross-domain few-shot classification,” in 2020 25th International Conference on Pattern Recognition (ICPR), 2021, pp. 7609–7616.
[35] H. Wang and Z.-H. Deng, “Cross-domain few-shot classification via adversarial task augmentation,” in Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, Z.-H. Zhou, Ed. International Joint Conferences on Artificial Intelligence Organization, 8 2021, pp. 1075–1081.
[36] E. Rahimian, S. Zabihi, A. Asif, S. F. Atashzar, and A. Mohammadi, “Trustworthy adaptation with few-shot learning for hand gesture recognition,” in 2021 IEEE International Conference on Autonomous Systems (ICAS), 2021, pp. 1–5.
[37] A. Paul, T. C. Shen, Y. Peng, Z. Lu, and R. M. Summers, “Learning few-shot chest x-ray diagnosis using images from the published scientific literature,” in 2021 IEEE 18th International Symposium on Biomedical Imaging (ISBI), 2021, pp. 344–348.
[38] Y. Kim, D. Kang, Y. Mok, S. Kwon, and J. Paik, “A review on few-shot learning for medical image segmentation,” in 2023 International Conference on Electronics, Information, and Communication (ICEIC), 2023, pp. 1–3.
[39] Y. Wang, Z. Xu, J. Tian, J. Luo, Z. Shi, Y. Zhang, J. Fan, and Z. He, “Cross-domain few-shot learning for rare-disease skin lesion segmentation,” in ICASSP 2022 - 2022IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022, pp. 1086–1090.
[40] D. Hendrycks and T. Dietterich, “Benchmarking neural network robustness to common corruptions and perturbations,” in International Conference on Learning Representations, 2019.
[41] I. Cugu, M. Mancini, Y. Chen, and Z. Akata, “Attention consistency on visual corruptions for single-source domain generalization,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, June 2022, pp. 4165–4174.
[42] J. Krause, M. Stark, J. Deng, and L. Fei-Fei, “3d object representations for fine-grained categorization,” in Proceedings of the IEEE International Conference on Computer Vision (ICCV) Workshops, June 2013.
[43] B. Zhou, A. Lapedriza, A. Khosla, A. Oliva, and A. Torralba, “Places: A 10 million image database for scene recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 6, pp. 1452–1464, 2018.
[44] G. Van Horn, O. Mac Aodha, Y. Song, Y. Cui, C. Sun, A. Shepard, H. Adam, P. Perona, and S. Belongie, “The inaturalist species classification and detection dataset,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018.
[45] S. P. Mohanty, D. P. Hughes, and M. Salathé, “Using deep learning for image-based plant disease detection,” Frontiers in Plant Science, vol. 7, 2016.
[46] P. Helber, B. Bischke, A. Dengel, and D. Borth, “Introducing eurosat: A novel dataset and deep learning benchmark for land use and land cover classification,” in IGARSS 2018 - 2018 IEEE International Geoscience and Remote Sensing Symposium, 2018, pp. 204–207.
[47] X. Wang, Y. Peng, L. Lu, Z. Lu, M. Bagheri, and R. M. Summers, “Chestx-ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017.