| 研究生: |
施啟瑞 Shih, Chi-Ruei |
|---|---|
| 論文名稱: |
基於感知區域的少樣本分類資料增強方法 A Data Augmentation Method: Attentive Region Assignment Mix for Few-Shot Classification |
| 指導教授: |
蔡佩璇
Tsai, Pei-Hsuan |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 製造資訊與系統研究所 Institute of Manufacturing Information and Systems |
| 論文出版年: | 2022 |
| 畢業學年度: | 110 |
| 語文別: | 中文 |
| 論文頁數: | 31 |
| 中文關鍵詞: | 資料增強 、少樣本學習 、圖像分類 |
| 外文關鍵詞: | data augmentation, few-shot learning, image classification |
| 相關次數: | 點閱:52 下載:2 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著技術的發展,在終端設備上的建立訓練模型已成為趨勢。由於這些設備獲得的樣本場景單一且侷限,少樣本學習模型有效利用有限的資料並迅速適應新資訊的方式引起了人們的興趣。儘管現有的少樣本學習方法在分類訓練集中的類別表現出色,但這些策略並不能有效地概括到未見過的新樣本,因為該模型沒有為每個樣本學習足夠的關鍵判別特徵。
為此,本研究提出了一種新的簡單資料增強方法 ARAMix,該方法在幾乎沒有增加訓練成本的情況下提供具有競爭力的表現。ARAMix 嵌入於模型的中間隱藏層中,並參考學習到的空間特徵生成的多尺度感知圖,來做樣本混和生成新樣本以增加資料的多樣性,這可以最大程度地避免引入雜訊並提高學習效率。值得注意的是,在提高準確度的同時,ARAMix 不需要預訓練的模型,也不需要持續提取特徵來獲得代表特徵區域,且適用於各種網路中。這幾乎不需要額外訓練成本的特性相較於複雜的方法更適合硬體效能有限的終端設備。
本研究將 ARAMix 添加至現有作為 state-of-the-art(SOTA)方法之一的架構中,並在三個基準資料及上進行了驗證。實驗表明加入了 ARAMix 後在 CifarFS、mini-ImageNet 以及 FC100 資料集中皆提升了準確度,證實提出方法的有效性。
Recently, adaptable and intelligent technologies are increasingly used in many different fields. The construction of training models on these terminal devices has become a growing trend, considering data privacy and communication efficacy. However, the amount of data produced by a single device in a particular scene is quite limited and the samples are sparse. Accordingly, few-shot learning, the way in which a deep learning model makes effective use of limited data and rapidly adapts to new information, has aroused increasing interest. Although existing few-shot learning approaches do exceptionally well at classifying categories within the training dataset, these strategies do not generalize effectively to unseen new datasets, since the model does not learn enough critical discriminative features for each sample. Therefore, this research proposes a new simple data augmentation method, ARAMix, which offers competitive performance with minimally increased training costs. ARAMix is embedded in the hidden layer of the model, which refers to the spatial feature representations to generate new samples to increase the diversity of data. This can maximize the limited data in the few-shot classification and improve learning efficiency. Notably, while enhancing accuracy, ARAMix does not necessitate pretrained models and does not require ongoing feature extraction to obtain discriminative feature regions. These characteristics are more appropriate for terminal devices with limited hardware performance than complex methods. And the approach is applicable to various training networks. In this paper, the ARAMix method is added to the existing architecture for few-shot classification. Experiments are performed and the accuracy is enhanced on few-shot benchmark datasets including miniImageNet, CifarFS, and FC100.
[1] C. Fan and J. Huang, "Federated Few-Shot Learning with Adversarial Learning," 2021 19th International Symposium on Modeling and Optimization in Mobile, Ad hoc, and Wireless Networks (WiOpt), pp. 1-8, 18-21 Oct. 2021 2021, doi: 10.23919/WiOpt52861.2021.9589192.
[2] Y. Song, T. Wang, S. K. Mondal, and J. P. Sahoo, "A Comprehensive Survey of Fewshot Learning: Evolution, Applications, Challenges, and Opportunities," arXiv preprint arXiv:2205.06743, 2022.
[3] W.-Y. Chen, Y.-C. Liu, Z. Kira, Y.-C. F. Wang, and J.-B. Huang, "A closer look at fewshot classification," arXiv preprint arXiv:1904.04232, 2019.
[4] Y. Wang, W.-L. Chao, K. Q. Weinberger, and L. van der Maaten, "Simpleshot: Revisiting nearest-neighbor classification for few-shot learning," arXiv preprint arXiv:1911.04623, 2019.
[5] Y. Guo et al., "A Broader Study of Cross-Domain Few-Shot Learning," in Computer Vision – ECCV 2020, Cham, A. Vedaldi, H. Bischof, T. Brox, and J.-M. Frahm, Eds., 2020// 2020: Springer International Publishing, pp. 124-141.
[6] S. Gidaris, P. Singh, and N. Komodakis, "Unsupervised representation learning by predicting image rotations," arXiv preprint arXiv:1803.07728, 2018.
[7] D. Chen, Y. Chen, Y. Li, F. Mao, Y. He, and H. Xue, "Self-supervised learning for fewshot image classification," in ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021: IEEE, pp. 1745-1749.
[8] J. Liu, F. Chao, and C.-M. Lin, "Task augmentation by rotating for meta-learning," arXiv preprint arXiv:2003.00804, 2020.
[9] P. Mangla, N. Kumari, A. Sinha, M. Singh, B. Krishnamurthy, and V. N. Balasubramanian, "Charting the right manifold: Manifold mixup for few-shot learning," in Proceedings of the IEEE/CVF winter conference on applications of computer vision, 2020, pp. 2218-2227.
[10] C. Zhang, Y. Cai, G. Lin, and C. Shen, "Deepemd: Few-shot image classification with differentiable earth mover's distance and structured classifiers," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 12203-12213.29
[11] H. Zhang, M. Cisse, Y. N. Dauphin, and D. Lopez-Paz, "mixup: Beyond empirical risk minimization," arXiv preprint arXiv:1710.09412, 2017.
[12] V. Verma et al., "Manifold mixup: Better representations by interpolating hidden states," in International Conference on Machine Learning, 2019: PMLR, pp. 6438-6447.
[13] S. Yun, D. Han, S. J. Oh, S. Chun, J. Choe, and Y. Yoo, "Cutmix: Regularization strategy to train strong classifiers with localizable features," in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 6023-6032.
[14] Z. Zhong, L. Zheng, G. Kang, S. Li, and Y. Yang, "Random erasing data augmentation," in Proceedings of the AAAI conference on artificial intelligence, 2020, vol. 34, no. 07, pp. 13001-13008.
[15] T. DeVries and G. W. Taylor, "Improved regularization of convolutional neural networks with cutout," arXiv preprint arXiv:1708.04552, 2017.
[16] D. Walawalkar, Z. Shen, Z. Liu, and M. Savvides, "Attentive cutmix: An enhanced data augmentation approach for deep learning based image classification," arXiv preprint arXiv:2003.13048, 2020.
[17] S. Huang, X. Wang, and D. Tao, "Snapmix: Semantically proportional mixing for augmenting fine-grained data," in Proceedings of the AAAI Conference on Artificial Intelligence, 2021, vol. 35, no. 2, pp. 1628-1636.
[18] C. Finn, P. Abbeel, and S. Levine, "Model-agnostic meta-learning for fast adaptation of deep networks," in International conference on machine learning, 2017: PMLR, pp. 1126-1135.
[19] G. Koch, R. Zemel, and R. Salakhutdinov, "Siamese neural networks for one-shot image recognition," in ICML deep learning workshop, 2015, vol. 2: Lille, p. 0.
[20] F. Sung, Y. Yang, L. Zhang, T. Xiang, P. H. Torr, and T. M. Hospedales, "Learning to compare: Relation network for few-shot learning," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 1199-1208.
[21] O. Vinyals, C. Blundell, T. Lillicrap, and D. Wierstra, "Matching networks for one shot learning," Advances in neural information processing systems, vol. 29, 2016.
[22] S. Ravi and H. Larochelle, "Optimization as a model for few-shot learning," 2016.
[23] Y. Tian, Y. Wang, D. Krishnan, J. B. Tenenbaum, and P. Isola, "Rethinking few-shot 30image classification: a good embedding is all you need?," in European Conference on Computer Vision, 2020: Springer, pp. 266-282.
[24] Y. Bendou et al., "EASY: Ensemble Augmented-Shot Y-shaped Learning: State-OfThe-Art Few-Shot Classification with Simple Ingredients," arXiv preprint arXiv:2201.09699, 2022.
[25] B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, and A. Torralba, "Learning deep features for discriminative localization," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 2921-2929.
[26] M. Noroozi and P. Favaro, "Unsupervised learning of visual representations by solving jigsaw puzzles," in European conference on computer vision, 2016: Springer, pp. 69-84.
[27] L. Bertinetto, J. F. Henriques, P. H. Torr, and A. Vedaldi, "Meta-learning with differentiable closed-form solvers," arXiv preprint arXiv:1805.08136, 2018.
[28] B. Oreshkin, P. Rodríguez López, and A. Lacoste, "Tadam: Task dependent adaptive metric for improved few-shot learning," Advances in neural information processing systems, vol. 31, 2018.
[29] I. Loshchilov and F. Hutter, "Sgdr: Stochastic gradient descent with warm restarts," arXiv preprint arXiv:1608.03983, 2016.
[30] J. Snell, K. Swersky, and R. Zemel, "Prototypical networks for few-shot learning," Advances in neural information processing systems, vol. 30, 2017.
[31] J. Ma, H. Xie, G. Han, S.-F. Chang, A. Galstyan, and W. Abd-Almageed, "Partnerassisted learning for few-shot image classification," in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 10573-10582.
[32] M. N. Rizve, S. Khan, F. S. Khan, and M. Shah, "Exploring complementary strengths of invariant and equivariant representations for few-shot learning," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 10836-10846.
[33] X. Luo et al., "Rectifying the shortcut learning of background for few-shot learning," Advances in Neural Information Processing Systems, vol. 34, pp. 13073-13085, 2021.
[34] G. Qi, H. Yu, Z. Lu, and S. Li, "Transductive few-shot classification on the oblique manifold," in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 8412-8422. 31
[35] Y. Hu, S. Pateux, and V. Gripon, "Squeezing backbone feature distributions to the max for efficient few-shot learning," Algorithms, vol. 15, no. 5, p. 147, 2022.
[36] X. Shen, Y. Xiao, S. X. Hu, O. Sbai, and M. Aubry, "Re-ranking for image retrieval and transductive few-shot classification," Advances in Neural Information Processing Systems, vol. 34, pp. 25932-25943, 2021.
[37] M. Lazarou, T. Stathaki, and Y. Avrithis, "Iterative label cleaning for transductive and semi-supervised few-shot learning," in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 8751-8760.
[38] L. Yang, L. Li, Z. Zhang, X. Zhou, E. Zhou, and Y. Liu, "Dpgn: Distribution propagation graph network for few-shot learning," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 13390-13399.
[39] C. Chen, X. Yang, C. Xu, X. Huang, and Z. Ma, "ECKPN: Explicit Class Knowledge Propagation Network for Transductive Few-shot Learning," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 6596-6605.
[40] G. S. Dhillon, P. Chaudhari, A. Ravichandran, and S. Soatto, "A baseline for few-shot image classification," arXiv preprint arXiv:1909.02729, 2019.
[41] Y. Hu, V. Gripon, and S. Pateux, "Leveraging the feature distribution in transfer-based few-shot learning," in International Conference on Artificial Neural Networks, 2021: Springer, pp. 487-499.