簡易檢索 / 詳目顯示

研究生: 林旻弘
Lin, Min-Hong
論文名稱: 運用特徵重參數化與H&C度量學習之少樣本物件重識別方法
A Few-Shot Object Re-Identification Method with Feature Reparameterization and H&C Metric Learning
指導教授: 郭耀煌
Kuo, Yau-Hwang
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2020
畢業學年度: 108
語文別: 英文
論文頁數: 85
中文關鍵詞: 物件重識別少樣本學習度量學習影像識別
外文關鍵詞: Object Re-Identification, Few-Shot Learning, Metric Learning, Visual Recognition
相關次數: 點閱:103下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著近年來人工智慧技術的迅速發展,許多研究機構和業界投入大量資源來發展無人商店、智慧零售、影像監控等人工智慧相關應用領域的產品與服務,並陸續開發出多種先進技術,其中,電腦視覺關鍵技術佔有相當重要的地位。在許多應用場景中,物件的重識別會是非常重要的一環,例如,在無人商店中的人員追蹤、犯罪防治中的影像監控等。近年來,有許多基於大規模資料集來訓練的物件重識別方法被提出,並且這些方法在物件重識別的任務上擁有很好的效果。但是這些方法在資料量較少的情況下,效能將會大大的降低。
    本論文中,我們提出一種基於少樣本學習的物件重識別方法,使得物件重識別的方法在資料量有限的情況下也可以達到不錯的效能。我們的方法透過增進模型的泛化性以及模型的識別能力來提升模型在少樣本下的效能。在我們的少樣本學習物件重識別方法中,使用一個重新參數化方法使得所取出的特徵向量符合高斯分布,藉此讓模型對沒看過的資料具有推論能力。並提出了一個同時考慮了困難樣本以及能夠代表整個集合的中心點來進行學習的度量學習方法,以此提升模型的識別能力。並加入了資料增量方法以提高資料的多樣性,以及加入了一些用以提升物件重識別模型效能的訓練技巧。而且我們所提出的方法可以簡單的加入到其他現有的模型中,並在資料量有限的情況可以顯著的提升這些模型的效能。
    最後,我們提供實驗結果來驗證所提出方法的效能。透過與前人方法的比較,本論文所提出之方法的結果在資料有限的情況下可以達到更高的準確率,並且在將我們的方法套用在兩個現有最出色的人員重識別模型上皆可大大的提升他們在資料有限情況下的準確率。本論文所提出之方法將提升在資料量不足的應用情境下執行物件重識別任務的可行性。

    With the rapid development of artificial intelligence (AI), many research institutes and companies invest lots of resources to develop AI-enabled technologies for unmanned stores, smart retail and video surveillance, etc. In these applications, computer vision plays an important role. In many application scenarios, the object re-identification (Re-ID) is a very important part of it. For example, multi-person tracking in unmanned stores and video surveillance in crime prevention, etc. In recent years, a lot of object Re-ID methods dependent on large-scale Re-ID datasets have been proposed, and these methods can already achieve good results in the object Re-ID task. However, the performance of these methods will be greatly reduced when the amount of data is scarce.
    In this thesis, we propose a few-shot object Re-ID method, which can achieve good performance even in the case of the limited number of data. Our method improves the performance of the model in the few-shot case by enhancing the generalization and discriminative of the model. A reparameterization approach is used to make the extracted feature vectors conform to the Gaussian distribution, thereby improving the inference ability of the model to unseen data. We also propose the H&C metric learning method that takes into account hard samples and center points which represent the entire set, for learning to improve the discriminative of the model. In addition, some data augmentation methods is used in our method to increase the diversity of data and add some training tricks to improve the performance of object Re-ID model. Moreover, the proposed method can be easily applied to other existing object Re-ID models, and can greatly enhance performance of these models under limited data.
    Finally, experimental results to verify the effectiveness of the proposed method are provided. By comparing with the previous methods, the proposed method can achieve higher accuracy under the condition of limited data, and two examples of applying our method to the existing person Re-ID models are shown to enhance their accuracy in the case of the limited number of data. The proposed method in this thesis will make object Re-ID more suitable for application scenarios where the amount of data is scarce.

    CHAPTER 1 INTRODUCTION 1 1.1 BACKGROUND 1 1.2 PROBLEM DESCRIPTION 7 1.3 MOTIVATION 9 1.4 CONTRIBUTION 12 1.5 ORGANIZATION 14 CHAPTER 2 RELATED WORK 15 2.1 OBJECT RE-IDENTIFICATION 15 2.2 FEW-SHOT LEARNING 30 CHAPTER 3 FEW-SHOT OBJECT RE-IDENTIFICATION 33 3.1 RESEARCH PROCESS 34 3.2 DATA AUGMENTATION AND TRAINING TRICKS 36 3.3 FEATURE EXTRACTOR 41 3.4 REPARAMETERIZATION APPROACH 43 3.5 H&C METRIC LEARNING 48 3.6 TOTAL LOSS AND INFERENCE STAGE 55 3.7 APPLY OUR METHOD TO EXISTING MODELS 57 CHAPTER 4 EXPERIMENTS AND DISCUSSION 59 4.1 DATASET 59 4.2 EXPERIMENTAL ENVIRONMENT 61 4.3 EVALUATION METRICS 62 4.4 EXPERIMENTAL RESULTS OF PERSON RE-ID 63 4.5 EXPERIMENTAL RESULTS OF VEHICLE RE-ID 74 CHAPTER 5 CONCLUSION 77 CHAPTER 6 FUTURE WORK 78 REFERENCES 79

    [CHE 06] E. D. Cheng and M. Piccardi, “Matching of objects moving across disjoint cameras”, IEEE International Conference on Image Processing (ICIP), 2006.
    [CHE 17] W. Chen, X. Chen, J. Zhang, and K. Huang, “A Multi-task Deep Network for Person Re-identification”, AAAI Conference on Artificial Intelligence, 2017.
    [CHE 19] G. Chen, J. Lu, M. Yang, and J. Zhou, “Spatial-temporal attention-aware learning for video-based person reidentification”, TIP, vol. 28, no. 9, pp. 4192–4205, 2019.
    [CHE 19] G. Chen, T. Zhang, J. Lu, and J. Zhou, “Deep Meta Metric Learning”, IEEE International Conference on Computer Vision (ICCV), 2019.
    [CHO 05] S. Chopra, R. Hadsell, and Y. LeCun, “Learning a similarity metric discriminatively, with application to face verification”, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 1, pp. 539–546, 2005.
    [CHO 16] Y.-J. Cho and K.-J. Yoon, “Improving person re-identification via pose-aware multi-shot matching”, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1354–1362, 2016.
    [DEN 09] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. FeiFei, “ImageNet: A large-scale hierarchical image database”, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.248-255, 2009.
    [DEN 12] J. Deng, A. Berg, S. Satheesh, H. Su, A. Khosla, and L. FeiFei, “ImageNet Large Scale Visual Recognition Competition”, 2012.
    [FAD 15] Facebook AI Director Yann LeCun on His Quest to Unleash Deep Learning and Make Machines Smarter
    https://spectrum.ieee.org/automaton/robotics/artificial-intelligence/facebook-ai-director-yann-lecun-on-deep-learning/

    [FAR 10] M. Farenzena, L. Bazzani, A. Perina, V. Murino, and M. Cristani, “Person re-identification by symmetry-driven accumulation of local features”, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2010.
    [FIN 17] C. Finn, P. Abbeel, and S. Levine, “Model-agnostic meta-learning for fast adaptation of deep networks”, In International Conference on Machine Learning, 2017.
    [FU 19] Y. Fu, Y. Wei, Y. Zhou, H. Shi, G. Huang, X. Wang, and Z. Yao, T. Huang, “Horizontal pyramid matching for person re-identification”, AAAI Conference on Artificial Intelligence, 2019.
    [GRA 08] D. Gray and H. Tao, “Viewpoint invariant pedestrian recognition with an ensemble of localized features”, European Conference on Computer Vision (ECCV), 2008.
    [GUO 18] Y. Guo and N.-M. Cheung, “Efficient and deep person reidentification using multi-level similarity,” IEEE Conference on Computer Vision and Pattern Recognition(CVPR), pp. 2335–2344, 2018.
    [HE 16] K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition”, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770-778, 2016.
    [HE 20] S. He, H. Luo, W. Chen, M. Zhang, Y. Zhang, F. Wang, H. Li, and W. Jiang, “Multi-domain learning and identity mining for vehicle re-identification”, In CVPR Workshops, 2020.
    [HER 17] A. Hermans, L. Beyer, and B. Leibe, “In defense of the triplet loss for person re-identification”, arXiv preprint arXiv:1703.07737, 2017.
    [HUA 18] H. Huang, D. Li, Z. Zhang, X. Chen, and K. Huang, “Adversarially occluded samples for person re-identification”, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5098–5107, 2018.
    [KAR 15] S. Karanam, Y. Li, and R. J. Radke, “Person re-identification with discriminatively trained viewpoint invariant dictionaries”, IEEE International Conference on Computer Vision (ICCV), pp. 4516–4524, 2015.
    [KIN 13] D. P. Kingma and M. Welling, “Auto-Encoding Variational Bayes”, International Conference on Learning Representations (ICLR), 2013.
    [LEC 98] Y. LeCun, L. Bottou, and Y. Bengio, “Gradient-based learning applied to
    document recognition”, Proceedings of the IEEE, pp. 2278-2324, 1998.
    [LI 14] W. Li, R. Zhao, T. Xiao, and X. Wang, “Deepreid: Deep filter pairing neural network for person re-identification”, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 152–159, 2014.
    [LIU 16] X. Liu, W. Liu, H. Ma, and H. Fu, “Large-scale vehicle reidentification in urban surveillance videos”, In ICME, pp. 1–6, 2016.
    [LUO 19] H. Luo, Y. Gu, X. Liao, S. Lai, and W. Jiang, “Bag of tricks and a strong baseline for deep person re-identification”, IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2019.
    [LUO 19] H. Luo, W. Jiang, Y. Gu, F. Liu, X. Liao, S. Lai, and J. Gu, “A strong baseline and batch normalization neck for deep person re-identification”, IEEE Transactions on Multimedia, 2019.
    [PAN 10] S. Pan and Q. Yang, “A survey on transfer learning”, Knowledge and Data Engineering, IEEE Transactions on, vol. 22, no.10, pp. 1345–1359, 2010.
    [RIS 16] E. Ristani, F. Solera, R. Zou, R. Cucchiara, and C. Tomasi, “Performance measures and a data set for multi-target, multi-camera tracking”, European Conference on Computer Vision workshop on Benchmarking Multi-Target Tracking, 2016.
    [RIS 18] E. Ristani and C. Tomasi, “Features for multi-target multi-camera tracking and re-identification”, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6036–6046, 2018.
    [SAI 13] T. N. Sainath, A. R. Mohamed, B. Kingsbury, and B. Ramabhadran, “Deep convolutional neural networks for LVCSR”, International Conference on Acoustics, Speech and Signal Processing, 2013.
    [SAN 16] A.Santoro, S. Bartunov, and M. Botvinick, “One-shot learning with memory augmented neural networks”, arXiv preprint arXiv:1605.06065, 2016.
    [SCH 15] F. Schroff, D. Kalenichenko, and J. Philbin, “Facenet: A unified embedding for face recognition and clustering”, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.
    [SHU 19] M. Shu, "Deep learning for image classification on very small datasets using transfer learning", Creative Components, 2019.
    [SNE 17] J. Snell, K. Swersky, and R. S. Zemel, “Prototypical networks for few-shot learning”, Conference on Neural Information Processing Systems (NIPS), 2017.
    [SU 16] C. Su, S. Zhang, J. Xing, W. Gao, and Q. Tian, “Deep attributes driven multi-camera person re-identification”, European Conference on Computer Vision (ECCV), pp. 475–491, 2016.
    [SZE 15] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going Deeper with Convolutions”, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1-9, 2015.
    [VAR 16] R. R. Varior, B. Shuai, J. Lu, D. Xu, and G. Wang, “A siamese long short-term memory architecture for human re-identification,” European Conference on Computer Vision (ECCV), pp. 135–153, 2016.
    [VIN 16] O. Vinyals, C. Blundell, T. Lillicrap, D. Wierstra, “Matching networks for one shot learning”, Conference on Neural Information Processing Systems (NIPS), pp. 3630–3638, 2016.
    [WAN 07] X. Wang, G. Doretto, T. Sebastian, J. Rittscher, and P. Tu, “Shape and appearance context modeling”, IEEE International Conference on Computer Vision (ICCV), 2007.
    [WAN 18] Y. Wang, L. Wang, Y. You, X. Zou, V. Chen, S. Li, G. Huang, B. Hariharan, and K. Q. Weinberger, “Resource aware person reidentification across multiple resolutions”, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8042–8051, 2018.
    [WAN 19] Y. Wang, Q. Yao, J. Kwok, L. M. Ni, “Generalizing from a Few Examples: A Survey on Few-Shot Learning”, ACM Computing Surveys (CSUR), 2019
    [WEN 16] Y. Wen, K. Zhang, Z. Li, and Y. Qiao, “A discriminative feature learning approach for deep face recognition”, European Conference on Computer Vision (ECCV), pp. 499–515, Springer, 2016.
    [XIN 19] X. Xin, X. Wu, Y. Wang, J. Wang, “Deep Self-Paced Learning For Semi-Supervised Person Re-Identification Using Multi-View Self-Paced Clustering”, IEEE International Conference on Image Processing (ICIP), 2019.
    [YAN 19] Q. Yang, H.-X. Yu, A. Wu, and W.-S. Zheng, “Patch-based discriminative feature learning for unsupervised person reidentification”, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3633–3642, 2019.
    [YE 20] M. Ye, J. Shen, G. Lin, T. Xiang, L. Shao, and S. C. H. Hoi, “Deep learning for person re-identification: A survey and outlook”, arXiv preprint arXiv:2001.04193, 2020.
    [YI 14] D. Yi, Z. Lei, S. Liao, and S. Z. Li et al., “Deep metric learning for person re-identification”, International Conference on Pattern Recognition (ICPR), pp. 34–39, 2014.
    [YUA 19] Y. Yuan, W. Chen, Y. Yang, and Z. Wang, “In Defense of the Triplet Loss Again: Learning Robust Person Re-Identification with Fast Approximated Triplet Loss and Label Distillation”, arXiv:1912.07863, 2019
    [ZHE 15] L. Zheng, L. Shen, L. Tian, S. Wang, J. Wang, and Q. Tian, “Scalable person re-identification: A benchmark,” in Computer Vision, IEEE International Conference, 2015.
    [ZHE 17] Z. Zheng, L. Zheng, and Y. Yang, “A discriminatively learned cnn embedding for person reidentification”, arXiv preprint arXiv:1611.05666, 2016.
    [ZHO 19] Z. Zhong, L. Zheng, Z. Zheng, S. Li, and Y. Yang, “Camstyle: A novel data augmentation method for person re-identification”, IEEE Transactions on Image Processing, vol. 28, no. 3, pp. 1176–1190, 2019.
    [ZHU 17] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired imageto-image translation using cycle-consistent adversarial networks”, IEEE International Conference on Computer Vision (ICCV), 2017.

    無法下載圖示 校內:2025-08-10公開
    校外:不公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE