研究生: |
張瑜芳 Chang, Yu-Fang |
---|---|
論文名稱: |
從已知看未知:透過類別感知的資料擴增與損失修改之廣義零樣本文字分類 From Seen to Unseen: Generalized Zero-Shot Text Classification with Class-aware Data Augmentation and Loss Adaptation |
指導教授: |
高宏宇
Kao, Hung-Yu |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
論文出版年: | 2023 |
畢業學年度: | 111 |
語文別: | 英文 |
論文頁數: | 41 |
中文關鍵詞: | 自然語言處理 、目標函數 、廣義零樣本學習 、元學習 |
外文關鍵詞: | Natural Language Processing, Generalized Zero-shot Learning, Objective Function, Meta-learning |
相關次數: | 點閱:112 下載:4 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著對話系統的重要性日漸增長,意圖檢測變得越來越關鍵。然而,新興和多樣化的類別不斷出現,這使得在實際應用中收集足夠的標記樣本並對所有類別進行重新訓練變得不可行。零樣本學習(ZSL)和少樣本學習適用於資源有限的情況。傳統的ZSL在一個特定條件下運行,即訓練和測試類別是互不相交的,這與實際情況是矛盾的。我們研究了廣義零樣本學習(GZSL),在此測試樣本包含已知和未知類別。雖然ZSL已經有大量研究,但由於兩者之間的重大差異,這些研究並不能直接應用於GZSL。這種差異在於,GZSL對已知類別的偏見對未知類別的分類造成了顯著影響。在ZSL中常用的基於矩陣的方法在GZSL中顯示出顯著的優勢,尤其是在LTA模型中。在這些模型中,我們發現原型相似性與分類任務的整體性能之間存在相關性。因此,在我們的研究中,我們調整了損失函數並加入新生成的數據,使我們的模型能夠在原型學習期間區分已知和未知類別。這種方法降低了由於數據稀缺和偏見引起的挑戰。我們評估了不同生成設置對性能的影響,並確認了增強原型對未知數據性能的顯著提升,從而提高了GZSL的整體性能。
As the significance of dialogue systems continues to gain prominence, intent detection has become crucial. However, the emergence of novel and diverse classes makes accumulating enough labeled samples and retraining for all classes in real-world applications impractical. Zero-shot learning (ZSL) and few-shot learning are applicable in resource-limited situations. Traditional ZSL operates under a specific condition where training and testing classes are disjoint, contradicting real-world scenarios. We investigate Generalized zero-shot learning (GZSL), where test samples encompass seen and unseen classes. Despite significant research on ZSL, it does not directly apply to GZSL due to the substantial divergence - GZSL’s evident bias towards seen classes significantly affects the classification of unseen classes. Matrix-based methods commonly used in ZSL demonstrate improvement in GZSL, especially within the LTA model. We observed a correlation between prototype similarity and overall performance in classification tasks within the model mentioned above. Hence, we adjusted the loss function and incorporated newly generated data in our research, enabling our model to distinguish between seen and unseen classes during prototype learning. This approach alleviates the challenges brought about by data scarcity and bias. We assessed the impact of different generation settings on performance and validated the significant enhancement in performance on unseen data with enriched prototypes, thereby boosting the overall performance of GZSL.
[1] Ateret Anaby-Tavor, Boaz Carmeli, Esther Goldbraich, Amir Kantor, George Kour, Segev Shlomov, Naama Tepper, and Naama Zwerdling. Do not have enough data? deep learning to the rescue! In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 7383–7390, 2020.
[2] Yujia Bao, Menghua Wu, Shiyu Chang, and Regina Barzilay. Few-shot text classification with distributional signatures. arXiv preprint arXiv:1908.06039, 2019.
[3] Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. Language models are few-shot learners. Advances in neural information processing
systems, 33:1877–1901, 2020.
[4] Wei-Lun Chao, Soravit Changpinyo, Boqing Gong, and Fei Sha. An empirical study and analysis of generalized zero-shot learning for object recognition in the wild. In Com-
puter Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part II 14, pages 52–68. Springer, 2016.
[5] Jiaoyan Chen, Yuxia Geng, Zhuo Chen, Ian Horrocks, Jeff Z Pan, and Huajun Chen. Knowledge-aware zero-shot learning: Survey and perspective. arXiv preprint arXiv:2103.00070, 2021.
[6] Long Chen, Hanwang Zhang, Jun Xiao, Wei Liu, and Shih-Fu Chang. Zero-shot visual recognition using semantics-preserving adversarial embedding networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages
1043–1052, 2018.
[7] Alice Coucke, Alaa Saade, Adrien Ball, Théodore Bluche, Alexandre Caulier, David Leroy, Clément Doumouro, Thibault Gisselbrecht, Francesco Caltagirone, Thibaut Lavril, et al. Snips voice platform: an embedded spoken language understanding system for private-by-design voice interfaces. arXiv preprint arXiv:1805.10190, 2018.
[8] Geli Fei and Bing Liu. Breaking the closed world assumption in text classification. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 506–514, 2016.
[9] Chelsea Finn, Pieter Abbeel, and Sergey Levine. Model-agnostic meta-learning for fast adaptation of deep networks. In International conference on machine learning, pages 1126–1135. PMLR, 2017.
[10] Chuanxing Geng, Sheng-jun Huang, and Songcan Chen. Recent advances in open set recognition: A survey. IEEE transactions on pattern analysis and machine intelligence, 43(10):3614–3631, 2020.
[11] Zongyan Han, Zhenyong Fu, Shuo Chen, and Jian Yang. Contrastive embedding for generalized zero-shot learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2371–2381, 2021.
[12] Charles T. Hemphill, John J. Godfrey, and George R. Doddington. The ATIS spoken language systems pilot corpus. In Speech and Natural Language: Proceedings of a Workshop Held at Hidden Valley, Pennsylvania, June 24-27,1990, 1990.
[13] Varun Kumar, Ashutosh Choudhary, and Eunah Cho. Data augmentation using pretrained transformer models. arXiv preprint arXiv:2003.02245, 2020.
[14] Stefan Larson, Anish Mahendran, Joseph J Peper, Christopher Clarke, Andrew Lee, Parker Hill, Jonathan K Kummerfeld, Kevin Leach, Michael A Laurenzano, Lingjia Tang, et al. An evaluation dataset for intent classification and out-of-scope prediction. arXiv preprint arXiv:1909.02027, 2019.
[15] Yun Li, Zhe Liu, Lina Yao, and Xiaojun Chang. Attribute-modulated generative meta learning for zero-shot learning. IEEE Transactions on Multimedia, 2021.
[16] Han Liu, Xiaotong Zhang, Lu Fan, Xuandi Fu, Qimai Li, Xiao-Ming Wu, and Albert YS Lam. Reconstructing capsule networks for zero-shot intent classification. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 4799–4809, 2019.
[17] Pengfei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi, and Graham Neubig. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM Computing Surveys, 55(9):1–35, 2023.
[18] Yu Meng, Jiaxin Huang, Yu Zhang, and Jiawei Han. Generating training data with language models: Towards zero-shot language understanding. Advances in Neural Information Processing Systems, 35:462–477, 2022.
[19] Farhad Pourpanah, Moloud Abdar, Yuxuan Luo, Xinlei Zhou, Ran Wang, Chee Peng Lim, Xi-Zhao Wang, and QM Jonathan Wu. A review of generalized zero-shot learning methods. IEEE transactions on pattern analysis and machine intelligence, 2022.
[20] Timo Schick and Hinrich Schütze. Exploiting cloze questions for few shot text classification and natural language inference. arXiv preprint arXiv:2001.07676, 2020.
[21] AB Siddique, Fuad Jamour, Luxun Xu, and Vagelis Hristidis. Generalized zero-shot intent detection via commonsense knowledge. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages
1925–1929, 2021.
[22] Jake Snell, Kevin Swersky, and Richard Zemel. Prototypical networks for few-shot learning. Advances in neural information processing systems, 30, 2017.
[23] Richard Socher, Milind Ganjoo, Christopher D Manning, and Andrew Ng. Zero-shot learning through cross-modal transfer. Advances in neural information processing systems, 26, 2013.
[24] Congzheng Song, Shanghang Zhang, Najmeh Sadoughi, Pengtao Xie, and Eric Xing. Generalized zero-shot text classification for icd coding. In Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence, pages 4018–4024, 2021.
[25] Oriol Vinyals, Charles Blundell, Timothy Lillicrap, Daan Wierstra, et al. Matching networks for one shot learning. Advances in neural information processing systems, 29, 2016.
[26] Wei Wang, Vincent W Zheng, Han Yu, and Chunyan Miao. A survey of zero-shot learning: Settings, methods, and applications. ACM Transactions on Intelligent Systems and Technology (TIST), 10(2):1–37, 2019.
[27] Zirui Wang, Adams Wei Yu, Orhan Firat, and Yuan Cao. Towards zero-label language learning. arXiv preprint arXiv:2109.09193, 2021.
[28] Yongqin Xian, Bernt Schiele, and Zeynep Akata. Zero-shot learning-the good, the bad and the ugly. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4582–4591, 2017.
[29] Guangfeng Yan, Lu Fan, Qimai Li, Han Liu, Xiaotong Zhang, Xiao-Ming Wu, and Albert YS Lam. Unknown intent detection using gaussian mixture model with an application to zero-shot intent classification. In Proceedings of the 58th annual meeting of the association for computational linguistics, pages 1050–1060, 2020.
[30] Jiacheng Ye, Jiahui Gao, Qintong Li, Hang Xu, Jiangtao Feng, Zhiyong Wu, Tao Yu, and Lingpeng Kong. Zerogen: Efficient zero-shot learning via dataset generation. arXiv preprint arXiv:2202.07922, 2022.
[31] Hanlei Zhang, Hua Xu, and Ting-En Lin. Deep open intent classification with adaptive decision boundary. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 14374–14382, 2021.
[32] Wei-Nan Zhang, Zhigang Chen, Wanxiang Che, Guoping Hu, and Ting Liu. The first evaluation of chinese human-computer dialogue technology. arXiv preprint arXiv:1709.10217, 2017.
[33] Yiwen Zhang, Caixia Yuan, Xiaojie Wang, Ziwei Bai, and Yongbin Liu. Learn to adapt for generalized zero-shot text classification. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 517–527, 2022.