簡易檢索 / 詳目顯示

研究生: 黃子寬
Huang, Zi-Kuan
論文名稱: 在少樣本立場檢測任務中使用元學習演算法進行跨語言遷移
Utilizing meta-learning for cross-lingual transfer in few-shot stance detection task
指導教授: 高宏宇
Kao, Hung-Yu
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2020
畢業學年度: 108
語文別: 英文
論文頁數: 38
中文關鍵詞: 立場檢測元學習少樣本學習跨語言學習
外文關鍵詞: Stance Detection, Meta-learning, Few shot learning, Multilingual
相關次數: 點閱:75下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著移動互聯網的普及和社交網路的飛速發展,人們現在能夠接受到越來越多資訊,其中具有爭議性的資訊往往是人們關注的重點。從自然語言文本中自動提取語義資訊是許多實際應用領域中的重要研究問題。因此研究有爭議性的資訊的立場傾向也是現在自然語言處理研究中很重要的一個部分。研究者們已經在很多在許多立場檢測任務中取得了優秀的成績。但是在實際應用過程中,我們難以擁有如此大量的訓練資料。因此本文著力於解決在某些立場檢測任務中資料不足的問題。
    本文提出了一種基於元學習演算法的模型,使用其它任務的資料來增強資料不足的任務的訓練效果。在元學習訓練(Meta-training)的過程中,本文運用了記憶網路和跨語言的預訓練模型,使模型能夠使用不同語言,不同來源,文本長度差異巨大的立場檢測任務。本文還在實驗部分探究了不同的元學習模型不同的優化方式和參數設置對實驗結果的影響。所以,本文的主要貢獻點為:(1)把元學習演算法結合到立場檢查任務中,提升資料量不足的立場檢測任務的效果。(2)在元學習算法中使用記憶網路和跨語言預訓練模型,增大了元學習訓練過程中訓練資料的適用性。

    With the popularity of the mobile Internet and the rapid development of social networks, we can now receive more and more information, of which controversial information is often the focus of attention. The automatic extraction of semantic information from natural language texts is an important research issue in many practical applications. Therefore, the stance of controversial information is also an important part of natural language processing research. Researchers have achieved excellent results in many stance detections tasks. But in the actual application, it is difficult for us to have such a large amount of training data. Therefore, this article focuses on solving the problem of insufficient data in certain stance detection tasks.
    This paper proposes a model based on meta-learning algorithm that uses data from other tasks to enhance the training effect of tasks with insufficient data. In the process of meta-training, this paper uses a memory network and a cross-lingual pre-training model to enable the model to use different languages, different sources of stance detection tasks. This paper also explores the effects of different optimization methods and parameter settings of different meta-learning model. Therefore, the main contributions of this paper are: (1) Integrate the meta-learning algorithm into the position inspection task to improve the effect of the stance detection task with insufficient data. (2) We use cross-lingual pre-training model and memory network architecture in the meta-training process and Increased the applicability of training materials.

    Content 中文摘要 I Abstract II FIGURE LISTING V TABLE LISTING VI 1 Introduction 1 1.1 Background 1 1.2 Motivation 3 1.3 Approach 5 2 Related Work 8 2.1 Stance Detection tasks 8 2.2 Early stance detection work 9 2.3 Recent stance testing work 10 2.4 BERT model 11 2.5 Optimization-based meta-learning 13 2.6 Model-agnostic meta-learning 15 3 Methods 18 3.1 Data preprocessing 18 3.2 General Framework 18 3.3 Basic model description 20 3.3.1 Input and output 20 3.3.2 Input 21 3.3.3 Generalization 22 3.3.4 Output 22 3.3.5 Response 23 3.3.6 Usage of model agnostic meta-learning 23 4 Experiment 25 4.1 Used Datasets 25 4.2 Result for meta-learning 27 4.2.1 Results of meta-learning models with different data quantities 27 4.2.2 Result for using different meta-training tasks 30 4.2.3 Parameter settings 31 5 Conclusion 34 Reference 35

    [1]. Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019, June). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) (pp. 4171-4186).
    [2]. Thorne, J., Vlachos, A., Cocarascu, O., Christodoulopoulos, C., & Mittal, A. (2018, November). The Fact Extraction and VERification (FEVER) Shared Task. In Proceedings of the First Workshop on Fact Extraction and VERification (FEVER) (pp. 1-9).
    [3]. Mohammad, S., Kiritchenko, S., Sobhani, P., Zhu, X., & Cherry, C. (2016, June). Semeval-2016 task 6: Detecting stance in tweets. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016) (pp. 31-41).
    [4]. Vamvas, J., & Sennrich, R. (2020). X-Stance: A Multilingual Multi-Target Dataset for Stance Detection. arXiv, arXiv-2003.
    [5]. Chen, S., Khashabi, D., Yin, W., Callison-Burch, C., & Roth, D. (2019, June). Seeing Things from a Different Angle: Discovering Diverse Perspectives about Claims. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) (pp. 542-557).
    [6]. Xu, Ruifeng, et al. "Overview of nlpcc shared task 4: Stance detection in chinese microblogs." Natural Language Understanding and Intelligent Applications. Springer, Cham, 2016. 907-916.
    [7]. Finn, C., Abbeel, P., & Levine, S. (2017, August). Model-agnostic meta-learning for fast adaptation of deep networks. In Proceedings of the 34th International Conference on Machine Learning-Volume 70 (pp. 1126-1135).
    [8]. Nichol, A., Achiam, J., & Schulman, J. (2018). On first-order meta-learning algorithms. arXiv preprint arXiv:1803.02999.
    [9]. Conneau, A., & Lample, G. (2019). Cross-lingual language model pretraining. In Advances in Neural Information Processing Systems (pp. 7059-7069).
    [10]. Torrey, L., & Shavlik, J. (2010). Transfer learning. In Handbook of research on machine learning applications and trends: algorithms, methods, and techniques (pp. 242-264). IGI global.
    [11]. Sun, Q., Wang, Z., Zhu, Q., & Zhou, G. (2018, August). Stance detection with hierarchical attention network. In Proceedings of the 27th international conference on computational linguistics (pp. 2399-2409).
    [12]. Xu, C., Paris, C., Nepal, S., & Sparks, R. (2019, July). Recognising Agreement and Disagreement between Stances with Reason Comparing Networks. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (pp. 4665-4671).
    [13]. Popat, K., Mukherjee, S., Yates, A., & Weikum, G. (2019, November). STANCY: Stance Classification Based on Consistency Cues. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) (pp. 6414-6419).
    [14]. Küçük, D., & Can, F. (2020). Stance detection: A survey. ACM Computing Surveys (CSUR), 53(1), 1-37.
    [15]. Mohtarami, M., Baly, R., Glass, J., Nakov, P., Màrquez, L., & Moschitti, A. (2018, June). Automatic Stance Detection Using End-to-End Memory Networks. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers) (pp. 767-776).
    [16]. Kazi Saidul Hasan and Vincent Ng. 2013. Stance classification of ideological debates: Data, models, features, and constraints. In Proceedings of the International Joint Conference on Natural Language Processing. 1348–1356
    [17]. Matt Thomas, Bo Pang, and Lillian Lee. 2006. Get out the vote: Determining support or opposition from congressional floor-debate transcripts. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 327–335.
    [18]. Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013).
    [19]. Swapna Somasundaran and Janyce Wiebe. 2010. Recognizing stances in ideological on-line debates. In Proceedings of the Workshop on Computational Approaches to Analysis and Generation of Emotion in Text. 116–124.
    [20]. Marilyn A. Walker, Pranav Anand, Robert Abbott, and Ricky Grant. 2012a. Stance classification using dialogic properties of persuasion. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 592–596.
    [21]. Swapna Somasundaran and Janyce Wiebe. 2009. Recognizing stances in online debates. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1-Volume 1. 226–234.
    [22]. Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. GloVe: Global vectors for word representation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP). 1532–1543.
    [23]. Chen, Q., Zhu, X., Ling, Z. H., Wei, S., Jiang, H., & Inkpen, D. (2017, July). Enhanced LSTM for Natural Language Inference. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 1657-1668).
    [24]. Akiko Murakami and Rudy Raymond. 2010. Support or oppose?: Classifying positions in online debates from reply activities and opinion expressions. In Proceedings of the International Conference on Computational Linguistics. 869–875.
    [25]. Ravi, S., & Larochelle, H. (2016). Optimization as a model for few-shot learning.
    [26]. Sukhbaatar, S., Weston, J., & Fergus, R. (2015). End-to-end memory networks. In Advances in neural information processing systems (pp. 2440-2448).
    [27]. 王安君, 黃凱凱, & 陸黎明. (2019). 基於 Bert-Condition-CNN 的中文微博立場檢測. 電腦系統應用, 28(11), 45-53.
    [28]. Wei, P., Mao, W., & Zeng, D. (2018, July). A target-guided neural memory model for stance detection in Twitter. In 2018 International Joint Conference on Neural Networks (IJCNN) (pp. 1-8). IEEE.

    下載圖示 校內:2021-12-31公開
    校外:2021-12-31公開
    QR CODE