| 研究生: |
許源展 Hsu, Yuan-Chan |
|---|---|
| 論文名稱: |
透過橋接本體與實體嵌入以強化少樣本命名實體辨識之雙流架構 A Two-Stream Framework for Few-Shot Named Entity Recognition Enhanced by Bridging Ontology and Entity Embeddings |
| 指導教授: |
黃仁暐
Huang, Jen-Wei |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
| 論文出版年: | 2026 |
| 畢業學年度: | 114 |
| 語文別: | 英文 |
| 論文頁數: | 47 |
| 中文關鍵詞: | 少樣本學習 、命名實體識別 、對比學習 、提示學習 、原型學習 |
| 外文關鍵詞: | Few-shot learning, named entity recognition, contrastive learning, prompt learning, prototypical learning |
| 相關次數: | 點閱:15 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
少樣本命名實體識別(Few-shot NER)旨在利用有限的標註範例來識別實體,尤其是在專業領域的應用中。現有方法主要在兩個互補的方向上取得了進展。一方面,將 NER 重構為遮罩語言模型(MLM)任務的方法,能夠直接利用預訓練的語義知識;然而,在低樣本情境下,有限的範例無法為微調提供充足的引導,往往導致決策邊界模糊不清。另一方面,利用定義引導的方法通常採用對比學習,試圖將實體表徵與語義原型進行對齊。然而,這些原型往往侷限於理想化的定義,導致實體嵌入向這些抽象原型收斂的效果有限。
為了增強低樣本 NER 設定下的泛化能力,我們提出了透過橋接本體與實體嵌入以強化少樣本命名實體識別之雙流架構,旨在整合三項關鍵優勢:利用語言模型的語義能力、透過對比學習為實體嵌入提供具鑑別力的結構引導,以及採用「動態原型適應」策略來橋接本體與實體嵌入。實驗結果顯示,本研究在不同資料集的多種樣本設定下均優於現有模型,特別是在 5-shot 設定中,F1 分數取得了高達 11.86 的提升。
Few-shot Named Entity Recognition aims to identify entities with limited annotated examples, particularly in specialized domains. Existing methods have made progress along two complementary directions. On the one hand, approaches that reformulate NER as Masked Language Modeling (MLM) tasks directly leverage pre-trained semantic knowledge; however, in low-shot regimes, the limited examples provide insufficient guidance for effective fine-tuning, often lead to ambiguous decision boundaries. Conversely, approaches leveraging definition guidance often employ contrastive learning to align entity representations with semantic prototypes. Yet, these prototypes often remain tethered to idealized definitions, resulting in limited convergence of entity embeddings toward such abstract prototypes. To enhance generalization in low-resource NER settings, we propose a two stream framework to synthesize three key strengths: leveraging the semantic power of LMs, providing structural guidance for discriminative embeddings via contrastive learning, and employing a Dynamic Prototype Adaptation strategy to bridge ontology and entity embeddings. Experiments on various datasets demonstrate that our model consistently outperforms SOTA models across different shot settings, with increasing 11.86 of F1 score in 5-shot settings.
[1] Daniel M. Bikel, Scott Miller, Richard Schwartz, and Ralph Weischedel. Nymble: a high performance learning name-finder. In Fifth Conference on Applied Natural Language Processing, pages 194–201, Washington, DC, USA, March 1997. Association for Computational Linguistics.
[2] Ronan Collobert, Jason Weston, L´eon Bottou, Michael Karlen, Koray Kavukcuoglu, and Pavel Kuksa. Natural language processing (almost) from scratch. J. Mach. Learn. Res., 12(null):2493–2537, November 2011.
[3] Leyang Cui, Yu Wu, Jian Liu, Sen Yang, and Yue Zhang. Template-based named entity recognition using BART. In Chengqing Zong, Fei Xia, Wenjie Li, and Roberto Navigli, editors, Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 1835–1845, Online, August 2021. Association for Computational Linguistics.
[4] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: Pre-training of deep bidirectional transformers for language understanding. In Jill Burstein, Christy Doran, and Thamar Solorio, editors, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota, June 2019. Association for Computational Linguistics.
[5] Alexander Fritzler, Varvara Logacheva, and Maksim Kretov. Few-shot classification in named entity recognition task. In Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing, SAC ’19, page 993–1000, New York, NY, USA, 2019. Association for Computing Machinery.
[6] Tianyu Gao, Xingcheng Yao, and Danqi Chen. SimCSE: Simple contrastive learning of sentence embeddings. In Marie-Francine Moens, Xuanjing Huang, Lucia Specia, and Scott Wen-tau Yih, editors, Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 6894–6910, Online and Punta Cana, Dominican Republic, November 2021. Association for Computational Linguistics.
[7] Gemma Team. Gemma 3. 2025.
[8] Ralph Grishman and Beth Sundheim. Message Understanding Conference- 6: A brief history. In COLING 1996 Volume 1: The 16th International Conference on Computational Linguistics, 1996.
[9] Kai He, Rui Mao, Yucheng Huang, Tieliang Gong, Chen Li, and Erik Cambria. Template-free prompting for few-shot named entity recognition via semantic-enhanced contrastive learning. IEEE Transactions on Neural Networks and Learning Systems, 35(12):18357–18369, 2024.
[10] Yucheng Huang, Kai He, Yige Wang, Xianli Zhang, Tieliang Gong, Rui Mao, and Chen Li. COPNER: Contrastive learning with prompt guiding for few-shot named entity recognition. In Nicoletta Calzolari, Chu-Ren Huang, Hansaem Kim, James Pustejovsky, Leo Wanner, Key-Sun Choi, Pum-Mo Ryu, Hsin-Hsi Chen, Lucia Donatelli, Heng Ji, Sadao Kurohashi, Patrizia Paggio, Nianwen Xue, Seokhwan Kim, Younggyun Hahm, Zhong He, Tony Kyungil Lee, Enrico Santus, Francis Bond, and Seung-Hoon Na, editors, Proceedings of the 29th International Conference on Computational Linguistics, pages 2515–2527, Gyeongju, Republic of Korea, October 2022. International Committee on Computational Linguistics.
[11] Mandar Joshi, Danqi Chen, Yinhan Liu, Daniel S. Weld, Luke Zettlemoyer, and Omer Levy. SpanBERT: Improving pre-training by representing and predicting spans. Transactions of the Association for Computational Linguistics, 8:64–77, 2020.
[12] John D. Lafferty, Andrew McCallum, and Fernando C. N. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the Eighteenth International Conference on Machine Learning, ICML ’01, page 282–289, San Francisco, CA, USA, 2001. Morgan Kaufmann Publishers Inc.
[13] Xiaoya Li, Jingrong Feng, Yuxian Meng, Qinghong Han, Fei Wu, and Jiwei Li. A unified MRC framework for named entity recognition. In Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel Tetreault, editors, Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5849–5859, Online, July 2020. Association for Computational Linguistics.
[14] Jingjing Liu, Panupong Pasupat, Scott Cyphers, and Jim Glass. Asgard: A portable architecture for multilingual dialogue systems. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pages 8386–8390, 2013.
[15] Pengfei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi, and Graham Neubig. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM Comput. Surv., 55(9), January 2023.
[16] Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization. In International Conference on Learning Representations, 2019.
[17] Ruotian Ma, Xin Zhou, Tao Gui, Yiding Tan, Linyang Li, Qi Zhang, and Xuanjing Huang. Template-free prompt tuning for few-shot NER. In Marine Carpuat, Marie-Catherine de Marneffe, and Ivan Vladimir Meza Ruiz, editors, Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 5721–5732, Seattle, United States, July 2022. Association for Computational Linguistics.
[18] Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient estimation of word representations in vector space, 2013.
[19] Mike Mintz, Steven Bills, Rion Snow, and Daniel Jurafsky. Distant supervision for relation extraction without labeled data. In Keh-Yih Su, Jian Su, Janyce Wiebe, and Haizhou Li, editors, Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, pages 1003–1011, Suntec, Singapore, August 2009. Association for Computational Linguistics.
[20] Diego Moll´a and Jos´e Luis Vicedo. Question answering in restricted domains: An overview. Computational Linguistics, 33(1):41–61, 2007.
[21] Jeffrey Pennington, Richard Socher, and Christopher Manning. GloVe: Global vectors for word representation. In Alessandro Moschitti, Bo Pang, and Walter Daelemans, editors, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing(EMNLP), pages 1532–1543, Doha, Qatar, October 2014. Association for Computational Linguistics.
[22] L.F. Rau. Extracting company names from text. In [1991] Proceedings. The Seventh IEEE Conference on Artificial Intelligence Application, volume i, pages 29–32, 1991.
[23] Stefan Schweter and Alan Akbik. Flert: Document-level features for named entity recognition, 2021.
[24] Jake Snell, Kevin Swersky, and Richard Zemel. Prototypical networks for few-shot learning. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, page 4080–4090, Red Hook, NY, USA, 2017. Curran Associates Inc.
[25] Erik F. Tjong Kim Sang and Fien De Meulder. Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition. In Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, pages 142–147, 2003.
[26] Laurens van der Maaten and Geoffrey Hinton. Visualizing data using t-sne. Journal of Machine Learning Research, 9(86):2579–2605, 2008.
[27] Oriol Vinyals, Charles Blundell, Timothy Lillicrap, koray kavukcuoglu, and Daan Wierstra. Matching networks for one shot learning. In D. Lee, M. Sugiyama, U. Luxburg, I. Guyon, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 29. Curran Associates, Inc., 2016.
[28] Ralph Weischedel, Martha Palmer, Mitchell Marcus, Eduard Hovy, Sameer Pradhan, Lance Ramshaw, Nianwen Xue, Ann Taylor, Jeff Kaufman, Michelle Franchini, Mohammed ElBachouti, Robert Belvin, and Ann Houston. Ontonotes release 5.0. Linguistic Data Consortium (LDC), Philadelphia, PA, 2013. LDC2013T19, ISBN: 1-58563-659-2.
[29] Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Remi Louf, Morgan Funtowicz, Joe Davison, Sam Shleifer, Patrick von Platen, Clara Ma, Yacine Jernite, Julien Plu, Canwen Xu, Teven Le Scao, Sylvain Gugger, Mariama Drame, Quentin Lhoest, and Alexander Rush. Transformers: State-of-the-art natural language processing. In Qun Liu and David Schlangen, editors, Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 38–45, Online, October 2020. Association for Computational Linguistics.
[30] Yi Yang and Arzoo Katiyar. Simple and effective few-shot named entity recognition with structured nearest neighbor learning. In Bonnie Webber, Trevor Cohn, Yulan He, and Yang Liu, editors, Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 6365–6375, Online, November 2020. Association for Computational Linguistics.