研究生: |
蘇志盛 Su, Jhih-Sheng |
---|---|
論文名稱: |
焦點遮蔽注意力機制之生成式摘要 Focus Masking Attention for Abstractive Summarization |
指導教授: |
高宏宇
Kao, Hung-Yu |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
論文出版年: | 2019 |
畢業學年度: | 107 |
語文別: | 英文 |
論文頁數: | 39 |
中文關鍵詞: | 注意力機制 、自然語言生成 、生成式摘要 |
外文關鍵詞: | attention mechanism, nature language generation, abstractive summarization |
相關次數: | 點閱:108 下載:8 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
文本摘要意在保留語義的前提下,對輸入文本進行壓縮。此自然語言處理的技術,可以被應用在許多不同的情境中。例如,對每日新聞的文章進行文本摘要,可以快速擷取出新聞的重點,以減少讀者閱讀所花費的時間。摘要系統主要分為兩大類,一是抽取式摘要,產生的摘要中所有詞彙皆源於輸入文本;二是生成式摘要,模型會擷取輸入文本的重要資訊,再透過自然語言生成的技術產出輸出序列。
現行研究主要使用序列到序列來搭建生成式模型,使用的是編碼器-解碼器的架構。此架構雖然能有效地學習輸入文本及輸出文本之間的對應關係,但是在內容選擇上能有很大的進步空間。本論文基於序列到序列模型,額外加上一個內容選擇器對輸入文本進行挑選。並透過焦點遮蔽注意力機制,將內容選擇的結果添加到模型當中。其中,內容選擇器是由序列標記模型搭配關鍵詞提取所構成。焦點遮蔽注意力機制則是透過調整生成模型中,解碼器對輸入文本「複製行為」的權重大小來輔助解碼器輸出的結果。
實驗結果顯示,我們提出的焦點遮蔽注意力機制,不論在單字級別或是句子級別的權重調整都能夠在評估結果上得到有效的改善。而混和使用單字級別及句子級別的權重,也能夠使得效能得到進一步的提升。
Text summarization condenses the input text while preserving semantic. This natural language processing technique can apply to many fields. For example, to summarize daily news can rapidly extract key points of news, which reduces the reading time for readers. Summarization system can be divided to extractive and abstractive summarization. All words in generated summary comes from input text for extractive summarization. The model will capture critical information from input text and generate output sequence by natural language generation for abstractive summarization.
Recent research train abstractive summarization via sequence-to-sequence learning with encoder-decoder structure. Even though this structure can effectively learn the corresponding relationship between input and output text, it still has room for improvement on content selection. This thesis tries to add a content selector to sequence-to-sequence model, and apply the content selection results to model by “focus masking attention” mechanism. The content selector combines the sequence tagging and keywords extraction. Focus masking attention reweights the copy attention distribution for decoder to improve results.
The experiment results show that our approach get effective improvement on both word-level and sentence-level reweighting. Combining both levels reweighting improves more than only applying any single level adjustment.
[1] R. K. Amplayo, S. Lim and S.-w. J. a. p. a. Hwang. Entity commonsense representation for neural abstractive summarization. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), 2018.
[2] D. Bahdanau, K. Cho and Y. J. a. p. a. Bengio. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473, 2014.
[3] A. Celikyilmaz, A. Bosselut, X. He and Y. J. a. p. a. Choi. Deep communicating agents for abstractive summarization. arXiv preprint arXiv:1803.10357, 2018.
[4] D. Das, A. F. J. L. S. f. t. L. Martins and S. I. c. a. CMU, A survey on automatic text summarization. in Literature Survey for the Language and Statistics II course at CMU, 2007.
[5] M. Gardner, J. Grus, M. Neumann, O. Tafjord, P. Dasigi, N. Liu, M. Peters, M. Schmitz and L. J. a. p. a. Zettlemoyer. Allennlp: A deep semantic natural language processing platform. arXiv preprint arXiv:1803.07640, 2018.
[6] S. Gehrmann, Y. Deng and A. M. J. a. p. a. Rush. Bottom-up abstractive summarization. arXiv preprint arXiv:1808.10792, 2018.
[7] K. M. Hermann, T. Kocisky, E. Grefenstette, L. Espeholt, W. Kay, M. Suleyman and P. Blunsom. Teaching machines to read and comprehend. Advances in neural information processing systems, 2015.
[8] W.-T. Hsu, C.-K. Lin, M.-Y. Lee, K. Min, J. Tang and M. J. a. p. a. Sun. A unified model for extractive and abstractive summarization using inconsistency loss. arXiv preprint arXiv:1805.06266, 2018.
[9] G. Klein, Y. Kim, Y. Deng, J. Senellart and A. M. J. a. p. a. Rush. Opennmt: Open-source toolkit for neural machine translation. arXiv e-prints 1701.02810, 2017.
[10] C. Li, W. Xu, S. Li and S. Gao. Guiding generation for abstractive text summarization based on key information guide network. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), 2018.
[11] C.-Y. Lin. Rouge: A package for automatic evaluation of summaries. Text summarization branches out, 2004.
[12] R. Mihalcea and P. Tarau. Textrank: Bringing order into text. Proceedings of the 2004 conference on empirical methods in natural language processing, 2004.
[13] R. Nallapati, B. Zhou, C. Gulcehre and B. J. a. p. a. Xiang. Abstractive text summarization using sequence-to-sequence rnns and beyond. arXiv preprint arXiv:1602.06023, 2016.
[14] R. Paulus, C. Xiong and R. J. a. p. a. Socher. A deep reinforced model for abstractive summarization. arXiv preprint arXiv:1705.04304, 2017.
[15] J. Pennington, R. Socher and C. Manning. Glove: Global vectors for word representation. Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 2014.
[16] M. E. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee and L. J. a. p. a. Zettlemoyer. Deep contextualized word representations. arXiv preprint arXiv:1802.05365., 2018.
[17] A. M. Rush, S. Chopra and J. J. a. p. a. Weston. A neural attention model for abstractive sentence summarization. arXiv preprint arXiv:1509.00685, 2015.
[18] A. See, P. J. Liu and C. D. J. a. p. a. Manning. Get to the point: Summarization with pointer-generator networks. arXiv preprint arXiv: 1704.04368, 2017.
[19] I. Sutskever, O. Vinyals and Q. V. Le. Sequence to sequence learning with neural networks. Advances in neural information processing systems, 2014.
[20] Y. Wu, M. Schuster, Z. Chen, Q. V. Le, M. Norouzi, W. Macherey, M. Krikun, Y. Cao, Q. Gao and K. J. a. p. a. Macherey. Google's neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.0814 2016.