簡易檢索 / 詳目顯示

研究生: 林雨瑩
Lin, Yu-Ying
論文名稱: 控制實體導向之摘要生成
EOS: Controllable Entity-Oriented Summarization
指導教授: 高宏宇
Kao, Hung-Yu
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2021
畢業學年度: 109
語文別: 英文
論文頁數: 63
中文關鍵詞: 自然語言處理自動化摘要生成類神經網路控制生成控制化摘要生成
外文關鍵詞: Natural Language Processing, Text Summarization, Neural Network, Controlled Generation, Controlled Summarization
相關次數: 點閱:108下載:13
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 雖然文本摘要一直都是自然語言處理領域中一項重要任務,但對文本摘要進行客製化仍是較新穎的一項課題。根據客製化的需求目的不同,我們將客製化摘要分為五大類型:長度控制(length-constrained)、類別導向(aspect-oriented)、實體相關(entity-centric)、部份摘要(remainder)與文本風格(source-style)。我們於本論文提出一個新的實體導向摘要任務 (Entity-Oriented Summarization, EOS) ,結合了以往類別導向及實體相關的目標,希望能夠生成以實體為導向的相關摘要。

    現行的方法在進行實體導向摘要任務時,會遇到以下三個難題:缺乏與實體相對應的摘要訓練資料、罕見實體的向量學習表現不佳以及實體與文章主題不相關。面對第一個難處,前人往往自定義規則來產生對應的訓練資料,而我們在模型架構設計上,只需要一般的摘要作為訓練即可。對於罕見字處理,我們基於點間互資訊 (Pointwise Mutual information) 設計了一個新的相關度計算方式。而最後,我們採用一個模型訓練技巧,巧妙地在實體與文章不相關時,依然可以產生通用的摘要。實驗結果顯示,我們的模型確實可以產生與實體相關的摘要,而在進一步的分析中也可以發現,我們所提出的解決方法,確實能有效減緩罕見與不相關實體的問題。

    Even though the text summarization task has always been a vital part of Natural Language Processing (NLP) field, controlled summarization is still a novel topic. Based on different customization purposes, we classified controlled summarization into five categories: length-constrained, aspect-oriented, entity-centric, remainder and source-style. In this thesis, we propose a new Entity-Oriented Summarization (EOS) task, combining the appeals of aspect-oriented and entity-centric, hoping to produce an entity-related summary.

    When conducting EOS task, current methods may face three difficulties: lack of aspect/entity-aware summary dataset, underperforming word embeddings of infrequent entities and unrelated entity input. To handle the first issue, previous works define heuristic rules to generate data while we don't need such dataset, which makes our method more flexible. As for the rare entities, we purpose a relation calculation method based on Pointwise Mutual information (PMI). Last, we adopt a training skill, weight annealing, to produce a generic summary even in the situations that the desired entity and article are irrelevant. Experiment results show that our model can generate entity-related summaries. As for the infrequent and unrelated entities, we analyze the influences brought by our proposed solutions, and the results show that our model indeed mitigates the issues.

    中文摘要 . . . . . . . . . . . . . . .. . . . . . . . . .i Abstract . . . . . . . . . . . . . . . . . . . . . . . .ii Acknowledgments . . . . . . . . . . . . . . . . . . . .iii Table of Contents . . . . . . . . . . . . . . . . . . .v Chapter 1. Introduction . . . . . . . . . . . . . . . .1 1.1 Background . . . . . . . . . . . . . . . . . . . . .1 1.2 Motivation . . . . . . . . . . . . . . . . . . . . .4 1.3 Our work . . . . . . . . . . . . . . . . . . . . . .5 Chapter 2. Related Work . . . . . . . . . . . . . . . .9 2.1 Background of controlled summarization . . . . . . .9 2.1.1 Text generation. . . . . . . . . . . . . . . . . .9 2.1.2 Controlled generation. . . . . . . . . . . . . . .9 2.1.3 Summarization. . . . . . . . . . . . . . . . . . .10 2.2 Aspect-­oriented and entity­-centric in controlled summarization. . . . .11 2.2.1 Hierarchical methods . . . . . . . . . . . . . . .11 2.2.2 Other methods. . . . . . . . . . . . . . . . . . .14 Chapter 3. Methodology . . . . . . . . . . . . . . . . .15 3.1 Overview of model structures . . . . . . . . . . . .15 3.1.1 Sequence-­to-­sequence attention model . . . . . . .15 3.1.2 Pointer-­generator network. . . . . . . . . . . . .17 3.1.3 Entity attention . . . . . . . . . . . . . . . . .21 3.1.4 Pointer­generator network with entity attention . .22 3.2 Weight annealing . . . . . . . . . . . . . . . . . .24 Chapter 4. Experimental Results. . . . . . . . . . . . .26 4.1 Dataset. . . . . . . . . . . . . . . . . . . . . . .26 4.2 Evaluation Metrics . . . . . . . . . . . . . . . . .28 4.2.1 ROUGE. . . . . . . . . . . . . . . . . . . . . . .28 4.2.2 BLEU . . . . . . . . . . . . . . . . . . . . . . .29 4.3 Quantitative result. . . . . . . . . . . . . . . . .29 4.3.1 Baseline models. . . . . . . . . . . . . . . . . .29 4.3.2 Performance comparison . . . . . . . . . . . . . .30 4.3.3 Human evaluation . . . . . . . . . . . . . . . . .39 4.4 Qualitative result . . . . . . . . . . . . . . . . .41 Chapter 5. Analysis. . . . . . . . . . . . . . . . . . .43 5.1 Synthetic data experiment. . . . . . . . . . . . . .43 5.2 Model reusability with similar entities. . . . . . .45 5.3 Analysis of unrelated entities . . . . . . . . . . .48 5.4 Analysis of infrequent entities. . . . . . . . . . .50 5.5 Comparison between attention scores. . . . . . . . .53 5.6 Representative words . . . . . . . . . . . . . . . .54 Chapter 6. Conclusion. . . . . . . . . . . . . . . . . .57 6.1 Conclusion . . . . . . . . . . . . . . . . . . . . .57 6.2 Future work. . . . . . . . . . . . . . . . . . . . .57 References . . . . . . . . . . . . . . . . . . . . . . .58

    [1]N. Akhtar, N. Zubair, A. Kumar, and T. Ahmad. Aspect based sentiment oriented summarization of hotel reviews. In Proceedings of the 7th International Conference on Advances in Computing and Communications(ICACC), pages 563–571, 2017.
    [2]P. Anderson, X. He, C. Buehler, D. Teney, M. Johnson, S. Gould, and L. Zhang. Bottom­up and top­down attention for image captioning and visual question answering. In The IEEE Conference on Computer Visionand Pattern Recognition (CVPR), page 6077–6086, 2018.
    [3]S. Angelidis and M. Lapata. Summarizing opinions: Aspect extraction meets sentiment prediction and they are both weakly supervised. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 3657–3686, 2018.
    [4]D. Anh and N. Trang. Abstractive text summarization using pointer­generatornetworks with pre­trained word embedding. In Proceedings of the 10th International Symposium on Information and Communication Technology (SoICT), pages 473–478, 2019.
    [5]D. Bahdanau, K. Cho, and Y. Bengio. Neural machine translation by jointly learning to align and translate. In Proceedings of the 3rd International Conference on Learning Representations (ICLR), 2015.
    [6]D. Blei, A. Ng, and M. Jordan. Latent dirichlet allocation. InJournal of Machine Learning Research, page 993–1022, 2003.
    [7]S. Bowman, L. Vilnis, O. Vinyals, A. Dai, R. Jozefowicz, and S. Bengio.Generating sentences from a continuous space. In Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning (CONLL), pages 10–21, 2016.
    [8]Y.­C. Chen and M. Bansal. Fast abstractive summarization with reinforce­selected sentence rewriting. In Proceedings of the 56th Annual Meeting of theAssociation for Computational Linguistics (ACL), 2018.
    [9]J. Conroy and D. P. O’leary. Text summarization via hidden markov models and pivoted qr matrix decomposition. In Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pages 406–407, 2001.
    [10]A. Fan, D. Grangier, and M. Auli. Controllable abstractive summarization.In Proceedings of the 2nd Workshop on Neural Machine Translation and Generation (NMT@ACL), pages 45–54, 2018.
    [11]J. Ficler and Y. Goldberg. Controlling linguistic style aspects in neural language generation. In Proceedings of the Workshop on Stylistic Variation,2017.
    [12]K. Filippova, E. Alfonseca, C. Colmenares, L. Kaiser, and O. Vinyals.Sentence compression by deletion with lstms. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 360–368, 2015.
    [13]S. Gehrmann, Y. Deng, and A. Rush. Bottom­up abstractive summarization. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 4098–4109, 2018.
    [14]I. J. Goodfellow, J. Pouget­Abadie, M. Mirza, B. Xu, D. Warde­Farley, S. Ozair, A. Courville, and Y. Bengio. Generative adversarial networks.In Proceedings of the 27th International Conference on Neural Information Processing Systems (NIPS), pages 2672–2680, 2014.
    [15]R. He, W. S. Lee, H. T. Ng, and D. Dahlmeier. An unsupervised neural attention model for aspect extraction. In Proceedings of the 55th Association for Computational Linguistics (ACL), pages 388–397, 2017.
    [16]K. M. Hermann, T. Kociský, E. Grefenstette, L. Espeholt, W. Kay,M. Suleyman, and P. Blunsom. Teaching machines to read and comprehend.
    In Proceedings of the 28th International Conference on Neural Information Processing Systems (NIPS), pages 1693–1701, 2015.
    [17]Z. Hu, Z. Yang, X. Liang, R. Salakhutdinov, and E. P. Xing. Toward controlled generation of text. In Proceedings of the 34th International Conference on Machine Learning (ICML), pages 1587–1596, 2017.
    [18]L. Jingyan, Yue, Zhai, Chengxiang, Sundaresan, and Neel. Rated aspect summarization of short comments. In Proceedings of the 18th InternationalWorld Wide Web Conference (WWW), pages 131–140, 2009.
    [19]C. Kedzie, K. McKeown, and H. Daumé. Content selection in deep learning models of summarization. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 57–64,2018.
    [20]Y. Kikuchi, G. Neubig, R. Sasano, H. Takamura, and M. Okumura. Controlling output length in neural encoder­decoders. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing(EMNLP), pages 1328–1338, 2016.
    [21]D. P. Kingma and M. Welling. Auto­encoding variational bayes. In Proceedings of the 2nd International Conference on Learning Representations (ICLR), 2014.
    [22]K. Krishna and B. V. Srinivasan. Generating topic­ oriented summaries using neural attention. In Proceedings of the 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL­HLT), pages 1697–1705, 2018.
    [23]F. Ladhak, B. Li, Y. Al­Onaizan, and K. McKeown. Exploring content selection in summarization of novel chapters. In Proceedings of the 58th Association for Computational Linguistics (ACL), pages 5043–5054, 2020.
    [24]C.­Y. Lin. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out, pages 74–81, 2004.
    [25]Y. Liu, Z. Luo, and K. Zhu. Controlling length in abstractive summarization using a convolutional neural network. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages4110–4119, 2018.
    [26]J. B. Macqueen. Some methods for classification and analysis of multivariate observations. In Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, pages 281–297, 1967.
    [27]T. Mikolov, S. Kombrink, L. Burget, J. Cernocký, and S. Khudanpur.Extensions of recurrent neural network language model. In Proceedings of the 36th International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 5528–5531, 2011.
    [28]C. Mitcheltree, V. Wharton, and A. Saluja. Using aspect extraction approaches to generate review summaries and user profiles. In Proceedings of the 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), pages 68–75, 2018.
    [29]R. Nallapati, F. Zhai, and B. Zhou. Summarunner: A recurrent neural network based sequence model for extractive summarization of documents. In Proceedings of the 31st Association for the Advancement of Artificial Intelligence (AAAI), pages 3075–3081, 2017.
    [30]S. Narayan, S. Cohen, and M. Lapata. Don’t give me the details, just the summary! topic­aware convolutional neural networks for extreme summarization. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1797–1807, 2018.
    [31]K. Papineni, S. Roukos, T. Ward, and W.­J. Zhu. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), pages 311–318, 2002.
    [32]S. Rajeswar, S. Subramanian, F. Dutil, C. Pal, and A. Courville. Adversarial generation of natural language. In Proceedings of the 2nd Workshop on Representation Learning for NLP, pages 241–251, 2017.
    [33]D. J. Rezende, S. Mohamed, and D. Wierstra. Stochastic backpropagation and approximate inference in deep generative models. In Proceedings of the 31st International Conference on Machine Learning (ICML), pages 1278–1286,2014.
    [34]S. Robertson, S. Walker, S. Jones, M. Hancock­Beaulieu, and M. Gatford. Okapi at trec­3. In the 3rd Text Retrieval Conference (TREC), pages 109–126,1995.
    [35]A. M. Rush, S. Chopra, and J. Weston. A neural attention model for abstractive sentence summarization. In Proceedings of the 2015 Conferenceon Empirical Methods in Natural Language Processing (EMNLP), pages 379–389, 2015.
    [36]I. Saito, K. Nishida, K. Nishida, A. Otsuka, H. Asano, J. Tomita, H. Shindo, and Y. Matsumoto. Length­controllable abstractive summarization by guiding with summary prototype. InArxiv, 2020.
    [37]A. See, P. J. Liu, and C. D. Manning. Get to the point: Summarizationwith pointer­generator networks. In Proceedings of the 55th Association for Computational Linguistics (ACL), pages 1073–1083, 2017.
    [38]R. Sennrich, B. Haddow, and A. Birch. Controlling politeness in neural machine translation via side constraints. In Proceedings of the 14th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL­HLT), pages 35–40, 2016.
    [39]X.Shen, J. Suzuki, K.Inui, H. Su, D. Klakow, and S. Sekine. Select and attend: Towards controllable content selection in text generation. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-­IJCNLP), pages 579–590, 2019.
    [40]Y.­Z. Song, H.­H. Shuai, S.­L. Yeh, Y.­L. Wu, L.­W. Ku, and W.­C. Peng.Attractive or faithful? popularity­reinforced learning for inspired headline
    generation. In Proceedings of the 34th Association for the Advancement of Artificial Intelligence (AAAI), pages 8910–8917, 2020.
    [41]S. Takeno, M. Nagata, and K. Yamamoto. Controlling target features in neural machine translation via prefix constraints. In Proceedings of the 4th Workshop on Asian Translation (WAT), pages 55–63, 2017.
    [42]O. Vinyals, M. Fortunato, and N. Jaitly. Pointer networks. In Proceedings of the 28th International Conference on Neural Information Processing Systems (NIPS), pages 2692–2700, 2015.
    [43]W. Wang, Z. Gan, H. Xu, R. Zhang, G. Wang, D. Shen, C. Chen, and L. Carin.Topic­guided variational autoencoders for text generation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), pages 166–177, 2019.
    [44]Y. Yan, W. Qi, Y. Gong, D. Liu, N. Duan, J. Chen, R. Zhang, and M. Zhou.Prophetnet: Predicting future n­gram for sequence­to­sequence pre­training. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 2401–2410, 2020.
    [45]M. Yasunaga, R. Zhang, K. Meelu, A. Pareek, K. Srinivasan, and D. Radev.Graph­based neural multi­document summarization. In Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL), pages 452–462, 2017.
    [46]L. Yu, W. Zhang, J. Wang, and Y. Yu. Seqgan: Sequence generative adversarial nets with policy gradient. In Proceedings of the 31st Association for the Advancement of Artificial Intelligence (AAAI), pages 2852–2858, 2017.
    [47]Y. Zhang, Z. Gan, and L. Carin. Generating text via adversarial training. InProceedings of the 2016 NIPS Workshop on Adversarial Training, 2016.
    [48]L. Zhuang, F. Jing, X. Zhu, and L. Zhang. Movie review mining and summarization. In Proceedings of the 15th ACM International Conference on Information and Knowledge Management (CIKM), pages 43–50, 2006

    下載圖示 校內:2022-07-29公開
    校外:2022-07-29公開
    QR CODE