簡易檢索 / 詳目顯示

研究生: 許逸翔
HSU, I-HSIANG
論文名稱: 基於PREAFS 的樹狀知識圖譜之多文件懶人包
Multi-Documents Guidance Summary based on PREAFS-Tree-Structured Knowledge Graph
指導教授: 盧文祥
Lu, Wen-Hsiang
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2023
畢業學年度: 111
語文別: 英文
論文頁數: 47
中文關鍵詞: 多文章摘要知識圖譜懶人包系統語意概念設計語意概念抽取
外文關鍵詞: Multi-Documents Summarization, Knowledge Graph, Guidance Summary System, Concept Pattern Design, Concept Pattern Extraction
相關次數: 點閱:54下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著網路的蓬勃發展,存在於網路上的資訊日漸遽增,人們逐漸習慣以網路作為獲
    取相關知識的途徑,其中也包括閱讀新聞了解時事。然而大量的文章數目與零碎的
    資訊結構,造成使用者閱讀上的負擔。
    為了解決上路所遇到的問題,許多研究進行抽取式或生成式的摘要,但都會面臨摘
    要出的句子並無法闡述事情的起承轉合,仍然需要閱讀多篇新聞,來補足不同的面
    相,因此本論文將提出摘要技術與樹狀知識圖譜整合的系統。
    在本論文中,首先利用時間維度尋找當前熱門的議題,並將相關的新聞導入開源的
    語言工具,用文法分析樹抽取出「主詞片語- 動詞片語- 受詞片語」的候選摘要句,
    同時搭配新聞中的人物、時間、地點補充結構訊息。接著運用提前設計好的概念結
    構將候選摘要句進行6 大面相分類,分別是「前提、原因、事件本身、影響、未來
    方向、實際策略」,最後將候選摘句串聯成完整摘要,並建構成知識圖譜,形成懶
    人包,方便使用者可以藉由本系統所提供的介面,快速釐清事情的始末並理解其內
    容。

    With the booming development of the Internet, the amount of information available online has rapidly increased. People are gradually accustomed to using the Internet to stay informed about current events. However, the large number of articles and fragmented information structure impose a burden on users' reading experience.
    To address the challenges encountered in this context, many studies have focused on extractive or generative summarization techniques. However, the summaries generated often fail to provide a comprehensive narrative or account of the events, requiring users to read multiple news articles to gather different perspectives. Therefore, this paper proposes a system that integrates summarization techniques with a tree-structured knowledge graph.
    In this paper, we first use the temporal dimension to identify current trending topics and import relevant news articles into an open-source language tool. We extract candidate summaries using syntactic parsing trees in the form of "Subject phrase - Verb phrase - Object phrase." Additionally, we supplement the structural information with entities such as people, time, and locations found in the news articles. We then classify the candidate summaries into six major aspects using pre-designed concept patterns: "Premise, Reason, Event, Affect, Future Direction, Practical Strategies". Finally, the candidate summaries are concatenated into a guidance summary, enhancing the coherence of the sentences, and constructing a knowledge graph for a comprehensive understanding. This system provides users with an interface to quickly grasp the essence of the events and comprehend their content.

    摘要 I Abstract II 誌謝 III Table Of Contents IV List Of Tables VI List Of Figures VII Chapter 1 Introduction 1 1.1 Background 1 1.2 Motivation 2 1.3 Goal 2 1.4 Method 3 1.5 Contribution 4 Chapter 2 Related Work 5 2.1 Event Extraction 5 2.2 Article Summarization 5 2.3 Structural Analysis 6 Chapter 3 Method 7 3.1 Overview 7 3.2 News Searching 9 3.2.1 News Collection 9 3.2.2 Issue Detection 10 3.2.3 Trend Analysis 10 3.3 Knowledge Database Preparing 11 3.3.1 PREAFS concept Pattern Construction 12 3.3.2 Synonym Extending 13 3.4 Text Processing 15 3.4.1 Event Extraction 15 3.4.2 Phrases Combination 17 3.4.3 Coreference 18 3.5 PREAFS Tree Structure Projecting 19 3.5.1 NER Converter 19 3.5.2 Discourse Detecting 20 3.5.3 PREAFS concept Pattern Capturing 21 3.6 PREAFS Guidance Summary KG Construction 22 3.6.1 PREAFS Summary Chain Connection 22 3.6.2 PREAFS Summary Grouping 23 3.6.3 Knowledge Graph Construction 25 Chapter 4 Experiment 26 4.1 Data 26 4.2 Evaluation Metrics: Concept F1 26 4.3 Influence of Article Quantity on Guidance Summarization 29 4.3.1 Description 29 4.3.2 Experiment Result 29 4.4 Evaluating Summarization by Concept F1 (without PREAFS) 30 4.4.1 Description 30 4.4.2 Experiment Result 30 4.4.3 Error Analysis 31 4.5 Evaluating Guidance Summarization by Concept F1 (with PREAFS) 34 4.5.1 Description 34 4.5.2 Experiment Result 34 4.5.3 Error Analysis 35 4.6 Evaluating Guidance Summarization by Human 37 4.6.1 Description 37 4.6.2 Experiment Result 38 Chapter 5 Conclusions and Future Works 39 5.1 Conclusions 39 5.2 Future Works 39 References 40 Appendix A. Experiment figures of 4.3 44 Appendix B. Experiment tables of 4.5 46

    [1] QmeeLtd, “Online in 60 seconds –2021,” 2022. Available at https://www.qmee.com/blog/online-in-60-seconds-2021 (visited on 2023-8-25).
    [2] Kepios, “Digital 2023: Taiwan,” 2023. Available at https://datareportal.com/reports/digital-2023-taiwan (visited on 2023-8-25).
    [3] L. Sha, F. Qian, B. Chang, and Z. Sui, “Jointly extracting event triggers and argumentsby dependency-bridge rnn and tensor-based argument interaction,” in Proceedings ofthe AAAI conference on artificial intelligence, vol. 32, 2018.
    [4] F. Christopoulou, M. Miwa, and S. Ananiadou, “A walk-based model on entity graphsfor relation extraction,” arXiv preprint arXiv:1902.07023, 2019.
    [5] Y. Luan, D. Wadden, L. He, A. Shah, M. Ostendorf, and H. Hajishirzi, “A generalframework for information extraction using dynamic span graphs,” arXiv preprintarXiv:1904.03296, 2019.
    [6] J. Liu, Y. Chen, K. Liu, W. Bi, and X. Liu, “Event extraction as machine reading comprehension,”in Proceedings of the 2020 conference on empirical methods in naturallanguage processing (EMNLP), pp. 1641–1651, 2020.
    [7] R. Cai, X. Zhang, and H. Wang, “Bidirectional recurrent convolutional neural networkfor relation classification,” in Proceedings of the 54th Annual Meeting of the Associationfor Computational Linguistics (Volume 1: Long Papers), pp. 756–765, 2016.
    [8] N. Peng, H. Poon, C. Quirk, K. Toutanova, and W.-t. Yih, “Cross-sentence n-ary relationextraction with graph lstms,” Transactions of the Association for ComputationalLinguistics, vol. 5, pp. 101–115, 2017.
    [9] Z.-Y. Dou, P. Liu, H. Hayashi, Z. Jiang, and G. Neubig, “Gsum: A general frameworkfor guided neural abstractive summarization,” arXiv preprint arXiv:2010.08014, 2020.
    [10] H. Li, J. Zhu, J. Zhang, C. Zong, and X. He, “Keywords-guided abstractive sentencesummarization,” in Proceedings of the AAAI conference on artificial intelligence,vol. 34, pp. 8196–8203, 2020.40
    [11] F. Wang, K. Song, H. Zhang, L. Jin, S. Cho, W. Yao, X. Wang, M. Chen, andD. Yu, “Salience allocation as guidance for abstractive summarization,” arXiv preprintarXiv:2210.12330, 2022.
    [12] T. Goyal, J. J. Li, and G. Durrett, “News summarization and evaluation in the era ofgpt-3,” arXiv preprint arXiv:2209.12356, 2022.
    [13] M. Zhong, P. Liu, Y. Chen, D. Wang, X. Qiu, and X. Huang, “Extractive summarizationas text matching,” arXiv preprint arXiv:2004.08795, 2020.
    [14] D. Wang, P. Liu, Y. Zheng, X. Qiu, and X. Huang, “Heterogeneous graph neuralnetworks for extractive document summarization,” arXiv preprint arXiv:2004.12393,2020.
    [15] C.-W. Lin, C.-P. Young, and W.-H. Lu, “A domain-based news summarizationby syntactic structure and verb semantics,” Master’s thesis,2022. Available at https://thesis.lib.ncku.edu.tw/thesis/detail/09361c39e4b943e10c348d8ee11d60e9/ (visited on 2023-8-25).
    [16] P. Cao, X. Zuo, Y. Chen, K. Liu, J. Zhao, Y. Chen, and W. Peng, “Knowledge-enrichedevent causality identification via latent structure induction networks,” in Proceedingsof the 59th Annual Meeting of the Association for Computational Linguistics and the11th International Joint Conference on Natural Language Processing (Volume 1: LongPapers), pp. 4862–4872, 2021.
    [17] M. T. Phu and T. H. Nguyen, “Graph convolutional networks for event causality identificationwith rich document-level structures,” in Proceedings of the 2021 conferenceof the North American chapter of the association for computational linguistics: Humanlanguage technologies, pp. 3480–3490, 2021.
    [18] M. Chen, Y. Cao, K. Deng, M. Li, K. Wang, J. Shao, and Y. Zhang, “Ergo: Eventrelational graph transformer for document-level event causality identification,” arXivpreprint arXiv:2204.07434, 2022.
    [19] F. Guil and R. Marín, “A tree structure for event-based sequence mining,” Knowledge-Based Systems, vol. 35, pp. 186–200, 2012.41
    [20] C.-K. Wang and W.-H. Lu, “Multi-turn news dialogue system based on event chain andtree-structure knowledge graph,” Master’s thesis, 2020. Available at https://thesis.lib.ncku.edu.tw/thesis/detail/8f80a4683abeffa0b7738069862d7ec7/ (visitedon 2023-8-25).
    [21] F. Petroni, N. Raman, T. Nugent, A. Nourbakhsh, Ž. Panić, S. Shah, and J. L. Leidner,“An extensible event extraction system with cross-media event resolution,” in Proceedingsof the 24th ACM SIGKDD international conference on knowledge discovery & datamining, pp. 626–635, 2018.
    [22] C.-A. Wang and W.-H. Lu, “Shopping chatbot based on complex task structureand consumption need,” Master’s thesis, 2019. Available at https://thesis.lib.ncku.edu.tw/thesis/detail/63c7ca148bff7a769fda88713ed4f7f5/ (visitedon 2023-8-25).
    [23] E. B.V., “Accelerate time to insight with elasticsearch and ai,” 2023. Available athttps://www.elastic.co/ (visited on 2023-8-25).
    [24] NelSenso.Net, “Keywords online: Irezer,” 2023. Available at https://www.nelsenso.net/en/irezer.aspx (visited on 2023-8-25).
    [25] D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent dirichlet allocation,” Journal of machineLearning research, vol. 3, no. Jan, pp. 993–1022, 2003.
    [26] M. Yang, “Ckip e-hownet tools,” 2021. Available at https://github.com/ckiplab/ehownet (visited on 2023-8-25).
    [27] H. He and J. D. Choi, “The stem cell hypothesis: Dilemma behind multi-task learningwith transformer encoders,” in Proceedings of the 2021 Conference on Empirical Methodsin Natural Language Processing, (Online and Punta Cana, Dominican Republic),pp. 5555–5577, Association for Computational Linguistics, Nov. 2021.
    [28] C.-R. Li, I.-H. Hsu, W.-H. Lu, and B.-Y. Huang, “Coreference resolution,”2023. Available at https://github.com/nckucsiewmmks/coreference/blob/main/README.md (visited on 2023-8-25).42
    [29] N. Reimers and I. Gurevych, “Making monolingual sentence embeddings multilingualusing knowledge distillation,” in Proceedings of the 2020 Conference on EmpiricalMethods in Natural Language Processing, Association for Computational Linguistics,11 2020.
    [30] C.-Y. Lin, “Rouge: A package for automatic evaluation of summaries,” in Text summarizationbranches out, pp. 74–81, 2004.
    [31] K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu, “Bleu: a method for automatic evaluationof machine translation,” in Proceedings of the 40th annual meeting of the Associationfor Computational Linguistics, pp. 311–318, 2002.
    [32] OpenAI, “Introducing chatgpt,” 2023. Available at https://openai.com/blog/chatgpt (visited on 2023-8-25).

    無法下載圖示 校內:2028-07-31公開
    校外:2028-07-31公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE