簡易檢索 / 詳目顯示

研究生: 湯立婷
Tang, Li-Ting
論文名稱: MeSA: 基於醫學主題注意力機制之生醫論文生成式摘要系統
MeSA: Medical Subject Attention for Abstractive Summarization on Biomedical Literature
指導教授: 高宏宇
Kao, Hung-Yu
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 醫學資訊研究所
Institute of Medical Informatics
論文出版年: 2020
畢業學年度: 108
語文別: 英文
論文頁數: 45
中文關鍵詞: 自然語言處理自動化摘要生物醫學文獻類神經網路醫學主題字
外文關鍵詞: Natural Language Processing, Text Summarization, Biomedical Literature, Neural Network, Medical Subject Heading
相關次數: 點閱:155下載:5
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 現今,隨著生物醫學文獻數量大量的發表,我們需要一生醫文章自動化 摘要系統。在 PubMed Central® (PMC) 資料庫中,文章的摘要又分為章節結 構式以及非章節結構式,我們的自動化摘要系統也分別針對這兩種文章摘要 進行不同模型的方式生成。第一種章節結構式摘要,我們使用四種模型分別 訓練「背景介紹」、「模型方法」、「實驗結果」、「結論」以上四個章節,再合 併起來成為最後的文章摘要。我們讓模型可以專門學習自己章節內文的特 徵,並且生成出更符合自己章節的重點摘要。第二種非章節結構式摘要,我 們使用章節編碼器,讓模型可以在生成摘要的每一個字時,決定此字是來自 原文內某一章節的資訊,非章節結構式摘要相對於章節結構式摘要具有較為 彈性的結構。
    在 PMC 資料庫中大部份文章附有醫學主題字,可以讓研究人員快速獲取 整篇文章的重點醫學主題。醫學主題字裡的詞彙來自一階層式的醫學字典, 由化學物質、疾病、組織等等所組成,階層由上至下代表著其定義由廣泛至 特定含義。因此,我們相信使用醫學主題字放入自動化摘要模型,可以幫助 我們抓取醫學概念的圖嵌入。本篇論文的模型架構是第一篇結合類神經網路 和醫學主題字主題階層概念的生成式自動化摘要模型。不僅如此,我們還大 幅提升了 ROUGE-2 與 ROUGE-3 的分數。

    With the number of biomedical domain publications growing up, we need the automatic summarization model to get the abstract that makes people obtain the essential information in the article. There are two types of abstracts in PMC, structured abstract and unstructured abstract, respectively. Our automatic summarization system will generate both of abstract types based on the demand. For generating a structured abstract, we construct four models for processing chapter of ”Introduction, Method, Result, Conclusion” and combine the output of each model. We keep the model focusing on learning characteristics in their belonging chapter and generating distinct abstract for each chapter. For generating unstructured abstract, we construct a chapter-level encoder to extract chapter information. Furthermore, the model can decide which chapter information the current time step’s word comes from.
    Moreover, part of the literature in PMC has a medical subject heading (MeSH), which make people research specific topic papers efficiently. MeSH is a controlled vocabulary that helps readers to capture the medical topics in the article. Therefore, we make use of MeSH terms to aid the text summarization model to acquire the medical concept graph embedding in the article. Our proposed model is the first abstractive summarization model, which combining the neural network and the MeSH concept graph. Moreover, we improve the ROUGE-2 and ROUGE-3 scores significantly.

    中文摘要 i Abstract ii Acknowledgments iv Table of Contents v Chapter 1. Introduction 1 1.1 Background ................................ 1 1.2 Motivation................................. 6 1.3 MeSAwork ................................ 11 Chapter 2. Related Work 13 2.1 Biomedicalwordembedding ....................... 13 2.2 AttentionMechanismwithMeSHterm.................. 14 2.3 Extractivesummarizationmodel ..................... 15 2.4 Abstractivesummarizationmodel..................... 16 Chapter 3. Methodology 21 3.1 Medicalsubjecttermsgraphembedding ................. 22 3.2 Medical subject attention with chapter-level encoder . . . . . . . . . . . 23 3.3 Encoder .................................. 25 3.4 AttentionMeSHmechanism........................ 26 3.5 Decoder .................................. 28 3.6 Decodingstrategy ............................. 30 3.6.1 Teacherforcing........................... 30 3.6.2 Beamsearch ............................ 31 Chapter 4. Experimental Results 32 4.1 Dataset................................... 32 4.2 Evaluation Metric ............................. 32 4.3 Modelparameters ............................. 33 4.4 Performancecomparison ......................... 34 4.5 Analysis .................................. 35 Chapter 5. Conclusion 41 5.1 Conclusion................................. 41 References 42

    [1] R. K. Amplayo, S. Lim, and S.-w. Hwang. Entity commonsense representation for neural abstractive summarization. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 697–707, New Orleans, Louisiana, June 2018. Association for Computational Linguistics.
    [2] D. Bahdanau, K. Cho, and Y. Bengio. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473, 2014.
    [3] P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov. Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics, 5:135–146, 2017.
    [4] B. Chiu, G. Crichton, A. Korhonen, and S. Pyysalo. How to train good word embeddings for biomedical NLP. In Proceedings of the 15th Workshop on Biomedical Natural Language Processing, pages 166–174, Berlin, Germany, Aug. 2016. Association for Computational Linguistics.
    [5] A. Cohan, F. Dernoncourt, D. S. Kim, T. Bui, S. Kim, W. Chang, and N. Goharian. A discourse-aware attention model for abstractive summarization of long documents. arXiv preprint arXiv:1804.05685, 2018.
    [6] S. Eğin, M. İlhan, S. Bademler, B. Gökçek, S. Hot, H. Ekmekci, Ö. B. Ekmekçi, G. Tanrıverdi, F. K. Dağıstanlı, G. Kamalı, et al. Protective effects of pentoxifylline in small intestine after ischemia–reperfusion. Journal of International Medical Research, 46(10):4140–4156, 2018.
    [7] A. Grover and J. Leskovec. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, pages 855–864. ACM, 2016.
    [8] J. Howard and S. Ruder. Universal language model fine-tuning for text classification. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 328–339, Melbourne, Australia, July 2018. Association for Computational Linguistics.
    [9] W.-T. Hsu, C.-K. Lin, M.-Y. Lee, K. Min, J. Tang, and M. Sun. A unified model for extractive and abstractive summarization using inconsistency
    loss. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 132–141, Melbourne, Australia, July 2018. Association for Computational Linguistics.
    [10] Q. Jin, B. Dhingra, W. Cohen, and X. Lu. Attentionmesh: Simple, effective and interpretable automatic mesh indexer. In Proceedings of the 6th BioASQ Workshop A challenge on large-scale biomedical semantic indexing and question answering, pages 47–56, 2018.
    [11] M. Kågebäck, O. Mogren, N. Tahmasebi, and D. Dubhashi. Extractive summarization using continuous vector space models. In Proceedings of the 2nd Workshop on Continuous Vector Space Models and their Compositionality (CVSC), pages 31–39, Gothenburg, Sweden, Apr. 2014. Association for Computational Linguistics.
    [12] C.-Y. Lin. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out, pages 74–81, 2004.
    [13] M.-T. Luong, H. Pham, and C. D. Manning. Effective approaches to attention- based neural machine translation. arXiv preprint arXiv:1508.04025, 2015.
    [14] T. Mikolov, K. Chen, G. Corrado, and J. Dean. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781, 2013.
    [15] J. Monteiro, M. Alves, P. Oliveira, and B. Silva. Structure-bioactivity relationships of methylxanthines: Trying to make sense of all the promises and the drawbacks. Molecules, 21(8):974, 2016.
    [16] R. Nallapati, F. Zhai, and B. Zhou. Summarunner: A recurrent neural network based sequence model for extractive summarization of documents. In Thirty- First AAAI Conference on Artificial Intelligence, 2017.
    [17] R. Nallapati, B. Zhou, C. Gulcehre, B. Xiang, et al. Abstractive text summarization using sequence-to-sequence rnns and beyond. arXiv preprint arXiv:1602.06023, 2016.
    [18] K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pages 311–318, Philadelphia, Pennsylvania, USA, July 2002. Association for Computational Linguistics.
    [19] R. Paulus, C. Xiong, and R. Socher. A deep reinforced model for abstractive summarization. arXiv preprint arXiv:1705.04304, 2017.
    [20] B. Perozzi, R. Al-Rfou, and S. Skiena. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 701–710. ACM, 2014.
    [21] A. M. Rush, S. Chopra, and J. Weston. A neural attention model for abstractive sentence summarization. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 379–389, Lisbon, Portugal, Sept. 2015. Association for Computational Linguistics.
    [22] A. See, P. J. Liu, and C. D. Manning. Get to the point: Summarization with pointer-generator networks. arXiv preprint arXiv:1704.04368, 2017.
    [23] K. A. Strohecker, C. J. Gaffney, J. Graham, K. Irgit, W. R. Smith, and T. R. Bowen. Pediatric all-terrain vehicle (atv) injuries: an epidemic of cost and grief. Acta orthopaedica et traumatologica Turcica, 51(5):416–419, 2017.
    [24] I. Sutskever, O. Vinyals, and Q. V. Le. Sequence to sequence learning with neural networks. In Advances in neural information processing systems, pages 3104–3112, 2014.
    [25] Z. Tu, Z. Lu, Y. Liu, X. Liu, and H. Li. Modeling coverage for neural machine translation. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 76–85, Berlin, Germany, Aug. 2016. Association for Computational Linguistics.
    [26] O. Vinyals, M. Fortunato, and N. Jaitly. Pointer networks. In C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett, editors, Advances in Neural Information Processing Systems 28, pages 2692–2700. Curran Associates, Inc., 2015.
    [27] Y. Zhang, Q. Chen, Z. Yang, H. Lin, and Z. Lu. Biowordvec, improving biomedical word embeddings with subword information and mesh. Scientific data, 6(1):52, 2019.

    下載圖示 校內:立即公開
    校外:立即公開
    QR CODE