簡易檢索 / 詳目顯示

研究生: 王筌立
Wang, Chuan-Li
論文名稱: 基於BERT與BART的文本摘要生成技術與應用研究:以寫作能力培養為例
Research on BERT and BART based Text Summary Generation Technology and Applications: A Case Study on Writing Skill Development
指導教授: 陳裕民
Chen, Yuh-Min
共同指導教授: 朱慧娟
Chu, Hui-Chuan
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 製造資訊與系統研究所
Institute of Manufacturing Information and Systems
論文出版年: 2024
畢業學年度: 112
語文別: 中文
論文頁數: 97
中文關鍵詞: BERTBART合作學習心智圖文章摘要寫作能力培養數位寫作技能
外文關鍵詞: BERT, BART, collaborative learning, mind maps, article summarization, writing skill cultivation, digital writing skills
相關次數: 點閱:48下載:5
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 二十一世紀的深度學習技術蓬勃發展,在自然語言處理領域中,文本生成技術已成為重要的研究方向。儘管基於GPT之預訓練模型在生成文本方面展現了卓越的能力,但其應用存在需要大量計算資源、高昂的運行成本及不開源等缺點,因此尋找其他能夠在資源有限的環境下有效運行的語言模型成為一項重要需求。
    本研究設計一「基於BERT及BART之文章摘要生成方法」,再開發相關技術、驗證技術之效能,並和基於GPT之預訓練模型比較、分析。結果顯示,基於GPT之預訓練模型的ROUGE-L F1-score為0.44,而本研究所提出方法之ROUGE-L F1-score為0.39,表明此方法仍能在硬體資源和數據量相對匱乏的情況下,產生具有足夠優異品質的文章摘要,證明了此方法的效能卓越。
    此外,本研究設計了一套「以摘要為基之數位寫作能力培養模式」,並開發合作寫作學習模組。經實驗驗證,該模式能有效提高學習者的寫作能力,實驗對象的平均分數從介入前的86.73提升至介入後的89.00,顯示此模式在提升寫作能力方面的有效性。
    根據相關研究,人工撰寫摘要通常需要30分鐘到1小時(Weintraub & Seffrin, 1985),而本研究的文本摘要生成技術能夠即時生成高品質的摘要,顯著縮短了參考用摘要的準備時間,也證實了此技術在教育和實務應用中的價值。

    In today's digital world, digital literacy is essential. Collaborative learning significantly enhances communication, motivation, information absorption, and overall learning satisfaction among students. Writing a summary before an article clarifies thoughts, improves structure, and boosts coherence, significantly enhancing writing skills and quality.
    This study utilizes deep learning to design and develop a "BERT and BART-based article summarization method" with two main steps: "keyword extraction" and "article summarization." It also introduces a " Digital Writing Skill Cultivation Model Based on Article Summarization " which includes "drawing mind maps," "writing article summaries," and "writing articles." The aim is to improve learning outcomes through student collaboration.
    To evaluate the accuracy of this method, a series of experiments were conducted. The results indicate that although there is room for improvement, the method can still produce high-quality summaries under limited hardware and data resources, demonstrating its effectiveness.
    Furthermore, a "digital reading and writing learning platform" was developed to verify the effectiveness of this model in digital teaching. Experimental results show that this platform significantly improves students' writing skills, confirming the practical value of the proposed model.

    摘要 I 誌謝 VI 目錄 VII 表目錄 X 圖目錄 XII 第一章、緒論 1 1.1 研究背景與動機 1 1.2 研究目的 2 1.3 研究問題 3 1.4 研究項目與方法 4 1.5 研究步驟 5 第二章、文獻探討 7 2.1 類似研究 7 2.1.1 關鍵詞萃取 7 2.1.2 文章摘要生成 8 2.2 相關技術 9 2.2.1 自然語言處理 9 2.2.2 Attention、Transformer 9 2.2.3 BERT 11 2.2.4 BART 12 2.2.5 GPT 13 2.3 應用領域 15 2.3.1 數位素養 15 2.3.2 數位寫作 16 2.3.3 合作學習 16 2.4 技術驗證 17 2.4.1 N-gram 17 2.4.2 ROUGE-N 17 第三章、方法與模型設計 19 3.1 摘要生成方法設計 19 3.2 關鍵詞萃取 19 3.2.1 關鍵詞萃取模型 20 3.3 摘要生成 22 3.3.1 摘要生成模型 23 第四章、方法開發與評估 24 4.1 資料集 24 4.1.1 關鍵詞資料集 24 4.1.2 文章摘要資料集 24 4.2 技術實作與實驗環境 25 4.2.1 BERT模型 25 4.2.2 BART模型 25 4.2.3 GPT模型 26 4.3 方法實驗與評估流程 26 4.3.1 GPT模型簡介 27 4.3.2 深度學習模型評估流程 29 4.3.3 摘要生成方法評估流程 30 4.4 評估指標 30 4.4.1 ROUGE-N 30 4.4.2 人工評估指標 31 4.5 實驗與分析 32 4.5.1 實驗一 33 4.5.2 實驗二 34 4.5.3 實驗三 34 4.5.4 實驗四 35 第五章、方法應用有效性評量 39 5.1 數位合作寫作學習流程設計 39 5.2 數位讀寫學習平台建置 47 5.2.1 平台架構 47 5.2.2 平台建置 48 5.2.3 伺服器環境 49 5.3 方法應用驗證與分析 49 5.3.1 實驗設計 49 5.3.2 指標設計 50 5.3.3 實驗對象 51 5.3.4 實驗執行 51 5.3.5 實驗資料準備 51 5.4 結果與分析 52 5.4.1 實驗結果分析 52 5.4.2 學生回饋分析 72 5.4.3 實驗總結 75 5.4.4 實驗建議 76 第六章、結論與討論 77 6.1 結論 77 6.2 建議與未來方向 78 參考文獻 79 中文參考文獻 79 英文參考文獻 80

    中文參考文獻
    王怡心. (2020). 疫起翻轉教育─ 數位成為主流. 會計研究月刊, (418), 12-14.
    朱慧娟. (2022). 學習功能輕微缺損學生適性化數位合作讀寫學習研究. 國家科學及技術委員會專題研究計畫成果報告.
    普皓群. (2021). 基於深度學習之心智圖自動產生方法與技術研發:以數位閱讀與寫作能力培養之應用為例. 成功大學製造資訊與系統研究所學位論文, 2021.
    何子岳. (2022). 基於深度學習之文章摘要提取技術研發:以階層式文章摘要能力培養之應用為例. 成功大學製造資訊與系統研究所學位論文, 2022.
    蘇嘉穎. (2006). 文章摘要策略教學系統的設計與應用-以自然類說明文為例
    賴苑玲. (2019). 再思考什麼是數位素養.臺中市教育電子報.臺中市:教育局.
    教育研究月刊,高等教育出版. (2021). 教育研究月刊. 
    英文參考文獻
    Yadav, D., Desai, J., & Yadav, A. K. (2022). Automatic Text Summarization Methods: A Comprehensive Review. arXiv preprint arXiv:2204.01849.
    Mark Glickman, Yi Zhang (2024). AI and Generative AI for Research Discovery and Summarization
    Treviso, M., Martins, A. F. T., et al. (2022). Efficient Methods for Natural Language Processing: A Survey.
    Zhang, C., Zhao, L., Zhao, M., & Wang, Y. (2021). Enhancing Keyphrase Extraction from Academic Articles with their Reference Information. arXiv preprint arXiv:2111.14106.
    Newburn, J. (2023). Keyword extraction in BERT-based models for reviewer system. University of Illinois Urbana-Champaign.
    UNESCO. (2018). Digital literacy. Retrieved from UNESCO Institute for Statistics.
    Rasyidnita, P., Najwa Amalia Putri, K., Syifa, D. E., & Arfian. (2024). The Influence of Digital Literacy on Reading Interests of Elementary School Students. Linguanusa: Social Humanities, Education and Linguistic, 2(1), 58–65.
    Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30.
    Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
    Lewis, M., Liu, Y., Goyal, N., Ghazvininejad, M., Mohamed, A., Levy, O., Stoyanov, V., & Zettlemoyer, L. (2020). BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 7871-7880.
    Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I. (2018). Improving Language Understanding by Generative Pre-Training. OpenAI.
    Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019). Language Models are Unsupervised Multitask Learners. OpenAI.
    Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D., Wu, J., Winter, C., ... & Amodei, D. (2020). Language Models are Few-Shot Learners. arXiv preprint arXiv:2005.14165.
    Loewenstein, M., Slay, L. E., & Morton, T. (2021). Reimagining Writing Instruction during Pandemic Times: A First Grade Teacher's Journey Creating A Digital Writing Workshop. Texas Association for Literacy Education Yearbook, 8, 13-25.
    ISA: The International Studies Association. (2021). International Studies Association.
    Lin, C. Y. (2004, July). Rouge: A package for automatic evaluation of summaries. In Text summarization branches out (pp. 74-81).
    Jiang, D., Cao, S., & Yang, S. (2021, November). Abstractive summarization of long texts based on BERT and sequence-to-sequence model. In 2021 2nd International Conference on Information Science and Education (ICISE-IE) (pp. 460-466). IEEE.
    David Kornhaber (2000). Outlining
    Jaishree Ranganathan; Gloria Abuka (2022). Text Summarization using Transformer Model. 2022 Ninth International Conference on Social Networks Analysis, Management and Security (SNAMS). IEEE.
    Karthik Shiraly (2023, August). BART Text Summarization vs. GPT-3 vs. BERT: An In-Depth Comparison.
    Weintraub, M., & Seffrin, R. (1985). The efficiency of abstracting: Factors and results. Information Processing & Management, 21(2), 95-104.
    Mani, I., & Maybury, M. T. (1999). Advances in automatic text summarization. MIT Press.
    Lewis, M., Liu, Y., Goyal, N., Ghazvininejad, M., Mohamed, A., Levy, O., ... & Zettlemoyer, L. (2020). BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (pp. 7871-7880).
    Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., ... & Amodei, D. (2020). Language models are few-shot learners. In Advances in neural information processing systems (Vol. 33, pp. 1877-1901).
    OECD. (2019). PISA 2018 Results (Volume I): What Students Know and Can Do. OECD Publishing.
    Fraillon, J., Ainley, J., Schulz, W., Friedman, T., & Duckworth, D. (2020). Preparing for life in a digital world: IEA International Computer and Information Literacy Study 2018 International Report. Springer.
    Keller, J. M. (1987). "Development and use of the ARCS model of instructional design." Journal of Instructional Development, 10(3), 2-10.

    下載圖示 校內:立即公開
    校外:立即公開
    QR CODE