簡易檢索 / 詳目顯示

研究生: 陳怡帆
Chen, I-Fan
論文名稱: 基於主題模型與深度學習之課程試題生成方法
A Method of Generating Course Test Questions Based on Topic Model and Deep Learning
指導教授: 王惠嘉
Wang, Hei-Chia
學位類別: 碩士
Master
系所名稱: 管理學院 - 資訊管理研究所
Institute of Information Management
論文出版年: 2020
畢業學年度: 108
語文別: 中文
論文頁數: 64
中文關鍵詞: 自主學習試題生成深度學習主題模型
外文關鍵詞: Self-learning, Test Question Generation, Deep Learning, Topic Model
相關次數: 點閱:196下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在學習的過程中,試題測驗是衡量學生對於課程內容的理解程度和反映學習情況的重要途徑,也是讓教師評估教學成果的常見方法。然而手動產生試題的過程相當耗時費力,對於學生而言欲獲得課程相關試題練習也不容易。為了解決上述問題,教育與計算機科學學者提出自動化產生課程試題的問題生成(Question Generation, QG)方法於教育環境,以減輕教師出題負擔並幫助學生自學。其中傳統自動化試題生成多使用基於啟發式規則的方法,並且需經過「句子選擇」與「問題建構」兩步驟。
    近幾年基於深度學習的類神經問題生成方法(Neural Question Generation, NQG)興起,研究顯示與基於規則式的方法相比,NQG方法能夠生成較流暢靈活且多樣化的問句。然而先前的NQG研究著重於問句生成後的正確性、流暢性等,因此方法中忽略了句子選擇的步驟,但生成試題時句子選擇應在其中扮演著重要的角色,必須選擇教材中對學習有幫助、值得出題的句子才有出題及學習的意義。因此本研究提出一個從教材中生成測驗試題的NQG方法,並且方法中將涵蓋句子選擇與問題建構兩步驟。不同於先前研究,本研究使用多源教材摘選出老師講課的重點內容,並提出基於深度學習的TE-QG (Topic-Embedding Question Generation)模型。本研究使用基於注意力機制及Copying機制的Seq2Seq模型,此外為了生成使生成的試題更符合課文章節的整體語義,TE-QG模型中加入能表現文本整體性的主題特徵,使模型生成更符合教材章節主題的課程試題。
    本研究透過六個實驗檢視研究方法的有效性,實驗結果顯示TE-QG模型比起過去幾種問題生成模型在自動評估指標中皆獲得最高的分數,並且經由TE-QG模型所生成的課程試題在人為評估結果顯示,試題的流暢度、清晰度及學習有用性達到較佳的表現。證實本研究的試題生成方法不僅能符合教師課程中的教學重點,且於問題生成模型中加入主題字特徵確實提升模型問句生成的表現。

    Questions are useful to measure students' mastery of course materials and reflect their learning process. However, generating questions manually is time-consuming and it is not easy for students to obtain practice exercises. In order to solve this issue, both computer and learning scientists have proposed automatic test question generation methods to ease instructors’ burden and help students in the self-learning process. Typically, automatic question generation methods use heuristic rule-based methods and the process of generating question includes two steps: sentence selection and question construction.
    Recently, neural network has been utilized for questions generation. Compared to the rule-based approach, the neural question generation(NQG) approach produces more fluent and diverse questions. Prior work on NQG focused on the grammaticality of generated questions and ignored the step of sentence selection. However, selection of sentences should play an important role in generating test questions to identify question-worthy sentences carrying information or knowledge that is worth asking questions about. Therefore, this paper proposes a method of generating course test questions from course materials through combining sentence selection and question construction. Contrasting past approaches, we use multi-source teaching materials to select question-worthy sentences and propose a NQG model called topic-embedding question generation (TE-QG). We base our model on attention mechanism and the pointer-generator copying mechanism. Furthermore, in order to generate questions that better match the chapters’ topic, TE-QG model incorporates topic features, which is associated with a body of chapter content in the framework.
    We conducted several experiments to examine the effectiveness of our method. The experimental results show that the sentence selection method can indeed select sentences that meet the key points of the course, and our TE-QG model outperforms other NQG models. The human evaluation results also show that our method can produce fluent and useful questions for learning.

    第1章 緒論 1 1.1 研究背景與動機 1 1.2 研究目的 4 1.3 研究限制 5 1.4 研究流程 5 1.5 論文大綱 6 第2章 文獻探討 7 2.1 問題生成(Question Generation) 7 2.1.1 自動問題生成(Automatic Question Generation, AQG) 7 2.1.2 類神經問題生成(Neural Question Generation, NQG) 8 2.2 主題模型(Topic Model) 9 2.2.1 LDA (Latent Dirichlet Allocation) 10 2.3 詞嵌入(Word Embedding) 13 2.3.1 Word2vec 13 2.3.2 GloVe 14 2.4 深度學習 15 2.4.1 循環神經網路(Recurrent Neural Network, RNN) 16 2.4.1.1 LSTM (Long Short-Term Memory) 17 2.4.1.2 GRU (Gated Recurrent Unit) 18 2.4.2 Sequence-to-sequence 20 2.4.3 注意力機制(Attention Mechanism) 22 2.4.4 Copying機制(Copying Mechanism) 23 2.5 文件分割(Document Segmentation) 24 2.5.1 TextTiling 25 2.5.2 TSF 26 2.6 小結 27 第3章 研究方法 28 3.1 研究架構 28 3.2 資料前處理模組 29 3.2.1 課本內文前處理 29 3.2.2 投影片前處理 29 3.3 內文比對模組 30 3.3.1 課文分段 30 3.3.2 課文段落與投影片內容比對 32 3.4 問句選擇模組 33 3.4.1 詞嵌入模型 33 3.4.2 課文質心詞嵌入 35 3.4.3 出題句子選擇 35 3.5 試題生成模組 35 3.5.1 出題句子詞嵌入 36 3.5.2 Encoder 37 3.5.2.1 特徵選用與標註 37 3.5.2.2 章節主題向量 38 3.5.2.3 Encoding 39 3.5.3 Decoder 40 3.5.3.1 Attention-based Decoding 40 3.5.3.2 Pointer-generator network 41 3.6 小結 42 第4章 系統建置與驗證 43 4.1 系統環境建置 43 4.2 實驗方法 43 4.2.1 資料來源 43 4.2.2 實驗設計 45 4.2.3 評估指標 46 4.2.3.1 出題句子選擇階段 46 4.2.3.2 課程試題生成階段 47 4.3 參數設定 48 4.3.1 參數一:主題模型萃取的潛在主題個數K 48 4.3.2 參數二:試題生成模組的網路訓練參數 49 4.4 實驗結果與分析 50 4.4.1 實驗一:與其他句子選擇方法的比較 50 4.4.2 實驗二:主題字詞性過濾對於生成結果的影響 51 4.4.3 實驗三:於生成模型加入主題字詞的效果 52 4.4.4 實驗四:透過自動評估指標與其他問題生成模型比較 53 4.4.5 實驗五:以人為評估方式檢視TE-QG模型生成之試題 54 4.4.6 實驗六:以人為方式評估再訓練詞嵌入模型的效果 55 第5章 結論與未來方向 57 5.1 研究成果 57 5.2 未來研究方向 58 參考文獻 60

    Agarwal, M., & Mannem, P. (2011). Automatic Gap-Fill Question Generation from Text Books. Paper presented at the the 6th Workshop on Innovative Use of NLP for Building Educational Applications, Portland, Oregon.
    Anderson, R. C., & Biddle, W. B. (1975). On Asking People Questions About What They Are Reading. In G. H. Bower (Ed.), The Psychology of learning and motivation (pp. 89-132). New York: Academic Press.
    Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural Machine Translation by Jointly Learning to Align and Translate. Paper presented at the arXiv preprint arXiv:1409.0473.
    Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet Allocation. Journal of machine Learning research, 3(Jan), 993-1022.
    Callender, A. A., & McDaniel, M. A. (2007). The Benefits of Embedded Question Adjuncts for Low and High Structure Builders. Journal of Educational Psychology, 99(2), 339.
    Chen, G., Yang, J., & Gasevic, D. (2019). A Comparative Study on Question-Worthy Sentence Selection Strategies for Educational Question Generation. Paper presented at the the International Conference on Artificial Intelligence in Education, Chicago, USA.
    Cheng, J., & Lapata, M. (Writers). (2016). Neural Summarization by Extracting Sentences and Words, the 54th Annual Meeting of the Association for Computational Linguistics.
    Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. (2014). Learning Phrase Representations Using RNN Encoder-Decoder for Statistical Machine Translation. arXiv preprint arXiv:1406.1078.
    Chung, J., Gulcehre, C., Cho, K., & Bengio, Y. (2014). Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. arXiv preprint arXiv:1412.3555.
    Connor-Greene, P. A. (2000). Assessing and Promoting Student Learning: Blurring the Line Between Teaching and Testing. Teaching of Psychology, 27(2), 84-88.
    De Boom, C., Van Canneyt, S., Bohez, S., Demeester, T., & Dhoedt, B. (2015). Learning Semantic Similarity for Very Short Texts. Paper presented at the the 2015 IEEE International Conference on Data Mining Workshop, Atlantic City, NJ, USA.
    Du, X., & Cardie, C. (2017). Identifying Where to Focus in Reading Comprehension for Neural Question Generation. Paper presented at the the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark.
    Du, X., & Cardie, C. (2018). Harvesting Paragraph-Level Question-Answer Pairs from Wikipedia. Paper presented at the the 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, Australia.
    Du, X., Shao, J., & Cardie, C. (2017). Learning to Ask: Neural Question Generation for Reading Comprehension. Paper presented at the the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Canada.
    Fu, X., Sun, X., Wu, H., Cui, L., & Huang, J. Z. (2018). Weakly Supervised Topic Sentiment Joint Model with Word Embeddings. Knowledge-Based Systems, 147, 43-54.
    Graesser, A. C., & Person, N. K. (1994). Question Asking During Tutoring. American educational research journal, 31(1), 104-137.
    Griffiths, T. L., & Steyvers, M. (2004). Finding Scientific Topics. the National academy of Sciences, 101(suppl 1), 5228-5235.
    Gulcehre, C., Ahn, S., Nallapati, R., Zhou, B., & Bengio, Y. (2016). Pointing the Unknown Words. Paper presented at the the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany.
    Harrison, V., & Walker, M. (2018). Neural Generation of Diverse Questions using Answer Focus, Contextual and Linguistic Features. arXiv preprint arXiv:1809.02637.
    Hearst, M. A. (1997). TextTiling: Segmenting Text into Multi-Paragraph Subtopic Passages. Computational linguistics, 23(1), 33-64.
    Heilman, M. (2011). Automatic Factual Question Generation from Text.
    Heilman, M., & Smith, N. A. (2010). Good Question! Statistical Ranking for Question Generation. Paper presented at the the Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Los Angeles, California.
    Hochreiter, S., & Schmidhuber, J. (1997). Long Short-Term Memory. Neural computation, 9(8), 1735-1780.
    Hofmann, T. (1999). Probabilistic Latent Semantic Analysis. Paper presented at the the 5th Conference on Uncertainty in Artificial Intelligence, Stockholm, Sweden.
    Karpicke, J. D., & Grimaldi, P. J. (2012). Retrieval-Based Learning: A Perspective for Enhancing Meaningful Learning. Educational Psychology Review, 24(3), 401-418.
    Kern, R., & Granitzer, M. (2009). Efficient Linear Text Segmentation Based on Information Retrieval Techniques. Paper presented at the the International Conference on Management of Emergent Digital EcoSystems, Lyon, France.
    Koedinger, K. R., Kim, J., Jia, J. Z., McLaughlin, E. A., & Bier, N. L. (2015). Learning is not a Apectator Sport: Doing is Better Than Watching for Learning From a MOOC. Paper presented at the the 2nd ACM Conference on Learning@ Scale, New York, NY.
    Kumar, G., Banchs, R., & D’Haro, L. F. (2015). Revup: Automatic Gap-Fill Question Generation from Educational Texts. Paper presented at the the 10th Workshop on Innovative Use of NLP for Building Educational Applications, Denver, Colorado.
    Kumar, V., Ramakrishnan, G., & Li, Y. F. (2018). A Framework for Automatic Question Generation from Text Using Deep Reinforcement Learning. arXiv preprint arXiv:1808.04961.
    Levy, O., Goldberg, Y., & Dagan, I. (2015). Improving Distributional Similarity with Lessons Learned from Word Embeddings. Transactions of the Association for Computational Linguistics, 3, 211-225.
    Lindberg, D., Popowich, F., Nesbit, J., & Winne, P. (2013). Generating Natural Language Questions to Support Learning On-Line. Paper presented at the the 14th European Workshop on Natural Language Generation, Sofia, Bulgaria.
    Liu, M., Calvo, R. A., Aditomo, A., & Pizzato, L. A. (2012). Using Wikipedia and Conceptual Graph Structures to Generate Questions for Academic Writing Support. IEEE Transactions on Learning Technologies, 5(3), 251-263.
    Liu, M., Rus, V., & Liu, L. (2017). Automatic Chinese Multiple Choice Question Generation Using Mixed Similarity Strategy. IEEE Transactions on Learning Technologies, 11(2), 193-202.
    Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient Estimation of Word Representations in Vector Space. arXiv preprint arXiv:1301.3781.
    Mitkov, R. (2003). Computer-Aided Generation of Multiple-Choice Tests. Paper presented at the the HLT-NAACL 03 Workshop on Building Educational Applications Using Natural Language Processing.
    Moody, C. E. (2016). Mixing Dirichlet Topic Models and Word Embeddings to Make Lda2vec. arXiv preprint arXiv:1605.02019.
    Morchid, M. (2018). Parsimonious Memory Unit for Recurrent Neural Networks with Application to Natural Language Processing. Neurocomputing, 314, 48-64.
    Nema, P., & Khapra, M. M. (2018). Towards a Better Metric for Evaluating Question Generation Systems. Paper presented at the the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium.
    Newman, D., Lau, J. H., Grieser, K., & Baldwin, T. (2010). Automatic Evaluation of Topic Coherence. Paper presented at the the 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Los Angeles, California.
    Nguyen, T., Rosenberg, M., Song, X., Gao, J., Tiwary, S., Majumder, R., & Deng, L. (2016). MS MARCO: A Human-Generated MAchine Reading COmprehension Dataset. Paper presented at the the 30th Conference on Neural Information Processing Systems, Barcelona, Spain.
    Olah, C. (2015). Understanding Lstm Networks. [Online forum comment]. Retrieved from http://colah.github.io/posts/2015-08-Understanding-LSTMs/
    Pan, L., Lei, W., Chua, T. S., & Kan, M. Y. (2019). Recent Advances in Neural Question Generation. arXiv preprint arXiv:1905.08949.
    Pennington, J., Socher, R., & Manning, C. (2014). Glove: Global Vectors for Word Representation. Paper presented at the the 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Qatar.
    Rajpurkar, P., Zhang, J., Lopyrev, K., & Liang, P. (2016). Squad: 100,000+ Questions for Machine Comprehension of Text. Paper presented at the the Empirical Methods in Natural Language Processing (EMNLP), Austin, Texas, USA.
    Ranjan, P., & Balabantaray, R. C. (2016). Question Answering System for Factoid Based Question. Paper presented at the the 2nd International Conference on Contemporary Computing and Informatics, Greater Noida, India.
    Riedl, M., & Biemann, C. (2012). Text Segmentation with Topic Models. Journal for Language Technology and Computational Linguistics, 27(1), 47-69.
    Rossiello, G., Basile, P., & Semeraro, G. (2017). Centroid-Based Text Summarization Through Compositionality of Word Embeddings. Paper presented at the the MultiLing 2017 Workshop on Summarization and Summary Evaluation Across Source Types and Genres, Valencia, Spain.
    Rus, V., Wyse, B., Piwek, P., Lintean, M., Stoyanchev, S., & Moldovan, C. (2010). The First Question Generation Shared Task Evaluation Challenge. Paper presented at the the 6th International Natural Language Generation Conference, Dublin, Ireland.
    See, A., Liu, P. J., & Manning, C. D. (2017). Get to the Point: Summarization with Pointer-Generator Networks. Paper presented at the the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Canada.
    Stokes, N. (2004). Applications of Lexical Cohesion Analysis in the Topic Detection and Tracking Domain.
    Sun, X., Liu, J., Lyu, Y., He, W., Ma, Y., & Wang, S. (2018). Answer-Focused and Position-Aware Neural Question Generation. Paper presented at the the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium.
    Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to Sequence Learning with Neural Networks. Paper presented at the the 27th International Conference on Neural Information Processing Systems, Montreal, Quebec, Canada.
    Tang, D., Duan, N., Yan, Z., Zhang, Z., Sun, Y., Liu, S., . . . Zhou, M. (2018). Learning to Collaborate for Question Answering and Asking. Paper presented at the the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, New Orleans, Louisiana.
    Wan, X. (2007). A Novel Document Similarity Measure Based on Earth Mover’s Distance. Information Sciences, 177(18), 3718-3730.
    Wang, T., Yuan, X., & Trischler, A. (2017). A Joint Model for Question Answering and Question Generation. Paper presented at the the 1st Workshop on Learning to Generate Natural Language, Sydney, Australia.
    Wang, Z., Lan, A. S., Nie, W., Waters, A. E., Grimaldi, P. J., & Baraniuk, R. G. (2018). QG-Net: a Data-Driven Question Generation Model for Educational Content. Paper presented at the the 5th Annual ACM Conference on Learning at Scale, London, United Kingdom.
    Wu, J. W., Tseng, J. C. R., & Tsai, W. N. (2014). A Hybrid Linear Text Segmentation Algorithm Using Hierarchical Agglomerative Clustering and Discrete Particle Swarm Optimization. Integrated Computer-Aided Engineering, 21(1), 35-46.
    Young, T., Hazarika, D., Poria, S., & Cambria, E. (2018). Recent Trends in Deep Learning Based Natural Language Processing. ieee Computational intelligenCe magazine, 13(3), 55-75.
    Yuan, X., Wang, T., Gulcehre, C., Sordoni, A., Bachman, P., Subramanian, S., . . . Trischler, A. (2017). Machine Comprehension by Text-to-Text Neural Question Generation. Paper presented at the the 2nd Workshop on Representation Learning for NLP, Vancouver, Canada.
    Zhao, Y., Ni, X., Ding, Y., & Ke, Q. (2018). Paragraph-Level Neural Question Generation with Maxout Pointer and Gated Self-Attention Networks. Paper presented at the the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium.
    Zhou, Q., Yang, N., Wei, F., Tan, C., Bao, H., & Zhou, M. (2017). Neural Question Generation from Text: A Preliminary Study. Paper presented at the the Natural Language Processing and Chinese Computing, Dalian, China.

    下載圖示 校內:2025-07-01公開
    校外:2025-07-01公開
    QR CODE