| 研究生: |
陳怡帆 Chen, I-Fan |
|---|---|
| 論文名稱: |
基於主題模型與深度學習之課程試題生成方法 A Method of Generating Course Test Questions Based on Topic Model and Deep Learning |
| 指導教授: |
王惠嘉
Wang, Hei-Chia |
| 學位類別: |
碩士 Master |
| 系所名稱: |
管理學院 - 資訊管理研究所 Institute of Information Management |
| 論文出版年: | 2020 |
| 畢業學年度: | 108 |
| 語文別: | 中文 |
| 論文頁數: | 64 |
| 中文關鍵詞: | 自主學習 、試題生成 、深度學習 、主題模型 |
| 外文關鍵詞: | Self-learning, Test Question Generation, Deep Learning, Topic Model |
| 相關次數: | 點閱:196 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
在學習的過程中,試題測驗是衡量學生對於課程內容的理解程度和反映學習情況的重要途徑,也是讓教師評估教學成果的常見方法。然而手動產生試題的過程相當耗時費力,對於學生而言欲獲得課程相關試題練習也不容易。為了解決上述問題,教育與計算機科學學者提出自動化產生課程試題的問題生成(Question Generation, QG)方法於教育環境,以減輕教師出題負擔並幫助學生自學。其中傳統自動化試題生成多使用基於啟發式規則的方法,並且需經過「句子選擇」與「問題建構」兩步驟。
近幾年基於深度學習的類神經問題生成方法(Neural Question Generation, NQG)興起,研究顯示與基於規則式的方法相比,NQG方法能夠生成較流暢靈活且多樣化的問句。然而先前的NQG研究著重於問句生成後的正確性、流暢性等,因此方法中忽略了句子選擇的步驟,但生成試題時句子選擇應在其中扮演著重要的角色,必須選擇教材中對學習有幫助、值得出題的句子才有出題及學習的意義。因此本研究提出一個從教材中生成測驗試題的NQG方法,並且方法中將涵蓋句子選擇與問題建構兩步驟。不同於先前研究,本研究使用多源教材摘選出老師講課的重點內容,並提出基於深度學習的TE-QG (Topic-Embedding Question Generation)模型。本研究使用基於注意力機制及Copying機制的Seq2Seq模型,此外為了生成使生成的試題更符合課文章節的整體語義,TE-QG模型中加入能表現文本整體性的主題特徵,使模型生成更符合教材章節主題的課程試題。
本研究透過六個實驗檢視研究方法的有效性,實驗結果顯示TE-QG模型比起過去幾種問題生成模型在自動評估指標中皆獲得最高的分數,並且經由TE-QG模型所生成的課程試題在人為評估結果顯示,試題的流暢度、清晰度及學習有用性達到較佳的表現。證實本研究的試題生成方法不僅能符合教師課程中的教學重點,且於問題生成模型中加入主題字特徵確實提升模型問句生成的表現。
Questions are useful to measure students' mastery of course materials and reflect their learning process. However, generating questions manually is time-consuming and it is not easy for students to obtain practice exercises. In order to solve this issue, both computer and learning scientists have proposed automatic test question generation methods to ease instructors’ burden and help students in the self-learning process. Typically, automatic question generation methods use heuristic rule-based methods and the process of generating question includes two steps: sentence selection and question construction.
Recently, neural network has been utilized for questions generation. Compared to the rule-based approach, the neural question generation(NQG) approach produces more fluent and diverse questions. Prior work on NQG focused on the grammaticality of generated questions and ignored the step of sentence selection. However, selection of sentences should play an important role in generating test questions to identify question-worthy sentences carrying information or knowledge that is worth asking questions about. Therefore, this paper proposes a method of generating course test questions from course materials through combining sentence selection and question construction. Contrasting past approaches, we use multi-source teaching materials to select question-worthy sentences and propose a NQG model called topic-embedding question generation (TE-QG). We base our model on attention mechanism and the pointer-generator copying mechanism. Furthermore, in order to generate questions that better match the chapters’ topic, TE-QG model incorporates topic features, which is associated with a body of chapter content in the framework.
We conducted several experiments to examine the effectiveness of our method. The experimental results show that the sentence selection method can indeed select sentences that meet the key points of the course, and our TE-QG model outperforms other NQG models. The human evaluation results also show that our method can produce fluent and useful questions for learning.
Agarwal, M., & Mannem, P. (2011). Automatic Gap-Fill Question Generation from Text Books. Paper presented at the the 6th Workshop on Innovative Use of NLP for Building Educational Applications, Portland, Oregon.
Anderson, R. C., & Biddle, W. B. (1975). On Asking People Questions About What They Are Reading. In G. H. Bower (Ed.), The Psychology of learning and motivation (pp. 89-132). New York: Academic Press.
Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural Machine Translation by Jointly Learning to Align and Translate. Paper presented at the arXiv preprint arXiv:1409.0473.
Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet Allocation. Journal of machine Learning research, 3(Jan), 993-1022.
Callender, A. A., & McDaniel, M. A. (2007). The Benefits of Embedded Question Adjuncts for Low and High Structure Builders. Journal of Educational Psychology, 99(2), 339.
Chen, G., Yang, J., & Gasevic, D. (2019). A Comparative Study on Question-Worthy Sentence Selection Strategies for Educational Question Generation. Paper presented at the the International Conference on Artificial Intelligence in Education, Chicago, USA.
Cheng, J., & Lapata, M. (Writers). (2016). Neural Summarization by Extracting Sentences and Words, the 54th Annual Meeting of the Association for Computational Linguistics.
Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. (2014). Learning Phrase Representations Using RNN Encoder-Decoder for Statistical Machine Translation. arXiv preprint arXiv:1406.1078.
Chung, J., Gulcehre, C., Cho, K., & Bengio, Y. (2014). Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. arXiv preprint arXiv:1412.3555.
Connor-Greene, P. A. (2000). Assessing and Promoting Student Learning: Blurring the Line Between Teaching and Testing. Teaching of Psychology, 27(2), 84-88.
De Boom, C., Van Canneyt, S., Bohez, S., Demeester, T., & Dhoedt, B. (2015). Learning Semantic Similarity for Very Short Texts. Paper presented at the the 2015 IEEE International Conference on Data Mining Workshop, Atlantic City, NJ, USA.
Du, X., & Cardie, C. (2017). Identifying Where to Focus in Reading Comprehension for Neural Question Generation. Paper presented at the the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark.
Du, X., & Cardie, C. (2018). Harvesting Paragraph-Level Question-Answer Pairs from Wikipedia. Paper presented at the the 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, Australia.
Du, X., Shao, J., & Cardie, C. (2017). Learning to Ask: Neural Question Generation for Reading Comprehension. Paper presented at the the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Canada.
Fu, X., Sun, X., Wu, H., Cui, L., & Huang, J. Z. (2018). Weakly Supervised Topic Sentiment Joint Model with Word Embeddings. Knowledge-Based Systems, 147, 43-54.
Graesser, A. C., & Person, N. K. (1994). Question Asking During Tutoring. American educational research journal, 31(1), 104-137.
Griffiths, T. L., & Steyvers, M. (2004). Finding Scientific Topics. the National academy of Sciences, 101(suppl 1), 5228-5235.
Gulcehre, C., Ahn, S., Nallapati, R., Zhou, B., & Bengio, Y. (2016). Pointing the Unknown Words. Paper presented at the the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany.
Harrison, V., & Walker, M. (2018). Neural Generation of Diverse Questions using Answer Focus, Contextual and Linguistic Features. arXiv preprint arXiv:1809.02637.
Hearst, M. A. (1997). TextTiling: Segmenting Text into Multi-Paragraph Subtopic Passages. Computational linguistics, 23(1), 33-64.
Heilman, M. (2011). Automatic Factual Question Generation from Text.
Heilman, M., & Smith, N. A. (2010). Good Question! Statistical Ranking for Question Generation. Paper presented at the the Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Los Angeles, California.
Hochreiter, S., & Schmidhuber, J. (1997). Long Short-Term Memory. Neural computation, 9(8), 1735-1780.
Hofmann, T. (1999). Probabilistic Latent Semantic Analysis. Paper presented at the the 5th Conference on Uncertainty in Artificial Intelligence, Stockholm, Sweden.
Karpicke, J. D., & Grimaldi, P. J. (2012). Retrieval-Based Learning: A Perspective for Enhancing Meaningful Learning. Educational Psychology Review, 24(3), 401-418.
Kern, R., & Granitzer, M. (2009). Efficient Linear Text Segmentation Based on Information Retrieval Techniques. Paper presented at the the International Conference on Management of Emergent Digital EcoSystems, Lyon, France.
Koedinger, K. R., Kim, J., Jia, J. Z., McLaughlin, E. A., & Bier, N. L. (2015). Learning is not a Apectator Sport: Doing is Better Than Watching for Learning From a MOOC. Paper presented at the the 2nd ACM Conference on Learning@ Scale, New York, NY.
Kumar, G., Banchs, R., & D’Haro, L. F. (2015). Revup: Automatic Gap-Fill Question Generation from Educational Texts. Paper presented at the the 10th Workshop on Innovative Use of NLP for Building Educational Applications, Denver, Colorado.
Kumar, V., Ramakrishnan, G., & Li, Y. F. (2018). A Framework for Automatic Question Generation from Text Using Deep Reinforcement Learning. arXiv preprint arXiv:1808.04961.
Levy, O., Goldberg, Y., & Dagan, I. (2015). Improving Distributional Similarity with Lessons Learned from Word Embeddings. Transactions of the Association for Computational Linguistics, 3, 211-225.
Lindberg, D., Popowich, F., Nesbit, J., & Winne, P. (2013). Generating Natural Language Questions to Support Learning On-Line. Paper presented at the the 14th European Workshop on Natural Language Generation, Sofia, Bulgaria.
Liu, M., Calvo, R. A., Aditomo, A., & Pizzato, L. A. (2012). Using Wikipedia and Conceptual Graph Structures to Generate Questions for Academic Writing Support. IEEE Transactions on Learning Technologies, 5(3), 251-263.
Liu, M., Rus, V., & Liu, L. (2017). Automatic Chinese Multiple Choice Question Generation Using Mixed Similarity Strategy. IEEE Transactions on Learning Technologies, 11(2), 193-202.
Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient Estimation of Word Representations in Vector Space. arXiv preprint arXiv:1301.3781.
Mitkov, R. (2003). Computer-Aided Generation of Multiple-Choice Tests. Paper presented at the the HLT-NAACL 03 Workshop on Building Educational Applications Using Natural Language Processing.
Moody, C. E. (2016). Mixing Dirichlet Topic Models and Word Embeddings to Make Lda2vec. arXiv preprint arXiv:1605.02019.
Morchid, M. (2018). Parsimonious Memory Unit for Recurrent Neural Networks with Application to Natural Language Processing. Neurocomputing, 314, 48-64.
Nema, P., & Khapra, M. M. (2018). Towards a Better Metric for Evaluating Question Generation Systems. Paper presented at the the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium.
Newman, D., Lau, J. H., Grieser, K., & Baldwin, T. (2010). Automatic Evaluation of Topic Coherence. Paper presented at the the 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Los Angeles, California.
Nguyen, T., Rosenberg, M., Song, X., Gao, J., Tiwary, S., Majumder, R., & Deng, L. (2016). MS MARCO: A Human-Generated MAchine Reading COmprehension Dataset. Paper presented at the the 30th Conference on Neural Information Processing Systems, Barcelona, Spain.
Olah, C. (2015). Understanding Lstm Networks. [Online forum comment]. Retrieved from http://colah.github.io/posts/2015-08-Understanding-LSTMs/
Pan, L., Lei, W., Chua, T. S., & Kan, M. Y. (2019). Recent Advances in Neural Question Generation. arXiv preprint arXiv:1905.08949.
Pennington, J., Socher, R., & Manning, C. (2014). Glove: Global Vectors for Word Representation. Paper presented at the the 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Qatar.
Rajpurkar, P., Zhang, J., Lopyrev, K., & Liang, P. (2016). Squad: 100,000+ Questions for Machine Comprehension of Text. Paper presented at the the Empirical Methods in Natural Language Processing (EMNLP), Austin, Texas, USA.
Ranjan, P., & Balabantaray, R. C. (2016). Question Answering System for Factoid Based Question. Paper presented at the the 2nd International Conference on Contemporary Computing and Informatics, Greater Noida, India.
Riedl, M., & Biemann, C. (2012). Text Segmentation with Topic Models. Journal for Language Technology and Computational Linguistics, 27(1), 47-69.
Rossiello, G., Basile, P., & Semeraro, G. (2017). Centroid-Based Text Summarization Through Compositionality of Word Embeddings. Paper presented at the the MultiLing 2017 Workshop on Summarization and Summary Evaluation Across Source Types and Genres, Valencia, Spain.
Rus, V., Wyse, B., Piwek, P., Lintean, M., Stoyanchev, S., & Moldovan, C. (2010). The First Question Generation Shared Task Evaluation Challenge. Paper presented at the the 6th International Natural Language Generation Conference, Dublin, Ireland.
See, A., Liu, P. J., & Manning, C. D. (2017). Get to the Point: Summarization with Pointer-Generator Networks. Paper presented at the the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Canada.
Stokes, N. (2004). Applications of Lexical Cohesion Analysis in the Topic Detection and Tracking Domain.
Sun, X., Liu, J., Lyu, Y., He, W., Ma, Y., & Wang, S. (2018). Answer-Focused and Position-Aware Neural Question Generation. Paper presented at the the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium.
Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to Sequence Learning with Neural Networks. Paper presented at the the 27th International Conference on Neural Information Processing Systems, Montreal, Quebec, Canada.
Tang, D., Duan, N., Yan, Z., Zhang, Z., Sun, Y., Liu, S., . . . Zhou, M. (2018). Learning to Collaborate for Question Answering and Asking. Paper presented at the the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, New Orleans, Louisiana.
Wan, X. (2007). A Novel Document Similarity Measure Based on Earth Mover’s Distance. Information Sciences, 177(18), 3718-3730.
Wang, T., Yuan, X., & Trischler, A. (2017). A Joint Model for Question Answering and Question Generation. Paper presented at the the 1st Workshop on Learning to Generate Natural Language, Sydney, Australia.
Wang, Z., Lan, A. S., Nie, W., Waters, A. E., Grimaldi, P. J., & Baraniuk, R. G. (2018). QG-Net: a Data-Driven Question Generation Model for Educational Content. Paper presented at the the 5th Annual ACM Conference on Learning at Scale, London, United Kingdom.
Wu, J. W., Tseng, J. C. R., & Tsai, W. N. (2014). A Hybrid Linear Text Segmentation Algorithm Using Hierarchical Agglomerative Clustering and Discrete Particle Swarm Optimization. Integrated Computer-Aided Engineering, 21(1), 35-46.
Young, T., Hazarika, D., Poria, S., & Cambria, E. (2018). Recent Trends in Deep Learning Based Natural Language Processing. ieee Computational intelligenCe magazine, 13(3), 55-75.
Yuan, X., Wang, T., Gulcehre, C., Sordoni, A., Bachman, P., Subramanian, S., . . . Trischler, A. (2017). Machine Comprehension by Text-to-Text Neural Question Generation. Paper presented at the the 2nd Workshop on Representation Learning for NLP, Vancouver, Canada.
Zhao, Y., Ni, X., Ding, Y., & Ke, Q. (2018). Paragraph-Level Neural Question Generation with Maxout Pointer and Gated Self-Attention Networks. Paper presented at the the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium.
Zhou, Q., Yang, N., Wei, F., Tan, C., Bao, H., & Zhou, M. (2017). Neural Question Generation from Text: A Preliminary Study. Paper presented at the the Natural Language Processing and Chinese Computing, Dalian, China.