簡易檢索 / 詳目顯示

研究生: 賴昀澤
Lai, Yun-Tse
論文名稱: 使用具注意力機制之序列到序列模型之圍棋棋譜評論生成方法
A Sequence to Sequence Model with Attention Mechanism to Generate Commentary for Go Game Records
指導教授: 王宗一
Wang, Tzone-I
學位類別: 碩士
Master
系所名稱: 工學院 - 工程科學系
Department of Engineering Science
論文出版年: 2021
畢業學年度: 109
語文別: 中文
論文頁數: 45
中文關鍵詞: 圍棋評論生成自然語言注意力機制深度學習
外文關鍵詞: Go, Commentary Generation, Natural Language, Attention Mechanism, Deep Learning
相關次數: 點閱:173下載:22
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 圍棋是一種起源於中國且歷史悠久的棋類遊戲,規則簡單卻有相當高的對弈複雜度。近年來隨著深度學習及強化學習的發展,人工智慧在電腦對弈遊戲領域的發展有突破性的成長。隨著電腦圍棋棋力達到人類難以企及的程度時,也激起許多人學習圍棋的興趣。現有的圍棋訓練軟體絕大部分僅能提供形勢判斷、變化圖、勝率等工具,於教學部分尚有一段需要努力的空間。目前圍棋評論生成還無法達到讓電腦直接用自己的想法來對人類進行指導,本研究的目標是希望能透過訓練深度學習網路,讓電腦能用自然語言的方式,透過深度學習網路對棋譜產生評論、進行講解,來輔助學習圍棋。
    本研究針對圍棋棋譜評論生成問題,提出一套方法,讓深度學習網路學習如何使用自然語言描述落子目的。訓練資料必須為使用人工標註評論的棋譜,本研究從網路上蒐集了約10000個附有人工標註評論的棋譜檔案,對其提取所需特徵,取出評論作為標籤,並依照評論的性質進行分類。而訓練模型則參考序列到序列(Sequence to Sequence - Seq2Seq)的架構,使用雙向長短期記憶模式(Bidirectional LSTM)作為編碼器來理解輸入的特徵,並輸出一個內文向量(context vector),即將之置入一LSTM解碼器進行解碼以生成序列評論,為了增強模型學習效果,模型內並加入注意力機制,使解碼器的每一個步驟都能專注在權重高的特徵。本研究進行了多次實驗,針對不同評論分類的棋譜資料進行評論生成,使用機器翻譯評價指標 (Bilingual Evaluation Understudy - BLEU) 對實驗結果進行準確性之評估,在「著手質量」這個類別的效果要比其他類別來的好,並找了三位職業棋士對生成之評論進行評價,根據當前盤面及評論給出非常不符合、不符合、普通、符合、非常符合等五種評價,其中,非常符合、符合及普通的統計結果占整體評論的約80%。

    Go is a board game that originated in China and has a long history. The rules are simple but the game complexity is quite high. In recent years, with the breakthrough of deep learning and reinforcement learning, and artificial intelligence in the field of computer games has achieved breakthrough growth. As the Go strength of computer Go climbed at a level that was difficult for humans to reach, many people were interested in learning Go. Most of the existing Go software can only provide tools such as situation judgment, variation diagram, and win rate analysis. In the teaching part, there is still much room for improvement. At present, the Go commentary generation is still unable to allow computers to directly use their own ideas to teach humans. The goal of this study is to train a deep learning network so that computer can use natural language to generate commentaries and explain the Go game records to assist in learning Go.
    This study proposes a set of methods to generate commentary for Go game records and allows the deep learning network to learn how to use natural language to describe the reason of the move. The training data must be Go game records and commentaries with manual annotations. we collected about 10,000 Go game record files and commentaries with manual annotations from the Internet, extracted the required features, took out the commentaries as labels, and classified them according to the property of the commentaries. The training model refers to the Sequence to Sequence (Seq2Seq) architecture, uses Bidirectional LSTM as the encoder to understand the input features, and output a context vector. After the context vector is generated, it will be input a LSTM decoder for decoding to generate sequence commentaries. In order to improve the model performance, an attention mechanism is added to the model. In this way, every step of the decoder can be focused on high weight features. This study conducted many experiments to generate commentaries based on the Go game record data of different commentary classifications, use the Bilingual Evaluation Understudy (BLEU) to evaluate the score of the experimental results, and the score in the category of "Move Quality" is better than other categories. We also found three professional Go players to evaluate the generated commentaries according to the current board and commentary, given five evaluations were be selected, including "Very Discrepant", "Discrepant", "General", "Consistent", and "Very Consistent", according to the statistics, "General", "Consistent", and "Very Consistent" account for about 80% of the overall commentaries.

    摘要 I Extended Abstract II 致謝 VII 目錄 VIII 表目錄 X 圖目錄 XI 第一章 緒論 1 1.1 研究背景與動機 1 1.2 研究目的 1 1.3 研究方法 2 1.4 研究貢獻 2 第二章 文獻探討 3 2.1 圍棋介紹 3 2.2 智慧遊戲格式 4 2.3 電腦圍棋輔助工具 4 2.3.1 形勢判斷 4 2.3.2 變化圖 5 2.3.3 勝率 6 2.4 棋型分類 7 2.5 序列到序列 8 2.6 注意力機制 9 第三章 模型設計與架構 11 3.1 模型架構 11 3.2 資料前處理 12 3.2.1 特徵提取 12 3.2.2 評論分類 22 3.2.3 評論斷詞 23 3.3 特徵表示 23 3.4 編碼器 24 3.5 解碼器 25 3.6 注意力機制 27 3.7 損失函數 28 3.8 貪婪搜尋 29 3.9 集束搜尋 30 第四章 實驗設計與結果 32 4.1 訓練及實驗設置 32 4.1.1 資料集 32 4.1.2 參數設定和環境設置 32 4.2 評估工具 33 4.3 實驗結果 35 第五章 結論與未來展望 42 5.1 結論 42 5.2 未來展望 42 參考文獻 43

    [1] Byeong Jo Kim, Yong Suk Choi. Automatic baseball commentary generation using deep learning. SAC '20: Proceedings of the 35th Annual ACM Symposium on Applied Computing, Pages 1056–1065, 2020.
    [2] Chengxi Li, Sagar Gandhi, Brent Harrison. End-to-end let's play commentary generation using multi-modal video representations. FDG '19: Proceedings of the 14th International Conference on the Foundations of Digital Games, Article No.: 76, Pages 1–7, 2019.
    [3] Diederik Kingma and Jimmy Ba. Adam: A method for stochastic optimization. In ICLR, 2015.
    [4] Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio. Neural Machine Translation by Jointly Learning to Align and Translate. International Conference on Learning Representations (ICLR), 2015.
    [5] David Silver, Aja Huang, Chris J. Maddison, Arthur Guez, Laurent Sifre, George van den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, Sander Dieleman, Dominik Grewe, John Nham, Nal Kalchbrenner, Ilya Sutskever, Timothy Lillicrap, Madeleine Leach, Koray Kavukcuoglu, Thore Graepel & Demis Hassabis. Mastering the game of Go with deep neural networks and tree search. Nature 529, 484-489, 2016
    [6] Delanyo K. B. Kulevome, Hong Wang*, and Xuegang Wang. A Bidirectional LSTM-Based Prognostication of Electrolytic Capacitor. Progress In Electromagnetics Research C, Vol. 109, 139–152, 2021.
    [7] Greff, K., Srivastava, R. K., Koutník, J., Steunebrink, B. R., & Schmidhuber, J. LSTM: A search space odyssey. IEEE transactions on neural networks and learning systems, 28(10), 2222-2232, 2016.
    [8] Hirotaka Kameko, Shinsuke Mori, and Yoshimasa Tsuruoka. Learning a game commentary generator with grounded move expressions. In Computational Intelligence and Games (CIG), 2015 IEEE Conference on. IEEE, pages 177–184, 2015.
    [9] Harsh Jhamtani, Varun Gangal, Eduard Hovy, Graham Neubig, and Taylor Berg-Kirkpatrick. Learning to Generate Move-by-Move Commentary for Chess Games from Large-Scale Social Forum Data. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), volume 1, pages 1661–1671, 2018.
    [10] Isaac Kamlish, Isaac Bentata Chocron, Nicholas McCarthy. SentiMATE: Learning to play Chess through Natural Language Processing. arXiv:1907.08321 [cs.LG], 2019.
    [11] K. Papineni, S. Roukos, T. Ward, and W. J. Zhu. BLEU: a method for automatic evaluation of machine translation. In ACL, 2002.
    [12] Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, Yoshua Bengio. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. CoRR, abs/1406.1078, 2014.
    [13] K. Ikeda, S. Viennot, and N. Sato. Detection and labeling of bad moves for coaching Go. Proceedings of IEEE Conference on Computational Intelligence and Games, pp. 1-8, 2016.
    [14] S. Hochreiter, J. Schmidhuber. Long Short-Term Memory. In MIT Press Journals on Neural Computation, vol. 9, no. 8, pp. 1735-1780, 1997.
    [15] Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel. Backpropagation applied to handwritten zip code recognition. Neural computation, 1989.
    [16] Yasufumi Taniguchi, Yukun Feng, Hiroya Takamura, Manabu Okumura. Generating Live Soccer-Match Commentary from Play Data. The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19), 2019.
    [17] Zhaoyue Sun, Jiaze Chen, Hao Zhou, Deyu Zhou1, Lei Li and Mingmin Jiang. GraspSnooker: Automatic Chinese Commentary Generation for Snooker Videos. Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-19), 2019.
    [18] Zihan Wang, Naoki Yoshinaga. From eSports Data to Game Commentary: Datasets, Models, and Evaluation Metrics. DEIM Forum 2021 H13-3, 2021.
    [19] Retrieve from: http://speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2017_2/Lecture/seq1.pdf
    [20] Retrieve from: https://discuss.ardupilot.org/t/nature-of-sim-speedup/50506
    [21] Retrieve from: https://en.wikipedia.org/wiki/Smart_Game_Format
    [22] Retrieve from: https://gtl.xmp.net/
    [23] Retrieve from: https://pytorch.org/docs/stable/generated/torch.nn.Embedding.html
    [24] Retrieve from: https://pytorch.org/docs/stable/generated/torch.nn.NLLLoss.html
    [25] Retrieve from: https://pytorch.org/tutorials/intermediate/seq2seq_translation_tutorial.html
    [26] Retrieve from: https://towardsdatascience.com/foundations-of-nlp-explained-visually-beam-search-how-it-works-1586b9849a24
    [27] Retrieve from: https://www.jpier.org/PIERC/pierc109/11.20120201.pdf
    [28] Retrieve from: https://zh.wikipedia.org/wiki/%E5%9B%B4%E6%A3%8B
    [29] Retrieve from: https://zh.wikipedia.org/wiki/%E6%A3%8B%E9%AD%82
    [30] Retrieve from: https://zh.wikipedia.org/wiki/Master_(%E5%9B%B4%E6%A3%8B%E8%BD%AF%E4%BB%B6)

    下載圖示 校內:立即公開
    校外:立即公開
    QR CODE