研究生: |
李沅翰 Li, Yuan-Han |
---|---|
論文名稱: |
以語法語意修正模型技術改善華語轉台語機器翻譯之語意流暢度 Using Grammatical and Semantic Correction Model to Improve Chinese-to-Taiwanese Machine Translation Fluency |
指導教授: |
盧文祥
Lu, Wen-Hsiang |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
論文出版年: | 2022 |
畢業學年度: | 110 |
語文別: | 英文 |
論文頁數: | 28 |
中文關鍵詞: | 機器翻譯 、台語語法規則 、構詞轉換 、語序轉換 、華語轉台語 |
外文關鍵詞: | Machine translation, Taiwanese grammatical rules, Lexical transformation, Syntactic transformation, Chinese-to-Taiwanese |
相關次數: | 點閱:53 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
目前華語轉台語機器翻譯主要面臨三大問題:台語一詞多音選音、台語辭典未知詞選音,以及華語轉台語語法語意轉換。相關文獻多聚焦於台語一詞多音選音或是台語辭典未知詞選音的研究,而目前華語轉台語語法語意轉換之文獻僅片甲麟毛,然而台語句法的特殊規則也會影響翻譯成效,就算發音與選詞正確,若華語語句翻譯至台語時未考慮華台語之間的語法差異,仍會產生華語式台語,除了讀起來不流暢,也會影響讀者對原句語意的理解。本論文的貢獻在整理台語常見句型、語法規則,並以此模組對華語轉台語機器翻譯輸出進行語法語意偵錯修正,以改善語意流暢度。
Currently, there are three major issues to tackle in Chinese-to-Taiwanese machine translation: multi-pronunciation Taiwanese words, unknown words, and Chinese-to-Taiwanese grammatical and semantic transformation. Recent studies have mostly focused on the issues of multi-pronunciation Taiwanese words and unknown words, while very few research papers focus on grammatical and semantic transformation. However, there exist grammatical rules exclusive to Taiwanese that, if not translated properly, would cause the result to feel unnatural to native speakers and potentially twist the original meaning of the sentence, even with the right words and pronunciations. Therefore, we collect and organize a few common Taiwanese sentence structures and grammar rules, and create a grammar and semantic correction model for Chinese-to-Taiwanese machine translation, which would detect and correct grammatical and semantic discrepancies between the two languages, thus improving translation fluency.
[1] 潘冠勳 (2021)。基於變調的台語語音合成系統與中台翻譯應用,碩士論文,國立成功大學資訊工程學系,台灣。取自https://hdl.handle.net/11296/9s79j7
[2] https://suisiann.ithuan.tw/ 意傳科技鬥拍字系統官網
[3] Arvi Hurskainen, Jörg Tiedemann. Rule-based Machine translation from English to Finnish. Proceedings of the Second Conference on Machine Translation, 323–329.
[4] Peter F. Brown, John Cocke, Stephen A. Della Pietra, Vincent J. Della Pietra, Fredrick Jelinek, John D. Lafferty, Robert L. Mercer, Paul S. Roossin (1990), A Statistical Approach to Machine Translation. Computational Linguistics, 16(2), 79-85.
[5] Franz Josef Och, Christoph Tillmann, Hermann Ney (1999). Improved Alignment Models for Statistical Machine Translation. Proc. of the Joint Conference of Empirical Methods in Natural Language Processing. 1999 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, 20-28.
[6] Philipp Koehn, Franz J. Och, Daniel Marcu (2003). Statistical Phrase-Based Translation. Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, 127–133.
[7] Franz Josef Och, Hermann Ney (2004). The Alignment Template Approach to Statistical Machine Translation. Computational Linguistics, 30(4), 417–449.
[8] Ilya Sutskever, Oriol Vinyals, Quoc V. Le (2014). Sequence to Sequence Learning with Neural Networks. NIPS'14: Proceedings of the 27th International Conference on Neural Information Processing Systems, 2, 3104–3112.
[9] Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio (2014). Neural Machine Translation by Jointly Learning to Align and Translate, ICLR 2015
[10] Thang Luong, Hieu Pham, and Christopher D. Manning (2015). Effective Approaches to Attention-based Neural Machine Translation. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 1412–1421.
[11] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, Illia Polosukhin, Attention Is All You Need, Paper presented at the Meeting of 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA.
[12] 吳時耀 (2015)。結合多特徵模型與階層式架構解決台語文轉音系統中一詞多音問題,碩士論文,國立中興大學資訊科學與工程學系,台灣。取自https://hdl.handle.net/11296/aa7djz
[13] 陳世翔 (2015)。華台語文轉音系統中未知詞發音決策,碩士論文,國立中興大學資訊科學與工程學系,台灣。取自https://hdl.handle.net/11296/hy8jw8
[14] 許文漢、曾證融、廖元甫、王文俊、潘振銘 (2020)。基於深度學習之中文文字轉台語語音合成系統初步探討。International Journal of Computational Linguistics & Chinese Language Processing, 25(2), 69-84.
[15] 黃志超 (2015)。範例為本的國語--台語翻譯研究,碩士論文,國立臺灣海洋大學資訊工程學系,台灣。取自https://hdl.handle.net/11296/h9mtz
[16] https://github.com/ChhoeTaigi/ChhoeTaigiDatabase ChhoeTaigi 找台語:台語字詞資料庫
[17] 劉承賢 (2012)。台語、華語,語法大不同!。民國101年2月16日。取自http://taokara.blogspot.com/2012/02/blog-post.html
[18] 郭永鏜 (2016)。外籍與大陸配偶生活適應輔導班台語《語法》。民國105年7月4日。取自高雄市政府教育局:https://www.kh.edu.tw/filemanage/upload/2301/0712-2%E8%AA%9E%E6%B3%95-%E9%83%AD%E6%B0%B8%E9%8F%9C%E8%80%81%E5%B8%AB.pdf