簡易檢索 / 詳目顯示

研究生: 李威德
Li, Wei-Te
論文名稱: 利用剖析樹中三層語法結構為基礎之詞組重排模型
A Phrase Reordering Model Using Three-level Syntactic Structures in A Parse Tree
指導教授: 盧文祥
Lu, Wen-Hsiang
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2008
畢業學年度: 96
語文別: 中文
論文頁數: 48
中文關鍵詞: 詞組重排詞組替換
外文關鍵詞: Phrase Reordering, Phrase Substitution
相關次數: 點閱:43下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 不同語言具有不同的語法結構,所以只用逐字翻譯的做法並不能得到一個流暢的翻譯句。理論上,我們可以從所有可能的詞組排列情形中找出一組最佳的翻譯結果,但由於計算量過於龐大,因此一般翻譯系統會設定限制條件以降低計算複雜度。例如傳統的distortion model (Brown et. al, 1993)會使用“distortion limit”來限制兩個單詞重排之後的距離,但這種做法,並不適合用在語法結構差異大的兩種語言上。因此近年來的研究已逐漸改為利用語法結構來決定詞組的相對位置。在本論文中,我們利用剖析樹的三層語法子結構以及function words來判斷翻譯句中的詞組排列順序,另外我們也利用詞組替換的方式讓翻譯更加流暢。
    實驗結果顯示我們的方法對於翻譯流暢度的提升是有效的,並且在我們的語料庫大小規模之下,使用三層語法子結構會比使用四層以上子結構有著更好的效果。

    Since the syntactic structures are often different between two languages, we can not translate a sentence fluently without reordering phrases. Theoretically speaking, we may find out the best translation from all possible permutations of phrases, but since the cost of computation might be too much, the conventional translation system will not consider all permutations. For example, the traditional distortion model (Brown et. al, 1993) uses the “distortion limit” to restrict the distance between two target words after translation, but this doesn’t work well on two languages whose syntactic structures are highly different with each other. Thus recent works have gradually used the syntactic structure to decide the order of phrases. We use the 3-level syntactic sub-structures in a parse tree and function words to determine the order of phrases. Also, we substitute some phrases to make the translation more fluent.
    Our experimental results show that it is useful to make the translation more fluent by using our methods. Besides, for the scale of our training corpus, adopting the 3-level syntactic sub-structures is better than adopting the higher level ones.

    摘要 I ABSTRACT II 章節目錄 III 表目錄 V 圖目錄 V 第一章 序論 1 1.1 研究動機 1 1.2 研究背景 2 1.3 研究方法 4 1.4 論文架構 6 第二章 相關研究與文獻 7 第三章 翻譯修正 10 3.1 研究架構 10 3.2 翻譯修正模型 (translation correction model) 11 3.2.1 概略翻譯模型(Coarse translation model) 12 3.2.2 語法剖析模型(Syntax parsing model) 13 3.2.3 詞組重排模型(Target-phrase reordering model) 13 3.2.4 詞組替換模型(target-phrase substitution model) 24 第四章 翻譯修正模型實驗 28 4.1 實驗資料 28 4.1.1 訓練語料庫與測試資料 28 4.1.2 評估方法 29 4.2 實驗結果 29 4.2.1 詞組重排 30 4.2.2 詞組替換 31 4.2.3 翻譯修正 36 4.3 實驗討論 37 4.3.1 Evaluation metric之影響 37 4.3.2 語法結構之影響 39 4.3.3 Function word和高頻詞之影響 42 4.3.4 翻譯系統之影響 43 第五章 結論與未來研究 45 5.1 結論 45 5.2 未來研究方向 45 參考文獻 47

    Yaser Al-Onaizan, and Kishore Papineni. 2006. Distortion Models for Statistical Machine Translation. Proceedings for ACL 2006.

    Peter F. Brown, Vincent J. Della Pietra, Stephen A. Della Pietra, and Robert L. Mercer. 1993. The Mathematics of Statistical Machine Translation: Parameter Estimation. Computational Linguistics, 19(2):263–311.

    Pi-Chuan Chang, and Kristina Toutanova. 2007. A Discriminative SyntacticWord Order Model for Machine Translation. Proceedings for ACL 2007.

    David Chiang. A hierarchical phrase-based model for statistical machine translation. 2005. In Proceedings of ACL 2005, pages 263-270.

    Philipp Koehn, Franz Joseph Och, and Daniel Marcu. 2003. Statistical Phrase-Based Translation. In Proceedings of HLT/NAACL.

    Philipp Koehn, Amittai Axelrod, Alexandra Birch Mayne, Chris Callison-Burch, Miles Osborne and David Talbot. 2005. Edinburgh System Description for the 2005 IWSLT Speech Translation Evaluation. In International Workshop on Spoken Language Translation.

    Shankar Kumar and William Byrne. 2005. Local phrase reordering models for statistical machine translation. In Proceedings of HLT-EMNLP.

    John Lafferty, Andrew McCallum, Fernando Pereira. 2001. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. In Proceedings of International Conference on Machine Learning. Pages 591-598.

    Chi-Ho Li, Dongdong Zhang, Mu Li, Ming Zhou, Minghui Li, and Yi Guan. 2007. A Probabilistic Approach to Syntax-based Reordering for Statistical Machine Translation. Proceedings for ACL 2007.

    Masaaki Nagata, Kuniko Saito, Kazuhide Yamamoto, and Kazuteru Ohashi. 2006. A Clustered Global Phrase Reordering Model for Statistical Machine Translation. In ACL/COLING 2006, pp. 713–720.

    Franz Josef Och, Christoph Tillman, and Hermann Ney. 1999. Improved alignment models for statistical machine translation. In Proceedings of the 1999 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora (EMNLP/WVLC-99), pages 20–28.

    F. J. Och and H. Ney. 2002. Discriminative training and maximum entropy models for statistical machine translation. In ACL.

    Franz Josef Och and Hermann Ney. 2004. The alignment template approach to statistical machine translation. Computational Linguistics, 30:417-449.

    Franz Josef Och, Ignacio Thayer, Daniel Marcu, Kevin Knight, Dragos Stefan Munteanu, Quamrul Tipu, Michel Galley, and Mark Hopkins. 2004. Arabic and Chinese MT at USC/ISI. Presentation given at NIST Machine Translation Evaluation Workshop.

    Hendra Setiawan, Min-Yen Kan, and Haizhou Li. 2007. Ordering Phrase with Function Words. Proceedings for ACL 2007.

    Christoph Tillmann and Tong Zhang. 2005. A localized prediction model for statistical machine translation. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL-05), pages 557–564.

    Christoph Tillmann. 2004. A block orientation model for statistical machine translation. In HLT-NAACL, Boston, MA, USA.

    D. Xiong, Q. Liu, and S. Lin. 2006. Maximum entropy based phrase reordering model for statistical machine translation. In ACL.

    下載圖示 校內:立即公開
    校外:2008-09-10公開
    QR CODE