簡易檢索 / 詳目顯示

研究生: 石家嶸
Shih, Jia-Rong
論文名稱: 基於自發性語音之半監督式學習對話系統
Semi-Supervised Learning Spoken Dialogue System Based on Spontaneous Speech Recognition
指導教授: 王駿發
Wang, Jhing-Fa
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2011
畢業學年度: 99
語文別: 英文
論文頁數: 41
中文關鍵詞: 自發性語音辨識關鍵詞辨識語音對話系統半自動學習機制
外文關鍵詞: spontaneous speech recognition, keyword-spotting, semi-supervised learning, speech dialogue
相關次數: 點閱:105下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 語音對話是人與人之間最自然的溝通方式,也是人與機器間最自然的溝通方式。但現有的語音對話系統均為限制固定式對話內容,無法應對自然口語中的語音變化。
    本論文提出了一種半監督學習自然口語對話系統,它使用二階段關鍵詞擷取方法來解決自然語音識別。本系統可用於電話服務或人類與機器人的互動。由於以往的研究通常採用的語音識別是利用固定詞組或短句來辨識使用者講話的內容。然而,在實際環境中固定和短句的辨識有一定的缺陷,並且有自發性語音的問題。因此,我們使用共振峰及冗餘詞消去來解決自發語音識別的問題,半監督學習學習是用來添加額外的詞彙。目標的系統可解決連續音,鼻音,音節合併,感嘆詞,語助詞和口吃等自發性語音辨識的問題問題。半監督學習系統可以半自動增加新詞彙進入系統的對話關鍵詞類別資料庫。
    在實驗中,我們發現我們的系統可以提高自發性語音識別率從16.7%到70%。而半監督學習系統可以使對話更加流暢,並適當增加新詞擴充我們的自發性語音對話系統。

    Speech is the most naturally communication mode between human and human, and it is the same as between human and machine. However, currently spoken dialogue system is limited and fixed the conversation contents, it cannot respond to pronunciation variations in spontaneous speech.
    This paper presents a semi-supervised learning spoken dialogue system, which is comprised of several effective methods for spontaneous speech recognition. The proposed system can be used in telephone services or human-computer interactions (HCI). Since the previous research on speech recognition usually utilize the fixed phrases or short sentence to recognize the speech content. However, the fixed and short sentence-based recognitions have some deficiencies in actual environments because of the spontaneous problems. Therefore, we adopt some approaches to solve the problems of spontaneous speech recognition, and add additional vocabularies into our proposed system for semi-supervised learning. The proposed system can detect the extended audio, nasal, syllable contraction, interjection, and the stuttering. The semi-supervised learning in this system can automatically increase the new vocabulary into the system by the dialogue manager scheme.
    In the experiments, the proposed system can improve the recognition rate from 16.7% to 70% for spontaneous speeches. The semi-supervised learning can make the dialogue system more smooth, and properly increase new words into the spontaneous speeches dialogue system.

    Abstract II 1. Introduction 1 1.1 Thesis Motivation 1 1.2 Thesis Objective 2 1.3 Thesis Organization 2 2. Related Works 3 2.1 Overview of Spoken Dialogue System 3 2.2 Learning Method for Spoken Dialogue System 4 2.2.1 Markov Decision Process Method 6 2.2.2 Q-Learning Method 7 2.3 Pronunciation Variance of Spontaneous Speech Recognition 9 2.3.1 Syllable Contraction 9 2.3.2 Lengthening and Nasal 9 2.3.3 Keyword Spotting 10 3 Semi-Supervised Learning Method for Spoken Dialogue System 13 3.1 Framework of the Proposed System 13 3.2 Definition for Keyword Classification of Semantic 14 3.3 Semi-Supervised Learning Method 16 3.3.1 Dialogue Manager and Strategy 16 3.3.2 Semantic Analysis and Detection 17 3.3.3 Sentence Redundancy Delete 18 3.3.4 Vocabulary Detection and Keyword Learning 20 4 Improvement of The Spontaneous Speech Recognition 24 4.1 Age Group Identification with Spontaneous Speech Processing 24 4.2 Noise Reduction Processing 27 4.2.1 Voice Activity Detection (VAD) 27 4.2.2 Dynamic Spectrum Subtraction (DSS) 28 4.3 Multiple Age-based Acoustic Models 28 4.4 Keyword Spotting and Syllable Contraction 31 5 Experiments and Comparisons 34 5.1 Experimental Setup for Semi-Supervised Learning dialogue system Method 34 5.2 Experimental Results 35 5.2.1 Results of Spontaneous Speech Recognition 35 5.2.2 Keyword Spotting Accuracy of adult Speech 36 5.2.3 Results of Spoken Dialogue System with Semi-Supervised Learning Method 38 6. Conclusions and Future Works 39 References 39

    [1] 羅應順,“An Implementation of Spontaneous Mandarin Speech Recognition
    Baseline System,”電信工程學系碩士論文,國立交通大學,2005
    [2] 孫立諺, “An Analysis and Modeling of Syllable Contraction in
    Spontaneous Mandarin Speech Recognition,”電信工程學系碩士論文,國立交通大學,2004
    [3] Jun Ogata1, Masataka Goto, and Katunobu Itou “The Use of Acoustically Detected Filled And Silent Pauses In Spontaneous Speech Recognition,” ICASSP, 2009
    [4] 中研院新世紀語料庫MCDC ,http://mmc.sinica.edu.tw/mcdc_c.htm
    [5] Maozu Guo, Yang Liu, and Jacek Malec “A new Q-learning Algorithm Based on the Metropolis Criterion”,IEEE Transaction On Systems Man And Cybernetics
    [6] Esther Levin, Member, IEEE,Roberto Pieraccini Member, IEEE and Wieland Eckert, Member, IEEE “A Stochastic Model of Human-Machine Interaction for Learning Dialogue Strategy”IEEE Transactions On Speech And Audio Processing
    [7] Wolfgang Minker, Johannes Pittermann, Angela Pittermann, Petra-Maria StrauB3, Dirk Btihfler “Next-Generation Humancomputer Interfaces Towards Intelligent, Adaptive And Proactive Spoken Language Dialogue Systems ,”
    [8] Thomas Hofmann, “ProbabiTableic Latent Semantic Analysis”, uncertainity in Artificial Intelligence, UAI’99 Stockholm
    [9] 陳怡婷,黃耀民,葉耀明, 陳柏琳,“中文語音文件自動摘要之摘要模型”
    [10] 洪儷瑜,陳佩盈,“中文句型類型整理”,臺灣師範大學特教系,2007,05
    [11] 張弘霖,“基於位置特定事後機率詞圖及潛藏與異分析之語音文件檢索Spoken Document Retrieval Based on Position Specific Posterior Lattices and Latent Semantic Analysis”,國立台灣大學電機資訊學院資訊工程學系,碩士論文
    [12] 朱育德,“基於字詞內容之適應性對話系統”, 國立中央大學資訊工程研究所,碩士論文,2006,07
    [13] Lee,Yun-Huan,“以樹狀資料結構為基礎之語音對話系統”,國立台灣大學資訊網路與多媒體研究所,碩士論文,2008,7
    [14] Eugene Santos Jr. and John T. Wilkinson,“Bayesian Knowledge Fusion”,Proceedings of the Twenty-second International FLAIRS Conference ,2009
    [15] 康育楷,“Detection and Correction of Syllable Contraction in Spontaneous Speech Recognition”,國立成功大學,資訊工程學系,碩士論文,2008
    [16] 劉啟權,“Disfluency Detection in Spontaneous Speech using Conditional Random Field”, 國立成功大學,資訊工程學系,碩士論文,2008
    [17] http://www.kwuntung.net/synonym/,相關同義詞出處
    [18] 陳建明, “植基於本體論之中文文件摘要系統”,國立成功大學,資訊管理研究所,碩士論文
    [19] 李易, “A preliminary Study on Automatic Detection of Filled Pause in Spontaneous Speech”, 國立台灣大學,電資工程學院電信工程學系,碩士論文,2008,07
    [20] 蔡金翰,”語音對話系統和對話策略之研究”國立交通大學,電信工程學系,碩士論文,2005,07
    [21] Weifeng Li†et al,” Keyword Detection for Spontaneous Speech”Swiss Federal Institute of Technology, Lausanne (EPFL), Switzerland‡Idiap Research Institute, CH-1920, Martigny, Switzerland
    [22] Sevket Duran, “Keyword Spotting Using Hidden Markov Models” B.S. in E.E., Bosaziçi University, 1997Submitted
    [23] Pengyuan Zhang,” A New Keyword Spotting Approach for Spontaneous Mandarin Speech”, ICSP2006 Proceedings
    [24] Martin W¨ollmer et al, “Robust Discriminative Keyword Spotting For Emotionally Colored Spontaneous Speech Using Bidirectional LSTM Networks”, Institute for Human-Machine Communication, Technische Universit¨at M¨unchen, Germany2Idiap Research Institute, Martigny, Switzerland3 Institute for Computer Science VI, Technische Universit¨at M¨unchen, Germany
    [25] 梁維彬,”應用於口述對話系統之語者意賅表示和情緒辨識之研究”,國立成功大學,資訊工程學系,博士論文,2011,07
    [26] S. Gökhun Tanyer and Hamza Özer, “Voice Activity Detection in Nonstationary Noise,” IEEE Transactions On Speech And Audio Processing, VOL. 8, NO. 4, JULY 2000
    [27] 蕭育丞,”應用句型結構與部份樣本樹於對話行為之偵測”,國立成功大學,資訊工程學系,碩士論文,2009

    無法下載圖示 校內:2016-09-06公開
    校外:不公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE