簡易檢索 / 詳目顯示

研究生: 陳柏誠
Chen, Bo-Cheng
論文名稱: 新穎獨立成份分析應用於隱藏式馬可夫模型分群及未知訊號分離
A novel independent component analysis approach to hidden Markov model clustering and blind source separation
指導教授: 簡仁宗
Chien, Jen-Tzung
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2004
畢業學年度: 92
語文別: 中文
論文頁數: 87
中文關鍵詞: 未知訊號分離語音辨識發音差異獨立成份分析
外文關鍵詞: blind source separation, speech recognition, pronunciatio variation, independent component analysis
相關次數: 點閱:123下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  •   傳統獨立成份分析(independent component analysis, ICA)是利用高階統計量及資訊理論的方法訂定出非高斯特性(non-Gaussianity)或獨立度的量測準則(measurement criterion),再利用最佳化演算法,找出轉換矩陣,經由此轉換矩陣可以轉換隨機變數至線性獨立之向量空間。此獨立成份分析已被廣泛的應用於未知訊號分離(blind source separation)、資料分群(data clustering)、參數擷取以及很多圖形識別的應用包括腦波訊號分離、去除影像雜訊、語音參數擷取等。本論文的貢獻在於利用假說檢定中的獨立性(independence)檢定,在驗證各個成份間是否獨立的過程中,推導出以相似度比(likelihood ratio)為基礎的獨立度測量,再根據獨立成份分析中,成份的機率密度函數不得假設為高斯分佈的限制,利用非參數型(non-parametric)之核心密度函數估測(kernel density estimation)的技術來表示成份的機率密度函數,進而求得量測準則。最後使用最小梯度(gradient decent)演算法,迭代求出ICA轉換矩陣。
      我們將本論文所提出的新穎獨立成份分析應用於語音發音差異(pronunciation variation)的分析上。我們是將訓練語料經由獨立成份分析,投影到獨立成份所形成的向量空間上,每一獨立成份分別代表一種語音特性,因此經過投影後的語音特徵參數,具相同發音差異的特徵參數會群聚。再利用向量量化演算法將訓練語料分群,將分在同一群的訓練語料訓練出隱藏式馬可夫模型(hidden Markov model)。在實驗部份,我們使用TCC300連續音語料庫評估HMM分群在語音辨識上的效果。在未知訊號分離實驗中,我們使用一語音訊號與一音樂訊號,並隨機產生一混合矩陣(mixing matrix)做訊號的混合。原訊號可視為原獨立來源,我們利用獨立成份分析重建出獨立訊號源,並使用原訊號與重建訊號之SIR(signal to interference ratio)來評估分離的效能。實驗結果顯示,獨立性檢定確實可驗證成份間是否獨立,在無法得知訊號源的機率分佈情況下,使用非參數型機率密度函數逼近法比傳統未知訊號分離法有更佳的分離效果。另外在連續音辨識實驗中,經分群後的隱藏式馬可夫模型可顯著提昇辨識系統效能,實驗證明獨立成份子空間中確實蘊含語音訊號中潛在的發音差異資訊,且經獨立成份投影,增加群與群之間的資料獨立性,使不同發音特性之語料有更正確的分群,並且使訓練出的隱藏式馬可夫模型更契合發音變異。

      Independent component analysis (ICA) is increasingly important for many signal processing applications including data clustering and blind source separation. Traditionally, ICA used the high-order statistics or information-theoretic criteria to measure the non-Gaussianity or independence of random variables. These approaches are developed by formulating an objective function measuring the degree of independence and then optimizing the objective criterion to find the unmixing matrix for data transformation. In this thesis, we propose a novel objective function for ICA framework. Statistical hypothesis testing is applied to test whether the random variables are mutually independent or not. Such testing turns out to be likelihood ratio criterion, which is carried out for evaluation. To prevent assuming Gaussian distribution in hypothesis testing problem, we applied a nonparametric approach where the distributions of random variables are estimated using a kernel density functions. Having this nonparametric objective function finally, we find the optimal unmixing matrix for the applications of speech recognition and blind source separation.
      For the application on speech recognition, we use ICA for clustering of hidden Markov models (HMMs). The higher-order statistics is measured to represent the latent factors in pronunciation variations. These variations are compensated via HMM clustering using ICA. We carry out ICA and find the unmixing matrix to project the speech features of the same HMM into the independent component subspace. Then, we perform the HMM-level clustering. We train the clusters of HMM so as to cover different pronunciation variations. On the other hand, the proposed nonparametric ICA is also applied for blind source separation. We randomly generate a mixing matrix to mix the speech and audio signals. Our approach can be effectively obtain unmixing matrix and elevate the BSS performance in a measure of signal-to-inference ratio. In speech recognition experiments, the performance can be also improved by using clustered HMMs.

    第一章 緒論 1 1.1前言 1 1.2 研究動機與目的 2 1.3 研究方法簡介 4 1.4 章節概要 6 第二章 獨立成份分析文獻研討 8 2.1 前言 8 2.2 獨立成份分析基本理論 8 2.3獨立成份分析演算法 16 2.4獨立成份分析相關應用 16 第三章 隱藏式馬可夫模型分群與 獨立成份分析 20 3.1 中文語音辨識與隱藏式馬可夫模型 20 3.2 Viterbi演算法 23 3.3 EM (expectation-maximization)演算法 24 3.4 隱藏式馬可夫模型分群 27 3.4.1 監督式分類 28 3.4.2 非監督式分群 30 3.4.3 發音差異和語者分群的關係 34 3.5 獨立成份分析 35 3.5.1 資料前處理 35 3.5.2 非高斯特性即為獨立性 37 3.5.3 量測準則 38 3.5.4最佳化演算法 42 第四章 新穎之獨立成份分析 44 4.1 獨立度的測量 44 4.2 獨立成份的機率分佈估測 47 4.3 量測準則推導 47 4.4 新穎之獨立成份分析演算法 50 第五章 獨立成份分析應用於隱藏式馬可夫模型分群 51 5.1 隱藏式馬可夫模型分群 51 5.2 獨立成份分析應用於分析語音特性 52 第六章 實驗 58 6.1 實驗設定 58 6.1.1 未知訊號分離測試訊號 58 6.1.2 語料 61 6.1.3 語音模型設定 62 6.2 實驗結果 62 6.2.1 未知訊號分離 62 6.2.2 語音辨識 71 6.3 實驗討論 73 6.4 系統展示 73 第七章 結論與未來研究方向 78 7.1 結論 78 7.2 未來研究方向 79 參考文獻 81

    ICA Algorithm

    [1] S. Cruces, A. Cichocki, “Globally Convergent Newton Algorithm For Blind Decorrelation,” in Proceedings of the Fourth International Workshop on Independent Component Analysis and Blind Signal Separation (ICA 2003), Nara, Japan 2003.
    [2] S. Douglas, “On The Convergence Behavior of The FastICA Algorithm,” in Proceedings of the Fourth International Workshop on Independent Component Analysis and Blind Signal Separation (ICA 2003), Nara, Japan 2003.
    [3] A. Hyvärinen, “Fast and Robust Fixed-Point Algorithms for Independent Componen Analysis,” IEEE Transactions on Neural Network, vol. 10, pp 626-634, 1999.
    [4] A. Hyvärinen, E. Oja, “A Fast Fixed-Point Algorithm for Independent Component Analysis,’ Neural Computation, vol. 9, pp. 1483-1492, 1997.
    [5] A. Hyvärinen, E. Oja, “Independent Component Analysis: Algorithm and Application,” Neural Networks vol. 13, pp. 411-430, 2001.
    [6] A. Hyvärinen, E. Oja, “Independent Component Analysis by General Nonlinear Hebbian-like Learning rules.” Signal Processing, vol. 64, pp. 301-313, 1998.
    [7] S. Haykin, “Neural Networks ,” Prentice Hall, 1999.
    [8] S. Ikeda, “Factor Analysis Preprocessing For ICA,” in Proceedings of the Third International Workshop on Independent Component Analysis and Blind Signal Separation (ICA 2001), San Diego, USA 2001.
    [9] D. Pham, “Fast Algorithm for Estimating Mutual Information, Entropies and Score Functions,” in Proceedings of the Fourth International Workshop on Independent Component Analysis and Blind Signal Separation (ICA 2003), Nara, Japan 2003.
    [10]H. Park, “A Modification of the Gradient Algorithm for Blind Signal Separation,” in Proceedings of the Fourth International Workshop on Independent Component Analysis and Blind Signal Separation (ICA 2003), Nara, Japan 2003.

    Fundamental ICA Theories
    [11]R. Boscolo, H. Pan, P. Roychowdhury, “Non-Parametric ICA,” in Proceedings of the Third International Workshop on Independent Component Analysis and Blind Signal Separation (ICA 2001), San Diego, USA 2001.
    [12]R. Boscolo, H. Pan, P. Roychowdhury, “Independent Component Analysis Based Nonparametric Density Estimation,” IEEE Transaction on Neural Networks, Vol. 15, No. 1, January 2004.
    [13]Y. Blanco, S. Zazo, “New Gaussianity Measures Based on Order Statistics: Application to ICA,” Neurocomputing, Vol. 51, pp. 303-320, 2003.
    [14]A. Bell, T. Sejnowski, “An Information Maximization Approach to Blind Separation and Blind Deconvolution,” Neural Computation, Vol. 7, pp. 1129-1159, 1995.
    [15]J. Cardoso, “Informax and Maximum Likelihood for Source Separation,’ IEEE Letters on Signal Processing, Vol. 4, no. 4, pp. 112-114, Apr. 1997.
    [16]J. Cardoso, “Higher-Order Constrasts for Independent Component Analysis,” Neural Computation, Vol. 11, pp. 157-192, 1999.
    [17]P. Comon, “Separation of Sources Using Higher-Order Cumulants,” in Proc. SPIE, San Diego, CA, 1989, vol. 1152, pp. 170-181.
    [18]P. Comon, “Independent Component Analysis, A New Concept?” Signal Processing , Vol. 36, no. 3, pp. 287-314, 1994.
    [19]J. Eriksson, A. Kanhainen, V. Koivunen, “Novel Characteristic Function Based Criteria For ICA,” in Proceedings of the Third International Workshop on Independent Component Analysis and Blind Signal Separation (ICA 2001), San Diego, USA 2001.
    [20]J. Eriksoon, V. Koivunen, “Characteristic-Function-Based Independent Component Analysis,” Signal Processing Vol. 83, pp. 2195-2208, 2003.
    [21]J. Eriksson, J. Karvanen, V. Koivunen, “Source Distribution Adaptive Maximum Likelihood Estimation of ICA Model,” in Proceedings of the Third International Workshop on Independent Component Analysis and Blind Signal Separation (ICA 2001), San Diego, USA 2001.
    [22]A. Hyvärinen, “The Fixed-Point Algorithm and Maximum Likelihood Estimation for Independent Component Analysis,” Neural Processing Letter, 10(1), 1-5.
    [23]A. Hyvärinen, “Survey on Independent Component Analysis,” Neural Computing Surveys, Vol. 2, pp. 94-128, 1999.
    [24]A. Hyvärinen, J. Karhunen, E. Oja, Independent Component Analysis, Wiley, New York, 2001.
    [25]K. Hild, D. Erdogmus, J. Principe, “Blind Source Separation Using Renyi’s Mutual Information,” IEEE Signal Processing Letters, Vol. 8, no. 6, pp. 174-176, 2001.
    [26]M. Jones, R. Sibson, “What is Projection Pursuit?” Journal of the Royal Statistical Society, Series A, 150, 1-36, 1987.
    [27]J. Karvanen, V. Koivunen, “Blind Separation Using Absolute Monents Based Adaptive Estimating Function,” in Proceedings of the Third International Workshop on Independent Component Analysis and Blind Signal Separation (ICA 2001), San Diego, USA 2001.
    [28]L. Lathauwer, D. Moor, J. Vandewalle, J. Cardoso, “Independent Component Analysis of Largely Underdetermined Mixtures,” in Proceedings of the Fourth International Workshop on Independent Component Analysis and Blind Signal Separation (ICA 2003), Nara, Japan 2003.
    [29]W. Lu, J. Rajapakse, “Eliminating indeterminacy in ICA,” Neurocomputing Vol. 50, pp. 271-290, 2003.
    [30]N. Murata, “Properties of The Empirical Characteristic Function And Its Application to Testing For Independence,” in Proceedings of the Third International Workshop on Independent Component Analysis and Blind Signal Separation (ICA 2001), San Diego, USA 2001.
    [31]E. Miller, J. Fisher, “ICA Using Spacings Estimates of Entropy,” in Proceedings of the Fourth International Workshop on Independent Component Analysis and Blind Signal Separation (ICA 2003), Nara, Japan 2003.
    [32]P. Meinicke, H. Ritter, “Independent Component Analysis With Quantizing Density Estimators,” in Proceedings of the Third International Workshop on Independent Component Analysis and Blind Signal Separation (ICA 2001), San Diego, USA 2001.
    [33]V. Vigneron, C. Jutten, “Bounded Approximation for Score Function Selection,” in Proceedings of the Fourth International Workshop on Independent Component Analysis and Blind Signal Separation (ICA 2003), Nara, Japan 2003.
    [34]H. H. Yang, S.I. Amari, “Adaptive Online Learning Algorithms for Blind Separation: Maximum Entropy and Minimum Mutual Information,” Neural Computation, Vol. 9, pp. 1457-1482, 1997.
    ICA Related Applications
    [35]C. Huang, T. Chen, S. Li, E. Chang, J. Zhou, “Analysis of Speaker Variability,” in Proc. Eur. Conf. Speech Communication and Technology (EUROSPEECH), pp. 1377-1381, 2001.
    [36]S. Makeig, A. Bell, T. Jung, T. Sejnowski, “Independent Component Analysis of Electroencephalographic data,” Advances in Neural Information Processing System, vol. 8, Cambridge, MA: MIT Press, pp. 145-151, 1996.
    [37]K. Kiviluoto, E. Oja, “Independent Component Analysis for Parallel Financial Time Series,” Proceedings of the International Conference on Neural information Processing (ICONIP ’98) Vol. 2, pp. 895-898. 1998.
    [38]J. Lee, H. Jung, T. Lee, S. Lee, “Speech Feature Extraction Using Independent Component Analysis,” Proc. ICASSP 2000, IEEE, Vol. 3, pp. 1631-1634.
    [39]T. Lee, A. Bell, “Blind Separation of Delayed and Convolved Sources,” in Proc. ICASSP-98, 1998
    [40]R. Lambert A. Bell, “Blind Separation of Multiple Speakers in a multipath environment,” in Proc. ICASSP 1997, Munich
    [41]H. Saruwatari, K. Sawai, A. LEE, K. Shikano, A. Kaminuma, M. Sakata, “Speech Enhancement and Recognition in Car Environment using Blind Source Separation and Subband Elimination Processing,” in Proceedings of the Fourth International Workshop on Independent Component Analysis and Blind Signal Separation (ICA 2003), Nara, Japan 2003.
    [42]H. Saruwatari, K. Sawai, K. Shikano, “Blind source separation based on fast-convergence algorithm using ICA and array signal processing,’ in Proceedings of the Third International Workshop on Independent Component Analysis and Blind Signal Separation (ICA 2001), San Diego, USA 2001.

    ICA for Speech Recognition
    [43]L. Bahl, J. Baker, P. Cohen, F. Jelinek, B. Lewis, R. Mercer, “Recognition of a continuously read natural corpus,” Proc. ICASSP-78, pp. 422-424, 1978.
    [44]W. Chou, W. Reichl, “Decision Tree State Tying Based on Penalized Bayesian Information Criterion,” In Proceeding of IEEE International Conference on Acoustics, Speech and Signal Proceedings, Vol. 1, pp. 345-348, 1999.
    [45]Jen-Tzung Chien, Chih-Hsien Huang and Shun-Ju Chen, “Compact decision trees with cluster validity for speech recognition,” Proc. of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), vol. 1, pp. 873-876, Orlando, May 2002.
    [46]T. Fukada, T. Yoshimura, Y. Sagisaka, “Automatic Generation of Multiple Pronunciations Based on Neural Networks and Language Statistics,” Proc. ICASSP 1998.
    [47]T. Fukada, Y. Sagisaka, “Automatic Generation of a Pronunciation dictionary based on a pronunciation network,’ Proc. Eurospeech-97, 1997.
    [48]X. D. Huang, Y. Ariki, M. A. Jack, Hidden Markov Models For Speech Recognition, Edinburgh University Press 1990
    [49]B. H. Juang, Pattern Recognition in Speech and Language Processing CRC Press 2003
    [50]S. Kanokphara, V. Tesprasit, R. Thongprasirt, “Pronunciation Variation Speech Recognition without Dictionary Modification on Sparse Database,” Proc ICASSP 2003.
    [51]M. Ravishankar and M. Eskenazi, “Automatic Generation of Context-dependent Pronunciations,’ in Proc. Eur. Conf. Speech Communication and Technology (EUROSPEECH), pp. 2467-2470, 1997.
    [52]Mei-Yun Hwang, Xuedong Huang, “Predicting unseen triphones with senones,” IEEE Transactions on Speech and Audio Processing, Vol. 4, pp. 412-419, Nov 1996.
    [53]L. R. Rabiner, B. H. Juang, Fundamentals of Speech Recognition. Englewood Cliffs, NJ: Prentice-Hall, 1993
    [54]W. Reichl, W. Chou, “Decision Tree State Tying Based on Segmental Clustering for Acoustic Modeling,” In Proc. ICASSP-98, pp. 801-804, 1998.
    [55]R. Singh, B. Raj, R. Stern, “Automatic Generation of Subword Units for Speech Recognition Systems,” IEEE Transactions on Speech and Audio Processing, Vol. 10, No. 2, February 2002.
    [56]P. Schmid, R. Cole M. Fanty, “Automatically Generated Word Pronunciations from phoneme classifier output,” Proc. ICASSP-93, pp.223-226, 1993.
    [57]T. W. Anderson, An Introduction to Multivariate Statistical Analysis, John Wiley & Sons, Inc 1984.
    [58]M. T. Subbotin, “On the law of frequency of erros, ” Matematicheskii Sbornik, Vol. 31, pp. 296-301, 1923.
    [59]W. H. Pun, B. D. Jeffs, “Adaptive Image Restoration Using a Generalized Gaussian Model for Unknown Noise,” IEEE Transactions on Image Processing, Vol. 4, pp. 1451-1456, 1995.
    [60]S. Young, J. Jansen, J. Odell, D. Ollason, P. Woodland, The HTK Book (Version 2.0). ECRL , 1995.
    [61]陳順入, “應用叢集驗證法則於決策樹建立與語音辨識,” 國立成功大學資訊工程學系碩士論文, July 2001.

    下載圖示 校內:立即公開
    校外:2004-07-20公開
    QR CODE