簡易檢索 / 詳目顯示

研究生: 廖志彬
Liao, Chih-Pin
論文名稱: 鑑別性模型應用於人臉及動作辨識
Discriminative Models for Face and Motion Recognition
指導教授: 簡仁宗
Chien, Jen-Tzung
學位類別: 博士
Doctor
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2010
畢業學年度: 98
語文別: 英文
論文頁數: 97
中文關鍵詞: 隱藏式馬可夫模型信賴度量測鑑別性特徵擷取鑑別式訓練圖形分類人臉辨識動作估測條件式隨機域圖形模型變異推論方法
外文關鍵詞: Hidden Markov model, confidence measure, discriminative feature extraction, discriminative training, pattern classification, face recognition, Motion estimation, conditional random field, graphical model, variational inference
相關次數: 點閱:185下載:3
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 鑑別性學習是圖形識別領域中相當重要的探討課題。一般而言訓練圖形識別分類器的步驟為首先對訓練資料集擷取適當且具鑑別力的特徵,再選擇具鑑別力的模型並用具鑑別力的訓練條件來訓練模型。因此為了增加圖形識別的分類器鑑別力,本論文分別從(1)鑑別性特徵擷取、(2)鑑別性模型訓練及(3)新穎鑑別性模型等三個方向來進行研究並提出若干鑑別性演算法應用於人臉辨識及人類動作辨識,提升系統之辨識效能。
    在鑑別性特徵擷取方面,線性鑑別式分析是在圖形識別中最常用來擷取具鑑別性特徵的方法。其主要觀念是利用最大化類別間散佈情形與類別內散佈情形的比值來求得一個最佳的線性轉換矩陣來轉換原始特徵向量至較具鑑別力的空間。但其缺點是在高維度特徵空間跟小樣本數訓練集的情形下會有因為散佈矩陣具奇異性(Singular)的情形而無法實現,在此論文中我們針對此問題提出一套在傳統鑑別式分析之前以非奇異性轉換(Nonsingular Transformation)來解決此問題的方法。這個方法先將原始空間中的人臉特徵先經過對應到散佈矩陣之非零特徵值的特徵向量所組成之轉換矩陣先進行轉換,藉此可保持在新空間的散佈矩陣的非奇異性。接著再以新空間中的散佈矩陣來進行線性鑑別式分析。這樣的做法不但可以解決散佈矩陣奇異性問題,並且因為類別內散佈矩陣的萎縮而使得新的特徵空間更具鑑別力。在實驗中,我們使用ORL與FERET人臉資料庫進行實驗評估並發現我們所提出的方法在不同大小的訓練集與類別個數條件下都較原始的線性鑑別式相關方法有較佳的辨識效果。
    另外,在鑑別性模型訓練方面,我們提出結合特徵擷取與二維隱藏式馬可夫模型(Hidden Markov Model, HMM)的鑑別性訓練。我們提出一種新穎的鑑別性訓練準則來確保訓練出更低維度且更具鑑別力的模型。這種新的訓練準則是從假說檢定理論中的最大化信賴度(Maximum Confidence, MC)準則所推衍而來,透過最大化信賴度函式,並接受影像區塊是從目標HMM狀態生成而非從競爭HMM狀態生成的假說,根據此最佳信賴準則估測出MC-HMM模型參數和特徵轉換矩陣,在我們提出的架構下,我們使用相同的訓練條件求得轉換矩陣來擷取鑑別性的人臉特徵並且訓練有封閉解的MC-HMM參數,以實現模型化二維人臉影像並建立鑑別式人臉辨識系統。在辨識階段,我們提出雙層式Viterbi分割法來得到最佳的HMM狀態序列和混合成分序列。從實驗結果我們發現所提出的方法可以在不同表情、方向的人臉中取得更佳的人臉部位分割結果,並得知我們所訓練的模型能有效符合人臉真正的特徵。此外在比較最大化相似度跟最小化分類錯誤等方法訓練出來的模型後,我們也發現所提出的方法有更高的人臉辨識結果。
    在新穎鑑別性模型方面,我們鑽研於一套新穎的鑑別性模型:條件式隨機域(Conditional Random Field, CRF)的研究。在本研究中我們提出新穎的推論方法進行條件式隨機域的模型訓練,並應用在人體肢體動作這類具有上下文資訊(Contextual Information)的圖形識別系統中。在圖形辨識的方法中,條件式隨機域是一種廣為所知可以模型化上下文資訊的機制。但是因為此方法中變數的模型與機率計算的複雜度取決於描述變數之間關係的拓樸圖形,所以我們以圖形模型(Graphical Model)觀念中的結合樹(Junction Tree, JT)理論提出一個新的模型來解決具有迴圈複雜拓樸的人體動作分類條件機率推論問題。在推論方法上除了使用樹狀結構圖形模型的推論方法外也提出了因子化與結構化之變異推論方法(Factorized Variational Inference and Structured Variational Inference)來直接逼近條件機率。在實現上我們使用有連續性數值的特徵來建立結合樹狀的條件式隨機域(JT-CRF)並且以IDIAP手勢影片資料庫與CMU人體動作影片資料庫來做評估。從實驗中發現我們提出的方法比原始的隱藏式馬可夫模型,最大化熵(Maximum Entropy)馬可夫模型與原始的CRF有更好的動作分類效果。本論文所提出的若干鑑別性演算法不但可以應用在人臉及動作辨識還可以應用在其他實用的圖形識別系統中。

    Discriminative learning has been known as an important and challenging research topic in the areas of pattern recognition. To build a discriminative pattern classifier, the training procedure should involve (1) extraction of discriminative features, (2) selection of discriminative models, and (3) estimation of discriminative model according to a discriminative criterion. Accordingly, we present some discriminative learning algorithms for feature extraction, model construction and model training and apply them for face recognition and human motion recognition.
    First of all, we present a new Fisher linear discriminant analysis (LDA) to extract the discriminative facial features for face recognition. LDA aims to find an optimal discriminant transformation matrix, which maximizes the ratio of between-class scatter to within-class scatter. However, in case of small sample size and high dimensional data, LDA is prone to be unrealizable due to the singularity of scatter matrices. In this study, we present a nonsingular transformation prior to performing LDA. This method is to transform facial features using all eigenvectors of scatter matrix with nonzero eigenvalues. As a result, the scatter matrix of transformed features is nonsingular. Next, the discriminant transformation is applied according to LDA using the new scatter matrices. The superiority of nonsingular discriminant analysis of between-class matrix comes from the shrinkage of within-class scatters and accordingly the enhancement of Fisher class separability. From the experiments on ORL and FERET facial databases, we find that the nonsingular discriminant feature extraction achieves significant face recognition performance compared to other LDA-related methods for a wide range of sample sizes and class numbers.
    Considering the issue of discriminative model training, we present some approaches for face recognition based on the hidden Markov models (HMMs). We propose a hybrid framework of feature extraction and HMM training for two-dimensional pattern recognition. Importantly, we explore a new discriminative training criterion to assure model compactness and discriminability. This criterion is derived from hypothesis test theory via maximizing the confidence of accepting the hypothesis that observations are from target HMM states rather than competing HMM states. We accordingly develop the maximum confidence hidden Markov modeling (MC-HMM) for face recognition. Under this framework, we merge a transformation matrix to extract discriminative facial features. The closed-form solutions to continuous-density HMM parameters are formulated. Attractively, the hybrid MC-HMM and feature transformation parameters are estimated under the same criterion and converged through the expectation-maximization procedure. From the experiments on ORL and FERET facial databases, we find that the proposed method obtains robust segmentation in presence of different facial expressions, orientations, etc. In comparison with maximum likelihood and minimum classification error HMMs, the proposed MC-HMM achieves higher recognition accuracies with lower feature dimensions.
    Additionally, we present a novel discriminative model training approach based on the paradigm of (CRFs). CRFs have been known as a discriminative and contextual model for many pattern recognition applications. However, the model complexity and computational cost highly depends on the topology that characterizes the dependences between variables. We propose a new graphical model to establish the CRFs with cycles for representation of human motions. The dependent variables are integrated into a clique for clique tree construction. The cycles of sequential variables are tackled by the junction tree algorithm. A tree-based inference algorithm is developed to calculate the joint probability of sequential variables in the clique tree. Furthermore, the factorized variational inference (FVI) and the structured variational inference (SVI) are presented for direct approximation of the conditional probability rather than the calculation of marginal probability in CRFs. In the implementation, the continuous-valued feature functions are specified and adopted to build the junction tree CRF (JT-CRF) algorithm. In the experiments on IDIAP and CMU human motion databases, the JT-CRF combined with FVI and SVI schemes achieves the highest classification accuracies in comparison with the HMM, the maximum entropy Markov model and the baseline CRF in case of different contextual ranges. We propose these general machine learning approaches which are only useful for face and human motion recognition but also for many other pattern recognition systems.

    中文摘要 i Abstract iii 誌謝 vi LIST OF CONTENTS vii LIST OF FIGURES x Chapter 1 Introduction 1 1.1 MOTIVATIONS 1 1.2 OUTLINE of THIS DISSERTATION 4 Chapter 2 Nonsingular Discriminant Feature Extraction for Face Recognition 5 2.1 LINEAR DISCRIMINANT ANALYSIS 7 2.2 NONSINGULAR DISCRIMINANT ANALYSIS 8 2.2.1 Nonsingular Transformation 8 2.2.2 Discriminant Transformation 10 2.3 EXPERIMENTS 11 2.3.1 Experimental Setup 11 2.3.2 Evaluation of Different Methods on ORL Database 12 2.3.3 Evaluation of Baseline LDA and NDT on FERET Database 14 2.4 SUMMARY 16 Chapter 3 Maximum Confidence Hidden Markov Modeling for Face Recognition 17 3.1 DISCRIMINATIVE TRAINING FOR 2D PATTERN CLASSIFICATION 20 3.1.1 Minimum Classification Error Training 20 3.1.2 Discriminative Feature Extraction 22 3.1.3 Hidden Markov Model for Object Recognition 23 3.2 MAXIMUM CONFIDENCE HIDEEN MARKOV MODEL 25 3.2.1 Maximum Confidence Criterion 25 3.2.2 Hybrid Feature Extraction and HMM Modeling 26 3.2.3 MC-HMM Viterbi Algorithm 31 3.2.4 Convergence Property and Implementation Procedure 32 3.2.5 Classification Rule and Relation to Other Methods 35 3.3 EXPERIMENTS 36 3.3.1 Experimental Setup and Implementation 36 3.3.2 Evaluation of Convergence Property and Facial Segmentation 38 3.3.3 Classification Performance Using FERET Database 40 3.3.4 Classification Performance Using ORL Database 44 3.4 SUMMARY 45 Chapter 4 Modeling and Inference in Conditional Random Fields 46 4.1 SURVEY OF RELATED MODELS 48 4.1.1 Maximum Entropy Markov Models 49 4.1.2 Conditional Random Fields 50 4.2 GRAPHICAL MODELS AND VARIATIONAL INFERENCE 53 4.2.1 Graphical Models in CRFs 53 4.2.2 Parameter Estimation in CRFs 56 4.2.3 Variational Inference in CRFs 59 4.3 EXPERIMENTS 62 4.3.1 Experimental Setup 62 4.3.2 Implementation Issues 63 4.3.3 Evaluation of Objective Function 66 4.3.4 Evaluation of Classification Accuracy 68 4.3.5 Different Evaluations on CMU database 74 4.4 SUMMARY 77 Chapter 5 Conclusions and Future Works 79 5.1 CONCLUSIONS 79 5.2 FUTURE WORKS 81 Bibliography 83 Appendix 91 EXPERIMENTAL SAMPLES 91 1. FERET Facial Database 91 2. ORL Facial Database 92 3. CMU Graphics Lab Motion Capture Database 93 4. IDIAP TwoHandManip Database 96

    L. Bahl, P. Brown, P. de Souza and R. Mercer, “Maximum mutual information estimation of hidden Markov model parameters for speech recognition,” in Proc. of IEEE International Conference on Acoustic, Speech and Signal Processing, vol. 11, pp. 49-52, 1986.
    G. Baudat and F. Anouar, “Generalized discriminant analysis using a kernel approach,” Neural Computation, vol. 12, pp. 2385-2404, 2000.
    G. Baudat and F. Anouar, “Feature vector selection and projection using kernels,” Neurocomputing, vol. 55, pp. 21-38, 2003.
    P. N. Belhumeur, J. P. Hespanha and D. J. Kriegman, “Eigenfaces vs. fisherfaces: recognition using class specific linear projection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 711-720, 1997.
    A. Ben-Yishai and D. Burshtein, “A discriminative training algorithm for hidden Markov models,” IEEE Transactions on Speech and Audio Processing, vol. 12, no. 3, pp. 204-217, 2004.
    A. Berger, S. D. Pietra and V. D. Pietra, “A maximum entropy approach to natural language processing,” Computational Linguistics, vol. 22, no. 1, pp. 39-71, 1996.
    M. Bicego, U. Castellani and V. Murino, “Using hidden Markov models and wavelets for face recognition,” in Proc. of IEEE International Conference on Automatic Face and Gesture Recognition, pp. 257-262, 2000.
    C. M. Bishop, Pattern Recognition and Machine Learning, Springer Science, 2006.
    Y.-L. Chang, F.-D. Jou, C.-C. Han, K.-C. Fan, K. S. Chen, and J.-H. Chang, “A modular eigen subspace scheme for high-dimensional data classification,” Special Issue on Geocomputation and Evolutionary Computation, Future Generation Computer Systems , vol. 20, no. 7, pp. 1131-1143, 2004.
    L.-F. Chen, H.-Y. M. Liao, M.-T. Ko, J.-C. Lin and G.-J. Yu, “A new LDA-based face recognition system which can solve the small sample size problem,” Pattern Recognition, vol. 33, no. 10, pp. 1713-1726, 2000.
    J.-T. Chien and C.-C. Wu, “Discriminant waveletfaces and nearest feature classifiers for face recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 12, pp. 1644-1649, 2002.
    J.-T. Chien and C.-P. Liao, “Maximum confidence hidden Markov modeling for face recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 30, no. 4, pp. 606-616, 2008.
    J.-T. Chien and J.-C. Junqua, “Unsupervised hierarchical adaptation using reliable selection of cluster-dependent parameters,” Speech Communication, vol. 30, no. 4, pp. 235-253, 2000.
    J.-T. Chien and C.-H. Chueh, “Joint acoustic and language modeling for speech recognition”, Speech Communication, vol. 52, no. 3, 2010.
    J. Darroch and D. Ratcliff, “Generalized iterative scaling for log-linear models,” The Annals of Mathematical Statistics, vol. 43, pp. 1470-1480, 1972.
    A. P. Dempster, N. M. Laird and D. B. Rubin, “Maximum likelihood from incomplete data via the EM algorithm,” Journal of the Royal Statistical Society Series B, vol. 39, pp. 1-38, 1977.
    R. O. Duda and P. E. Hart, Pattern Classification and Scene Analysis, John Wiley & Sons, 1973.
    K. Etemad and R. Chellappa, “Face recognition using discriminant eigenvectors,” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 4, pp. 2150-2153, 1996.
    R. A. Fisher, “The use of multiple measurements in taxonomic problems,” Annals of Eugenics, vol. 7, pp. 179-188, 1936.
    R. A. Fisher, “The statistical utilization of multiple measurements,” Annals of Eugenics, vol. 8, pp. 376-386, 1938.
    D. H. Foley and J. W. Sammon, “An optimal set of discriminant vectors,” IEEE Transactions on Computer, vol. C-24, no. 3, pp. 281-289, 1975.
    Qiang Fu, Xiaodong He and Li Deng, “Phone-discriminating minimum classification error (P-MCE) training for phonetic recognition,” in Proc. of Annual Conference of the International Speech Communication Association (INTERSPEECH), pp.2073-1076, 2007.
    K. Fukunaga and J. M. Mantock, “Nonparametric discriminant analysis,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. PAMI-5, no. 6, pp. 671-678, 1983.
    F. Fukunaga, Introduction to Statistic Pattern Recognition, Academic Press, New York, 1990.
    Y. Ge, Q, Huo and Z.-D. Feng, “Offline recognition of handwritten Chinese characters using Gabor features, CDHMM modeling and MCE training,” in Proc. of International Conference on Acoustics, Speech and Signal Processing, vol. 1, pp. 1053-1056, 2002.
    Z. Ghahramani and M. I. Jordan, “Factorial hidden Markov models,” Machine Learning, vol. 29, pp. 245-273, 1997.
    A. Gunawardana, M. Mahajan, A. Acero and J. C. Platt, “Hidden conditional random fields for phone classification,” in Proc. of European Conference on Speech Communication and Technology, pp. 1117-1120, 2005.
    P.-C. Hsiao, C.-S. Chen and L.-W. Chang, “Human action recognition using temporal-state shape contexts,” in Proc. of International Conference on Pattern Recognition, pp. 1-4, 2008.
    C. Huang and A. Darwiche, “Inference in belief networks: a procedural guide,” International Journal of Approximate Reasoning, vol. 11, pp. 1-158, 1994.
    S. P. Huang, C.-C. Yu, Y.-N. Chen, C.-C. Han, K.-C. Fan, and T. C. Chuang, “Human action recognition using active appearance model,” 21th IPPR Conference on Computer Vision Graphics and Image Processing, 2008.
    J. Hung and L. Lee, “Data-driven temporal filters for robust features in speech recognition obtained via minimum classification error (MCE),” in Proc. of IEEE International Conference on Acoustic, Speech and Signal Processing, vol. 1, pp. 373-376, 2002.
    Q. Huo, Y. Ge and Z.-D. Feng, “High performance Chinese OCR based on Gabor features, discriminative feature extraction and model training,” in Proc. of IEEE International Conference on Acoustic, Speech and Signal Processing, vol. 3, pp. 1517-1520, 2001.
    M. Jeong and G. G. Lee, “Triangular-chain conditional random fields,” IEEE Transactions on Audio, Speech and Language Processing, vol. 16, no. 7, pp. 1287-1302, 2008.
    M. I. Jordan, “Graphical models,” Statistical Science, vol. 19, pp. 140-155, 2004.
    B.-H. Juang and S. Katagiri, “Discriminative learning for minimum error classification,” IEEE Transactions on Signal Processing, vol. 40, no. 12, pp. 3043-3054, 1992.
    S. Katagiri, B.-H. Juang and C.-H. Lee, “Pattern recognition using a family of design algorithms based upon the generalized probabilistic descent method,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2345-2373, 1998.
    M. S. Kim, D. Kim and S. Y. Lee, “Face recognition using the embedded HMM with second-order block-specific observations,” Pattern Recognition, vol. 36, no. 11, pp. 2723-2735, 2003.
    V. V. Kohir and U. B. Desai, “Face recognition using a DCT-HMM approach,” in Proc. of IEEE Workshop on Application of Computer Vision, pp. 226-231, 1998.
    N. Kumar and A. G. Andreou, “Heteroscedastic discriminant analysis and reduced rank HMMs for improved speech recognition,” Speech Communication, vol. 26, pp. 283-297, 1998.
    H.-K. J. Kuo and Y. Gao, “Maximum entropy direct models for speech recognition,” IEEE Transactions on Audio, Speech and Language Processing, vol. 14, no. 3, pp. 873-881, 2006.
    S. S. Kuo and O. E. Agazzi, “Keyword spotting in poorly printed documents using pseudo 2-D hidden Markov model,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 16, no. 8, pp. 842-848, 1994.
    J. Lafferty, A. McCallum and F. Pereira, “Conditional random fields: probabilistic models for segmenting and labeling sequence data,” in Proc. of International Conference on Machine Learning, pp. 282-289, 2001.
    N. D. Lawrence, “Gaussian process latent variable models for visualisation of high dimensional data,” Advances in Neural Information Processing Systems, pp. 329-336, MIT Press, 2004.
    E. Levin and R. Pieraccini, “Dynamic planar warping for optical character recognition,” in Proc. of International Conference on Acoustics, Speech and Signal Processing, vol. 3, pp. 149-152, 1992.
    S. Z. Li and J. Lu. “Face recognition using the nearest feature line method,” IEEE Transactions on Neural Networks, vol. 10, no. 2, pp. 439-443, 1999.
    Y.-M. Liang, S.-W. Shih, A. C.-C. Shih, H.-Y. M. Liao, and C.-C. Lin, “Learning atomic human actions using variable-length Markov models,” IEEE Transactions on Systems, Man, and Cybernetics, Part B, vol. 39, no. 1, pp. 268-280, 2009.
    C.-P. Liao and J.-T. Chien, “Nonsingular discriminant feature extraction for face recognition,” in Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 2, pp. 18-23, 2005.
    C.-P. Liao and J.-T. Chien, “Maximum confidence hidden Markov modeling,” in Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 5, pp. 549-552, 2006.
    C.-P. Liao and J.-T. Chien, “Graphical modeling of conditional random fields for human motion recognition,” in Proc. of IEEE International Conference on Acoustic, Speech and Signal Processing, pp. 1969-1972, 2008.
    C.-P. Liao and J.-T. Chien, “Variational inference for conditional random fields”, in Proc. of International Conference on Acoustics, Speech, and Signal Processing, Dallas, March 2010.
    C. Liu, H. Jiang and X. Li, “Discriminative training of CDHMMs for maximum relative separation margin,” in Proc. of IEEE International Conference on Acoustic, Speech and Signal Processing, vol. 1, pp. 101-104, 2005.
    C. Liu and H. Wechsler, “Enhanced Fisher linear discriminant models for face recognition,” in Proc. of IEEE International Conference on Pattern Recognition, vol. 2, pp. 1368-1372, 1998.
    M. Loog, R. P. W. Duin and R. Haeb-Umbach, “Multiclass linear dimension reduction by weighted pairwise Fisher criteria,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, no. 7, pp. 762-766, 2001.
    A. M. Martinez and A. C. Kak, “PCA versus LDA,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, pp. 228-233, 2001.
    A. McCallum, D. Freitag and F. Pereira, “Maximum entropy Markov models for information extraction and segmentation,” in Proc. of International Conference on Machine Learning, pp. 591-598, 2000
    S. Mika, G. Rätsch, J. Weston, B. Schölkopf, and K.-R. Müller, “Fisher discriminant analysis with kernels,” in Proc. of the IEEE Signal Processing Society Workshop In Neural Networks for Signal Processing IX, pp.41-48, 1999.
    D. R. H. Miller, T. Leek and R. M. Schwartz, “A hidden Markov model information retrieval system,” in Proc. of 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 214-221, 1999.
    A. V. Nefian and M. H. Hayes III, “An embedded HMM-based approach for face detection and recognition,” in Proc. of International Conference on Acoustics, Speech and Signal Processing, vol. 6, pp. 3553-3556, 1999.
    A. V. Nefian and M. H. Hayes III, “Maximum likelihood training of the embedded HMM for face detection and recognition,” in Proc. of International Conference on Image Processing, vol. 1, pp. 33-36, 2000.
    B. Noble and J. W. Daniel, Applied Linear Algebra, Prentice-Hall, Englewood Cliffs, NJ, 1988.
    M. Ordowski and G. G. L. Meyer, “Geometric linear discriminant analysis,” in Proc. of IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 5, pp. 3173-3176, 2001.
    H. Othman and T. Aboulnasr, “A separable low complexity 2D HMM with application to face recognition,” IEEE Transaction on Pattern Analysis and Machine Intelligence, vol. 25, no. 10, pp. 1229-1238, 2003.
    A. Schwaighofer, V. Tresp and K. Yu, “Learning Gaussian Process Kernels via Hierarchical Bayes,” Advances in Neural Information Processing Systems, pp. 1209-1216, MIT Press, 2004.
    H. S. Park and S. W. Lee, “A truly 2D hidden Markov model for off-line handwritten character recognition,” Pattern Recognition, vol. 31, no. 12, pp. 1849-1864, 1998.
    P. J. Phillips, H. Moon, P. J. Rauss and S. Rizvi, “The FERET evaluation methodology for face recognition algorithms,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 10, pp. 1090 - 1104, 2000.
    A. Pnevmatikakis and L. Polymenakos, “Comparison of eigenface-based feature vectors under different impairments,” in Proc. of International Conference on Pattern Recognition, vol. 1, pp. 296-299, 2004.
    A. Quattoni, M. Collins and T. Darrell, “Conditional random fields for object recognition,” Advances in Neural Information Processing Systems, pp. 1097-1104, MIT Press, 2004.
    A. Quattoni, S. Wang, L.-P. Morency, M. Collins, T. Darrell, “Hidden conditional random fields,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 10, pp. 1848-1852, 2007.
    L. R. Rabiner, “A tutorial on hidden Markov models and selected applications in speech recognition,” Proceedings of the IEEE, vol. 77, no. 2, pp. 257-286, 1989.
    S. Sakti, K. Markov and S. Nakamura, “An HMM model incorporating various additional knowledge source,” in Proc. of European Conference on Speech Communication and Technology, pp. 2117-2120, 2007.
    F. S. Samaria and S. Young, “HMM-based architecture for face identification,” Image and Vision Computing, vol. 12, no. 8, pp. 537-543, 1994.
    B. Schölkopf, A. J. Smola, and K.-R. Müller, “Nonlinear component analysis as a kernel eigenvalue problem,” Neural Computation, vol. 10, pp.1299–1319, 1998.
    B. Schölkopf and A.J. Smola, Learning with Kernels, MIT Press, Cambridge, MA, pp. 443–445, 2002.
    C. Sminchisescu, A. Kanaujia, Z. Li and D. Metaxas, “Conditional models for contextual human motion recognition,” in Proc. of International Conference on Computer Vision, pp. 1808-1815, 2005.
    B.-C. Shen, C.-S. Chen and H.-H. Hsu, “Face image retrieval by using Harr wavelets,” in Proc. of International Conference on Pattern Recognition, 2008.
    R. A. Sukkar and C.-H. Lee, “Vocabulary independent discriminative utterance verification for nonkeyword rejection in subword based speech recognition,” IEEE Transactions on Speech and Audio Processing, vol. 4, no. 6, pp. 420-429, 1996.
    C. Sutton and A. McCallum, “An introduction to conditional random fields for relational learning,” Introduction to Statistical Relational Learning (Chapter 4), MIT Press, 2007.
    M. Thomae, G. Ruske and T. Pfau, “A new approach to discriminative feature extraction using model transformation,” in Proc. of International Conference on Acoustics, Speech and Signal Processing, vol. 3, pp. 1615-1618, 2000.
    M. J. Wainwright and M. I. Jordan, “A variational principle for graphical models,” New Directions in Statistical Signal Processing (Chapter 11), MIT Press, 2005.
    M.-H. Yang, “Kernel eigenfaces vs. kernel fisherfaces: Face recognition using kernel methods,” in Proc. of IEEE International Conference on Automatic Face and Gesture Recognition, pp. 215–220, 2002.
    J. Yang, Y. Xu and C. S. Chen, “Human action learning via hidden Markov model,” IEEE Transactions on Systems, Man and Cybernetics, pp. 34-44, 1997.
    H. Yu and J. Yang, “A direct LDA algorithm for high-dimensional data - with application to face recognition,” Pattern Recognition, vol. 34, no. 10, pp. 2067-2070, 2001.
    W. Zheng, L. Zhao and C. Zou, “A modified algorithm for generalized discriminant analysis,” Neural Computation, vol. 16, no. 6, pp. 1283-1297, 2004.
    W. Zheng, L. Zhao and C. Zou, “Foley–Sammon optimal discriminant vectors using kernel approach,” IEEE Transactions on Neural Networks, vol. 16, no. 1, pp. 1-9, 2005.
    D. Zhu, B. Ma and H. Li, “Large margin estimation of Gaussian mixture model parameters with extended Baum-Welch for spoken language recognition,” in Proc. of Annual Conference of the International Speech Communication Association (INTERSPEECH), pp. 2179-2182, 2009.

    下載圖示 校內:立即公開
    校外:立即公開
    QR CODE