簡易檢索 / 詳目顯示

研究生: 吳孟淞
Wu, Meng-Sung
論文名稱: 貝氏鑑別性資訊檢索模型
Bayesian Discriminative Models for Information Retrieval
指導教授: 簡仁宗
Chien, Jen-Tzung
學位類別: 博士
Doctor
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2009
畢業學年度: 97
語文別: 英文
論文頁數: 120
中文關鍵詞: 貝氏學習資訊檢索鑑別式語言模型機率式潛在語意分析主題模型群聚最小排列錯誤
外文關鍵詞: discriminative, clustering, Bayesian learning, information retrieval, minimum rank error, topic model, probabilistic latent semantic analysis, language model, latent Dirichlet allocation
相關次數: 點閱:134下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 統計式語言模型為文件生成的重要機制,普遍地運用於語音辨識和其他相關資訊系統上。在過去幾年,統計式語言模型理論已廣泛應用在資訊檢索架構中。而隨著資料集的大量增長,如何提供強健、易理解且有效的文件模型,協助使用者檢索,儼然成為資訊檢索系統所致力達成的目標。對傳統的資訊檢索技術而言,從巨大的資訊倉儲發現有用的訊息是一項重大挑戰。近年來,越來越多以機器學習為基礎的方法被廣泛運用於語言模型及資訊檢索工作。本研究旨在針對使用自然語言或檢索資訊模型所衍生的相關問題,提出完整且詳實的研究探討。
    本論文提出一系列具貝氏暨鑑別性機制之資訊檢索模型,主要目的在有效提高文件模組化及檢索之效能。本研究之理論基礎與原理包括語言模型、資訊檢索、機器學習及機率模型。研究目標包括:(1)以潛在語意分析(latent semantic analysis)為基礎,發展高效能之語言模型,並處理模型訓練資料不足和長距離資訊問題,同時評估各種不同語言模型與平滑化的組合;(2)建立貝氏風險(Bayes risk)或期望排序損失(expected rank loss)最小化之鑑別式語言模型(discriminative language model)於資訊檢索系統上,增進檢索模型的鑑別能力,以提升檢索之效能;(3)發展貝氏學習(Bayesian learning)模型,藉由累進式萃取以學習到最新的潛在式語意資訊,期望在自然語言系統上增進文件模型之效能,並獲得適應後之文件模型;(4)發展以文件特徵為主之層級架構模型,以非監督模式建構出文件分群資訊的方法,以增進系統之效能;以及(5)實現及探討本論文所提方法之可行性。
    實驗評量主要探討文件模型、文件檢索效能以及文件分類正確率。透過各種主客觀實驗評量,其結果可顯示所提研究方法,在功能性評估上具有顯著性的效能提升。在語言模型化部分,潛在式語意語言模型(latent semantic language model)在適度的計算複雜度下,可以降低語言模型之perplexity。並且透過參數平滑化(parameter smoothing)展現模型perplexity之改善及相較於參數模型化(parameter modeling)之優越性。另外,論文也針對所提之貝氏潛在群聚模型(Bayesian latent topic clustering model)在文件模型和分類上做效能評估,相較先前之主題模型我們獲得較好的效能。而在資訊檢索部分,我們提出之鑑別式語言模型訓練可透過個別的相關文件相對於不相關文件之鑑別性資訊進行。在文件檢索實驗,本論文所提出之最小排列錯誤模型(minimum rank error model)對於新測試查詢句檢索效能有顯著的改善。另外,我們提出之貝氏機率式潛在語意分析(Bayesian probabilistic latent semantic analysis, Bayesian PLSA)可突顯出新領域知識連續變遷之累進式學習的能力,並對於過時之字詞或文件能有效的做更新學習。相較於原始機率式潛在語意分析模型的最大相似度估測(maximum likelihood estimate),本論文所提出的近似貝氏(quasi-Bayes, QB)估測演算法,更具有動態增加文件及建立累進式索引的能力。本研究之實驗結果及研究探討,可提供文件檢索與分類之學者重要的研究參考,並提供電腦資訊處理學者具指標性且系統化的概念設計架構於網際網路科技整合與發展,未來之研究將著重於大量全球資訊網資料發展及運用本論文提出之貝氏鑑別資訊檢索研究於系統發展與設計的改善。

    Statistical language modeling is a probabilistic mechanism for generating text documents. This mechanism has been popular for speech recognition and many other applications including optical character recognition and information retrieval. However, traditional language model using n-gram suffers from several weaknesses. How to find useful information from the huge text database has been a high-impacting challenge in the community of information retrieval. There have been increasing numbers of researches focusing on machine learning approaches to information retrieval. These approaches differ from traditional information retrieval methods. We deal with these issues and present our solutions to develop adaptive and discriminative information retrieval system.
    In this thesis, we present several Bayesian discriminative approaches to build probabilistic information retrieval models. The purpose of this dissertation is to improve the effectiveness and discriminative capability of language models for information retrieval. The machine learning theories of Bayesian learning and discriminative training provide the underlying principles in this dissertation. More specifically, this study presents several works focusing on: (1) developing a new language model based on latent semantic analysis so as to overcome the insufficiency of training data and long-distance dependencies in n-gram models, and comparing different language models and smoothing methods, (2) establishing a novel discriminative language model for ad-hoc information retrieval by minimizing the Bayes risk or the expected rank loss, (3) developing the Bayesian learning of document model by incrementally extracting the up-to-date latent semantic information to match the changing domains at run time in natural language systems, (4) developing a new method of extracting document feature terms and building a hierarchical structure by incorporating the topic taxonomy information, and (5) implementing and evaluating the proposed systems.
    To assess the proposed methods, several objective measurements were performed to investigate the document models in information retrieval systems. Experimental results showed that the proposed approaches made improvements in different tasks. In language modeling, the proposed latent semantic language model significantly reduced the perplexity of language models with moderate computational cost. Also, the perplexity improvement made by parameter smoothing was significant compared to that made by parameter modeling. Moreover, the desirable performance of model perplexity and document categorization was achieved by the proposed Bayesian latent topic clustering model. This model was built by handling the hierarchy of words and documents. In the application of document retrieval, the discriminative language model is proposed by integrating the discriminative information from individual relevant documents relative to their corresponding irrelevant documents. Experimental results indicated that the proposed minimum rank error model elevated the retrieval performance for new test queries. Besides, the Bayesian probabilistic latent semantic analysis presented the incremental learning mechanism where new domain knowledge was continuously updated, and at the same time the out-of-date words/documents were gradually faded away. Compared to standard probabilistic latent semantic analysis, the proposed quasi-Bayes parameter estimation is capable of performing dynamic document indexing and modeling. The resulting performance was better than the other adaptation methods. The proposed methods in this thesis are helpful for the researchers or scientists working on the related topics.

    中文摘要 I ABSTRACT III 致謝 V TABLE OF CONTENTS VI LIST OF TABLES IX LIST OF FIGURES XI Chapter 1 Introduction 1 1.1 Motivations 6 1.2 Outline of This Dissertation 6 1.3 Contributions of This Dissertation 7 Chapter 2 Survey of Information Retrieval Models 10 2.1 Information Retrieval 10 2.1.1 Vector Space Model 11 2.1.2 Latent Semantic Analysis 12 2.2 Language Model 14 2.3 Probabilistic Topic Model 16 2.3.1 Probabilistic Latent Semantic Analysis 16 2.3.2 Latent Dirichlet Allocation 19 2.4 Databases and Performance Measures 21 2.5 Summary 26 Chapter 3 Latent Semantic Language Model 27 3.1 N-gram Language Model 28 3.2 Latent Semantic Language Modeling and Smoothing 29 3.2.1 Parameter Modeling 29 3.2.2 Parameter Smoothing 33 3.3 Experiments on Latent Semantic Language Model 36 3.3.1 Experimental Setup 36 3.3.2 Evaluation of Model Perplexity and Computation Time 39 3.4 Summary 42 Chapter 4 Discriminative Retrieval Model 43 4.1 Language Models for Information Retrieval 44 4.1.1 Maximum Likelihood Model 45 4.1.2 Minimum Classification Error Model 46 4.2 Information Retrieval Measures 48 4.3 Minimum Rank Error Model 51 4.3.1 Average Precision versus Rank Error 51 4.3.2 Minimum Rank Error Model 52 4.4 Implementation and Interpretation 55 4.5 Experiments on Discriminative Retrieval Model 57 4.5.1 Experimental Setup 57 4.5.2 Retrieval Performance for Different Methods 59 4.5.3 Case Study for MCE and MRE Methods 64 4.6 Summary 67 Chapter 5 Bayesian Latent Semantic Analysis 68 5.1 Bayesian Learning for Latent Semantic Analysis 68 5.1.1 PLSA Adaptation 69 5.1.2 MAP Estimation for Corrective Training 70 5.1.3 QB Estimation for Incremental Learning 73 5.2 Implementation Issues 75 5.3 Experiments on the Bayesian Latent Semantic Analysis 77 5.3.1 Experimental Setup 77 5.3.2 Computation Complexity, EM Convergence and Model Perplexity 80 5.3.3 Evaluation of Information Retrieval and Document Categorization 83 5.4 Summary 92 Chapter 6 Bayesian Document Clustering Model 93 6.1 Bayesian Latent Topic Clustering Model 94 6.1.1 New Topic Model 94 6.1.2 Approximate Inference by Gibbs Sampling 96 6.2 Comparison of Different Topic Models 98 6.3 Experiments on Bayesian Document Clustering Model 100 6.3.1 Evaluation of Model Perplexity 100 6.3.2 Evaluation of Document Categorization 101 6.4 Summary 102 Chapter 7 Conclusions and Future Works 103 Bibliography 106 作者簡歷 (Author’s Biographical Notes) 117

    Akita, Y. and Kawahara, T., “Language modeling adaptation based on PLSA of topics and speakers”, Proceedings of International Conference on Spoken Language Processing, pp. 1045-1048, 2004.
    Bahl, L., Brown, P., de Souza, P. and Mercer, R., “Maximum mutual information estimation of hidden Markov model parameters for speech recognition”, IEEE Proceedings of International Conference on Acoustic, Speech and Signal Processing, vol. 11, pp. 49-52, 1986.
    Baker, L. D. and McCallum, A. K. “Distributional clustering of words for text classification”, Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 96-103, 1998.
    Bellegarda, J. R., “A statistical language modeling approach integrating local and global constraints”, Proc. IEEE Workshop on Automatic Speech Recognition and Understanding, pp. 262-269, 1997.
    Bellegarda, J. R., “A multi-span language modeling framework for large vocabulary speech recognition”, IEEE Transactions on Speech and Audio Processing, vol. 6, no. 5, pp. 456-467, 1998.
    Bellegarda, J. R., “Exploiting latent semantic information in statistical language modeling”, Proceeding of IEEE, vol. 88, no.8, pp. 1279-1296, 2000a.
    Bellegarda, J. R., “Large vocabulary speech recognition with multi-span statistical language models”, IEEE Transactions on Speech and Audio Processing, vol. 8, no. 1, pp. 76-84, 2000b.
    Bellegarda, J. R., “Fast update of latent semantic spaces using a linear transform framework”, Proceedings of International Conference on Acoustics, Speech, and Signal Processing, vol. 1, pp. 769-772, 2002.
    Berry, M. W., Dumais, S. T. and O’Brien, G. W., “Using linear algebra for intelligent information retrieval”, Society for Industrial and Applied Mathematics (SIAM): Review, vol. 37, no. 4, pp. 573-595, 1995.
    Berry, M. W., “Large scale singular value computations”, International Journal of Supercomputer Applications, vol. 6, pp. 13-49, 1992.
    Bishop, C. M., Pattern Recognition and Machine Learning, Springer Science, 2006.
    Blei, D. M., Ng, A. Y. and Jordan, M. I., “Latent Dirichlet allocation”, Journal of Machine Learning Research, vol. 3, no. 5, pp. 993-1022, 2003.
    Burges, C., Shaked, T., Renshaw, E., Lazier, A., Deeds, M., Hamilton, N. and Hullender, G., “Learning to rank using gradient descent”, Proceedings of the 22nd International Conference on Machine Learning, pp. 89-96, 2005.
    Brochu, E., Freitas, N. and Bao, K., “The sound of an album cover: probabilistic multimedia and IR”, Proceedings of the 9th International Workshop on Artificial Intelligence and Statistics, 2003.
    Brown, P. F., Della Pietra, V. J., deSouza, P. V., Lai, J. C. and Mercer, R. L., “Class-based n-gram models of natural language”, Computational Linguistics, vol. 18, no. 4, pp. 467-477, 1992.
    Cai, L. and Hofmann, T., “Text categorization by boosting automatically extracted concepts”, Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 182-189, 2003.
    Cao, Y., Xu, J., Liu, T.-Y., Li, H., Huang, Y. and Hon, H.-W., “Adapting ranking SVM to document retrieval”, Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 186-193, 2006.
    Chen, S. and Goodman,J., “An empirical study of smoothing techniques for language modeling”, Computer Speech and Language, vol. 13, no. 4, pp. 359-394, 1999.
    Chien, J.-T., “Online hierarchical transformation of hidden Markov models for speech recognition”, IEEE Transaction on Speech and Audio Processing, vol. 7, no. 6, pp. 656-667, 1999.
    Chien, J.-T., “Association pattern language modeling”, IEEE Transactions on Audio, Speech and Language Processing, vol. 14, no. 5, pp. 1719-1728, 2006.
    Chien, J.-T., Huang, C.-H., Shinoda, K. and Furui, S., “Towards optimal Bayes decision for speech recognition”, IEEE Proceedings of International Conference on Acoustics, Speech, and Signal Processing, vol. 1, pp. 45-48, 2006.
    Chien, J.-T. and Wu, M.-S., “Adaptive Bayesian latent semantic analysis”, IEEE Transactions on Audio, Speech and Language Processing, vol. 16, no. 1, pp. 198-207, 2008.
    Chien, J.-T. and Wu, M.-S., “Minimum rank error language modeling”, accepted for publication in IEEE Transactions on Audio, Speech and Language Processing, 2009.
    Chien, J.-T., Wu, M.-S. and Peng, H.-J., “On latent semantic language modeling and smoothing”, Proceedings of International Conference on Spoken Language Processing, vol. 2, pp. 1373-1376, Jeju Island-Korea, 2004.
    Chien, J.-T., Wu, M.-S. and Peng, H.-J., “Latent semantic language modeling and smoothing”, International Journal of Computational Linguistics and Chinese Language Processing, vol. 9, no. 2, pp. 29-44, 2004.
    Chueh, C.-H., Wu, M.-S. and Chien, J.-T., “Some advances in language modeling”, Advances in Chinese Spoken Language Processing (edited by C.-H. Lee, H. Li, L. Lee, R.-H. Wang and Q. Huo), pp. 201-226 (Chapter 9), World Scientific Publishing Co., 2007.
    Cortes, C. and Mohri, M., “AUC optimization vs. error rate minimization”, Advances in Neural Information Processing Systems, vol. 15, pp.313-320, 2003.
    Daumé III, H. and Marcu, Daniel., “Domain adaptation for statistical classifiers”, Journal of Artificial Intelligence Research, vol. 26, pp. 101–126, 2006.
    Deerwester, S., Dumais,S. T., Landauer, T. K., Furnas, G. W. and Harshman, R., “Indexing by Latent Semantic Analysis”, Journal of the American Society for Information Science, vol. 41, no.6, pp. 391-407, 1990.
    DeGroot, M. H., Optimal Statistical Decisions, McGraw-Hill, 1970.
    Dempster, A. P., Laird, N. M., and Rubin, D. B., “Maximum likelihood from incomplete data via the EM algorithm”, Journal of the Royal Statistical Society (B), vol. 39, no. 1, pp. 1-38, 1977.
    Dietz, L., Bickel, S. and Scheffer, T., “Unsupervised prediction of citation influences”, Proceedings of the 24th International Conference on Machine Learning, pp. 233–240, 2007.
    Ding, C. H. Q., “A similarity-based probability model for latent semantic indexing”, Proceedings of 22nd Annual International ACM SIGIR Conference, pp. 58-65, 1999.
    Federico, M., “Bayesian estimation methods for n-gram language model adaptation”, Proceedings of the International Conference on Spoken Language Processing, vol. 1, pp. 240-243, 1996.
    Freitas, N. and Barnard, K., “Bayesian latent semantic analysis of multimedia databases”, Technical Report, University of British Columbia, TR-2001-15, 2001.
    Freund, Y., Iyer, R., Schapire, R.E. and Singer, Y., “An efficient boosting algorithm for combining preferences”, Journal of Machine Learning Research, vol. 4, no. 6, pp. 933-969, 2003.
    Gao, J., Qi, H., Xia, X. and Nie, J.-Y., “Linear discriminant model for information retrieval”, Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 290-297, 2005.
    Gildea, D. and Hofmann, T., “Topic based language models using EM”, Proceedings of European Conference on Speech Communication and Technology, pp. 2167-2170, 1999.
    Gill, A.J., Gergle, D., French, R.M. and Oberlander, J., “Emotion rating from short blog texts”, Proceedings of ACM Conference on Human Factors in Computing Systems, pp. 1121–1124, 2008.
    Girolami, M. and Kaban, A., “On an equivalence between PLSI and LDA”, Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 433-434, 2003.
    Goel, V., Byrne, W. and Khudanpur, S., “LVCSR rescoring with modified loss function: a decision theoretic perspective”, IEEE Proceedings of International Conference on Acoustic, Speech and Signal Processing, pp. 425-428, 1998.
    Golub, G. and Van Loan, C., Matrix Computations, 2nd ed. Baltimore, MD: Johns Hopkins, 1989.
    Griffiths, T. L. and Steyvers, M., “Finding scientific topics”, Proceedings of the National Academy of Science, vol. 101, pp. 5228-5235, 2004.
    Hancock, J.T., Landrigan, C. and Silver, C., “Expressing emotion in text-based communication”, Proceedings of ACM Conference on Human Factors in Computing Systems, pp. 929-932, 2007.
    Hofmann, T., “Probabilistic latent semantic indexing”, Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 50-57, 1999a.
    Hofmann, T., “The cluster-abstraction model: Unsupervised learning of topic hierarchies from text data”, Proceedings of the International Joint Conference in Artificial Intelligence, pp. 682-687, 1999b.
    Hofmann, T., “Unsupervised learning by probabilistic latent semantic analysis”, Machine Learning, vol. 42, no. 1-2, pp. 177–196, 2001.
    Hofmann, T., “Latent semantic models for collaborative filtering”, ACM Transaction on Information Systems, vol. 22, no. 1, pp. 89-115, 2004.
    Hull, D., “Using statistical testing in the evaluation of retrieval experiments”, Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 329-338, 1993.
    Huo, Q. and Lee, C.-H., “On-line adaptive learning of the continuous density hidden Markov model based on approximate recursive Bayes estimate”, IEEE Transactions on Speech and Audio Processing, vol. 5, pp. 161-172, 1997.
    Iyer, R. M. and Ostendorf, M., “Modeling long distance dependence in language: Topic mixtures versus dynamic cache models,” IEEE Transactions on Speech and Audio Processing, vol. 7, no. 1, pp. 30-39, 1999.
    Jelinek, F., “Self-organized language modeling for speech recognition”, Readings in Speech Recognition, Morgan-Kaufmann Publishers, pp. 450-506, 1990.
    Jelinek, F., “Up from Trigrams”, Proceedings of European Conference on Speech communication and Technology, vol. 3, pp. 1037-1040, 1991.
    Jelinek, F. and Mercer, R., “Interpolated estimation of Markov source parameters from sparse data”, Pattern Recognition in Practice, pp. 381-397, 1980.
    Jocachims, T., “Optimizing search engines using clickthrough data”, Proceedings of the ACM Conference on Knowledge Discovery and Data Mining, pp. 133-142, 2002.
    Juang, B.-H., Chou, W. and Lee, C.-H., “Minimum classification error rate methods for speech recognition”, IEEE Transactions on Speech and Audio Processing, vol. 5, no. 3, pp. 257-265, 1997.
    Juang, B.-H. and Katagiri, S., “Discriminative learning for minimum error classification”, IEEE Transactions on Signal Processing, vol. 40, no. 12, pp. 3043-3054, 1992.
    Katz, S. M., “Estimation of probabilities from sparse data for the language model component of a speech recognizer”, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 35, no. 3, pp. 400-40, 19871.
    Kawabata, T. and Tamoto, M., “Back-off method for n-gram smoothing based on binomial posteriori distribution”, Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 1, pp.192-195, 1996.
    Kolda, T. G. and O’Leary, D. P., “A semidiscrete matrix decomposition for latent semantic indexing in information retrieval”, ACM Transactions on Information Systems, vol. 16, no. 4, pp. 322-346, 1998.
    Kuo, H.-K. J., Fosler-Lussier, E., Jiang, H. and Lee, C.-H., “Discriminative training of language models for speech recognition”, Proceedings of International Conference on Acoustics, Speech, and Signal Processing, pp. 325-328, 2002.
    Lau, R., Rosenfeld, R. and Roukos, S., “Trigger-based language models: A maximum entropy approach”, Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 2, pp. 45-48, 1993.
    Lavrenko, V. and Croft, W. B., “Relevance-based language models”, Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 120-127, 2001.
    Li, W. and McCallum, A., “Pachinko allocation:Dag-structured mixture models of topic correlations”, Proceedings of the 23rd International Conference on Machine Learning, pp. 577–584, 2006.
    Masataki, H., Sagisaka, Y., Hisaki, K. and Kawahara, T., “Task adaptation using MAP estimation in n-gram language modeling”, Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 2, pp.783-786, 1997.
    Miller, D. R. H., Leek, T. and Schwartz, R. M., “A hidden Markov model information retrieval system”, Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 214-221, 1999.
    Minka, T. and Lafferty, J., “Expectation-propagation for the generative aspect model”, Proceedings of the 18th Conference on Uncertainty in Artificial Intelligence, pp. 352-359, 2002.
    Nallapati, R., Ahmed. A., Xing, E. and Cohen, W., “Joint Latent Topic models for text and citations”, Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 542-550, 2008.
    Nigam, K., McCallum, A., Thrun, S. and Mitchell, T., “Text classification from labeled and unlabeled documents using EM”, Machine Learning, vol. 39, no. 2-3, pp. 103-134, 2000.
    Novak, M. and Mammone, R., “Use of non-negative matrix factorization for language model adaptation in a lecture transcription task”, Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 1, pp. 541-544, 2001.
    Pereira, F., Tishby, N. and Lee, L. “Distributional clustering of English words”, Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics, pp. 183-190, 1993.
    Ponte, J. M. and Croft, W. B., “A language modeling approach to information retrieval”, Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 275-281, 1998.
    Povey, D. and Woodland, P. C., “Minimum phone error and I-smoothing for improved discriminative training”, IEEE Proceedings of International Conference on Acoustic, Speech and Signal Processing, vol. 1, pp. 105-108, 2002.
    Rabiner, L. R., “A tutorial on hidden Markov models and selected applications in speech recognition”, Proceedings of the IEEE, vol. 77, no. 2, pp. 257-286, 1989.
    Rahim, M. G., Lee, C.-H. and Juang, B.-H., “Discriminative utterance verification for connected digit recognition”, IEEE Transactions on Speech and Audio Processing, vol. 5, no. 3, pp. 266-277, 1997.
    Ricardo, B.-Y. and Berthier, R. -N., Modern information retrieval, Addison-Wesley, 2000.
    Salton, G. and Buckley, C., “Term-weighting approaches in automatic text retrieval”, Information Processing and Management, vol. 24, no. 5, pp. 513-523, 1988.
    Salton, G. and McGill, M. J., Introduction to Modern Information Retrieval, McGraw-Hill, New York, 1983.
    Salton, G., Wong, A. and Yang, C.S., “A vector space model for automatic indexing”, Communication of the ACM, vol. 18, 1975.
    Savory, J., “Statistical inference in retrieval effectiveness evaluation”, Information Processing and Management, vol. 33, no. 4, pp. 495-512, 1997
    Slonim, N. and Tishby, N., “The power of word clusters for text classification”, In 23rd European Colloquium on Information Retrieval Research, 2001.
    Smith, A. F. M. and Makov, U. E., “A quasi-Bayes sequential procedure for mixtures”, Journal of the Royal Statistical Society (B), vol. 40, no. 1, pp. 106-112, 1978.
    Smucker, M. A., Allan, J. and Carterette, B., “A comparison of statistical significance tests for information retrieval evaluation”, Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, pp. 623-632, 2007.
    Tam, Y.-C. and Schultz, T., “Dynamic language model adaptation using variational Bayes inference”, Proceedings of European Conference on Speech Communication and Technology, pp. 5-8, 2005.
    Tishby, N., Pereira, F. C. and Bialek, W., “The Information Bottleneck Method”, Proceedings of the 37th Annual Allerton Conference on Communication, pp. 368-377, 1999.
    Voorhees, E. M. and Harman, D. K., TREC: Experiment and Evaluation in Information Retrieval, Cambridge, Massachusetts: MIT Press, 2005.
    Witter, D. I. and Berry, M. W., “Downdating the latent semantic indexing model for conceptual information retrieval”, The Computer Journal, vol. 41, no. 8, pp. 589-601, 1998.
    Wu, M.-S. and Chien, J.-T., “Minimum rank error training for language modeling”, Proceedings of the European Conference on Speech Communication and Technology, pp. 614-617, Antwerp-Belgium, 2007.
    Wu, M.-S. and Chien, J.-T., “Bayesian latent topic clustering model”, Proceedings of International Conference on Spoken Language Processing (INTERSPEECH), pp. 2162-2165, Brisbane-Australia, 2008.
    Written, I. H. and Bell, T. C., “The zero-frequency problem: Estimating the probabilities of novel events in adaptive text compression”, IEEE Transaction on Information Theory, vol. 37, no. 4, pp. 1085-1094, 1991.
    Xu, J. and Croft, W. B., “Cluster-based language models for distributed retrieval”, Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 254-261, 1999.
    Yang, Y. and Liu, X., “A re-examination of text categorization methods”, Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 42-49, 1999.
    Yue, Y., Finley, T., Radlinski, F. and Joachims, T., “A support vector method for optimizing average precision”, Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 271-278, 2007.
    Zha, H. and Simon, H. D., “On updating problems in latent semantic indexing”, SIAM Journal on Scientific Computing, vol. 21, no. 2, pp. 782-791, 1999.
    Zhai, C. and Lafferty, J., “A study of smoothing methods for language models applied to information retrieval”, ACM Transaction on Information Systems, vol. 22, no. 2, pp. 179-214, 2004.
    Zhou, G. D. and Lua, K. T., “Interpolation of n-gram and mutual-information based trigger pair language models for Mandarin speech recognition”, Computer Speech and Language, vol. 13, no. 2, pp.125-141, 1999.

    下載圖示 校內:立即公開
    校外:2009-01-15公開
    QR CODE