| 研究生: |
柯映竹 Ko, Ying-Chu |
|---|---|
| 論文名稱: |
變數間正負相關對簡易貝氏分類器學習正確率之影響 |
| 指導教授: |
翁慈宗
Wong, Tzu-Tsung |
| 學位類別: |
碩士 Master |
| 系所名稱: |
管理學院 - 工業管理科學系 Department of Industrial Management Science |
| 論文出版年: | 2003 |
| 畢業學年度: | 91 |
| 語文別: | 中文 |
| 論文頁數: | 52 |
| 中文關鍵詞: | 簡易貝氏分類器 、狄氏分配 、廣義狄氏分配 |
| 外文關鍵詞: | naive Bayesian classifier, generalized Dirichlet distribution, Dirichlet distribution |
| 相關次數: | 點閱:156 下載:1 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
簡易貝氏分類器 (naïve Bayesian classifier) 是在先驗 (prior) 分配為狄氏分配 (Dirichlet distribution) 下進行分類的工作,為一運算簡單且有效的分類工具。然而狄氏分配具有任何兩變數間皆為負相關的特性,可是實務上變數間的關係可能為正相關的型態。目前的相關研究中,使用簡易貝氏分類器都在變數間為負相關的假設下進行學習,對於變數間若為正相關時,是否影響簡易貝氏分類器的正確率並未進行探討。本研究擬利用狄氏分配和廣義狄氏分配 (generalized Dirichlet distribution),模擬出具有負相關和正相關特性的資料,來研究變數間的相關性是否會影響到簡易貝氏分類器的正確率,之後再利用真實的資料來進行分析研究。
Cestnik, B. and Bratko, C. (1991), On estimating probabilities in tree pruning,
Machine Learning – EWSL-91, European Working Session on Learning, Berlin,
Germany, Springer- Verlag, 138-150.
Clark, P. and Niblett, T. (1989), The CN2 Induction Algorithm, Machine Learning,
Vol. 3, 261-283.
Connor, R. J. and Mosimann, J. E. (1969), Concepts of Independence for
Proportions with a Generalization of the Dirichlet Distribution, Journal of
the American Statistical Association, Vol. 64, 194-206.
Dodier, R. (1999), Unified Prediction and Diagnosis in Engineering System by
Means of Distributed Belief Networks, PH. D dissertation, University of
Colorado.
Domingos, P. and Pazzani, M. (1997). On the Optimality of the Simple Bayesian
Classifier under Zero One Loss, Machine Learning, Vol. 29, 103-130.
Duda, R. O. and Hart, P. E. (1973), Pattern Classification and Scene Analysis,
John Wiley, New York.
Ein-Dor, P. and Feldmesser, J. (1987), UCI Repository of machine learning
databases [http://www.ics.uci.edu/~mlearn/MLRepository.html]. Irvine, CA:
University of California, Department of Information and Computer Science.
Hellerstein, J., Thathachar, J., and Rish, I. (2000), Recognizing End User
Transactions in Performance Management, Proceedings of AAAI, Austin, Texas,
596-602.
Hsu, C. N., Huang, H. J., and Wong, T. T. (2000), Why Discretization Works for
Naïve Bayes Classifier, Proceedings of the Seventeenth International
Conference on Machine Learning , Morgan Kaufmann ,San Mateo.
Huang, H. and Hsu, C. N. (2002), Bayesian Classification for Data from the Same
Unknown Class, IEEE Transaction on Systems, Man, and Cybernetics Part B,
32(2), 137-145.
John, G. and Langley, P. (1995), Estimating Continuous Distributions in Bayesian
Classifiers, Proceedings of the Eleventh Conference on Uncertainty in
Artificial Intelligence, Montreal, Canada, 338-345.
Kononenko, I. (1991), Semi Naïve Bayse Classifier, Proceedings of the Sixth
European Working Session on Learning, Porto, Portugal, 206-219.
Law, M. and Kelton, W. (1991), Simulation modeling and analysis, McGraw-Hill, New
York.
Langley, P., Iba, W., and Thompson, K. (1992), An Analysis of Bayesian
Classifier, Proceedings of the Tenth National Conference on Artificial
Intelligence, San Jose, 399-406.
Li, Y. H. and Jain, A. K. (1998), Classification of text documents, Computer
Journal, Vol. 41, 537-546.
Lochner, R. H. (1975), A Generalized Dirichlet Distribution in Bayesian Life
Testing, Journal of Royal Statistical Society, Series B, Vol. 37, 103-113.
Neapolitan, R. and Kenevan, J. (1991), Investigation of Variances in Belief
Networks, Proceedings of the Seventh Conference on Uncertainty in Artificial
Intelligence, Morgan Kaufmann , USA.
Pearl, J. (1988), Probabilistic Reasoning in Intelligent System:Network of
Plausible Inference, Morgan Kaufmann.
Spiegelhalter, D. J., Harris, N. L., Bull, K., and Franklin, R. C. G. (1994),
Empirical Evaluation of Prior Belief about Frequencies-Methodology and A Case
Study in Congenital Heart Disease, Journal of the American Statistical
Association, Vol. 89, 435-443.
Stewart, B. (2002), Predicting Project Delivery Rates Using the Naïve Bayes
Classifier, Journal of Software Maintenance and Evolution and Practice, Vol.
14, 161-179.
Wilks, S. S. (1962), Mathematical Statistics, John Wiley, New York.
Wong, T. T. (1998a), Perfect Aggregation in Dependent Bernoulli Systems with
Bayesian Updating, PH. D dissertation, University of Wisconsin – Madison.
Wong, T. T. (1998b), Generalized Dirichlet Distribution in Bayesian Analysis,
Applied Mathematics and Computation, Vol. 97, 165-181
Witten, I. H. and Frank, E. (1999), Data Mining, Morgan Kaufmann ,San Francisco.