簡易檢索 / 詳目顯示

研究生: 楊乃玉
Yang, Nai-Yu
論文名稱: 不同離散化方法對於具先驗分配的簡易貝氏分類器之影響評估
An Evaluation Study for the Impact of Discretization Methods on the Performance of Naive Bayesian Classifiers with Prior Distributions
指導教授: 翁慈宗
Wong, Tzu-Tsung
學位類別: 碩士
Master
系所名稱: 管理學院 - 工業與資訊管理學系碩士在職專班
Department of Industrial and Information Management (on the job class)
論文出版年: 2010
畢業學年度: 98
語文別: 中文
論文頁數: 61
中文關鍵詞: 狄氏分配離散化廣義狄氏分配簡易貝氏分類器先驗分配
外文關鍵詞: Dirichlet distribution, discretization, generalized Dirichlet distribution, naïve Bayesian classifier, prior distribution
相關次數: 點閱:151下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 由於簡易貝氏分類器具有使用簡便、運算速度快以及分類正確率佳等優勢,目前已廣泛應用在許多分類任務上。惟簡易貝氏分類器的運作係以離散型態資料為主,倘若欲進行連續型態資料預測,大多數會使用離散化方法進行資料的前置處理。再者,為提升簡易貝氏分類器之分類正確率,一般會假設屬性之先驗分配服從狄氏分配或廣義狄氏分配。所以過去曾經有學者採用不同的離散化方法測試簡易貝氏分類器之先驗分配為狄氏分配的分類正確率,然而研究結果發現,不同離散化方法對於分類正確率的影響並無顯著差異,推測可能原因為狄氏分配的條件限制在實務上過於嚴苛,進而對分類結果造成影響;反之,廣義狄氏分配放寬了狄氏分配的條件假設,使其在實務上能更廣泛地應用。是故,本研究採用常見的四種離散化方法:等寬度、等頻率、比例和最小entropy,透過實證UCI資料存放站中23個含有連續型態屬性的資料檔,評估簡易貝氏分類器之先驗分配為最佳狄氏分配或最佳廣義狄氏分配的情況下,使用不同的離散化方法對於分類正確率之影響性。最終,研究結果顯示,等寬度、等頻率與比例離散化方法搭配先驗分配為最佳廣義狄氏分配時,對於分類正確率的提升較有助益;而最小entropy離散化方法搭配最佳廣義狄氏分配時的分類正確率,相較於最佳狄氏分配之差異並不大。因此本研究建議,當資料檔類別值個數與離散化後最大屬性可能值個數兩者皆偏多時,最小entropy離散化方法才考慮搭配先驗分配為最佳廣義狄氏分配,否則僅需採用最佳狄氏分配即可。

    Naïve Bayesian classifiers are widely employed for classification tasks, because of their computational efficiency and competitive accuracy. Discretization is a major approach for processing continuous attributes for naïve Bayesian classifiers. In addition, the prior distributions of attributes in the naïve Bayesian classifier are implicitly or explicitly assumed to follow either Dirichlet or generalized Dirichlet distributions. Previous studies have found that discretization methods for continuous attributes do not have significant impact on the performance of the naïve Bayesian classifier with noninformative Dirichlet priors. Since generalized Dirichlet distribution is a more appropriate prior for the naïve Bayesian classifier, the purpose of this thesis is to investigate the impact of four well-known discretization methods, equal width, equal frequency, proportional, and minimization entropy, on the performance of naïve Bayesian classifiers with either noninformative Dirichlet or noninformative generalized Dirichlet priors. The experimental results on 23 data sets demonstrate that the equal width, the equal frequency, and the proportional discretization methods can achieve a higher classification accuracy when priors follow generalize Dirichlet distributions. However, generalized Dirichlet and Dirichlet priors have similar performance for the minimization entropy discretization method. The experimental results suggest that noninformative generalized Dirichlet priors can be employed for the minimization entropy discretization method only when neither the number of classes nor the number of intervals is small.

    摘 要 I Abstract II 誌 謝 III 表目錄 VI 圖目錄 VII 第一章 緒論 1 1.1 研究背景與動機 1 1.2 研究目的 3 1.3 研究流程與架構 3 第二章 文獻探討 5 2.1 貝氏分類器 5 2.1.1 簡易貝氏分類器 6 2.1.2 簡易貝氏分類器相關應用 7 2.2 先驗分配 8 2.2.1 狄氏分配 9 2.2.2 廣義狄氏分配 11 2.3 離散化方法 15 2.3.1 非監督式離散化方法 15 2.3.2 監督式離散化方法 17 2.4 屬性排序方法 21 第三章 研究方法 24 3.1 離散化之前置處理 25 3.2 簡易貝氏分類器 30 3.3 屬性排序 32 3.4 先驗分配之參數調整 35 3.4.1 狄氏分配 35 3.4.2 廣義狄氏分配 37 3.5 結果評估方式 40 第四章 實證研究 42 4.1 資料檔特性 42 4.2 離散化之屬性可能值個數 44 4.3 離散化之先驗分配測試結果 47 4.4 小結 52 第五章 結論與建議 55 參考文獻 57

    中文
    張良豪 (2009),利用貝氏屬性挑選法與先驗分配提升簡易貝氏分類器之效能,國立成功大學工業與資訊管理學系碩士班碩士論文。
    英文
    Addin, O., Sapuan, S. M., Mahdi, E., and Othman, M. (2007). A naïve Bayes classifier for damage detection in engineering materials. Materials and Design, 28(8), 2379-2386.
    Aitchison, J. (1985). A general class of distributions on the simplex. Journal of the Royal Statistical Society Series B, 47(1), 136-146.
    Asuncion, A. and Newman, D.J. (2007). UCI machine learning repository http://www.ics.uci.edu/~mlearn/MLRepository.html. Irvine, CA: University of California, School of Information and Computer Science.
    Battiti, R. (1994). Using mutual information for selecting features in supervised neural-net learning. IEEE Transactions on Neural Networks, 5(4), 537-550.
    Bier, V. M. and Yi, W. (1995). A Bayesian method for analyzing dependencies in precursor data. International Journal of Forecasting, 11(1), 25-41.
    Biesiada, J., Duch, W., Kachel, A., Maczka, K., and Palucha, S. (2005). Feature ranking methods based on information entropy with Parzen window. International Conference on Research in Electrotechnology and Applied Informatics, 109-118, Katowice, Poland.
    Catlett, J. (1991). On changing continuous attributes into ordered discrete attributes. Proceedings of the 5th European Working Session on Learning on Machine Learning, 164-178, Porto, Portugal.
    Cestnik, B. and Bratko, I. (1991). On estimating probabilities in tree pruning. Proceedings of the 5th European Working Session on Learning on Machine Learning, 138-150, Porto, Portugal.
    Chen, K., Kurgan, L., and Rahbari, M. (2007). Prediction of protein crystallization using collocation of amino acid pairs. Biochemical and Biophysical Research Communications, 355(3), 764-769.
    Connor, R. J. and Mosimann, J. E. (1969). Concepts of independence for proportions with a generalization of the Dirichlet distribution. Journal of the American Statistical Association, 64, 194-206.
    Domingos, P. and Plazzani, M. (1997). On the optimality of the simple Bayesian classifier under zero one loss. Machine Learning, 29, 103-130.
    Dougherty, J., Kohavi, R., and Sahami, M. (1995). Supervised and unsupervised discretization of continuous features. Proceedings of the 12th International Conference on Machine Learning, 194-202, San Francisco, Morgan Kaufmann.
    Elomaa, T. and Rousu, J. (2003). On decision boundaries of naive Bayes in continuous domains. Proceedings of the 7th European Conference on Knowledge Discovery in Databases, 2838, 144-155, Berlin, Heidelberg.
    Fayyad, U. M. and Irani, K. B. (1993). Multi-interval discretization of continuous valued attributes for classification learning. Proceedings of the 13th International Joint Conference on Artificial Intelligence, 1022-1027, Chamberry, France.
    Good, I. J. (1950). Probability and the Weighing of Evidence. Charles Griffin, London.
    Holte, R. C. (1993). Very simple classification rules perform well on most commonly used datasets. Machine Learning, 11, 63-91.
    Hsu, C. N., Huang, H. J., and Wong, T. T. (2000). Why discretization works for naïve Bayesian classifiers. Proceedings of the 17th International Conference on Machine Learning, 309-406.
    Hsu, C. N., Huang, H. J., and Wong, T. T. (2003). Implications of the Dirichlet assumption for discretization of continuous attributes in naïve Bayesian classifiers. Machine Learning, 53, 235-263.
    Huang, J. J., Cai, Y. Z., and Xu, X. M. (2008). A parameterless feature ranking algorithm based on MI. Neurocomputing, 71, 1656-1668.
    John, G. H., Kohavi, R., and Pfleger, K. (1994). Irrelevant features and the subset selection problem. Proceedings of the 11th International Conference on Machine Learning, 121-129.
    Kerber, R. (1992). Chimerge: Discretization for numeric attributes. Proceedings of the 10th International Conference on Artificial Intelligence, 123-128.
    Keren, D. (2003). Recognizing image style and activities in video using local features and naive Bayes. Pattern Recognition Letters, 24, 2913-2922.
    Kohavi, R. and Sahami, M. (1996). Error-based and entropy-based discretization of continuous features. Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, 114-119.
    Kwak, N. and Choi, C. H. (2002). Input feature selection for classification problems. IEEE Transactions on Neural Networks, 13(1), 143-159.
    Langley, P. and Sage, S. (1994). Induction of selective Bayesian classifiers. Proceedings of the 10th International Conference on Uncertainty in Artificial Intelligence, 399-406.
    Liang, J., Yang, S., and Winstanley, A. (2008). Invariant optimal feature selection: a distance discriminant and feature ranking based solution. Pattern Recognition, 41, 1429-1439.
    Lopez de Mantaras, R. (1991). A distance-based attribute selecting measure for decision tree induction. Machine Learning, 6, 81-92.
    Menzies, T., Greenwald, J., and Frank, A. (2007). Data mining static code attributes to learn defect predictors. IEEE Transactions on Software Engineering, 33(1), 2-13.
    Peng, H., Long, F., and Ding, C. (2005). Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(8), 1226-1238.
    Press, W. H., Flannery, B. P., Teukolsky, S. A., and Vetterling, W. T. (1988). Numerical Recipies in C. Cambridge University Press, Cambridge.
    Robnik, M. and Kononenko, I. (2003). Theoretical and empirical analysis of ReliefF and RReliefF. Machine Learning, 53, 23-69.
    Setiono, R. and Liu, H. (1995). Chi2: feature selection and discretization of numeric attributes. Proceedings of the 7th IEEE International Conference on Tools with Artificial Intelligence, 388-3913.
    Spector, P. (1990). An Introduction to S and S-PLUS. Duxbury Press, Belmont.
    Sridhar, D. V., Bartlett, E. B., and Seagrave, R. C. (1998). Information theoretic subset selection. Computers in Chemical Engineering, 22, 613-626.
    Terribilini, M., Sander, J. D., Lee, J. H., Zaback, P., Jernigan, R. L., Honavar, V., and Dobbs, D. (2007). Nucleic Acids Research, 35, 578-584.
    Turhan, B. and Bener, A. (2009). Analysis of naive Bayes’ assumptions on software fault data: an empirical study. Data and Knowledge Engineering, 68, 278-290.
    Wilks, S. S. (1962). Mathematical Statistics. Wiley, New York.
    Wong, T. T. (1998). Generalized Dirchlet distribution in Bayesian analysis. Applied Mathematics and Computation, 97, 165-181.
    Wong, T. T. (2009). Alternative prior assumptions for improving the performance of naive Bayesian classifiers. Data Mining and Knowledge Discovery, 18, 183-213.
    Yang, Y. and Webb, G. I. (2002a). Non-disjoint discretization for naïve Bayes classifiers. Proceedings of the 19th International Conference on Machine Learning, 666-673.
    Yang, Y. and Webb, G. I. (2002b). A comparative study of discretization methods for naïve Bayes classifiers. Proceedings of the Pacific Rim Knowledge Acquisition Workshop, 159-173.
    Yang, Y. and Webb, G. I. (2009). Discretization for naïve Bayes learning: managing discretization bias and variance. Machine Learning, 74, 39-74.
    Yousef, M., Nebozhyn, M., Shatkay, H., Kanterakis, S., Showe, L. C., and Showe, M. K. (2006). Combining multi-species genomic data for microRNA identification using a naive Bayes classifier. Bioinformatics, 22(11), 1325-1334.

    下載圖示 校內:2011-07-16公開
    校外:2012-07-16公開
    QR CODE