| 研究生: |
林韋成 Lin, Wei-Cheng |
|---|---|
| 論文名稱: |
應用一對一與一對多方法於多類別資料的隨機生成模型之集成方法 Ensemble Algorithms Based on One-Against-One and One-Against-All Strategies with Randomly Generated Models for Multi-Class Data |
| 指導教授: |
翁慈宗
Wong, Tzu-Tsung |
| 學位類別: |
碩士 Master |
| 系所名稱: |
管理學院 - 資訊管理研究所 Institute of Information Management |
| 論文出版年: | 2025 |
| 畢業學年度: | 113 |
| 語文別: | 中文 |
| 論文頁數: | 64 |
| 中文關鍵詞: | 集成學習 、多類別分類 、羅吉斯迴歸分類模型 、二元化策略 、隨機生成基本模型 |
| 外文關鍵詞: | Ensemble learning, Multi-class classification, Logistic regression, Binarization strategy, Randomly-generated base model |
| 相關次數: | 點閱:58 下載:1 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
先前探討使用隨機生成模型方法的研究中,隨機生成的模型在處理二元分類問題時的表現出色,但在多類別分類問題上的效果不佳,為解決此問題,過往研究提出將隨機生成模型方法結合二元化策略中的集成嵌套二分法,將多類別問題拆解為多個二元分類問題,並生成特定結構的基本模型,以改善隨機生成方法在多類別資料集上的預測效能,然而相較於傳統的訓練生成模型,隨機生成方法在搭配集成嵌套二分法時,其所建構之基本模型的分類效能較不穩定。本研究希望使用二元化策略中的一對一與一對多方法,搭配隨機生成方法建構集成模型中的基本模型,並評估該作法相較於原始的多元分類器和集成嵌套二分法,是否能提升基本模型與整體集成模型的分類效能,或提高計算效率。另外本研究也將探討以不同比例混合訓練與隨機生成模型所得之集成模型,是否能夠如先前研究一般提升分類效能。
本研究提出的隨機生成羅吉斯迴歸模型建構一對一集成方法,從二元化策略的角度來看,在30個多類別資料集上的分類表現優於一對多集成方法和過往提出的集成嵌套二分法,若進一步以不同集成技術進行比較,採用混合隨機生成與袋裝法生成模型的集成方法的表現最佳,在大多數資料集上取得了最高的平均分類正確率,所有方法中又以混合生成建構一對一集成方法的分類表現最好。在運算時間層面,無論採用隨機生成或是混合生成的方式,集成嵌套二分法的執行速度則會優於其他二元化策略。
The ensemble algorithms built on randomly generated models performed well on binary classification problems, while their performances on multi-class classification tasks was relatively poor. Therefore, a prior study that integrates randomly generated models with Nested Dichotomies—a binarization strategy—to decompose a multi-class problem into several binary classification problems. However, the base models constructed by Nested Dichotomy have relatively unstable classification accuracies. This study aims to explore whether combining binarization strategies One-Against-One and One-Against-All with randomly generated models can enhance classification performance or reduce computational cost. This study proposes four ensemble methods with randomly-generated base models for logistic regression. By comparing with the other nine ensemble algorithms, the experimental results on 30 multi-class datasets show that combining One-Against-One strategy with the hybrid of randomly-generated base models and the ones produced by the bagging approach achieves the highest accuracy in most data sets. However, Nested Dichotomy has the lowest computational cost among the three binarization strategies, regardless of the way for producing base models.
黃中立(2023)。以簡易貝氏分類器隨機生成基本模型之集成方法。國立成功大學資訊管理研究所碩士論文,台南市。
徐心縈(2023)。用羅吉斯迴歸建構隨機分類模型之集成方法。國立成功大學資訊管理研究所碩士論文,台南市。
陳昱嘉(2023)。隨機生成決策樹以進行集成學習之研究。國立成功大學資訊管理研究所碩士論文,台南市。
蔡哲倫(2024)。適用於多類別資料的隨機生成模型之集成嵌套二分法。國立成功大學資訊管理研究所碩士論文,台南市。
何政賢(2024)。以粒子群最佳化方法優化應用於二類別資料之隨機集成演算法。國立成功大學資訊管理研究所碩士論文,台南市。
Arun Kumar, M. & Gopal, M. (2010). Fast Multiclass SVM classification using decision tree based one-against-all method. Neural Processing Letters, 32(3), 311-323.
Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123-140.
Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5-32.
Clark, P. & Boswell, R. (1991). Rule induction with CN2: Some recent improvements. Machine Learning—EWSL-91: European Working Session on Learning Porto, Portugal, March 6–8, 1991 Proceedings 5 (pp. 151-163). Springer Berlin Heidelberg.
David, F. N. & Tukey, J. (1977). Exploratory data analysis. Biometrics, 33(4), 768.
Dietterich, T. G. (2000). Ensemble methods in machine learning. Proceedings of the 1st International Workshop on Multiple Classifier Systems (pp. 1-15). Berlin, Heidelberg.
Du Jardin, P. (2018). Failure pattern-based ensembles applied to bankruptcy forecasting. Decision Support Systems, 107, 64-77.
Eberhart, R. & Shi, Y. (2000). Comparing inertia weights and constriction factors in particle swarm optimization. Proceedings of the 2000 Congress on Evolutionary Computation, 1, 84-88.
Fei, B. & Jinbai Liu. (2006). Binary tree of SVM: A new fast multiclass training and classification algorithm. IEEE Transactions on Neural Networks, 17(3), 696-704.
Frank, E. & Kramer, S. (2004). Ensembles of nested dichotomies for multi-class problems. Proceedings of the Twenty-first International Conference on Machine Learning, 39.
Freund, Y. & Schapire, R. E. (1995). A desicion-theoretic generalization of on-line learning and an application to boosting. European Conference on Computational Learning Theory (pp. 23-37). Berlin, Heidelberg.
Freund, Y. & Schapire, R. E. (1996). Experiments with a new boosting algorithm. Proceedings of the Thirteenth International Conference on Machine Learning, 96, 148-156, Morgan Kaufmann.
Fürnkranz, J. (2002). Round robin classification. The Journal of Machine Learning Research, 2, 721-747.
Fürnkranz, J. (2003). Round robin ensembles. Intelligent Data Analysis, 7(5), 385 403.
Galar, M., Fernández, A., Barrenechea, E., Bustince, H., & Herrera, F. (2011). An overview of ensemble methods for binary classifiers in multi-class problems: Experimental study on one-vs-one and one-vs-all schemes. Pattern Recognition, 44(8), 1761-1776.
Galar, M., Fernández, A., Barrenechea, E., & Herrera, F. (2015). DRCW-OVO: Distance-based relative competence weighting combination for One-vs-One strategy in multi-class problems. Pattern Recognition, 48(1), 28-42.
Gao, X., He, Y., Zhang, M., Diao, X., Jing, X., Ren, B., & Ji, W. (2021). A multiclass classification using one-versus-all approach with the differential partition sampling ensemble. Engineering Applications of Artificial Intelligence, 97, 104034.
García-Pedrajas, N., & Ortiz-Boyer, D. (2011). An empirical study of binary classifier fusion methods for multiclass classification. Information Fusion, 12(2), 111-130.
Hsu, C.-W. & Lin, C.-J. (2002). A comparison of methods for multiclass support vector machines. IEEE Transactions on Neural Networks, 13(2), 415-425.
Jaiyeoba, O., Ogbuju, E., Yomi, O. T., & Oladipo, F. (2024). Development of a model to classify skin diseases using stacking ensemble machine learning techniques. Journal of Computing Theories and Applications, 2(1), 22-38.
Kang, S., Cho, S., & Kang, P. (2015). Constructing a multi-class classifier using one-against-one approach with different binary classifiers. Neurocomputing, 149, 677-682.
Kleinbaum, D. G., Klein, M., Kleinbaum, D. G., & Klein, M. (2010). Introduction to logistic regression. Logistic Regression: A Self-learning Text, 1-39.
Knerr, S., Personnaz, L., & Dreyfus, G. (1990). Single-layer learning revisited: A stepwise procedure for building and training a neural network. Neurocomputing, 41-50.
Mao, S. H. & Myint, E. E. (2013). Performance comparison of multi-class SVM classification for music cultural style tagging. International Journal of Computer Theory and Engineering, 5(2), 317-320.
Ndirangu, D., Mwangi, W., & Nderu, L. (2019). An ensemble model for multiclass classification and outlier detection method in data mining. Journal of Information Engineering and Applications, 9(2), 38-42.
Rifkin, R. & Klautau, A. (2004). In defense of one-vs-all classification. The Journal of Machine Learning Research, 5, 101-141.
Rodríguez, J. J., García-Osorio, C., & Maudes, J. (2010). Forests of nested dichotomies. Pattern Recognition Letters, 31(2), 125-132.
Tsai, C., Sue, K., Hu, Y., & Chiu, A. (2021). Combining feature selection, instance selection, and ensemble classification techniques for improved financial distress prediction. Journal of Business Research, 130, 200-209.
Wever, M., Mohr, F., & Hüllermeier, E. (2018). Ensembles of evolved nested dichotomies for classification. Proceedings of the Genetic and Evolutionary Computation Conference, 561-568.
Wolpert, D. H. (1992). Stacked generalization. Neural Networks, 5(2), 241-259.
Xie, X., Zhang, W., & Yang, L. (2003). Particle swarm optimization. Control and Decision, 18, 129-134.
Yan, J., Zhang, Z., Lin, K., Yang, F., & Luo, X. (2020). A hybrid scheme-based one-vs-all decision trees for multi-class classification tasks. Knowledge-Based Systems, 198, 105922.
Yang, X., Yu, Q., He, L., & Guo, T. (2013). The one-against-all partition based binary tree support vector machine algorithms for multi-class classification. Neurocomputing, 113, 1-7.
N. Shah Zainudin, M., Nasir Sulaiman, M., Musapha, N., Perumal, T., & Mohamed, R. (2018). Solving classification problem using ensemble Binarization classifier. International Journal of Engineering & Technology, 7(4), 280-284.