| 研究生: |
張書嘉 Chang, Shu-Chia |
|---|---|
| 論文名稱: |
基於支持向量機隨機生成基本模型之集成方法 Ensemble Algorithms with Randomly Generated Base Models Based on Support Vector Machine |
| 指導教授: |
翁慈宗
Wong, Tzu-Tsung |
| 學位類別: |
碩士 Master |
| 系所名稱: |
管理學院 - 工業與資訊管理學系 Department of Industrial and Information Management |
| 論文出版年: | 2024 |
| 畢業學年度: | 112 |
| 語文別: | 中文 |
| 論文頁數: | 65 |
| 中文關鍵詞: | 支持向量機 、集成學習 、隨機生成基本模型 、獨立性 |
| 外文關鍵詞: | Ensemble learning, independence, randomly-generated base models, support vector machine |
| 相關次數: | 點閱:32 下載:4 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著機器學習技術的迅速發展,集成學習成為解決複雜問題和提升模型預測效能的重要途徑。根據以往的文獻,影響集成模型分類效能的重要因素為基本模型彼此間必須儘量符合獨立性,然而,傳統的集成方法往往依賴於同一原始資料集進行基本模型的訓練,因而限制了基本模型之間的獨立性。為了克服這一限制,本研究提出一種基於支持向量機隨機生成基本模型的集成方法,命名為ERSVM,跳脫以往的學習框架,不使用資料集直接隨機生成支持向量機模型,除此之外,因隨機生成模型的不確定性較高,因此本研究也提出了以不同比例混合袋裝法和隨機生成的支持向量機集成模型,命名為EBRSVM,用以評估在不仰賴資料集的條件下,是否能提升模型的分類效能。
本研究使用30份資料集進行實驗,結果表明ERSVM透過4次迭代過程,可以使分類效能有所提升,甚至在部分資料集的分類表現比現有的分類方法還要好,此外,以EBRSVM與現有的分類方法相比,所有資料集的平均分類準確率為最高。然而,在訓練時間上,隨機生成的分類模型會因設立的門檻值,而導致淘汰許多分類模型,使得訓練時間增加,因此考量到運算資源及時間成本,可藉由適當的混合部分比例袋裝法訓練的基本模型,以縮短模型訓練時間,同時提升模型的訓練及分類效能。
With the rapid development of machine learning technologies, ensemble learning has become crucial for solving complex problems and enhancing model prediction performance. Previous literature indicates that the independence among base models significantly influences the classification performance of ensemble models. However, traditional ensemble methods often induce base models from the instance sets sampling from the same data set, and the predictions of those base models are thus not independent. To address this, this study proposes an ensemble algorithm that randomly generated base models for Support Vector Machines (SVMs), and this algorithm is named ERSVM. In addition, another ensemble algorithm EBRSVM which combines the randomly-generated base models with the ones generated by the bagging approach is also introduced. The experimental results on 30 datasets show that ERSVM can outperform previous ensemble algorithms in some datasets. Algorithm EBRSVM achieved the highest average classification accuracy among five ensemble algorithms. However, the computational costs for algorithms ERSVM and EBRSVM are both high due to the effort spending in filtering base models. Algorithm EBRSVM is recommended to balance computational cost and classification performance.
徐心縈.(2023)。用邏輯斯迴歸建構隨機分類模型之集成方法。國立成功大學工業與資訊管理學系碩士班碩士論文。
黃中立.(2023)。以簡易貝氏分類器隨機生成基本模型之集成方法。國立成功大學資訊管理研究所碩士論文。
Bennett, K. P. & Mangasarian, O. L. (1992). Robust linear programming discrimination of two linearly inseparable sets. Optimization Methods and Software, 1(1), 23-34.
Boser, B. E., Guyon, I. M., & Vapnik, V. N. (1992). A training algorithm for optimal margin classifiers. Proceedings of the Fifth Annual Workshop on Computational Learning Theory, 144-152.
Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123-140.
Burges, C. J. C. (1998). A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2(2), 121-167.
Cervantes, J., Garcia-Lamont, F., Rodriguez-Mazahua, L., & Lopez, A. (2020). A comprehensive survey on support vector machine classification: applications, challenges and trends. Neurocomputing, 408, 189-215.
Chen, S. Y., Wang, W., & van Zuylen, H. (2009). Construct support vector machine ensemble to detect traffic incident. Expert Systems with Applications, 36(8), 10976-10986.
Cortes, C. & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273-297.
Dietterich, T. G. (2000). Ensemble methods in machine learning. Proceedings of the 1st International Workshop on Multiple Classifier Systems, 1-15.
Du, S. C., Liu, C. P., & Xi, L. F. (2015). A selective multiclass support vector machine ensemble classifier for engineering surface classification using high definition metrology. Journal of Manufacturing Science and Engineering-Transactions of the Asme, 137(1), 011003.
Efron, B. (1979). The 1977 RIETZ lecture - bootstrap methods: another look at the jackknife. Annals of Statistics, 7(1), 1-26.
Freund, Y. & Schapire, R. E. (1996). Experiments with a new boosting algorithm. Proceedings of the Thirteenth International Conference on International Conference on Machine Learning, 96, 148-156.
Freund, Y. & Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1), 119-139.
Gu, J., Wang, L. H., Wang, H. W., & Wang, S. S. (2019). A novel approach to intrusion detection using SVM ensemble with feature augmentation. Computers & Security, 86, 53-62.
Hansen, L. K. & Salamon, P. (1990). Neural network ensembles. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(10), 993-1001.
Hu, Z.-H., Li, Y.-G., Cai, Y.-Z., & Xu, X.-M. (2004). An empirical comparison of ensemble classification algorithms with support vector machines. Proceedings of 2004 International Conference on Machine Learning and Cybernetics (IEEE Cat. No. 04EX826), 6, 3520-3523.
Kim, H. C., Pang, S., Je, H. M., Kim, D., & Bang, S. Y. (2003). Constructing support vector machine ensemble. Pattern Recognition, 36(12), 2757-2767.
Kim, N., Jung, K. H., Kim, Y. S., & Lee, J. (2012). Uniformly subsampled ensemble (USE) for churn management: theory and implementation. Expert Systems with Applications, 39(15), 11839-11845.
Kuncheva, L. I. & Whitaker, C. J. (2003). Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machine Learning, 51(2), 181-207.
Lei, Y. G., Yang, B., Jiang, X. W., Jia, F., Li, N. P., & Nandi, A. K. (2020). Applications of machine learning to machine fault diagnosis: a review and roadmap. Mechanical Systems and Signal Processing, 138.
Li, X. C., Wang, L., & Sung, E. (2008). AdaBoost with SVM-based component classifiers. Engineering Applications of Artificial Intelligence, 21(5), 785-795.
Melville, P. & Mooney, R. J. (2005). Creating diversity in ensembles using artificial data. Information Fusion, 6(1), 99-111.
Rokach, L. (2010). Ensemble-based classifiers. Artificial Intelligence Review, 33(1-2), 1-39.
Sagi, O. & Rokach, L. (2018). Ensemble learning: a survey. Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery, 8(4), e1249.
Schapire, R. E. (1990). The strength of weak learnability. Machine Learning, 5(2), 197-227.
Song, Q., Hu, W. J., & Xie, W. F. (2002). Robust support vector machine with bullet hole image classification. IEEE Transactions on Systems Man and Cybernetics Part C-Applications and Reviews, 32(4), 440-448.
Wang, H. F., Zheng, B. C., Yoon, S. W., & Ko, H. S. (2018). A support vector machine-based ensemble algorithm for breast cancer diagnosis. European Journal of Operational Research, 267(2), 687-699.
Wang, Q., Luo, Z. H., Huang, J. C., Feng, Y. H., & Liu, Z. (2017). A novel ensemble method for imbalanced data learning: bagging of extrapolation-SMOTE SVM. Computational Intelligence and Neuroscience, 2017.
Wolpert, D. H. (1992). Stacked generalization. Neural Networks, 5(2), 241-259.
Zhang, X. (1999). Using class-center vectors to build support vector machines. Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No. 98TH8468), 3-11.
Zhou, J., Jiang, Z. B., Chung, F. L., & Wang, S. T. (2021). Formulating ensemble learning of SVMs Into a single SVM formulation by negative agreement learning. IEEE Transactions on Systems Man Cybernetics: Systems, 51(10), 6015-6028.