簡易檢索 / 詳目顯示

研究生: 王亭云
Wang, Ting-Yun
論文名稱: 高效能單調性單類別支援向量資料描述模型之研究
An Efficient Monotonic One-class SVDD Model
指導教授: 李昇暾
Li, Sheng Tun
共同指導教授: 林清河
Lin, Chinho
學位類別: 碩士
Master
系所名稱: 管理學院 - 工業與資訊管理學系
Department of Industrial and Information Management
論文出版年: 2019
畢業學年度: 107
語文別: 英文
論文頁數: 58
中文關鍵詞: 單類別分類支援向量機資料描述單調性限制式決策樹最大期望演算法
外文關鍵詞: One-class Classification, Support Vector Data Description, Monotonicity Constraint, Expectation-Maximization Algorithm, Decision Tree
相關次數: 點閱:81下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在資訊爆炸的時代,各領域資料大量累積,資料探勘成為相當熱門的技術,能從大量資料中找尋隱藏其中的資訊。其中分類技術可以透過資料的特徵與相關性做決策預測,備受歡迎。與傳統分類不同,單類別分類常用於異常值檢測與模式識別,而支援向量資料描述(SVDD)模型則是常用的模型之一。然而支援向量資料描述(SVDD)模型易受資料特性影響,對於數據驅動的分類模型,可行性約束變得相當重要。
    融入含有專家知識的單調性限制式,已證實能縮小搜尋可行區域以優化模型。但過往單調性限制式的生成方式常採取隨機取樣,需要做多次實驗找出對模型最好的限制式數目。故本研究提出利用單調性單類別決策樹方法,透過最大期望演算法(EM)得到目標事例的兩群,並用決策樹提取單調性規則,生成用於支援向量資料描述SVDD的單調性限制式。透過數據導向與知識導向的建構限制式方式,可以更有效率找到模型的邊界。結果顯示使用單調性單類別決策樹方法能降低模型運算時間。

    Living in an era of information explosion and digital revolution, the data from different types of fields are accumulated rapidly. Technique such as data mining has blown up for having ability to investigate hidden information from mass data. Classification technique from data mining which can do decision forecasting through features and relationship of the data is well received. Different from traditional classification, one-class classification is often applied to outlier detection and pattern recognition, and support vector domain descriptions (SVDD) model is a kind of model commonly be used. However, SVDD model is likely to be affected by the nature of the data. As for data-oriented classification model, feasible constraints become quite important.
    Monotonic constraint with expert knowledge has been confirmed to narrow the search of feasible region to optimize the model. However, past method of generating monotonic constraint often adopt to random sampling, required to get on experiment repeatedly to find out best number of constraints for model. Consequently, this study proposes a monotonic constraint generating method for SVDD. First of all, Expectation-Maximization Algorithm (EM) is used to obtain the two group of target class. Second, applying decision tree to extract monotonic rule, SVDD constraint can be generated. Via data-oriented and knowledge-orient constraint generating method, the boundary of model can be found more efficiently.

    摘要 I ABSTRACT II CONTENTS V List of Table VI List of Figures VI Chapter 1 Introduction 1 1.1 Background and Motivation 1 1.2 Research Objectives and Limitations 3 1.3 Structure of Research 4 Chapter 2 Literature Review 6 2.1 One-Class Classification 6 2.2 Support Vector Machine (SVM) 8 2.3 Support Vector Data Description (SVDD) 9 2.3.1 Introduction of SVDD 9 2.3.2 Mathematical of SVDD 10 2.3.3 Application of SVDD 12 2.4 Monotonicity Constraints 13 2.5 Construct the Constraint 15 Chapter 3 Research Methodology 17 3.1 Monotonicity Constraints Construction 18 3.1.1 Concept of Monotonicity 18 3.1.2 Monotone Constraints in One-Class Classification 19 3.1.3 Constructing the Monotonicity Constraints 20 3.2 Derivation of the Monotonicity Constrained of SVDD 29 Chapter 4 Experimental Results and Analysis 35 4.1 Experiment Design 35 4.2 Experimental Dataset 37 4.3 Predictive assessment metrics 40 4.4 Model Performance 42 4.4.1 EM algorithm for two mixture Gaussian 42 4.4.2 Decision boundary of one-class classifier 43 4.4.3 Relation of number of constraint and time 43 4.4.4 The statistical analysis in experimental datasets 48 References 56

    Bioch, J. C., & Popova, V. (2002). Monotone decision trees and noisy data.
    Bishop, C. M. (1995). Neural networks for pattern recognition: Oxford university press.
    Bonchi, F., Giannotti, F., Mazzanti, A., & Pedreschi, D. (2003). ExAMiner: Optimized level-wise frequent pattern mining with monotone constraints. Paper presented at the Data Mining, 2003. ICDM 2003. Third IEEE International Conference on.
    Bonchi, F., & Lucchese, C. (2007). Extending the state-of-the-art of constraint-based pattern discovery. Data & Knowledge Engineering, 60(2), 377-399.
    Calvetti, D., Morigi, S., Reichel, L., & Sgallari, F. (2000). Tikhonov regularization and the L-curve for large discrete ill-posed problems. Journal of Computational and Applied Mathematics, 123(1-2), 423-446. doi:10.1016/S0377-0427(00)00414-3
    Cao‐Van, K., & De Baets, B. (2003). Growing decision trees in an ordinal setting. International Journal of Intelligent Systems, 18(7), 733-750.
    Carrizosa, E., & Morales, D. R. (2013). Supervised classification and mathematical optimization. Computers & Operations Research, 40(1), 150-165.
    Chen, C. C., & Li, S. T. (2014). Credit rating with a monotonicity-constrained support vector machine model. Expert Systems with Applications, 41(16), 7235-7247.
    Daniels, H., & Kamp, B. (1999). Application of MLP networks to bond rating and house pricing. Neural Computing & Applications, 8(3), 226-234.
    Daniels, H. A. M., & Velikova, M. V. (2006). Derivation of monotone decision models from noisy data. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 36(5), 705-710.
    Debar, H., Dacier, M., & Wespi, A. (2000). A revised taxonomy for intrusion-detection systems. Paper presented at the Annales des télécommunications.
    Duin, R. P. W. (1976). On the choice of smoothing parameters for Parzen estimators of probability density functions. IEEE Transactions on Computers, C-25(11), 1175-1179.
    Feelders, A., & Pardoel, M. (2003). Pruning for monotone classification trees. Paper presented at the International Symposium on Intelligent Data Analysis.
    Galvan, E., Malak, R. J., Gibbons, S., & Arroyave, R. (2017). A constraint satisfaction algorithm for the generalized inverse phase stability problem. Journal of Mechanical Design, 139(1), 011401.
    Hu, Q., Che, X., Zhang, L., Zhang, D., Guo, M., & Yu, D. (2012). Rank entropy-based decision trees for monotonic classification. IEEE Transactions on Knowledge and Data Engineering, 24(11), 2052-2064. doi:10.1109/TKDE.2011.149
    Incer, I., Theodorides, M., Afroz, S., & Wagner, D. (2018). Adversarially Robust Malware Detection Using Monotonic Classification. Paper presented at the Proceedings of the Fourth ACM International Workshop on Security and Privacy Analytics.
    Jiang, Z., Feng, X., Feng, X., & Li, L. (2012). A study of SVDD-based algorithm to the fault diagnosis of mechanical equipment system. Physics Procedia, 33, 1068-1073.
    Khan, S. S., & Madden, M. G. (2009). A survey of recent trends in one class classification. Paper presented at the Irish conference on artificial intelligence and cognitive science.
    Kou, Y., Lu, C. T., Sirwongwattana, S., Huang, Y. P. (2004). Survey of fraud detection techniques. Paper presented at the IEEE International Conference on Networking, Sensing and Control, 2004.
    Kudła, P., & Pawlak, T. P. (2018). One-class synthesis of constraints for Mixed-Integer Linear Programming with C4. 5 decision trees. Applied Soft Computing, 68, 1-12.
    Li, K. L., Huang, H. K., Tian, S. F., & Xu, W. (2003). Improving one-class SVM for anomaly detection. Paper presented at the Machine Learning and Cybernetics, 2003 International Conference on.
    Liu, B., Xiao, Y., Cao, L., Hao, Z., & Deng, F. (2013). SVDD-based outlier detection on uncertain data. Knowledge and Information Systems, 34(3), 597-618.
    Manevitz, L. M., & Yousef, M. (2001). One-class SVMs for document classification. Journal of machine Learning research, 2(Dec), 139-154.
    Mannila, H., & Toivonen, H. (1997). Levelwise search and borders of theories in knowledge discovery. Data Mining and Knowledge Discovery, 1(3), 241-258.
    Martinus, D., & Tax, J. (2001). One-class classification: Concept-learning in the absence of counter-examples. ASCI dissertation series, 65.
    Mavroforakis, M. E., & Theodoridis, S. (2006). A geometric approach to support vector machine (SVM) classification. IEEE Transactions on Neural Networks, 17(3), 671-682.
    McWilliam, S. (2001). Anti-optimisation of uncertain structures using interval analysis. Computers & Structures, 79(4), 421-430.
    Moya, M. M., & Hush, D. R. (1996). Network constraints and multi-objective optimization for one-class classification. Neural Networks, 9(3), 463-474.
    Moya, M. M., Koch, M. W., & Hostetler, L. D. (1993). One-class classifier networks for target recognition applications (SAND-93-0084C; Other: ON: DE93006288 United States Other: ON: DE93006288 OSTI; NTIS; GPO Dep. SNL English). Retrieved from
    Nigam, K., McCallum, A. K., Thrun, S., & Mitchell, T. (2000). Text classification from labeled and unlabeled documents using EM. Machine Learning, 39(2-3), 103-134.
    Pawlak, T. P. (in press). Synthesis of Mathematical Programming models with one-class evolutionary strategies. Swarm and Evolutionary Computation. doi:10.1016/j.swevo.2018.04.007
    Potharst, R., & Bioch, J. C. (2000). Decision trees for ordinal classification. Intelligent Data Analysis, 4(2), 97-111.
    Potharst, R., & Feelders, A. J. (2002). Classification trees for problems with monotonicity constraints. ACM SIGKDD Explorations Newsletter, 4(1), 1-10. doi:10.1145/568574.568577
    Schölkopf, B., Williamson, R. C., Smola, A. J., Shawe-Taylor, J., & Platt, J. C. (2000). Support vector method for novelty detection. Paper presented at the Advances in neural information processing systems.
    Schölkopf, B., Platt, J. C., Shawe-Taylor, J., Smola, A. J., & Williamson, R. C. (2001). Estimating the support of a high-dimensional distribution. Journal Neural Computation, 13(7), 1443-1471.
    Shin, H. J., Eom, D. H., & Kim, S. S. (2005). One-class support vector machines—an application in machine fault detection and classification. Computers & Industrial Engineering, 48(2), 395-408.
    Tax, D. M., & Duin, R. P. (1999). Support vector domain description. Pattern recognition letters, 20(11-13), 1191-1199.
    Vapnik, V. (1995). The nature of statistical learning theory: Springer science & business media.
    Vapnik, V. (1998). The support vector method of function estimation. In Nonlinear Modeling (pp. 55-85): Springer.
    Yin, L., Wang, H., & Fan, W. (2018). Active learning based support vector data description method for robust novelty detection. Knowledge-Based Systems, 153, 40-52.
    Zhou, Y., Wu, K., Meng, Z., & Tian, M. (2017). Fault detection of aircraft based on support vector domain description. Computers & Electrical Engineering, 61, 80-94.
    Zhu, H., Tsang, E. C., Wang, X. Z., & Ashfaq, R. A. R. (2017). Monotonic classification extreme learning machine. Neurocomputing, 225, 205-213.

    無法下載圖示 校內:2024-06-30公開
    校外:不公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE