| 研究生: |
陳麗君 Chen, Li-Chun |
|---|---|
| 論文名稱: |
基於支持向量資料描述的一新邊界距離量測值及其於分類上之應用 A New Boundary Distance Measure via Support Vector Domain Description and Its Application to Classification |
| 指導教授: |
郭淑美
Guo, Shu-Mei |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
| 論文出版年: | 2008 |
| 畢業學年度: | 96 |
| 語文別: | 英文 |
| 論文頁數: | 51 |
| 中文關鍵詞: | 多類別分類 、單一類別分類 、支持向量資料描述 |
| 外文關鍵詞: | one-class classification, support vector domain description, multiclass classification |
| 相關次數: | 點閱:64 下載:1 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
支持向量資料描述(support vector domain description)乃針對單一類別的目標資料,建立一邊界描述,而該邊界僅由少數目標資料,或稱支持向量(support vector),所構成。本研究將提出一新的邊界距離量測,該量測方法乃基於內積的幾何意義,為每一筆資料,於資料空間中的決策邊界上,找出最近邊界點,進而得到該筆資料之邊界距離量測值。本研究進一步將該邊界距離量測值運用於改善原支持向量所建構的邊界描述,相較於原先需要藉由核主成分分析(kernel principal component analysis)將資料變異數均一化的複雜處理以獲得較佳的決策邊界的方法,本研究所提出的方法不但可以獲得相當或較佳的分類結果,在運算效率上也來得好。此外,該邊界距離也可運用於多類別分類,本研究所提出的多類別分類方法乃是藉由支持向量描述建立每一類別之邊界描述,再計算一待分類資料與每一類別之間的邊界距離量測值,利用最小距離原則決定該待分類資料之所屬類別,實驗結果顯示,相較於其他支持向量分類器,本研究所提出之多類別分類器,在分類準確度亦有相當甚或更佳的表現。
A new boundary distance measure based on support vector domain description (SVDD) is proposed in this thesis, which measures the distance between an object and the boundary described by a few number of training objects, namely support vectors, in the SVDD. In the first part of this thesis, the boundary distance measure is derived by locating an object’s nearest boundary point with the help of geometrical interpretation of inner product, and then a simple post-processing method using the boundary distance measure is proposed, which tries to modify the SVDD boundary in order to achieve tight data description. The experimental results show that the proposed decision boundary can fit the shape of synthetic data distribution closely and can achieve better or comparable classification performance on real world datasets. Compared to kernel whitening which is often applied to improve the slack boundary issue with the SVDD, the proposed method can construct a better decision boundary more efficiently. In the second part of this thesis, the boundary distance measure is utilized for multiclass classification problems. To decide which class a test object belongs to, a minimum distance strategy for combining a multiple of SVDDs is then carried out. Experimental results also show that our proposed minimum distance algorithm can achieve a comparable or better performance than other support vector based classifiers.
1. A.P. Bradley, 1997. The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition 30(7), 1145–1159.
2. A. Ypma and R. Duin, 1998. Support objects for domain approximation. International Conference on Artificial Neural Networks, 719–724.
3. B. Schölkopf and A. Smola, 2002. Learning with Kernels - Support Vector Machines, Regularization, Optimization and Beyond. MIT Press, Cambridge, MA.
4. B. Scholkopf, A. J. Smola and K. R. Muller, 1998. Nonlinear component analysis as a kernel eigenvalue problem. Neural Computing. 10(5), 1299-1319.
5. B. Raskutti and A. Kowalczyk, 2004. Extreme re-balancing for SVMs: a case study. SIGKDD Explorations 6, 60–69.
6. B.-Y. Sun and D.-S. Huang, 2003. Support vector clustering for multiclass classification problems. Congress on Evolutionary Computation 2, 1480-1485.
7. C. Bishop, 1995. Neural Networks for Pattern Recognition. Oxford University Press, Walton Street, Oxford OX2 6DP.
8. C.-C. Chang and C.-J. Lin, 2001. LIBSVM: a library for support vector machines. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm
9. C. L. Blake and C. J. Merz., 1998. UCI repository of machine learning databases. University of California, Irvine, Dept. of Information and Computer Sciences, Available at http://www.ics.uci.edu/~mlearn/MLRepository.html
10. C.-W. Hsu and C.-J. Lin, 2002. A comparison of methods for multiclass support vector machines. IEEE Transaction on Neural Networks 13, 415–425.
11. C. Lai, D. Tax, R. Duin, E. Pekalska, P. Paclik, 2004. A study on combining image representations for image classification and retrieval. International Journal of Pattern Recognition and Artificial Intelligence 18(5), 867–890.
12. D. Lee and J. Lee, 2007. Domain described support vector classifier for multi-classification problems. Pattern Recognition 40, 41-51.
13. D. Michie, D.J. Spiegelhalter, and C.C. Taylor, 1994. Machine Learning, Neural and Statistical Classification. Prentice Hall, Englewood, N.J..
14. D.M.J. Tax, 2001. One-class Classification. Ph.D. thesis Delft University of Technology.
15. D.M.J. Tax and K.R. Muller, 2004. A consistency-based model selection for one-class classification. ICPR, 363–366.
16. D.M.J. Tax and P. Juszczak, 2003. Uncertainty sampling for one-class classifiers. ICML-2003 Workshop: Learning with Imbalanced Data Sets II, 81–88.
17. D.M.J. Tax and P. Juszczak, 2002. Kernel whitening for one-class classification. Lecture Notes in Computer Science 2388, 40–52.
18. D.M.J. Tax and R.P.W. Duin, 1999. Support vector domain description. Pattern Recognition Letter 20, 1191–1199.
19. D.M.J. Tax and R.P.W. Duin, 2001. Uniform object generation for optimizing one class classifiers. Journal of Machine Learning Research, 155-173.
20. D.M.J. Tax and R.P.W. Duin, 2002. Using two-class classifiers for multiclass classification. International Conference on Pattern Recognition 2, 124-127.
21. E. Parzen, 1962. On estimation of a probability density function and mode. Annals of Mathenatical Statistics, 33, 1065–1076.
22. H. Hoffmann, 2007. Kernel PCA for novelty detection. Pattern Recognition 40 (3), 863-874.
23. Hyoungjoo Lee and Sungzoon Cho, 2006. The novelty detection approach for different degrees of class imbalance. ICONIP (2), 21-30.
24. L. Parra, G. Deco, and S. Miesbach, 1996. Statistical independence and novelty detection with information preserving nonlinear maps. Neural Computation, 8, 260–269.
25. M. Bicego, E. Grosso, and M. Tistarelli, 2005. Face authentication using one-class support vector machines. Lecture Notes in Computer Science 3781, 15-22.
26. M.F. Jiang, S.S. Tseng, and C.M. Su, 2001. Two-phase clustering process for outliers detection. Pattern Recognition Letters 22(6-7), 691–700.
27. N. Japkowicz and S. Stephen, 2002. The class imbalance problem: a systematic study. Intelligent Data Analysis Journal 6(5), 429-450.
28. P. Juszczak, 2007. Volume-based model selection for one-class classifiers that consist of a set of spheres. ICONIP.
29. R. Duda and P. Hart, 1973. Pattern Classification and Scene Analysis. John Wiley & Sons, New York.
30. S.A. Nene, S.K. Nayar, and H. Murase, 1996. Columbia Object Image Library (COIL-20). Technical Report CUCS-005-96.
31. S.W. Lee, J. Park, S.W. Lee, 2006. Low resolution face recognition based on support vector data description. Pattern Recognition 39(9), 1809-1812.
32. V.N. Vapnik, 1995. The Nature of Statistical Learning Theory. Springer-Verlag, New York.
33. W.S. Kang and J.Y. Choi, 2008. Domain density description for multiclass pattern classification with reduced computational load. Pattern Recognition 41(6), 1997-2009.