| 研究生: |
方茜 Fang, Qian |
|---|---|
| 論文名稱: |
有初始值篩選的核函數K中心聚類法 Kernel K Medoids Algorithm with Selected Initial Values |
| 指導教授: |
溫敏杰
Wen, Miin-Jye |
| 學位類別: |
碩士 Master |
| 系所名稱: |
管理學院 - 統計學系 Department of Statistics |
| 論文出版年: | 2017 |
| 畢業學年度: | 105 |
| 語文別: | 英文 |
| 論文頁數: | 36 |
| 中文關鍵詞: | 核函數 、K-中心點 、初始值 |
| 外文關鍵詞: | Kernel function, K medoids, Initialization |
| 相關次數: | 點閱:68 下載:12 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本研究所提出的聚類方法將高斯核函數與 k-中心聚類法結合起來,與此同時,還 加入了利用變量 Vj (Park and Jun, 2009) 來對資料進行排序並篩選出 r 個中間值作為 我們的初始中心點。初始值的篩選讓聚類分析過程更加高效,而高斯核函數的加 入可以讓我們的聚類方法比較不容易受異常值和干擾數據的影響。為了評估我們 所提出來的方法,我們分析了一些真實數據,合成數據以及關聯數據,並用 ARI (Adjusted Rand Index), F1 score 和 MSE (Mean Squared Error) 這些指標進行結果評估, 將其與 k-平均值 (k means) 聚類法,k 中心點 (k medoids) 聚類法的分群評估結果進行 比較。評估結果顯示,本文所提出的分群方法與 k 平均值 (k means) 聚類法,k-中心 點 (kmedoids) 聚類法相比,有更好的分群效果。
This study proposes a clustering algorithm that combine gaussian kernel function with k medoids clustering algorithm. In the meanwhile, we use a variable called Vj (Park and Jun, 2009) to rank objects and select the r middle values as our initial centers. The selection of initial values makes the clustering process more efficient, and the combination of gaussian kernel function makes the clustering outcome more resistant to outliers or noises. To evaluate the proposed algorithm, we analyze some real, synthetic and relational datasets and compar- ing with the results of other algorithms in terms of the Adjusted Rand Index, F1 score and Mean Squared Error. The outcomes show that our proposed algorithm having better cluster- ing performance over the other mentioned algorithms (k means, k medoids) in this study.
[1] Agrawal,K.P.andGarg,S.,Patel,P. Performance Measures for Densed and Arbitrary Shaped Clusters. CS-Journals, Vol 6, pp. 388-350, 2015.
[2] Chang, C. C. and Lin C. J . Training ν-support vector classifiers: Theory and algorithms. Neural Computation, 13(9):2119–2147, 2001.
[3] Duda, R., Hart P. and Stork, D. Pattern Classification, seconded. John Wiley and Sons, New York.
[4] Hubert, L. and Arabie, P. Comparing partitions. Journal of Classification, 2, 193– 218, 1985.
[5] Jain, A. K. Data clustering: 50 years beyond K-means. Pattern Recognition Letters, 31, 651–666, 2010.
[6] Kaufman, L. and Rousseeuw, P. Finding Groups in Data: An Introduction To Cluster Analysis. John Wiley, New York., ISBN: 0-471-87876-6, 1990.
[7] Lance, G. L. and Williams, W. T . A General Theory of Classificatory Sorting Strategies: 1. Hierarchical Systems John Wiley and Sons, New York.
[8] MacQueen, J. Some methods for classification and analysis of multivariate observa- tions. Fifth Berkeley Symposium on Mathematics, Statistics and Probability, University of California Press, pp. 281–297, 1967.
[9] Mei, J. P. and Chen, L. Fuzzy clustering with weighted medoids for relational data. Pattern Recognition, 43, 1964–1974, 2010.
[10] Park, H.S. and Jun, C.H . A simple and fast algorithm for K-medoids clustering. Expert Systems with Applications, 36, 3336–3341, 2009.
[11] Saltelli, A., Tarantola, S., Campolongo, F. and Ratto, M. Sensitivity Analysis in Prac- tice, a Guide to Assessing Scientific Models. New York: Wiley, 2004.
[12] Wu, K. L. and Lin, Y. J. Kernelized K-Means Algorithm Based on Gaussian Kernel. Advances in Control and Communication, pp 657-664, 2012.