簡易檢索 / 詳目顯示

研究生: 蔡宗恩
Tsai, Zung-En
論文名稱: 二階段特徵向量篩選法:乳房X光片腫瘤偵測
Two-stage Feature Selection: Mass Detection in Mammograms
指導教授: 郭淑美
Guo, Shu-Mei
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2005
畢業學年度: 93
語文別: 英文
論文頁數: 70
中文關鍵詞: 特徵選取二階段
外文關鍵詞: feature selection, two-stage
相關次數: 點閱:88下載:5
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  •   電腦輔助診斷的腫瘤偵測系統通常由四個子系統組成,分別是(1)影像前處理,(2)特徵擷取,(3)特徵選取,(4)偵測與辨識。對於腫瘤偵測系統,首先要求辨識率越高越好。其次才是運算時間。因此在本論文『二階段特徵向量篩選法』著重於增進特徵選取的效能。
    本項特徵篩選分成兩個階段,第一階段先利用相關係數與資訊增益把相關性高的特徵向量聚集在一起,並運用類別分離選出代表特徵,之後讓循序前進搜尋演算法或循序前進浮動搜尋演算法選出第一階段的最佳代表特徵集合。第二階段開始時,先把第一階段選出的最佳代表特徵以及和代表特徵高相關的特徵合併在一起,之後再做一次循序前進搜尋演算法或循序前進浮動搜尋演算法,選出精鍊過的最佳特徵集合。
      在『二階段特徵向量篩選法』的研究過程中,我們發現,第二階段選出的最佳特徵和第一階段的最佳特徵極為類似,第二階段的辨識率和第一階段也多數相同。我們認為『代表特徵』已具足其高度相關的整體特徵。此外,我們更可以移除『二階段特徵向量篩選法』中的特定步驟來增進演算速度。只是移除的部分越多,辨識效果就會越差。由於運算速度與正確性不能兼得,只能選擇其中之一。再說電腦輔助診斷的設計階段首重正確性,所以『二階段特徵向量篩選法』移除的特定步驟部分必須妥切考慮。

     CAD system of mass detection is commonly composed of four parts: (1) image pre-processing, (2) feature extraction, (3) feature selection and (4) machine learning (detection and classification). For mass detection system, the first demand is high accuracy, and the next one computing time. Two-stage feature selection is therefore proposed to meet this goal of improving the performance of feature selection.
     In the first stage, we would apply two measures, correlation coefficient and information gain, to gather highly correlated features into one cluster, and then choose its representative feature by class separability. After choosing the representative feature, we apply SFS or SFFS to pick up the optimal representative feature set of the first-stage feature selection. In the beginning of the second stage, we first aggregate these optimal representative feature set (including their highly correlated features) together. Next, either SFS or SFFS is iteratively applied to gain the last refined optimal feature set.
    During the investigation of “two-stage feature selection”, we found that optimal feature set selected at the first stage is almost similar to that selected at the second stage. The correct correlation rate in the 1st stage and 2nd stage are often similar, too. We recognized that “representative feature” can be used to almost respond the behavior of the integrated highly correlated features belonged to the same cluster. Furthermore, we could accelerate the computing time by pruning some specific parts of “two-stage feature selection”. However, the more parts we prune, the worse accuracy we gain. There is a tradeoff between speed and accuracy. Since CAD systems are strictly careful about the accuracy, the action to prune some specific parts of two-stage feature selection must be inspected cautiously.

    Chapter 1 Introduction 1 1.1 Survey of feature selection methods 1 1.2 Motivation 3 1.3 Organization of the Thesis 5 Chapter 2 Feature Extraction 8 Chapter 3 Two-stage Feature Selection 13 3.1 Reducing redundancy 15 3.1.1 Correlation Coefficient 15 3.1.2 Information Gain 19 3.1.3 Comparison between Correlation Coefficient and Information Gain 21 3.2 Choosing Representative Feature 23 3.2.1 Principal Component Analysis 23 3.2.2 Class Separability 25 3.3 Feature Selection Strategies 33 3.3.1 Sequential Forward Search 33 3.3.2 Sequential Forward floating Search 33 3.3.3 Detection methodologies 34 3.3.3.1 Artificial Neural Network and Probabilistic Neural Network 34 3.3.3.2 K-Fold Cross Validation 37 Chapter 4 Experiments 38 4.1 Database Description 38 4.2 Algorithm Description 41 4.3 Experimental Results and Discussion 44 Chapter 5 Conclusions 54

    [1] A. Aarabi, F. Wallois and R. Grebe, “Feature selection based on discriminant and redundancy analysis applied to seizure detection in newborn”, Neural Engineering, 2005. 2nd International IEEE EMBS Conference, pp. 241- 244, 2005.
    [2] D.F Specht, “Probabilistic Neural Networks,” Neural Networks, vol. 3, pp.109-118, 1990
    [3] H. Yuan, S.-S. Tseng, W. Gangshan and Z. Fuyan, “A two-phase feature selection method using both filter and wrapper,” Proceedings of the IEEE Conference on Systems, Man, and Cybernetics, vol. 2, IEEE Computer Society Press, Piscataway, NJ, pp. 132-136. 1999.
    [4] I. Christoyianni, E. Dermatas, and G. Kokkinakis, “Neural classification of abnormals tissue in digital mammography using statistical features of the texture,” Electronics, Circuits and Systems, 1999. Proceedings of ICECS '99. The 6th IEEE International Conference, Vol. 1, 1999.
    [5] J. Kittler and J. Illingwroth, “On threshold selection using clustering criteria,’’ IEEE Trans. Syst., Man, Cybern. , Vol. SMC-15, No. 5, pp.652-655, 1985.
    [6] J. Reunanen, “Overfitting in making comparisons between variable selection methods,” The Journal of Machine Learning Research, vol. 3, pp. 1371 – 1382, 2003.
    [7] J.R. Quinlan, “C4.5: program for machine learning.” Morgan Kaufmann, 1993.
    [8] Jain and D. Zongker, “Feature selection: evaluation, application, and small sampleperformance.” IEEE Trans. Pattern Anal. Mach. Intell., pp. 153–158, 1997.
    [9] L. Yu and H. Liu, “Efficient Feature Selection via Analysis of Relevance and Redundancy”, The Journal of Machine Learning Research, Vol. 5, pp. 1205 – 1224, 2004.
    [10] M, Kudo and J. Sklansky, “Comparison of algorithms that selects features for pattern classifiers.” Pattern Recognition, vol. 33, pp. 25-41, 200.
    [11] M. Kudo and J. Sklansky. “Comparison of algorithms that select features for pattern classifiers.” Pattern Recognition, pp. 25–41, 2000.
    [12] M. Pabitra, C.A. Murthy and K.P. Sankar, “Unsupervised feature selection using feature similarity,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.24, no.3, pp. 301-312, 2002.
    [13] N. Otsu, ‘‘A threshold selection method from gray-level histogram,’’ IEEE Trans. Syst., Man, Cybern. , Vol. SMC-9, No. 1, pp.62-66, 1979.
    [14] O. Richard, P.E. Hart and D.G. Stork, Pattern classification 2nd edition, Wiley-Interscience, 2001.
    [15] P. Langley, “Selection of relevant features in machine learning”, Proceedings of the AAAI Fall Symposium on Relevance. AAAI Press. 1994
    [16] P. Pudil, J. Novovicová and J. Kittler, “Floating search methods in feature selection.' Pattern Recognigion Letters, 15: pp. 1119-1125, 1994.
    [17] P.A. Devijver and J. Kittler, Pattern recognition: a statistical approach Englewood Cliffs: Prentice Hall, 1982.
    [18] P.D. Wasserman, “Advanced Methods in Neural Computing,” John Wiley & Sons, Inc., New York, pp. 35-55, 1993.
    [19] P.K. Sahoo, S.Soltani and A.K. Wong, ‘‘A survey of thresholding techniques,’’ Comp. Vision, Graphics, Image Proc., Vol. 41, No. 2, pp.233-260, 1988.
    [20] R.P. Heydorn, “Redundancy in feature selection.” IEEE Trans. Computers, pp. 1051-1054, 1971.
    [21] S. Das, “Filters, wrappers and a boosting-based hybrid for feature selection,” Intl. conf. on Machine Learning, pp. 74-81, 2001.
    [22] S. Theodoridis and K. Koutroumbas, Pattern Recognition, Second Edition. Academic Press, 2002.
    [23] S.K. Das, "Feature Selection with a Linear Dependence Measure," IEEE Trans. Computers, pp. 1106-1109, 1971.
    [24] The Mammographic Image Analysis Society, Digital Mammography Database: http://www.wiau.man.ac.uk/services/MIAS/MIASweb.html
    [25] The mini-MIAS database of mammograms, http://peipa.essex.ac.uk/info/mias.html
    [26] W. Whitney. “A direct method of nonparametric measurement selection,” IEEE Trans. Computers, pp.1100–1103, 1971.
    [27] W.H. Press, B.P. Flannery, S.A. Teukolski and W.T. Vetterling, Numerical recipes in C. Cambridge University Press. 1988.
    [28] 于南書, 最佳特徵選擇:乳房X光片腫瘤偵測,國立成功大學資訊工程系碩士論文, 93年度
    [29] 廖有千, 紋路特徵值分析應用於乳房X光片攝影之腫瘤偵測,國立成功大學資訊工程系碩士論文, 90年度.
    [30] 陳順宇,多變量分析三版,華泰書局,2004

    下載圖示 校內:2006-07-27公開
    校外:2006-07-27公開
    QR CODE