| 研究生: |
吳咏璁 Wu, Yung-Tsung |
|---|---|
| 論文名稱: |
模式樹選取分割屬性方法之研究 |
| 指導教授: |
翁慈宗
Wong, Tzu-Tsung |
| 學位類別: |
碩士 Master |
| 系所名稱: |
管理學院 - 工業與資訊管理學系 Department of Industrial and Information Management |
| 論文出版年: | 2004 |
| 畢業學年度: | 92 |
| 語文別: | 中文 |
| 論文頁數: | 49 |
| 中文關鍵詞: | 判定係數 、線性迴歸 、均方差 、模式樹 、特徵選取 |
| 相關次數: | 點閱:82 下載:5 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
在資料探勘的領域中,用來做數值預測主要有迴歸樹和模式樹兩種方法。模式樹類似於決策樹,所不同的是在每個葉部節點中,決策樹所存放的是用來做分類的類別值,而模式樹卻是一條用來做數值預測的線性迴歸式,模式樹在現實的數值預測問題上是非常有用的一種方法。在已知的研究中,有下列 3種的模式樹長成方法,SDR、FAR與RETIS,這三種模式樹長成方法有各自的成長機制和學習的表現能力,本研究欲經由「精確度」、「原始資料結構的還原性」、「與原始模式樹相似度」、「運算時間」,這四種測試準則來比較不同的模式樹長成方式間的差異性,希望能透過更多的實驗結果分析來瞭解不同模式樹長成方法的使用時機。
從本研究的比較結果所得到的推論是當在做資料探勘數值預測時,若一筆實際資料內的屬性個數很少,RETIS會是比較適合用來學習的模式樹長成方法,不管是精確度或是原始資料的還原性都有良好的表現,運算時間也會因為屬性個數少而不致於過久;而當屬性個數多時,SDR與FAR會是比較適合的模式樹長成方法,兩者有各自的長處存在,SDR在精確度和運算時間的表現較優,而FAR則是在原始資料結構的還原性和模式樹相似度的表現上較良好,因此依據實驗人員本身所注重的層面,可以根據不同的資料狀態自行選擇較適合的模式樹長成方法。
none
陳子立,”結合特徵選取與判定係數以建構模式樹之方法”,碩士論文,國立成功大學工業管理科學研究所,民國92年
Alexander, W.P. and Grimshaw, S.D. (1996). Treed regression, Journal of Computational and Graphical Statistics, 5, 156-175.
Ari, B. and Guvenir, H.A.(2002). Clustered linear regression, Knowledge-Based Systems, 15, 169-175.
Berikov, V.B. and Rogozin, I.B. (1999). Regression trees for analysis of mutational spectra in nucleotide sequences, Bioinformatics, 15, 553-562.
Blum, A.L. and Langley, P.(1997). Selection of relevant features and examples in machine learning, Aritifical Intelligence, 97, 245-271.
Breiman, L. (1996). Bagging predictors, Machine Learning, 24, 123-140.
Breiman, L., Friedman, J.H., Olshen, R.A., and Stone, C.J.(1984). Classification and Regression Trees, Belmont, CA: Wadsworth International Group.
Chipman, H.A., George, E.I., and McCulloch, R.E. (1998). Bayesian CART model search, Journal of the American Statistical Association, 93, 935-960
Chipman, H., George, E.I., and McCulloch, R.E. (2002). Bayesian treed models, Machine Learning, 48:1-3, 299-320.
Kampichler, C., Dzeroski, S., and Wieland, R. (2000). Application of machine learning techniques to the analysis of soil ecological data bases: relationships between habitat features and Collembolan community characteristics, Soil Biology and Biochemistry, 32:2, 197-209.
Karalic, A. (1992). Linear regression in regression tree leaves. European Conference on Artificial Intelligence.
LeBlanc, M. and Tibshirani, R. (1998). Monotone shrinkage of trees, Journal of computational and Graphical Statistics, 7, 417-433.
Neter, J., Kutner, M. H., Nachtsheim, C. J., and Wasserman, W. (1996). Applied Linear Regression Models, Burr Ridge, IL: Irwin.
Peters, G., Morrissey, M. T., Sylvia, G., and Bolte, J. (1996). Linear regression, neural network and induction analysis to determine harvesting and processin effects on surimi qrality, Journal of Food Science, 61:5, 876-880.
Quinlan, J. R. (1992). Learning with continuous classes, Proceedings of the Australian Jonit Conference on Arificial Intelligence, 343-348.
Quinlan, J. R. (1993). Combining instance-based and model-based learning, Proceedings on the Tenth International Conference of Machine Learning, 236-243.
Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso, Journal of the Royal Statistical Society, Series B, 58, 267-288.
Torgo, L. (1997). Functional models for regression tree leaves, Proceeding of the international Machine Learning Conference, 385-393.
Wang, Y. and Witten, I. H. (1997). Inducing model trees for continuous classes, Proceedings of the poster papers of the 9th European Conference on Machine Learning.