簡易檢索 / 詳目顯示

研究生: 施奕羽
Shih, I-Yu
論文名稱: 使用M5'模式樹為基礎學習小樣本之研究
Using the M5'-based procedure for learning from small samples
指導教授: 利德江
Li, Der-Chiang
學位類別: 碩士
Master
系所名稱: 管理學院 - 工業與資訊管理學系碩士在職專班
Department of Industrial and Information Management (on the job class)
論文出版年: 2011
畢業學年度: 99
語文別: 中文
論文頁數: 59
中文關鍵詞: 小樣本學習名目屬性
外文關鍵詞: small data learning, nominal attributes
相關次數: 點閱:79下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在真實生活中,很多案例都受限於樣本數的不足,造成資料結構不完整,資訊不明確,使得決策條件受限,因此如何從獲得之少量資料中,發掘有意義的資訊以建立可靠穩定的知識模式,進而獲得有效資訊,在近年來已是重要的議題。本研究基於事前知識,利用M5'模式樹的學習過程,提出一個虛擬樣本產生方法,首先利用M5'對資料分割的過程中,取得屬性的可能值域;再透過事前知識的方式,結合可能性評估機制產生虛擬樣本,本模式亦可同時處理數值與名目屬性。實驗結果顯示,透過此方法產生之虛擬樣本,在加入原小樣本後所建構之預測模式,確實較原小樣本對未知母體的預測誤差與準確率有明顯的改善。

    This research aims to develop an effective procedure for leaning more knowledge from small datasets. In most small dataset learning tasks, however, owing to the incomplete data structure, the explicit information for decision makers is limited. The proposed procedure, based on the prior knowledge obtained by the M5’ model tree, generates more training samples to learn more information hidden inside the small datasets. In addition, the performance of five modeling tools, M5’, back-propagation neural network, support vector machine for regression, multiple linear regressions, and C4.5 decision tree is also improved. The proposed procedure can handle both numeric and nominal attributes, while fuzzy theory-based VSG algorithms cannot. Nine public datasets are taken to form six kind learning problems for performance evaluation; in addition, two real cases are taken to compare with Mega-Trend-Diffusion method which is the state of the art virtual sample generation algorithm. All the results show the significant improvements.

    目錄 摘要 I Abstract II 誌謝 III 目錄 IV 圖目錄 VII 表目錄 IX 第一章 緒論 1 1.1 研究背景 1 1.2 研究動機 3 1.3 研究目的 4 1.4 研究架構與流程 4 第二章 文獻探討 7 2.1 虛擬樣本產生法 7 2.1.1 資訊擴展 7 2.1.2 其他虛擬樣本產生方法 13 2.2 預測模型 17 2.2.1 模式樹 17 2.2.2 倒傳遞類神經網路 20 2.2.3 支撐向量迴歸 21 2.2.4 多元線性迴歸 22 2.2.5 C4.5決策樹 22 第三章 研究方法 24 3.1 M5'模式樹 24 3.1.1 屬性轉換 25 3.1.2 資料分割 26 3.1.3 迴歸模式建構 27 3.1.4 預測值平滑化 27 3.2 樣本產生方法 28 3.2.1 輸出屬性的值域推估 28 3.2.2 類別值的對應輸出屬性範圍 29 3.2.3 數值輸入屬性的值域推估 30 3.2.4 可能性評估機制 30 3.3 使用數值預測工具進行分類預測 32 第四章 實例驗證 34 4.1 軟體選擇 34 4.2 資料蒐集與說明 35 4.2.1 UCI資料說明 35 4.2.2 文獻資料說明 36 4.2.3 實驗步驟 37 4.3 樣本數敏感度分析與評估 43 4.3.1 方法驗證 43 4.3.2 方法比較 49 4.4 進階研究與探討 51 第五章 結論與建議 53 參考文獻 54 圖目錄 圖1-1 隱藏於觀測值間的資訊間隙 2 圖1-2 母體、小樣本與虛擬樣本關係圖 3 圖1-3 本研究之流程 6 圖2-1 (a)個別模糊化(Huang and Moraga, 2004);(b)整體模糊化(Li et al., 2005) 9 圖2-2 整體趨勢擴散技術對值域擴展之示意圖 10 圖2-3 輸入空間透過函數ψ轉換至特徵空間 21 圖2-4 兩類別訓練資料與超平面間的距離達最大 21 圖3-1 虛擬樣本產生流程 24 圖3-2 基於min、 與max所繪製之輸出屬性的模糊三角隸屬函數 29 圖3-3 考量新類別值影響初始輸出屬性時所繪製的新模糊三角隸屬函數 30 圖3-4 基於 、 以及 所繪製的輸入數值屬性之模糊三角隸屬函數 30 圖3-5 計算tv的三角隸屬函數值 32 圖3-6 建構Credit之分類問題轉成數值預測問題之模式 33 圖3-7 以數值預測方式預測Credit之類別值 33 圖4-1 模式樹M5’實驗結果趨勢圖 45 圖4-2 BPN實驗結果趨勢圖 46 圖4-3 SVR實驗結果趨勢圖 47 圖4-4 MLR實驗結果趨勢圖 48 圖4-5 C4.5實驗結果趨勢圖 49 圖4-6 本研究之虛擬樣本透過M5'所產生之決策資訊 52 表目錄 表2-1 一筆時間相依之時間序列資料範例 16 表2-2 時間相依資料所組成之訓練與測試資料集 16 表2-3 目前模式樹相關學門 18 表3-1 某葉部節點所儲存的樣本資料 28 表4-1 各資料檔的資料筆數與屬性個數 36 表4-2 Servo 20筆樣本資料 38 表4-3 Servo資料屬性與值域 38 表4-4 名目屬性motor之資料轉換值列表 38 表4-5 轉換後的Servo 20筆資料 39 表4-6 群組1與群組2資料 40 表4-7 資料子集#1與#2之各屬性值域 40 表4-8 群組1-1與1-2資料 41 表4-9 群組1-1與1-2資料屬性與值域 41 表4-10 模式樹M5’實驗結果 44 表4-11 BPN實驗結果 45 表4-12 SVR實驗結果 46 表4-13 MLR實驗結果 47 表4-14 C4.5實驗結果 49 表4-15 MLCC資料之本研究和MTD在各抽樣數使用BPN之實驗結果 50 表4-16 膀胱癌資料之本研究和MTD在各抽樣數使用BPN之實驗結果 51

    Breiman, L., Friedman, J.H., Olshen, R.A., and Stone, C.J. (1984). Classification and Regression Trees, Belmont, CA:Wadsworth International Group.
    Chaudhuri, P., Huang, M. C., Loh, W. Y., and Rubin, R. (1994). Piecewise-polynomial regression tree, Statistica Sinica, Vol.4, 143-167.
    Chao, G. Y., Tsai, T. I., Lu, T. J., Hsu, H. C., Bao, B. Y., and Wu, W. Y., et al. (2011). A new
    approach to prediction of radiotherapy of bladder cancer cells in small dataset analysis.Expert Systems with Applications, 38(7), 7963-7969.
    Dobra, A., Gehrke. J.E., (2002). SECRET: A Scalable Linear Regression Tree Algorithm, Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 481-487.
    Drucker, H., Burges, C. J. C., Kaufman, L., Smola, A., and Vapnik, V. (1997). Support vector regression machines, in: M. Mozer, M. Jordan, and T. Petsche (Eds.), Advances in Neural Information Processing Systems, vol 9. Cambridge, MA: MIT Press, 1997, pp. 155-161.
    Efron, B., Tibshirani, R. J. (1993). An Introduction to the Bootstrap: New York: Chapmen & Hall.
    Frank E., Wang Y., Inglis S., Holmes G., and Witten IH (1998). Technical note: Using model trees for classification. Machine Learning, 32, 63-76.
    Guo, G. D., Dyer, C. R. (2005). Learning from examples in the small sample case: Face expression recognition.. IEEE Transactions on Systems Man and Cybernetics Part B-Cybernetics, 35(3), 477-488.
    Hong, T. P., Tseng, L. H., and Chien, B. C. (2010). Mining from incomplete quantitative data by fuzzy rough sets. Expert Systems with Applications, 37(3), 2644-2653.
    Huang, C. F., Moraga, C. (2004). A diffusion-neural-network for learning from small samples. International Journal of Approximate Reasoning, 35, 137-161.
    Huang, C. J., Wang, H. F. (2010). Prediction of the Period of Psychotic Episode in Individual Schizophrenics by Simulation-Data Construction Approach.
    Huang, C.F. (1997). Principle of information. Fuzzy Sets and Systems. 91,69-90.
    Ivănescu, V. C., Bertrand, J. W. M., Fransoo, J. C., and Kleijnen, J. P. C. (2006). Bootstrapping to solve the limited data problem in production control: an application in batch process industries. Journal of the Operational Research Society, 57(1), 2-9.
    Jang, J.S.R. (1993). ANFIS: adaptive-network-based fuzzy inference systems. IEEE Transactions on Systems, Man and Cybernetics. 23, 665-685.
    Jennrich, R. I, Schluchter, M. D. (1986). Unbalanced repeated-measures models with structured covariance matrices. Biometrics, 42, 805-820.
    Karalic, A. (1992). Employing linear regression in regression tree leaves, Proceeding of the 10th European Conference on Artificial Intelligence, 440-441.
    Kuo, Y., Yang, T., Peters, B. A., and Chang, I. (2007). Simulation metamodel development using uniform design and neural networks for automated material handling systems in semiconductor wafer fabrication. Simulation Modelling Practice and Theory, 15(8), 1002-1015.
    Laird, N. M., Ware, J. H. (1982). Random-effects models for longitudinal data. Biometrics, 38, 963-974.
    Lanouette, R., Thibault, J., and Valade, J. L. (1999). Process modeling with neural networks using small experimental datasets. Computers & Chemical Engineering, 23(9), 1167-1176.
    Li, D. C., Chen, L. S., and Lin, Y. S. (2003). Using functional virtual population as assistance to learn scheduling knowledge in dynamic manufacturing environments. International Journal of Production Research 41(17), 4011-4024.
    Li, D. C., Fang, Y. H., Lai, Y. Y., and Hu, S. C. (2009). Utilization of virtual samples to facilitate cancer identification for DNA microarray data in the early stages of an investigation. Information Sciences, 179(16), 2740-2753.
    Li, D. C., Hsu, H. C., Tsai, T. I., Lu, T. J., and Hu, S. C. (2007a). A new method to help diagnose cancers for small sample size. Expert Systems with Applications, 33 420-424.
    Li, D. C., Lin, Y. S. (2006). Using virtual sample generation to build up management knowledge in the early manufacturing stages. European Journal of Operational Research, 175 413-434.
    Li, D. C., Liu, C. W., and Hu, S. C. (2010). A learning method for the class imbalance problem with medical data sets. Computers in Biology and Medicine, 40(5), 509-518.
    Li, D. C., Tsai, T. I., and Shi, S. (2009). A prediction of the dielectric constant of multi-layer ceramic capacitors using the mega-trend-diffusion technique in powder pilot runs: Case study. International Journal of Production Research 19, 51-69.
    Li, D. C., Wu, C., and Chen, F. M. (2005). Using Data-fuzzification Technology in Small Data Set Learning to Improve FMS Scheduling Accuracy. International Journal of Advanced Manufacturing Technology 27, 321-328.
    Li, D. C., Wu, C., Tsai, T. I., and Chang, F. M. (2006). Using Mega-Fuzzification and Data Trend Estimation in Small Data Set Learning for Early FMS Scheduling Knowledge. Computers & Operations Research 33, 1857-1869.
    Li, D. C., Wu, S. S., Tsai, T. I., and Lina, Y. S. (2007b).Using mega-trend-diffusion and artificial samples in small data set learning for early flexible manufacturing system scheduling knowledge, Computers and Operations Research, 34, 966-982.
    Lin, Y. S. and Li, D. C., 2010. The generalized-trend-diffusion modeling algorithm for small data sets in the early stages of manufacturing systems. European Journal of Operational Research, 207(1), 121-130.
    Loh, W.Y. (2002). Regression trees with unbiased variable selection and interaction detection. Statistica Sinica, 12, 361-386.
    Niyogi,P., Girosi, F., Tomaso, P. (1998). Incorporating prior information in machine learning by creating virtual examples. Proceeding of the IEEE 275-298.
    Oniśko, A., Druzdzel, M. J., and Wasyluk, H. (2001). Learning Bayesian network parameters from small data sets: application of Noisy-OR gates.International Journal of Approximate Reasoning, 27(2), 165-182.
    Quinlan, J. R. (1979). “Discovering rules from large collections of examples:a case study.” In Michie, D., editor , Expert System in the Microelectronic Age. Edinburgh Scotland: Edinburgh University Press.
    Quinlan, J. R. (1992). Learning with continuous classes. Proceedings of the Australian Joint Conference on Artificial Intelligence, 343-348.
    Quinlan, J. R. (1993). “C4.5:Programs for Machine Learning.” San Mateo CA :Morgan Kaufmann.
    Roiger, R. J., Geatz, M. W. (2003). Data Mining:A Tutorial-Based Primer. New York:Addison Wesley.
    Thomas, M., Kanstein, A., and Goser, K. (1997). Rare fault detection by possibilistic reasoning. Computational Intelligence - Theory and Applications, 1226, 294-298.
    Tsai, T. I. and Li, D. C. , 2008. Approximate modeling for high order non.-linear functions using small sample sets. Expert Systems with Applications, 34(1), 564-569.
    Vapnik, V. N. (2000). The Nature of Statistical Learning Theory, Springer-Verlag, New York.
    Wang, H. F., Huang, C. J. (2008). Data construction method for the analysis of the spatial distribution of disastrous earthquakes in Taiwan. 189~212.
    Wang, Y., Witten, I.H. (1997). Inducing model trees for continuous classes, Proceedings of poster papers of the 9th European Conference on Machine Learning.
    Witten, I. H., Frank, E. (2005). Data Mining:Practical Machine Learning Tools and Techniques (2nd ed.). San Francisco:Morgan Kaufmann.
    Wolpert, D. H. (1992). STACKED GENERALIZATION. Neural Networks, 5(2), 241-259.
    Yen, S. M. F., Hsu, Y. L. (2010). Profitability of technical analysis in financial and commodity futures markets - A reality check. Decision Support Systems, 50(1), 128 139.

    下載圖示 校內:2012-07-25公開
    校外:2016-07-25公開
    QR CODE