簡易檢索 / 詳目顯示

研究生: 趙展祥
Chao, Chan-Hsiang
論文名稱: 應用理論樣本改善小樣本學習之效果
Employing Theoretical Samples to Improve Learning Performance of Small Data Sets
指導教授: 利德江
Li, Der-Chiang
學位類別: 碩士
Master
系所名稱: 管理學院 - 工業與資訊管理學系碩士在職專班
Department of Industrial and Information Management (on the job class)
論文出版年: 2011
畢業學年度: 99
語文別: 中文
論文頁數: 43
中文關鍵詞: 小樣本學習虛擬樣本
外文關鍵詞: small data learning, virtual samples
相關次數: 點閱:90下載:5
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 小樣本學習問題可常見於系統建置初期以及囿於資料取得性等環境因素,但卻需對所獲得之少量樣本資料進行資訊擷取或預測模式建構時。由於大多數之機械學習演算法以及統計方法之迴歸模式,對於用以學習之訓練資料筆數均有其不同之基本要求,亦即當樣本數過於稀少時,往往無法產出有效且穩建之預測模式。本篇論文從訓練樣本數增量為研究方向,提出一個兩階段的虛擬樣本產生方法,先藉由M5'模式樹之資料分割準則,將取得之小樣本資料集依據其輸出屬性值進行分切,而使輸出屬性值相近者歸於同一區塊(或稱葉部節點),再藉由各屬性之期望值計算,透過排列組合方式產出虛擬樣本,稱之理論樣本。實驗結果顯示,當小樣本資料集加入理論樣本後,所建構之預測模式不但較僅使用小樣本資料集所建構之預測模式來得穩健、有較佳的預測效果,並可習得更多資訊;此外,本研究亦以整體趨勢擴散技術為比較標竿,而結果亦顯示效果確比該方法應用於Li等人於2007年所提之個案為佳。

    Small-data-set-learning problems can be observed at the beginning stage of systems or when data is difficult to be collected. However, most learning algorithms which can afford to provide robust and precise forecasting models for users are based on the assumption that training sample size is large enough to meet their minimum required one. For this reason, a two-stage procedure is proposed to generate more training samples for these learning tools to improve their performance when data size is small. According to the splitting criterion of M5’ model tree, a training data set is partitioned into several subsets to learning prior knowledge. Based on the prior knowledge, more training samples which we call theoretical ones can be created by a permutation way. The outcomes in experiments show that the forecasting models built with the training sets containing theoretical samples can provide more robust and precise results than the models built without theoretical ones. In addition, a virtual sample generation algorithm, mega-trend-diffusion (MTD) technique, is taken as benchmark in this paper. The result also shows that our procedure can outperform MTD in the case proposed by Li et al., (2007).

    摘要 I Abstract I 致謝 II 目錄 III 圖目錄 VI 表目錄 VII 第一章 緒論 1 1.1 研究背景與動機 1 1.2 研究目的 2 1.3 研究流程 2 第二章 文獻探討 4 2.1 虛擬樣本 4 2.1.1 資訊擴展 4 2.1.2 其他虛擬樣本產生方法 5 2.2 小樣本學習的相關方法 6 2.2.1 線性混合模式 6 2.2.2 貝氏網路 7 2.3 預測模型 7 2.3.1 M5'模式樹 8 2.3.2 倒傳遞類神經網路 10 2.3.3 支撐向量迴歸 11 2.3.4 多元線性迴歸 13 2.3.5 C4.5決策樹 13 2.4 小結 14 第三章 研究方法 15 3.1 事前知識取得 15 3.1.1 資料前處理 15 3.1.2 資料分割 16 3.2 產生理論樣本 16 3.2.1 取得屬性可能值 16 3.2.2 排列組合準則 18 3.3 預測模型之建構 18 3.3.1 M5'模式樹演算法 18 3.3.2 倒傳遞類神經網路 18 3.3.3 支撐向量迴歸 20 3.3.4 多元線性迴歸 22 3.3.5 決策樹 22 3.4 實驗方式 24 3.4.1 評估方式 24 3.4.2 預測誤差評估指標 25 3.4.3 假設檢定 25 3.4.4 分類預測問題轉換 26 3.4.5 軟體選擇 27 第四章 實例驗證 28 4.1 軟體之參數設定 28 4.2 資料蒐集與說明 28 4.3 實驗結果 29 4.3.1 UCI資料實驗結果 29 4.3.2 文獻方法比較 36 第五章 結論與建議 38 參考文獻 40

    陳子立 (民92) 。結合特徵選取與判定係數以建構模式樹之方法。國立成功大學工業管理科學研究所碩士論文。
    黃漢申 (民91) 。從稀少資料學習: 一個貝氏網路參數學習的方法。國立臺灣大學資訊工程學研究所博士論文。
    葉怡成 (民92) 。類神經網路模式應用與實作 (8版)。台北市:儒林圖書有限公司。
    Anthony, M., & Biggs, N. (1992). Computational Learning Theory. Cambridge: Cambridge University Press.
    Breiman, L., Friedman, J.H., Olshen, R.A., & Stone, C.J. (1984). Classification and Regression Trees. Belmont, CA:Wadsworth International Group.
    Dempster, A.P., Laird, N.M., & Rubin, D.B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, 39,1-38.
    Dobra, A., & Gehrke, J.E. (2002). SECRET: A scalable linear regression tree algorithm. Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining(pp. 481-487). Canada: Association for Computing Machinery.
    Drucker, H., Burges, C. J. C., Kaufman, L., Smola, A., and Vapnik, V. (1997). Support vector regression machines, in: M. Mozer, M. Jordan, and T. Petsche (Eds.), Advances in Neural Information Processing Systems, vol 9. Cambridge, MA: MIT Press, 1997, pp. 155-161.
    Frank E., Wang Y., Inglis S., Holmes G., and Witten IH (1998). Technical note: Using model trees for classification. Machine Learning, 32, 63-76.
    Gunn, Steve R., (1988). Support Vector Machine for Classification and Regression, Technical Report, Faculty of Engineering and Applied Science, Department of Electronics and Computer Science, University of Southampton.
    Harville, D. (1977). Maximum likelihood approaches to variance component estimation and to related problems. Journal of the American Statistical Association, 72, 320-338.
    Huang, C. F., & Moraga, C. (2004). A diffusion-neural-network for learning from small samples. International Journal of Approximate Reasoning, 35, 137-161.
    Harville, D. A., & Mee, R. W. (1984). A mixed-model procedure for analyzing ordered categorical data. Biometries, 40, 393-408.
    Huang, C.F. (1997). Principle of information. Fuzzy Sets and Systems, 91, 69-90.
    Jang, J.-S. R. (1993). ANFIS: adaptive-network-based fuzzy inference systems. IEEE Transactions on Systems, Man and Cybernetics, 23, 665-685.
    Jennrich, R. I., & Schluchter, M. D. (1986). Unbalanced repeated-measures models with structured covariance matrices. Biometrics, 42, 805-820.
    Karalic, A. (1992). Employing linear regression in regression tree leaves, Proceeding of the 10th European Conference on Artificial Intelligence, 440-441.
    Laird, N. M., & Ware, J. H. (1982). Random-effects models for longitudinal data. Biometrics, 38, 963-974.
    Li, D. C., Chen, L. S., & Lin, Y. S. (2003). Using functional virtual population as assistance to learn scheduling knowledge in dynamic manufacturing environments. International Journal of Production Research, 41(17), 4011-4024.
    Li, D. C., Hsu, H. C., Tsai, T. I., Lu, T. J., & Hu, S. C. (2007). A new method to help diagnose cancers for small sample size. Expert Systems with Applications, 33 420-424.
    Li, D. C., & Lin, Y. S. (2006). Using virtual sample generation to build up management knowledge in the early manufacturing stages. European Journal of Operational Research, 175, 413-434.
    Li, D. C., Wu, C., & Chen, F. M. (2005). Using data-fuzzification technology in small data set learning to improve FMS scheduling accuracy. International Journal of Advanced Manufacturing Technology, 27, 321-328.
    Li, D. C., Wu, C., Tsia, T. I., & Chang, F. M. (2006). Using mega-fuzzification and data trend estimation in small data set learning for early FMS scheduling knowledge. Computers & Operations Research, 33, 1857-1869.
    Li, D. C., Wu, S. S., Tsai, T. I., & Lina, Y. S. (2007).Using mega-trend-diffusion and artifical samples in small data set learning for early flexible manufacturing system scheduling knowledge. Computers and Operations Research, 34, 966-982
    Loh, W.Y. (2002). Regression trees with unbiased variable selection and interaction detection. Statistica Sinica, 12, 361-386.
    Niyogi, P., Girosi, F., & Tomaso, P. (1998). Incorporating prior information in machine learning by creating virtual examples. Proceeding of the IEEE, 275-298.
    Quinlan, J. R. (1979). “Discovering rules from large collections of examples:a case study.” In Michie, D., editor , Expert System in the Microelectronic Age. Edinburgh Scotland: Edinburgh University Press.
    Quinlan, J.R., (1986). Induction of Decision Trees, Machine Learning, Vol. 1, 81-106
    Quinlan, J. R. (1992). Learning with continuous classes. Proceedings of the Australian Joint Conference on Artificial Intelligence, 343-348.
    Quinlan, J. R. (1993). “C4.5:Programs for Machine Learning.” San Mateo CA :Morgan Kaufmann.
    Ribeiro, B. (2002). On data based learning using support vector clustering. Proceeding of the 9th International Conference on Neural Information Processing (ICONIP’02), 5, 2516-2521.
    Roiger, R. J., & Geatz, M. W. (2003). Data mining: A Tutorial-based Primer. New York : Addison Wesley.
    Schervish, M. (1995). Probabilistic inference and inference diagrams. Operations Research, 36, 589-604.
    Sommer, A., Katz, J., & Tanvotjo, I. (1983). Increased mortality in children with mild vitamin a deficiency. American Journal of Clinical Nutrition, 40, 1090- 1095.
    Vapnik, V. N. (1995). The Nature of Statistical Learning Theory, Springer-Verlag, New York.
    Williams, D. A. (1982). Extra-binomial variation in logistic linear models, Applied Statistics, 31, 144-148.
    Wang, Y., & Witten, I.H. (1997). Induction of model trees for predicting continuous classes. Proceedings of the poster papers of the European Conference on Machine Learning, 128-137.
    Zeger, S., & Karim, R. (1991). Generalized linear models with random effects. Journal of the American Statistical Association, 86, 79-86.

    下載圖示 校內:2016-02-16公開
    校外:2016-02-16公開
    QR CODE