| 研究生: |
鍾宜珍 Chung, Yi-Chen |
|---|---|
| 論文名稱: |
基於模糊推論系統發展合成屬性以學習小樣本預測資料 Based on Fuzzy Inference System to Develop Synthetic Attributes to Learn Predictive Model of Small Datasets |
| 指導教授: |
利德江
Li, Der-Chiang |
| 學位類別: |
碩士 Master |
| 系所名稱: |
管理學院 - 工業與資訊管理學系 Department of Industrial and Information Management |
| 論文出版年: | 2017 |
| 畢業學年度: | 105 |
| 語文別: | 英文 |
| 論文頁數: | 44 |
| 中文關鍵詞: | 小樣本學習 、合成屬性 、模糊分群 、整體趨勢擴散 、模糊推論系統 |
| 外文關鍵詞: | small dataset learning, synthetic attribute, fuzzy clustering, mega-trend-diffusion, fuzzy inference system |
| 相關次數: | 點閱:172 下載:2 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
近年來隨著社群網路的崛起,冀望從其產生的龐大資料中擷取有效且有意義資訊的需求應運而生,使得分析方法逐漸由過往的統計方法轉向資料探勘與機械學習演算法。然而,當樣本數量非常小時,無論是統計方法或是學習演算法均無法從中學習有意義之資訊。過往求解小樣本學習問題的方法,常見者有運算核心的改善、虛擬樣本的生成、人造屬性的合成等,其中虛擬樣本以及人造屬性均屬於知識發現流程中的資料前處理範疇。
本研究擬提出一個系統性的屬性合成方法,以建立小樣本的預測模型。其流程包含三個主要步驟:1. 先透過模糊分群法並配合模糊側影係數衡量最適分群數。2.使用整體趨勢擴散技術推估各屬性於各群之三角隸屬函數。3. 使用三角隸屬函數以及模糊分群權重建置本研究基於模糊推論系統架構所修訂之網路結構,並逐筆學習,以合成各屬性值屬於各群的歸屬資訊做為新屬性之值。本研究使用倒傳遞類神經網路能建立小樣本的預測模型。實驗階段,使用薄膜電晶體液晶顯示器與混凝土攤度試驗資料驗證在倒傳遞類神經網路中,本研究預測模型的正確性有所提升。
In recent years, owing to the rise of social media, the demand of extracting meaningful information from big data makes the analysis methods changed from statistics to machine learning algorithms. However, it is hard for these approaches to learn useful information from small data sets. Some methods were proposed to overcome that problem, such as improving the computing kernel, creating more samples, and generating synthetic attributes, where the last two approaches are one kind of the data preprocessing techniques.
In this study, a systematical procedure that generates synthetic attributes is proposed to extend the information of small data to help learning algorithms build more precise and robust models, where the method contains three processes. First, it employs the fuzzy C-means with the fuzzy silhouette coefficients to identify the possible distributions of small data. Then, it adopts the mega-trend-diffusion technique to estimate the triangular membership functions (MFs). Finally, it uses the MFs to the proposed network which bases on the fuzzy inference system to generate synthetic attributes. In the experiment, this study uses two real cases. One data set is the thin film transistor liquid crystal display panel, the other is concrete slump test, and the results show that the proposed method improves the accuracy of the back-propagation-neural-networks.
Bezdek, J. C. (1973). Fuzzy Mathematics in Pattern Classification. (PhD Thesis), Cornell University, Ithaca, NY.
Campello, R. J. G. B., & Hruschka, E. R. (2006). A fuzzy extension of the silhouette width criterion for cluster analysis. Fuzzy Sets and Systems, 157(21), 2858-2875.
Gomez, G., & Morales, E. F. (2002). Automatic Feature Construction and a Simple Rule Induction Algorithm for Skin Detection. Proc. ICML Workshop Machine Learning in Computer Vision, pp. 31--38.
Huang, C. F. (1997). Principle of information. Fuzzy Sets and Systems, 91(1), 69-90.
Huang, C. F., & Moraga, C. (2004). A diffusion-neural-network for learning from small samples. International Journal of Approximate Reasoning, 35(2), 137-161.
Kohavi, R., & John, G. H. (1997). Wrappers for feature subset selection. Artif. Intell., 97(1-2), 273-324.
Li, D. C., Chang, C. C., & Liu, C. W. (2012). Using structure-based data transformation method to improve prediction accuracies for small data sets. Decision Support Systems, 52(3), 748-756.
Li, D. C., Chang, C. C., Liu, C. W., & Chen, W. C. (2013). A new approach for manufacturing forecast problems with insufficient data: the case of TFT–LCDs. Journal of Intelligent Manufacturing, 24(2), 225-233.
Li, D. C., & Liu, C. W. (2012). Extending Attribute Information for Small Data Set Classification. Ieee Transactions on Knowledge and Data Engineering, 24(3), 452-464.
Li, D. C., Wu, C. S., Tsai, T. I., & Lina, Y. S. (2007). Using mega-trend-diffusion and artificial samples in small data set learning for early flexible manufacturing system scheduling knowledge. Computers & Operations Research, 34(4), 966-982.
Matheus, C. J., & Rendell, L. A. (1989). Constructive induction on decision trees. Paper presented at the Proceedings of the 11th international joint conference on Artificial intelligence - Volume 1, Detroit, Michigan.
Pagallo, G. (1989). Learning DNF by decision trees. Paper presented at the Proceedings of the 11th international joint conference on Artificial intelligence - Volume 1, Detroit, Michigan.
Rousseeuw, P. J. (1987). Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20, 53-65.
Sugeno, M., & Tanaka, K. (1991). Successive identification of a fuzzy model and its applications to prediction of a complex system. Fuzzy Sets and Systems, 42(3), 315-334.
Zadeh, L. A. (1965). Fuzzy sets. Information and Control, 8(3), 338-353.