成功大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	吳則澍 Wu, Tse-Shu
論文名稱：	基於名目屬性之虛擬樣本產生法 Virtual Sample Generation Based on Nominal Attributes
指導教授：	利德江 Li, Der-Chiang
學位類別：	碩士 Master
系所名稱：	管理學院 - 工業與資訊管理學系 Department of Industrial and Information Management
論文出版年：	2016
畢業學年度：	104
語文別：	中文
論文頁數：	45
中文關鍵詞：	小樣本學習、虛擬樣本產生、名目屬性
外文關鍵詞：	small dataset learning, virtual sample generation, nominal attributes
相關次數：	點閱：221 下載：2
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

隨著全球化競爭日益激烈，科技產品生命週期普遍縮短，透過減少試產階段的時間與成本係增加企業競爭力的方法之一，但同時也會導致小樣本學習問題。小樣本學習對於製造系統初期有著至關眾大的影響，然而一般的統計方法在遭遇樣本數量過少時並無法直接對其進行有效的分析與解釋。為了解決此問題，虛擬樣本產生法應運而生，而且已經被證實可以有效地克服小樣本學習問題，無論是在機器學習領域或是實務界的運用皆可看到其蹤影。本論文中同樣是基於虛擬樣本產生的概念，提出一針對名目屬性之新式虛擬樣本產生法，以觀測名目屬性值之出現次數，搭配模糊隸屬函數進行母體值域推估；此方法不同於以往虛擬樣本產生法需假設屬性間相互獨立以及僅能處理數值屬性的限制，更凸顯了其普遍性。研究中分別以純名目屬性資料集以及混合屬性資料集進行平均絕對誤差和分類準確率的評比，實驗結果顯示可有效地降低數值預測問題之誤差與提升分類問題之正確率，並達到統計上的顯著，說明了本研究方法確實對應小樣本學習有更佳的表現。

As the global competition getting more and more intense, it also leads to the shorter product life cycle. Reducing the time and cost of pilot-run can enhance the competitive ability of enterprises effectively, somehow the small dataset learning problems will also occur as the same time. There exists no appropriate statistics tool to evaluate the population when the sample size is too small, but we can fix the problem through virtual sample generation methods, which is widely used in industry and machine learning area. There are very few studies deal with nominal attributes due to the limit on domain estimation methods, therefore, this paper proposes a method that generate virtual sample based on the discrete degree of nominal attributes, then estimate the general population domain by fuzzy membership function. Two learning models will be used to test the efficiency of proposed method, including backpropagation neural network and support vector regression, and then the Wilcoxon-sign test will be used to test the difference with raw dataset. The result shows that the propose method can reduce the mean absolute error (MAE) as well as enhance classification accuracy by generating nominal virtual samples.

摘要	I
英文延伸摘要	II
誌謝	VIII
目錄	IX
表目錄	XI
圖目錄	XII
第一章 緒論	1
1 研究背景	1
2 研究動機	4
3 研究目的	5
4 研究假設	5
5 研究架構與流程	5
第二章 文獻探討	8
1 小樣本學習方法	8
1.1 資訊擴散	8
1.2 其他虛擬樣本產生法	13
2 關聯分析	14
2.1 列聯相關係數	15
2.2 費雪精確檢定	16
3 預測模型	16
3.1 倒傳遞類神經網路	17
3.2 支援向量迴歸	18
第三章 研究方法	20
1 符號定義	20
2 屬性獨立性考驗	20
3 名目屬性之值域推估	21
3.1 名目屬性量化法	22
3.2 屬性值域推估	22
4 研究方法整體流程	28
第四章 實例驗證	33
1 實驗環境	33
1.1 分類模式建構軟體	33
1.2 實驗方式	33
1.3 實驗之評估指標	34
1.4 實驗之假設檢定	34
2 個案說明	35
2.1 個案一：Servo資料集	35
2.2 個案二：Credit資料集	35
3 實驗結果	35
3.1 個案一：Servo資料集實驗結果	35
3.2 個案二：Credit資料集實驗結果	38
第五章 結論與建議	41
1 結論	41
2 後續研究建議	42
參考文獻	43


                                    

洪書帆 (2010)。以潛在樣本提升小樣本學習之正確性。碩士論文。國立成功大學工業與資訊管理學系。
Cortes, C., & Vapnik, V. (1995). Support-Vector Networks. Machine Learning,20(3), 273-297.
Efron, B., & Tibshirani, R. J. (1993). An Introduction to the Bootstrap:New York: Chapman & Hall.
Fisher, R. A. (1935). The design of experiments (1966). Oliver and Boyd, London.
Huang, C. F. (1997). Principle of information diffusion. Fuzzy Sets and Systems, 91(1), 69-90.
Huang, C. F., & Moraga, C. (2004). A diffusion-neural-network for learning from small samples. International Journal of Approximate Reasoning, 35(2), 137-161.
Ivănescu, V. C., Bertrand, J. W. M., Fransoo, J. C., & Kleijnen, J. P. C. (2006). Bootstrapping to solve the limited data problem in production control: an application in batch process industries. Journal of the Operational Research Society, 57(1), 2-9.
Jang, J. S. R. (1993). ANFIS: adaptive-network-based fuzzy inference system. IEEE Transactions on Systems, Man and Cybernetics, 23(3), 665-685.
Li, D. C., Chen, C. C., Chang, C. J., & Chen, W. C. (2012b). Employing Box-and-Whisker plots for learning more knowledge in TFT-LCD pilot runs. International Journal of Production Research, 50(6), 1539-1553.
Li, D.C., Chen, L.S., Lin, Y.S, (2003). Using Functional Virtual Population as assistance to learn scheduling knowledge in dynamic manufacturing environments. International Journal of Production Research, 41(17), 4011-4024.
Li, D.C., Hsu, H.C., Tsai, T.I., Lu, T.J., & Hu, S.C. (2007a). A new method to help diagnose cancers for small sample size. Expert Systems with Applications, 33(2), 420-424.
Li, D. C., Huang, W. T., Chen, C. C., & Chang, C. J. (2014). Employing box plots to build high-dimensional manufacturing models for new products in TFT-LCD plants. Neurocomputing, 142(0), 73-85.
Li, D.C., and Lin, Y.S. (2006a). Learning management knowledge for manufacturing systems in the early stages using time series data. European Journal of Operational Research, 184(1), 169-184.
Li, D.C., and Liu, C.W. (2012a). Extending Attribute Information for Small Data Set Classification. IEEE Transactions on Knowledge and Data Engineering, 24(3), 452-464.
Li, D.C., Liu, C.W., & Hu, S.C. (2011). A fuzzy-based data transformation for feature extraction to increase classification performance with small medical data sets. Artificial Intelligence in Medicine, 52, 45-52.
Li, D.C., and Wen. I.H. (2014). A genetic algorithm-based virtual sample generation technique to improve small data set learning. Neurocomputing, 143, 220-230.
Li, D.C., Wu, C.S., & Chang, F.M. (2005). Using data-fuzzification technology in small data set learning to improve FMS scheduling accuracy. International Journal of Advanced Manufacturing Technology, 27(3), 321-328.
Li, D.C., Wu, C.S., Tsai, T.I., & Chang, F.M. (2006b). Using mega-fuzzification and data trend estimation in small data set learning for early FMS scheduling knowledge. Computers & Operations Research, 33(6), 1857-1869.
Li, D.C., Wu, C.S., Tsai, T.I., & Lina, Y.S. (2007b). Using mega-trend-diffusion and artificial samples in small data set learning for early flexible manufacturing system scheduling knowledge. Computers & Operations Research, 34(4), 966-982.
Li, D.C. and Yeh, C.W. (2013). A non-parametric learning algorithm for small manufacturing data sets. Expert Systems with Applications, 34, 391-398.
M. Kudo and J. Sklansky. (2000). Comparison of algorithms that select features for pattern classifiers. Pattern Recognition, 33, 25-41
Niyogi, P., Girosi, F., & Poggio, T. (1998). Incorporating prior information in machine learning by creating virtual examples. Proceedings of the IEEE, 86(11), 2196-2209.
Wang, H. F., & Huang, C. J. (2009). Data construction method for the analysis
Of the spatial distribution of disastrous earthquakes in Taiwan. International Transactions in Operational Research, 16(2), 189-212.
Wang, Y. F. (2003). On-demand forecasting of stock prices using a real-time predictor. IEEE Transactions on Knowledge and Data Engineering, 15(4), 1033-1037.

2019-07-01公開

簡易檢索 / 詳目顯示

相關論文