| 研究生: |
洪秀芳 Hung, Meafen |
|---|---|
| 論文名稱: |
A Meta-Learning Method to Learn
from Small Datasets A Meta-Learning Method to Learn from Small Datasets |
| 指導教授: |
利德江
Li, De-Jiang |
| 學位類別: |
碩士 Master |
| 系所名稱: |
管理學院 - 資訊管理研究所 Institute of Information Management |
| 論文出版年: | 2005 |
| 畢業學年度: | 93 |
| 語文別: | 英文 |
| 論文頁數: | 35 |
| 中文關鍵詞: | 資料特性 、機器學習 、簡易貝氏分類器 |
| 外文關鍵詞: | meta-learning, small dataset learning, naïve Bayes, characterization of datasets, machine learning |
| 相關次數: | 點閱:82 下載:1 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
none
The nature of survival suggests that learning from fewer examples is often important, but machine learning has not yet learned well from small datasets. In contrast, human beings often learn well from very small examples, even if the number of potentially features is large. To do so, they successfully use previously learned concepts to improve performance of the current task.
This thesis is an approach to develop a classifier for a small dataset using other datasets and their learning result. Meta-learning aims at how learning systems can increase performance through experience, but researches in characterization of datasets are still lack. We propose a measurement to select a support dataset, a process to acquire prior knowledge and apply it to new learning problem to improve performance. A proper characterization of datasets to match naïve Bayes algorithm is key to the research.
Balakrishnan, N. & Nevzorov, V. B. (2003). A primer on statistical distributions. John Wiley & Sons.
Bauer, E. & Kohavi, R. (1999). An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Machine Learning 36, 1-2, pp.105–139.
Blake, C. & Merz, C. (1998). UCI Repository of machine learning databases.
Buckley, A. G.. (1976). Constrained Minimization Using Powell's Conjugacy Approach. SIAM Journal on Numerical Analysis, Vol. 13, No. 4. pp. 520-535.
Castillo, E., Hadi, A. S. & Solares, C. (1997). Learning and Updating of Uncertainty in Dirichlet Models. Machine Learning, 26, pp.43-63.
Clark, P., & Niblett, T. (1989). The CN2 induction algorithm. Machine Learning, 3, pp.261-284.
Cheeseman, P. & Stutz, J. (1995). Bayesian classification (AutoClass): Theory and results. In Fayyad, U., Piatesky-Shapiro, G., Smyth, P., and Uthurusamy, R., editors, Advances in Knowledge Discovery and Data Mining, AAAI Press, Menlo Park, CA, pp. 153-180.
Domingos, P. & Pazzani, M. (1997). On the Optimality of the Simple Bayesian Classifier under Zero-One Loss. Machine Learning 29, 103-130.
Dougherty, J., Kohavi, R., & Sahami, M. (1995). Supervised and unsupervised discretization of continuous features. Machine Learning: Proceedings of the 12th International Conference (ML ’95). San Francisco, CA, Morgan Kaufmann.
Duda, R. O. & Hart P. E. (1973). Pattern Classification and Scene Analysis. New York:Wiley and Sons.
Fahlman, S. & Lebiere, C. (1990). The cascade-correlation learning architecture. Advances in Neural Information Processing Systems 2, Morgan Kaufmann, Vol.2, pp.524-532.
Heckerman, D., Geiger, D. & Chickering, D. (1995). Learning Bayesian Networks: The combination of knowledge and Statistical Data. Machine Learning, 20, pp.197-243.
Hsu, C.-N., Huang, H.-J. & Wong, T.-T. (2003). Implications of the Dirichlet assumption for discretization of continuous variables in naïve Bayesian classifiers. Machine Learning 53, 3, pp.235-263.
Huang, H.-J. & Hsu, C.-N. (2002). Bayesian classification for data from the same unknown class. IEEE Transactions on Systems, Man, and Cybernetics, Part B 32(2), pp. 137-145.
Jones, D. S. (1979). Elementary Information Theory. Clarendon Press, Oxford.
Langley & Thompson (1992). An analysis of Bayesian classifier. Proceedings of the 10th National Conference on artificial intelligence. Portland, OR, AAAI Press, pp. 223–228
Lindley, D. V. (1997). The choice of sample size. Statistician, 46, pp.129-138.
Linder, C. & Studer, R. (1999). AST: Support for Algorithm Selection with a CBR Approach. Proceedings of the 16th International Conference on Machine Learning, Workshop on Recent Advances in Meta-Learning and Future Work.
Kuba, P., Brazdil, P., Soares, C. & Woznica, A. (2002). Exploiting sampling and meta-learning for parameter setting for multilayer perceptron on regression tasks. Technical report, LIACC, University of Porto / Masaryk University, Brno,.
Maron, M. (1961). Automatic indexing: An experimental inquiry. Journal of the Association for Computing Machinery 8, pp.404–417.
Michie, D., Spiegelhalter, D. & Taylor, C. (1994). Machine learning, neural and statistical classification. Ellis Horwood Series in Artificial Intelligence, New York, NY.
Mitchell, T. (1997). Machine Learning. MacGraw-Hill.
Pfahringer, B., Bensusan, H. & Giraud-Carrier, C. (2000). Tell me who can learn you and I can tell you who you are: Landmaking various learning algorithms. Proceedings of the 17th International Conference on Machine Learning, Morgan Kaufman, pp. 743-750.
Pratt, L. (1993). Experiments in the transfer of knowledge between neural networks. In Hanson, S., Drastal, G., and Rivest, R., editors, Computational Learning Theory and Natural Learning Systems, Constraints and Prospects, chapter 4.1. MIT Press.
Pratt, L. (1994). Non-literal Transfer Among Neural Network Learners, Artificial Neural Networks for Speech and Vision, Chapman & Hall, pp. 143-169.
Pratt L. & Thrun, S. (1997). Second Special Issue on Inductive Transfer. Machine Learning, 28.
Press, W. H., Teukolsky, S. A., Vetterling, W. T. & Flannery B. P. (1992). Numerical Recipes in C: The Art of Scientific Computing. London: Cambridge University Press.
Shannon, C. E. & Weaver, W. (1949). The mathematical theory of communication. University of Illinois Press, Urbana.
Soares, C., Brazdil, P. & Kuba, P. (2004). A meta-learning approach to select the kernel width in support vector regression, Machine Learning, 54:3, pp.195-209.
Sohn, Y. (1999). Meta analysis of classification algorithms for pattern recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21, pp.1137-1144.
Steck, H. & Jaakkola T. (2003). On the Dirichlet Prior and Bayesian Regularization. Suzanna Becker, Sebastian Thrun, Klaus Obermayer (Eds.): Advances in Neural Information Processing Systems 15, NIPS 2002, MIT Press, pp.697-704.
Stigler, S. M. (1982). Thomas Bayes’s Bayesian Inference. Journal of the Royal Statistical Society, Ser. A, 145, pp.250-258.
Webb, G. I. (2000). Multiboosting: A technique for combining boosting and wagging. Machine Learning 40, 2, pp.159–196.
Wilks, S. (1962). Mathematical Statistics. New York: Wiley and Sons.
Zabell, S. L. (1992). R. A. Fisher and the Fiducial Argument. Statistical Science, 7, pp.369-387.