簡易檢索 / 詳目顯示

研究生: 毛紹睿
Mao, Shao-Rui
論文名稱: 由已訓練類神經網路劫取成本敏感之分類規則
Extracting cost sensitive rule from trained neural network
指導教授: 黃宇翔
Huang, Yu-Xiang
學位類別: 碩士
Master
系所名稱: 管理學院 - 資訊管理研究所
Institute of Information Management
論文出版年: 2004
畢業學年度: 92
語文別: 中文
論文頁數: 73
中文關鍵詞: 類神經網路規則擷取分類錯誤成本
外文關鍵詞: Misclassification Cost, Neural Network, Rule Extraction
相關次數: 點閱:127下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  •   類神經網路為處理資料探勘問題的技術之一,其學習結果通常有較高的正確率,且對於存有雜訊的資料有較好的容錯能力,其網路架構也能夠表達屬性間複雜的關係。然而其學習結果為一黑箱,對於使用者缺乏解釋能力,使得類神經網路在應用上受到一定程度的限制。本研究透過規則歸納演算法由已訓練類神經網路中擷取出明確的規則,用以解釋類神經網路的學習結果,且所提出之規則擷取架構將能夠適用於不同的類神經網路模式中。並於規則擷取的過程考量分類錯誤成本(Misclassification Cost)的影響,使所擷取之規則能反應不同類別的分類錯誤成本更能符合實務上的需要。本研究架構以Cendrowska所提出之PRISM演算法為規則擷取基礎,分別以Adacost、Metacost以及修改PRISM資訊函數三種方式使所擷取之規則能考量分類錯誤成本。並將本研究方法與Zhou等人所提出之規則擷取架構REFNE,以UCI-ML資料庫為評比基礎就所產生規則之規則數目、正確率以及分類錯誤成本進行比較與分析。

      Neural network is the technology used in solving data mining problem. Usually , its result has higher accuracy and handle well in noise data ,also the network architecture can present the complex relation between attributes .However its learning result is a black box .User can’t easily understand neural network’s learning result which limit the application of neural networl. This research try to explain the learning result of neural network by extracting rules from trained neural network . The extracting architecture can be used in various kind of neural network model .We put the effect of misclassification cost into consideration during the rule extracting process .The architecture of our research is based on PRISM which is proposed by Cendrowska.We combined the modified PRISM model with Adacost ,Metacost and Information cost function in our model to make the whole extracting architecture can consider the effect of misclassification cost .The architecture proposed in the research will compare the number of extracting rule ,accuracy of ruleset and misclassification cost with REFNE which is proposed by Zhou. The comparison is based on UCI-ML dataset .

    目錄1 表目錄..............................................................5 圖目錄..............................................................7 摘要..............................................................8 致謝..............................................................9 第一章緒言.............................................................11 第一節研究背景.............................................................11 第二節研究動機.............................................................12 第三節研究目的.............................................................13 第四節論文架構.............................................................14 第二章文獻探討.............................................................15 第一節資料探勘.............................................................15 一、基本概念.............................................................15 二、成本因素的相關研究.............................................................16 第二節類神經網路(Artificial Neural Network).............................................................18 一、基本概念.............................................................18 二、類神經網路與資料探勘.............................................................18 三、類神經網路與其他資料探勘技術的比較.............................................................19 第三節類神經網路學習結果的解釋方法.............................................................21 一、Decomposition 方法.............................................................21 二、Pedagogical 方法.............................................................23 三、規則呈現的方式.............................................................24 第三章研究方法.............................................................26 第一節問題描述.............................................................26 一、類神經網路.............................................................27 二、覆蓋式演算法( Covering Algorithm )............................................................27 三、成本矩陣.............................................................30 第二節研究架構.............................................................31 第三節分類規則擷取.............................................................34 一、產生資料.............................................................34 二、處理連續型屬性.............................................................35 三、規則擷取與修剪.............................................................36 第四節成本敏感分類方法.............................................................40 第五節小結.............................................................42 第四章實證研究.............................................................43 第一節實證環境.............................................................43 第二節實證結果分析比較.............................................................45 第三節討論.............................................................60 第五章結論.............................................................63 第一節、研究成果.............................................................63 第二節、研究限制.............................................................64 第三節、未來研究方向.............................................................64 參考文獻.............................................................66

    葉怡成.(民88) 類神經網路模式應用與實作,儒林,台北市.
    羅華強.(民90) 類神經網路Matlab的應用,清蔚科技,新竹市.
    Andrews,R. and Geva,S.(2002) “Rule extraction from local cluster neural nets,” Neurocomputing, pp.471–20.
    Bose,I. and Mahapatra,R.K.(2001) “Business data mining _a machine learning perspective ,” Information & Management ,Vol.39, pp.211-225.
    Boz,O.(2002) “Extracting decision trees from trained neural networks” ACM SIGKDD ’02 ,July ,pp.23-26.
    Castro,J.L. , Mantas,C.J. and Benítez ,J.M.(2002) “Interpretation of artificial neural networks by means of fuzzy rules,” IEEE Transaction On Neural Networks, Vol.13, No.1, January ,pp.101-110.
    Cendrowska,J.(1987) “PRISM: An algorithm for inducing modular rules,” International Journal of Man-Machine Studies ,Vol.27 , No.4 , pp.349-370.
    Chan ,P. and Stolfo,S. (1998) “ Towards scalable learning with non-uniform class and cost distributions: A case study in credit card fraud detection,” In Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining, pp.164-168.
    Cohen,W.W.(1995) “Fast effective rule induction,”.In Proceedings of the 12th International Conference (ML95).
    Craven,M.W(1996) “Extracting comprehensible models from trained neural networks,” PhD Dissertation, University of Wisconsin, Madison Wisconsin.
    Craven ,M. W. and Shavlik, J. W. (1996) “Extracting tree-structured representations of trained networks,” In Advances in Neural Information Processing Systems, Vol .8, pp 24-30.
    Domingos,P.(1999) “Metacost:A general method for making classifiers cost-sensitive,”In Proceedings of the International Conference on Knowledge Discovery and Data Mining, pp. 155-64.
    Drummond ,C. and Holte,R. (2000) “Exploiting the cost (in)sensitivity of decision tree splitting criteria,” In Proceedings of the 17th International Conference on Machine Learning (ICML'2000), pp. 239-246.
    Fan,W. , Stolfo,S. J. , Zhang, J. and Chan ,P. K.( 1999 ) “AdaCost: Misclassification cost-sensitive boosting,” In Proceedings of the Sixteenth International Conference on Machine Learning,pp.97-105, San Francisco, Morgan Kaufmann.
    Freund, Y. and Schapire ,R.E.(1997) “A decision-theoretic generalization of on-line learning and an application to boosting,” Journal of Computer and Systems Sciences, Vol.55,No.1,pp.119-139.
    Fu,L.(1994a) “Rule generation from neural networks,” IEEE Transactions on Systems, Man and Cybernetics, Vol.24, No.8 , Aug , pp.1114-1124 .
    Fu,L.(1994b) “ Representation of rule-based knowledge in neural networks,” IEEE International Conference on Neural Networks, Vol.3 , No.27 June-2 July , pp.1550 -1555.
    Fu,L.(1998) “A neural-network model for learning domain rules based on its activation function characteristics,“ IEEE Transactions on Neural Networks, Vol.9, No.5 , September. pp.787 -795.
    Fu,L.(1999) “Knowledge discovery by inductive neural networks,” IEEE Transactions on Knowledge and Data Engineering, Vol.11 , No.6, November.-December. pp. 992 -998.
    Fu,X and Wang,L (2001) “Rule extraction by genetic algorithms based on a simplified RBF neural network,” Proceedings of IJCNN.
    Fukumi,M. and Akamatsu,N.(1996) “A method to design a neural pattern recognition system by using a genetic algorithm with partial fitness and a deterministic mutation,“ IEEE International Conference on Systems, Man, and Cybernetics, Vol.3 , No.14-17 October.,pp.1989 -1993.
    Fukumi,M. and Akamatsu,N.(1999) “An evolutionary approach to rule generation from neural networks ,” Fuzzy Systems Conference Proceedings, FUZZ-IEEE '99. IEEE International , Vol.3 , No.22-25 August. pp.1388 -1393.
    Giles,C.L. and Omlin,C.W. (1993) “Rule refinement with recurrent neural networks,” In 1993 IEEE International Conference on Neural Networks, pp. 801-806, San Francisco. IEEE Neural Networks Council.
    Golea,M.(1996) “On the complexity of rule extraction from neural networks and network querying,” Proceedings of the Rule Extraction From Trained Artificial Neural Networks Workshop, Society For the Study of Artificial Intelligence and Simulation of Behavior Workshop Series (AISB’96) University of Sussex, Brighton, UK (April 1996),pp.51-59.
    Han,J. and Kamber,M.(2001) Data Mining: Concepts and Techniques, Morgan Kaufmann, San Francisco, CA.
    Hand,D. , Mannila,H. and Smyth ,P.(2001) Principles of Data Mining, MIT Press, Cambridge, CA.
    Hruschka,E.R. and Ebecken,N.F.F.( 2000) “Applying a clustering genetic algorithm for extracting rules from a supervised neural network,” IJCNN 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Network , Vol. 3 , No.24-27 July pp.407 -412 .
    Ikizler,N.(2002) “Benefit maximizing classification using feature inter- vals,” Technical Report BU-CE-0208, Bilkent University,
    Kantardzic,M. (2003) Data Mining Concept , Models , Methods and Algorithms , IEEE Press.
    Kerber,R.(1992) “ChiMerge discretization of numeric attributes,” Proc.Ninth Inf'l Conf.Artificial Intelligence, pp.123-128.
    Krishnan.,R(1996) “A systematic method for decomposition rule extraction from neural network ,” Proceedings of the NIPS97 Workshop on Rule Extraction from Trained Artificial Neural Network , pp.38-45.
    Lin,F.Y. and McClean,S.(2000) “The prediction of financial distress using a cost sensitive approach and prior probabilities,” In Workshop on Cost-Sensitive Learning at the Seventeenth International Conference on Machine Learning (WCSL at ICML-2000) , Stanford University, California.
    Liu,H and Srtiono,R.(1997) “Feature selection via discretization of numeric attributes,” IEEE Transaction on Knowledge and Data Vol.9, No.4, pp.642-645, July-Aug.
    Michalski ,R.S.(1969) “On the quasi –minimal solution of the general covering problem,” In Proceedings of the Fifth International Symposium on Information Processing .pp.125-128.
    Michalski,R.S., Mozetic,I., Hong,J., and Lavrac,N.(1986) “The Multi-Purpose Incremental Learning System AQ15 and its Testing Application to Three Medical Domains,” In Proceedings of the Fifth National Conference on Artificial Intelligence .
    Nilson,M.M and Illingworth,W.T.(1991) A practical Guide to Neural Nets , Addison Wesley Publishing Co.
    Norton, S.W.(1989) “Generating better decision trees,” In Proceedings of the Eleventh International Joint Conference on Artificial Intelligence(IJCAI-89) .
    Nunez, M. (1991) “The use of background knowledge in decision tree induction,” Machine Learning , Vol .6, pp. 231-250.
    Orr,M. , Hallam,J. , Takezawa,K. , Murray,A. , Ninomiya,S. , Oide,M., and Leonard,T. (1999) “Combining regression trees and radial basis function networks,” Int. J. of Neural Systems.
    Pendharkara,P.C. , Rodgerb,J.A. , Yaverbauma,G.J. , Hermana,N. and Bennerc,M. (1999) “Association, statistical, mathematical and neural approaches for mining breast cancer patterns,”Expert Systems with Applications ,Vol.17 ,pp.223–232.
    Provest,F. , Fawcett,T. and Kohavi,R. (1998) “The case against accuracy estimation for comparing induction algorithms,” In Proceedings of the 15th International Conference on Machine Learning , pp.445-453,San Mateo, Morgan Kaufmann.
    Quinlan,J.R.(1993) C4.5: Programs for Machine Learning ,San Mateo, Morgan Kaufmann .
    Quinlan,J.R.(1994) “Comparing connectionist and symbolic learning methods,” In Computational Learning Theory and Natural Learning Systems: Constraints and Prospects, ed., pp. 445-56, MIT Press.
    Saito,K. and Nakano,R. (1988) “Medical diagnostic expert system based on PDP model ,” IEEE International Conference on Neural Networks , pp.255 -262 .
    Sato,M. and Tsukimoto,H. (2001) “Rule extraction from neural networks via decision tree induction ,” Proceedings of IJCNN.
    Setiono,R. (1997) “Extracting rules from neural network by pruning and hidden-unit splitting,”. Neural Computation, Vol.9, No.1, pp205~225.
    Setiono,R. and Leow,W.K.(1999) “ Generating rules from trained network using fast pruning,”Neural Networks. IJCNN '99. International Joint Conference on , Vol.6 , No.10-16, July, pp 4095 -4098 .
    Setiono,R.(2000) “Extracting M-of-N rules from trained neural networks,” IEEE Transactions on Neural Networks , Vol.11 No. 2, March ,pp. 512 -519.
    Setiono,R.(2001) “An effective method for generating multiple linear regression rules from artificial neural networks,” Proceedings of the 13th International Conference on Tools with Artificial Intelligence , Vol.7-9 ,November. pp.171 -178.
    Setiono,R. , Leow,W.K. and Zurada,J.M. (2002) “Extraction of rules from artificial neural networks for nonlinear regression,” IEEE Transactions on Neural Networks, Vol.13 ,No 3 , May ,pp 564 -577.
    Sun ,R. (2001) “Beyond simple rule extraction: the extraction of planning knowledge from reinforcement learners,” In Proceedings of the International Joint Conference on Neural Networks .
    Tan, M. (1991) “ Cost-sensitive reinforcement learning for adaptive classification and control,” In Proceedings of the Ninth National Conference on Artificial Intelligence.
    Tay , F. and Shen , L.(2002) “A modified Chi2 algorithm for discretization,” Knowledge and Data Engineering ,IEEE .Vol.14 , No.3, pp. 666-670.
    Thrun,S.B(1995) “Extracting rules from artificial neural networks with distributed representations,” Advances in Neural Information Processing Systems, Vol.7 , pp. 505-512.
    Tickle,A.B. , Golea,M. , Hayward,R. and Diederich ,J.( 1997) “The truth is in there: Current issues in extracting rules from trained feed forward artificial neural networks,” Proceedings of the International Conference on Neural Networks, Vol. 4, pp. 2530–2534.
    Tikk,D. , Koczy,L.T. and Gedeon,T.D.(2001) “A survey on the universal approximation and its limits in soft computing techniques,” Research Working Paper RWP-IT-01-2001, School of Information Technology, Murdoch University, Perth, W.A. pp. 20.
    Ting,K.M. and Zheng,Z.(1998) “ Boosting trees for cost-sensitive classifications,” In ECML-98: 10th European Conference on Machine Learning, pp.190-195, Chemnitz, Germany:Springer.
    Towell,G..G.. and Shavlik,J.W.(1993) “Extracting refined rules from knowledge-based neural networks,” Machine Learning, Vol. 13, No.1, pp.71–101.
    Towell,G..G and Shavlik, J. W. (1994) “Knowledge-based artificial neural networks,” Artificial Intelligence Vol.70,No.1-2,pp.119-165.
    Tsukimoto,H.(2000) “Extracting rules from trained neural networks,”IEEE Transactions On Neural Networks, Vol.11, No. 2, MARCH.
    Turney.P.(1995) “Cost-sensitive classification: empirical evaluation of a hybrid genetic decision tree induction algorithm,” Journal of Artificial Intelligence Research .Vol.2 .pp.369-409.
    Turney,P.(2000) “Types of cost in inductive concept learning,” In Workshop on Cost-Sensitive Learning at the Seventeenth International Conference on Machine Learning (WCSL at ICML-2000), pp.15-21.
    Witten,I.H. and Frank,E.(2000) “Data Mining: practical machine learning tools and techniques with java implementations,” Morgan Kaufmann, San Francisco, CA.
    Wolberg,W.H. and Mangasarian,O.L.(1990) “Multisurface method of pattern separation for medical diagnosis applied to breast cytology,” Proceedings of the National Academy of Sciences.Vol.87, pp.9193-9196.
    Zhou,Z.H.(2003) “Three perspectives of data mining,” Artificial Intelligence,Vol.143 , pp.139-146.
    Zhou,Z.H. , Jiang,Y. and Chen ,S.Fu.(2003) “Extracting symbolic rules from trained neural network ensembles,”AI Communications , Vol.16 , No.1, pp. 3-15.
    Zubek,V.B. and Dietterich,T.G..(2002) “Pruning improves heuristic search for cost- sensitive learning,” In Proceedings of the International Conference on Machine Learning (ICML2002).

    下載圖示 校內:2005-06-23公開
    校外:2005-06-23公開
    QR CODE