| 研究生: | 歐嘉瑜 Ao, Ka-U | 
|---|---|
| 論文名稱: | 高維度資料中交互作用的探討-以多階段製程資料為例 A Study of Interaction Effect for High Dimensional Data with Application to Manufacturing Data of Multistage Process | 
| 指導教授: | 鄭順林 Jeng, Shuen-Lin | 
| 學位類別: | 碩士 Master | 
| 系所名稱: | 管理學院 - 統計學系 Department of Statistics | 
| 論文出版年: | 2016 | 
| 畢業學年度: | 104 | 
| 語文別: | 英文 | 
| 論文頁數: | 49 | 
| 中文關鍵詞: | 多階段製程生產資料 、動態貝氏網路 、高階交互作用 、協同作用 、類別型時間數列 、迴歸樹 | 
| 外文關鍵詞: | Multistage manufacturing process data, Dynamic Bayesian Network, High order interactions, synergy factors, Categorical-value Time series, Regression tree | 
| 相關次數: | 點閱:131 下載:2 | 
| 分享至: | 
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 | 
在統計分析中,主作用及交互作用對反應變數的影響同樣重要,但是由於各種原因,例如變數數目比樣本數數目多而做變數篩選,主作用的影響會覆蓋了交互作用(協同作用) 的影響,令交互作用被忽略。不論是工業統計或生物資訊領域都同時遭遇到此類問題。
在半導體工業中,工廠的投資額非常龐大,花費不少人力和財力,而在經過多個階段的製作工序中,製成品卻常具有缺陷或是不佳的情況,為了降低因此造成的損失,以及提升產品的品質,蒐集產品在製品期間的相關的歷史資料並進行分析,已經成為趨勢。此論文的主要目的是利用數個統計和機器學習的方法,找尋對製成品造成良率不佳的機台或是機台組合,提供造成品質不佳的可疑因子以協助尋找根本原因。
在製造的過程中,製品於不同階段使用的機台會被記錄,本研究為了找尋可能造成製成品良率不佳的問題機台,使用了動態貝氏網路(Dynamic Bayesian Network)(Dean and Kanazawa, 1989) 和Learned Pattern Similarity(Baydogan and Runger ,2015)的方法,尋找造成良率不佳的因子(尤其是交互作用和協同作用),並比較兩種分析方法的結果與使用傳統資料採礦方法分析的結果。
本論文的主要貢獻是建立了一個建議的流程,解決動態貝氏網路一階馬可夫鏈的限制,應用上更有彈性。另外,基於Learned pattern similarity 的想法並作修改,利用此修改後的方法找到造成良率低下的可疑機台組合。最後,本研究的結果可以協助分析造成良率不佳的可疑因子。
In statistical analysis, the influences of both main effects and interaction effects are important to the response variable. However, some reasons such as variable filtering due to computational burden would make interaction terms (or sometimes called synergy factors) being masked. No matter the field of industry statistics or bioinformatics, the similar problem exists.
In semiconductor manufacturing industry, huge investment is always consumed. It also spends a lot of human and financial resources. However, after a great number of manufacturing procedure stages, the final products often have defects or poor performance. In order to reduce the loss caused by this situation and improve the products’ quality, collection and analysis of the historical data of work in process (WIP) have become a trend. In this thesis, the aim is to find out one or some tools that would affect yield by using some statistical and data mining methods and these result can help to find out the possible root causes of the defect products.
The tools used in stages would be recorded during the manufacturing process. For the purpose of finding the suspected tools (especially for the interactions or synergy factors), Dynamic Bayesian Network (Dean and Kanazawa, 1989) and Learned Pattern Similarity (Baydogan and Runger, 2015) are considered in this thesis. At the end, the results of these methods are compared with the traditional data mining strategies.
One of the major contributions of this research is to develop a framework for finding suspected tools with Dynamic Bayesian Network. The assumption of the first order Markovian with fixed transition probability is relieved by using this framework. Also, a proposed approach which based on the concept of Learned Pattern Similarity is introduced. The result of these approaches identify the used tools which could reduce the yield rates.
Baydogan, M. G. (2013). Learned pattern similarity (LPS). homepage: www.mustafabaydogan.com/learned-pattern-similarity-lps.html/.
Baydogan, M. G. and Runger, G. (2015). Time series representation and similarity based on local autopatterns. Data Mining and Knowledge Discovery, pages 1–34.
Bishop, C. M. (2006). Pattern Recognition and Machine Learning. Springer.
Borgelt, C. (2003). Efficient implementations of apriori and eclat. In FIMI’03: Proceedings of the IEEE ICDM workshop on frequent itemset mining implementations.
Cortina-Borja, M., Smith, A. D., Combarros, O., and Lehmann, D. J. (2009). The synergy factor: a statistic to measure interactions in complex diseases. BMC Research Notes, 2(1): 1.
Dean, T. and Kanazawa, K. (1989). A model for reasoning about persistence and causation.
Computational intelligence, 5(2):142–150.
Ghahramani, Z. (1998). Learning dynamic bayesian networks. In Adaptive processing of sequences and data structures, pages 168–197. Springer.
Hahsler, M., Buchta, C., Gruen, B., and Hornik, K. (2016). arules: Mining Association Rules and Frequent Itemsets. R package version 1.4-1.
Hahsler, M., Grün, B., and Hornik, K. (2007). Introduction to arules–mining association rules and frequent item sets. SIGKDD Explor, 2(4).
Han, K. and Wang, K. (2013). Coordination and control of batch-based multistage processes.
Journal of Manufacturing Systems, 32(2):372–381.
Lèbre, S. (2009). Inferring dynamic genetic networks with low order independencies. Sta- tistical applications in genetics and molecular biology, 8(1):1–38.
Lèbre, S., Becq, J., Devaux, F., Stumpf, M. P., and Lelandais, G. (2010). Statistical inference of the time-varying structure of gene-regulation networks. BMC systems biology, 4(1): 130.
 
Lebre, S., original version 1.0 by Sophie Lebre, and contribution of Julien Chiquet to version
2.0 (2013). G1DBN: A package performing Dynamic Bayesian Network inference. R package version 3.1.1.
Liu, S., Chen, F., and Lu, W. (2002). Wafer bin map recognition using a neural network approach. International Journal of production research, 40(10):2207–2223.
Murphy, K. P. (2002). Dynamic bayesian networks: representation, inference and learning.
PhD thesis, University of California, Berkeley.
Nagarajan, R., Scutari, M., and Lèbre, S. (2013). Bayesian networks in r. Springer, 122:125– 127.
R Core Team (2015). R: A language and environment for statistical computing.
Robinson, J. W. and Hartemink, A. J. (2010). Learning non-stationary dynamic bayesian networks. Journal of Machine Learning Research, 11(Dec):3647–3680.
Russell, S. J., Norvig, P., Canny, J. F., Malik, J. M., and Edwards, D. D. (2003). Artificial intelligence: a modern approach, volume 2. Prentice hall Upper Saddle River.
Strobl, C., Malley, J., and Tutz, G. (2009). An introduction to recursive partitioning: ratio- nale, application, and characteristics of classification and regression trees, bagging, and random forests. Psychological methods, 14(4):323.
Therneau, T., Atkinson, B., and Ripley, B. (2015). rpart: Recursive Partitioning and Re- gression Trees. R package version 4.1-10.
Therneau, T. M. and Atkinson, E. J. (1997). An introduction to recursive partitioning using the rpart routines.
Verron, S., Li, J., and Tiplica, T. (2010). Fault detection and isolation of faults in a multi- variate process with bayesian network. Journal of Process Control, 20(8):902–911.
Yang, L. and Lee, J. (2012). Bayesian belief network-based approach for diagnostics and prognostics of semiconductor manufacturing systems. Robotics and Computer-Integrated Manufacturing, 28(1):66–74.
Zhang, Z. and Dong, F. (2014). Fault detection and diagnosis for missing data systems with a three time-slice dynamic Bayesian network approach. Chemometrics and Intelligent Laboratory Systems, 138:30–40.
 校內:2021-07-25公開
                                        校內:2021-07-25公開