| 研究生: |
劉家瑄 Liu, Jia-Xuan |
|---|---|
| 論文名稱: |
研發適用於高維度巨量資料之快速良率改善機制 Development of Fast Yield Improvement Scheme for High-Dimension Big Data |
| 指導教授: |
陳朝鈞
Chen, Chao-Chun |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 製造資訊與系統研究所 Institute of Manufacturing Information and Systems |
| 論文出版年: | 2017 |
| 畢業學年度: | 105 |
| 語文別: | 中文 |
| 論文頁數: | 50 |
| 中文關鍵詞: | 高維度巨量資料 、製造分析 、良率管理 、巨量資料快速關鍵參數搜尋 |
| 外文關鍵詞: | High-Dimensional Big Data, Manufacturing Analysis, Yield Analysis, Fast Key-variable Search for Big Data |
| 相關次數: | 點閱:103 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
在現今工業4.0的時代中,先進製造的生產工廠往往具有非常複雜與龐大的產線。資料量累積非常迅速且資料呈現高維度的型式,導致產生巨量資料儲存與處理分析的問題。對於製造分析中非常重要的良率分析問題,必須在眾多製程資料中找出影響良率的關鍵因子。傳統的高維度參數搜尋演算法,在面對巨量資料的情況下,必須會花費大量的時間才能完成參數搜尋,導致降低良率分析的效率。因此,本論文根據了三階段貪婪演算法(TPOGA)的概念,提出了巨量資料快速關鍵參數搜尋演算法機制(Fast Key-Variable Search Algorithm for Big Data Scheme,簡稱FKSABD Scheme),用以解決巨量資料下的關鍵參數搜尋問題。FKSABD Scheme透過雙階段式的分析,利用資料摘要技術將大部分的資料轉換成特徵摘要,再使用TPOGA的分析概念來進行參數搜尋,並且利用Apache Spark作為系統運算架構。相對於原本的TPOGA方法,在約380GB的資料情境下花費了約62個小時的運算時間,而FKSABD Scheme約8個小時即完成運算,減少約8倍的運算時間。因此FKSABD能夠有效改善良率問題分析時的運算之效能。
In the Industry 4.0 era, the processes of an advanced manufacturing factory are very complex and huge. Because the production data are collected rapidly and the data are high-dimensional, the manufacturing analysis encounters issues of big data storage and processing. Yield analysis is one of important manufacturing analysis problems, which needs to find out key factors causing yield losses from diverse production data. In a big data processing scenario, the traditional high-dimensional key-variable searching algorithms often cost many searching time so that the yield improvement task is inefficient. In this thesis, based on the triple phase orthogonal greedy algorithm (TPOGA), a fast key-variable searching algorithm for big data, called FKSABD scheme, is proposed to solve the key-variable searching problem for big production data. The FKSABD scheme is a two-phase searching scheme. First, it uses a data digesting technology to extract the feature from a large portion of the production data in phase 1. Then, the feature data together with the remaining production data are applied to search key variables in phase 2. Also, Apache Spark is used to implement the FKSABD scheme for processing big production data. Testing results of case studies show that the searching results of the FKSABD scheme are close to those of the TPOGA. Also, the TPOGA spends 62 hours to process 380 GB data, whereas the FKSABD scheme takes only 8 hours to process the same data, about 8 times faster than the TPOGA. Thus, the proposed FKSABD scheme is promising to greatly speed up the yield analysis in processing big production data.
[1] Y. C. Lin, M. H. Hung, H. C. Huang, C. C. Chen, H. C. Yang, Y. S. Hsieh, and F. T. Cheng, “Development of Advanced Manufacturing Cloud of Things (AMCoT) - Smart Manufacturing Platform,” IEEE Robotics and Automation Letters, vol. 2, no. 3, pp. 1809–1816, Jul. 2017.
[2] P. Lade, R. Ghosh, and S. Srinivasan, “Manufacturing Analytics and Industrial Internet of Things,” IEEE Intelligent Systems, vol. 32, no. 3, pp. 74–79, May 2017.
[3] C. F. Chien and S. C. Chuang, “A Framework for Root Cause Detection of Sub-Batch Processing System for Semiconductor Manufacturing Big Data Analytics,” IEEE Transactions on Semiconductor Manufacturing, vol. 27, no. 4, pp. 475–488, Nov. 2014.
[4] F. T. Cheng, Y. S. Hsieh, J. W. Zheng, S. M. Chen, R. X. Xiao, and C. Y. Lin, “A Scheme of High-Dimensional Key-Variable Search Algorithms for Yield Improvement,” IEEE Robotics and Automation Letters, vol. 2, no. 1, pp. 179–186, Jan. 2017.
[5] G. A. Susto, A. Schirru, S. Pampuri, and S. McLoone, “Supervised Aggregative Feature Extraction for Big Data Time Series Regression,” IEEE Transactions on Industrial Informatics, vol. 12, no. 3, pp. 1243–1252, Jun. 2016.
[6] H. K. Lim, Y. Kim, and M. K. Kim, “Failure Prediction Using Sequential Pattern Mining in the Wire Bonding Process,” IEEE Transactions on Semiconductor Manufacturing, vol. 30, no. 3, pp. 285–292, Aug. 2017.
[7] H. Lee, C. O. Kim, H. H. Ko, and M. K. Kim, “Yield Prediction Through the Event Sequence Analysis of the Die Attach Process,” IEEE Transactions on Semiconductor Manufacturing, vol. 28, no. 4, pp. 563–570, Nov. 2015.
[8] Y. Zhu and J. Xiong, “Modern Big Data Analytics For ‘Old-Fashioned’ Semiconductor Industry Applications,” in 2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD),2-6 Nov. 2015, pp. 776–780.
[9] Apache Hadoop. Available from: https://hadoop.apache.org/
[10] Apache Impala. Available from: https://impala.incubator.apache.org/
[11] The R Project for Statistical Computing. Available from: https://www.r-project.org/
[12] Apache Spark. Available from: https://spark.apache.org/
[13] U. Hessinger, W. K. Chan, and B. T. Schafman, “Data Mining for Significance in Yield-Defect Correlation Analysis,” IEEE Transactions on Semiconductor Manufacturing, vol. 27, no. 3, pp. 347–356, Aug. 2014..
[14] F. Adly, O. Alhussein, P. D. Yoo, Y. Al-Hammadi, K. Taha, S. Muhaidat, Y. S. Jeong, U. Lee, and M. Ismail, “Simplified Subspaced Regression Network for Identification of Defect Patterns in Semiconductor Wafer Maps,” IEEE Transactions on Industrial Informatics, vol. 11, no. 6, pp. 1267–1276, Dec. 2015.
[15] Chen and A. Hong, “Sample-Efficient Regression Trees (SERT) for Semiconductor Yield Loss Analysis,” IEEE Transactions on Semiconductor Manufacturing, vol. 23, no. 3, pp. 358–369, Aug. 2010.
[16] R. Tibshirani, “Regression Shrinkage and Selection Via the Lasso,” Journal of the Royal Statistical Society, Series B, vol. 58, pp. 267–288, 1994.
[17] T. Hastie, R. Tibshirani, and J. Friedman, “The Lasso,” in The Elements of Statistical Learning Data Mining,Inference,and Prediction, New York, NY, USA: Springer, Nov. 2013, p. 68.
[18] C.-K. Ing and T. L. Lai, “A Stepwise Regression Method And Consistent Model Selection For High-Dimensional Sparse Linear Models,” Statistica Sinica, vol. 21, no. 4, pp. 1473–1513, 2011.
[19] Y. Zhang, S. Ren, Y. Liu, and S. Si, “A Big Data Analytics Architecture For Cleaner Manufacturing And Maintenance Processes Of Complex Products,” Journal of Cleaner Production, vol. 142, Part 2, pp. 626–641, Jan. 2017.
[20] P. O’Donovan, K. Leahy, K. Bruton, and D. T. J. O’Sullivan, “An Industrial Big Data Pipeline For Data-Driven Analytics Maintenance Applications In Large-Scale Smart Manufacturing Facilities,” Journal of Big Data, vol. 2, p. 25, Nov. 2015.
[21] M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M. J. Franklin, S. Shenker, and I. Stoica, “Resilient Distributed Datasets: A Fault-tolerant Abstraction for In-memory Cluster Computing,” in Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation, Berkeley, CA, USA,25-27 Apr. 2012, pp. 2–2.
[22] F. T. Cheng, H. C. Huang, and C. A. Kao, “Developing an Automatic Virtual Metrology System,” IEEE Transactions on Automation Science and Engineering, vol. 9, no. 1, pp. 181–188, Jan. 2012.
[23] F. T. Cheng, Y. T. Chen, Y. C. Su, and D. L. Zeng, “Evaluating Reliance Level of a Virtual Metrology System,” IEEE Transactions on Semiconductor Manufacturing, vol. 21, no. 1, pp. 92–103, Feb. 2008.