| 研究生: |
王富琳 Wang, Fu-Lin |
|---|---|
| 論文名稱: |
結合外部資訊與特徵生成於伺服器製造業需求預測之研究 Combining external information and feature generation for demand forecasting in the server manufacturing industry |
| 指導教授: |
王惠嘉
Wang, Hei-Chia |
| 學位類別: |
碩士 Master |
| 系所名稱: |
管理學院 - 工業與資訊管理學系碩士在職專班 Department of Industrial and Information Management (on the job class) |
| 論文出版年: | 2023 |
| 畢業學年度: | 111 |
| 語文別: | 中文 |
| 論文頁數: | 46 |
| 中文關鍵詞: | 外部資訊 、需求預測 、特徵生成 、機器學習 |
| 外文關鍵詞: | external information, demand forecasting, feature generation, machine learning |
| 相關次數: | 點閱:81 下載:22 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
因疫情影響之下,民眾大幅提升網路應用,台灣在伺服器製造業業績表現亮眼, 現已成為全球伺服器供應鏈中所扮演的重要角色,亦為台灣現今具高價值的重要產業。 為了讓企業在競爭環境下的做出正確的商業決策,本研究首先探討當前發表之相關產 業研究,逐步探討製造需求的重要性,且需求作為企業競爭力的基礎,能判斷一間公 司的未來發展與成長性。
本研究將運用現有發表之預測研究,挑選外部資訊因子與企業內部因子作為資料 集,結合自動化特徵工程產出對預測問題更有意義之特徵。資料蒐集企業內外部資訊 後進行預處理,接著使用 Random Forest 生成初始模型,使用 Featuretools、Autofeat 與 Tsfresh 自動化特徵工程進行特徵生成,欲探討之因子將應用聚類分群需求高、低 峰資料,接著使用 XGBoost 分類方法使用內外部特徵來預測營收高低峰標籤,最後, 使用多元迴歸方法來調整需求預測值。研究實驗探討不同時間段的特徵與特徵數量進 行評比,使用特徵數量為 13 且時間範圍為 6 個月的組合獲得最佳的模型表現;隨機 森林模型的平均百分比誤差(MAPE)為 0.125,加入多元迴歸模型與自動特徵生成之表 現,誤差進一步降低至 0.08,提升了 36%效果。
During the pandemic, online application usage surged. Taiwan's server manufacturing industry thrived, becoming a key player in the global server supply chain. This study explores existing research, focusing on the importance of manufacturing demand for predicting a company's future growth. We utilize published predictive studies, selecting external and internal corporate factors as our data set. Automated feature engineering is used to generate meaningful features for prediction. After preprocessing the collected information, an initial model is created using Random Forest. Featuretools, Autofeat, and Tsfresh are used for automated feature generation. These features are applied to cluster high and low demand data. XGBoost is then used to predict high and low revenue labels. Finally, demand predictions are adjusted using multiple regression methods.
The experiment evaluates different time periods and feature quantities. The best performance was achieved with 13 features over 6 months. The Random Forest model's error was 0.125, but with the addition of the multiple regression model and automated feature generation, the error reduced to 0.08, improving performance by 36%.
Aksakalli, V. and Malekipirbazari, M. (2016). Feature selection via binary simultaneous perturbation stochastic approximation. Pattern Recognition Letters, 75, 41-47.
Battiti, R. (1994). Using mutual information for selecting features in supervised neural net learning, IEEE Transactions on Neural Networks, 5(4), 537-550.
Beutel, A. L., and Minner, S. (2012). Safety stock planning under causal demand forecasting. International Journal of Production Economics, 140(2), 637-645.
Cheraghi, S. H., Dadashzadeh, M., and Venkitachalam, P. (2010). Revenue management in manufacturing: a research landscape. Journal of Business & Economics Research, 8(2).
Christ, M., Braun, N., Neuffer, J., and Kempa-Liehr, A. W. (2018). Time series feature extraction on basis of scalable hypothesis tests (tsfresh–a python package). Neurocomputing, 307, 72-77.
Dai, W., Chuang, Y. Y., and Lu, C. J. (2015). A clustering-based sales forecasting scheme using support vector regression for computer server, Procedia Manufacturing, 2, 82-86.
Davis, T. (1993). Effective supply chain management. Sloan management review, 34, 35-35.
Fajgelbaum, P. D. and Khandelwal, A. K. (2022). The economic impacts of the US-China trade war. Annual Review of Economics, 14, 205-228.
Guyon, I., Weston, J., Barnhill, S., and Vapnik, V. (2002). Gene selection for cancer classification using support vector machines, Machine learning, 46(1-3), 389-422.
Guyon, I. and Elisseeff, A. (2003). An introduction to variable and feature selection, Journal of Machine Learning Research, 3, 1157-1182.
Hastie, T., Tibshirani, R., and Friedman, J. (2009). The Elements of Statistical, Springer.
He, Y., Zhang, Y., and Tian, P. (2015). The study of Warning Threshold of Chinese manufacturing PMI for important macroeconomic indicators. Procedia Computer Science, 55, 1374-1380.
Horn, F., Pack, R., and Rieger, M. (2019). The autofeat python library for automated feature engineering and selection. Joint European Conference on Machine Learning and Knowledge Discovery in Databases (pp. 111-120). Springer, Cham.
Hosoda, T., & Disney, S. M. (2009). Impact of market demand mis-specification on a two- level supply chain. International Journal of Production Economics, 121(2), 739-751.
Kanter, J. M. and Veeramachaneni, K. (2015). Deep feature synthesis: Towards automating data science endeavors. 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA), France.
Katz, G., Shin, E. C. R., and Song, D. (2016). Explorekit: Automatic feature generation and selection. 2016 IEEE 16th International Conference on Data Mining (ICDM), Spain.
Kim, Y. (2001). Measuring the economic value of public relations, Journal of Public Relations Research, 13(1), 3-26.
LeDell, E. and Poirier, S. (2020). H2O AutoML: Scalable automatic machine learning, 7th ICML Workshop on Automated Machine Learning (AutoML).
Lee, H. L., Padmanabhan, V., and Whang, S. (1997). Information distortion in a supply chain: The bullwhip effect. Management science, 43(4), 546-558.
Leist, A. K., Klee, M., Kim, J. H., Rehkopf, D. H., Bordas, S., Muniz-Terrera, G., and Wade, S. (2021). Machine Learning in the Social and Health Sciences, arXiv preprint arXiv:2106.10716.
Lipovetsky, S., & Conklin, M. (2001). Analysis of regression in game theory approach. Applied Stochastic Models in Business and Industry, 17(4), 319-330.
Liu, H. and Motoda, H. (2007). Computational Methods of Feature Selection, Chapman and Hall.
Ma, S. and Huang, J. (2008). Penalized feature selection and classification in bioinformatics, Briefings in Bioinformatics, 9(5), 392-403.
Roscher, R., Bohn, B., Duarte, M. F., and Garcke, J. (2020). Explainable machine learning for scientific insights and discoveries, IEEE Access, 8, 42200-42216.
Saha, C., Lam, S.S., and Boldrin, W. (2014), Demand forecasting for server manufacturing using neural networks, Industrial and Systems Engineering Research Conference, Canada.
Shah, N., Solanki, M., Tambe, A., and Dhangar, D. (2015). Sales prediction using effective mining techniques, International Journal of Computer Science and Information Technologies, 6(3), 2287-2289.
Teresiene, D., Keliuotyte-Staniuleniene, G., Liao, Y., Kanapickiene, R., Pu, R., Hu, S., and Yue, X.G. (2021). The impact of the COVID-19 pandemic on consumer and business confidence indicators. Journal of Risk and Financial Management, 14(4), 159.
Truong, A., Walters, A., Goodsitt J., Hines, K., Bruss, C. B., and Farivar R. (2019). Towards automated machine learning: evaluation and comparison of automl approaches and tools. 2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI), USA.
Tsao, Y.C., Chen, Y. K., Chiu, S. H., Lu J. C., and Vu T. L. (2022). An innovative demand forecasting approach for the server industry, Technovation, 110, 102371.
Wickramasinghe, C. S., Amarasinghe, K., Marino, D. L., Rieger, C., and Manic, M. (2021). Explainable unsupervised machine learning for cyber-physical systems, IEEE Access, 9, 131824-131843.
Wisesa, O., Andriansyah, A., and Khalaf, O. I. (2020). Prediction analysis for business to business (B2B) sales of telecommunication services using machine learning techniques, Majlesi Journal of Electrical Engineering, 14(4), 145-153.