| 研究生: |
黃之柔 Huang, Chih-Jou |
|---|---|
| 論文名稱: |
用LSTM神經網路預測銷售額—以小型企業為例 Forecasting Sales by LSTM Neural Network- Taking a Small Business as an Example |
| 指導教授: |
周榮華
Chou, Jung-Hua |
| 學位類別: |
碩士 Master |
| 系所名稱: |
工學院 - 工程科學系碩士在職專班 Department of Engineering Science (on the job class) |
| 論文出版年: | 2023 |
| 畢業學年度: | 111 |
| 語文別: | 中文 |
| 論文頁數: | 70 |
| 中文關鍵詞: | 電子商務 、網路零售業 、LSTM 、銷售額預測 |
| 外文關鍵詞: | E-commerce, E-retailing, LSTM, Sales Forecasting |
| 相關次數: | 點閱:103 下載:25 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
近年來,電子商務已成為目前市場的趨勢,且近十年來網路零售業更是隨著各式各樣行動裝置的普及,而有了突破性的發展。本論文之研究透過長短期記憶(LSTM)建立模型,並針對目前國內使用人數最多之一的電商平台-蝦皮購物,來進行零售業之銷售額預測與分析。本研究經由提出一項由歷史銷售額與行銷數據來預測銷售額之趨勢,使網路零售業者可透過此模型與預測數據了解店家本身的銷售額趨勢,以更好地因應未來商店的定位,或是相關的商業決策、行銷活動,亦能夠了解其成本與利潤之間的趨勢。
根據該研究對象銷售數據,將每日總銷售額以不同日數計算簡單移動平均,將數據劃分為資料集1至資料集8,並以80%為訓練資料、20%為測試資料。模型訓練完成後,以測試集產生預測數據,並以模型評估指標MAE、MSE、RMSE檢測模型學習狀況,其中以資料集5所得到的三項指標較佳,評估模型的MAE、MSE、RMSE三項指標都有明顯下降的趨勢,且損失函數可收斂於0.0032,低於0.01以下,亦沒有梯度爆炸或梯度消失情況,表示此模型沒有過度擬合的現象產生。
總上所述,本次研究之零售業銷售額預測,可透過移動平均的日數以及模型超參數的調整進行預測,取得更好的銷售額預測結果,電商零售業者可透過使用此模型,以歷史銷售數據預測未來的銷售額,以此作為零售業者在進銷庫存上的一項參考數據,並且能夠作為銷售額預測的工具,提高銷售額或作為商業經營的參考方向。
Recently, e-commerce has become popular due to the breakthrough of online retailing in the past decade with the readily accessibility of various mobile devices. Namely, consumers are more accustomed to the Internet for their daily purchasing behaviors, covering food, clothing, housing, transportation, education, and entertainment. In addition, the impact of the COVID-19 epidemic worldwide has also accelerated the growth of the global e-commerce economy further since 2019.
The scope of this study is to build a model using long and short-term memory (LSTM) neural network for forecasting e-retailing of a small business using one of the most used e-commerce platforms in Taiwan, Shrimp Shopping. The goal is for the small business to better respond to the future positioning of the store, or related business decisions, marketing activities, and to understand the trend of their costs and profits.
The results show that the simple moving average of five days is a proper method to reduce the data fluctuation to reveal sales trend for good sales forecasting. Also, the advertizement is important to the sales of goods for a small business. The developed LSTM forecasts the retail sales of a small business reasonably accurate; thus can be used for sales forecasting to increase profits.
[1] 經濟部統計處. "產業經濟統計簡訊." [Online]. Available: https://www.moea.gov.tw/Mns/dos/bulletin/Bulletin.aspx?kind=9&html=1&menu_id=18808&bull_id=9673. [Accessed: Mar. 2023].
[2] MIC資策會產業情報研究所. "【零售電商消費者調查系列一】 6成網友愛用蝦皮24h、momo,行動App購物冠軍為蝦皮,10大網購數位科技消費者最重視電子支付、跨平台比價." [Online]. Available: https://mic.iii.org.tw/news.aspx?id=621. [Accessed: Mar. 2023].
[3] R. Fildes, S. Ma, and S. Kolassa, "Retail Forecasting: Research and Practice," International Journal of Forecasting, vol. 38, no. 4, pp. 1283-1318, 2019.
[4] P. Doganis, E. Aggelogiannaki, and H. Sarimveis, "A Combined Model Predictive Control and Time Series Forecasting Framework for Production-inventory Systems," International Journal of Production Research, vol. 46, no. 24, pp. 6841-6853, 2008.
[5] M. Sarkar and A. De Bruyn, "LSTM Response Models for Direct Marketing Analytics: Replacing Feature Engineering with Deep Learning,"Journal of Interactive Marketing, vol. 53, pp. 80-95, 2020.
[6] H. Gavin, "Building a Neural Network Zoo From Scratch: The Recurrent Neural Network," 2022. [Online]. Available: https://medium.com/mlearning-ai/building-a-neural-network- zoo- from-scratch-the-recurrent-neural-network-9357b43e113c
[7] I. Goodfellow, Y. Bengio, and A. Courville, "Deep Learning,"The MIT Press, 2016.
[8] C. Olah, "Understanding LSTM Networks," colah's blog, 2015. [Online]. Available: http://colah.github.io/posts/2015-08-Understanding-LSTMs/. [Accessed: Apr. 2023].
[9] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, "Learning Representations by Back-propagating errors," Nature, vol. 323, pp. 533-536, 1986.
[10] K. Cho et al., "Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation," in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014, pp. 1724-1734.
[11] Y. Ming et al., "Understanding Hidden Memories of Recurrent Neural Networks," in 2017 IEEE Conference on Visual Analytics Science and Technology (VAST), 2017, pp. 13-24.
[12] A. D. Brown and G. E. Hinton, "Products of hidden Markov models," in Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics, 2001, pp. 21-28.
[13] T. Tieleman and G. Hinton, "Lecture 6.5-rmsprop: Divide the Gradient by a Running Average of its Recent Magnitude," COURSERA: Neural Networks for Machine Learning, no. 4(2), pp. 26–31, 2012.
[14] Y. Bengio, P. Simard, and P. Frasconi, "Learning Long-Term dependencies with gradient descent is difficult," IEEE Transactions on Neural Networks, vol. 5, no. 2, pp. 157–166, 1994.
[15] R. Pascanu, T. Mikolov, and Y. Bengio, "On the Difficulty of Training Recurrent Neural Networks," in Proceedings of the 30th International Conference on International Conference on Machine Learning, 2013, pp. 1310-1318.
[16] R. Józefowicz, W. Zaremba, and I. Sutskever, "An Empirical Exploration of Recurrent Network Architectures," in ICML'15: Proceedings of the 32nd International Conference on International Conference on Machine Learning, 2015, pp. 2342–2350.
[17] J. Koutnik et al., "A Clockwork RNN," in Proceedings of the 31st International Conference on Machine Learning, 2014, pp. 1863-1871.
[18] J. Luo and Y. Gong, "Air pollutant prediction based on ARIMA-WOA-LSTM model," Atmospheric Pollution Research, vol. 14, no. 6, p. 101761, 2023.
[19] W. F. Alfwzan et al., "Application of Bi-LSTM Method for Groundwater Quality Assessment Through Water Quality Indices," Journal of Water Process Engineering, vol. 53, p. 103889, 2023.
[20] A. Berhich, F.-Z. Belouadha, and M. I. Kabbaj, "An Attention-based LSTM Network for Large Earthquake Prediction," Soil Dynamics and Earthquake Engineering, vol. 165, p. 107663, 2023.
[21] F. A. Gers, J. Schmidhuber, and F. Cummins, "Learning to Forget: Continual Prediction with LSTM," in 1999 Ninth International Conference on Artificial Neural Networks ICANN 99, 1999, No. 470.
[22] S. Hochreiter and J. Schmidhuber, "Long short-term memory," Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997.
[23] Z. Chen, Y. Liu, and S. Liu, "Mechanical State Prediction Based on LSTM Neural Network," in 2017 36th Chinese Control Conference (CCC), 2017.
[24] K. Smagulova and A. P. James, "Overview of Long Short-Term Memory Neural Networks," in Deep Learning Classifiers with Memristive Networks, vol. 14, 2020, pp. 139–153.
[25] K. Irie et al., "LSTM, GRU, Highway and a Bit of Attention: An Empirical Overview for Language Modeling in Speech Recognition,"in Interspeech 2016, 2016, pp. 3519-3523.
[26] K. Greff et al., "LSTM: A Search Space Odyssey," IEEE Transactions on Neural Networks and Learning Systems, vol. 28, no. 10, pp. 2222–2232, 2017.
[27] Y. Verma, "A Complete Understanding of Dense Layers in Neural Networks," Analytics India Magazine (AIM), 2021. [Online]. Available: https://analyticsindiamag.com/a-complete-understanding-of-dense-layers-in-neural-networks/.
[28] G. E. Hinton, A. Krizhevsky, I. Sutskever and N. Srivastva, "System and Method for Addressing Overfitting in a Neural Network," U.S. Patent US9406017B2, 2013.
[29] P. Veličković, "Dropout," 2016. [Online]. Available: https://github.com/PetarV-/TikZ/tree/master/Dropout.
[30] G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, R. R. Salakhutdinov, "Improving Neural Networks by Preventing Co-adaptation of Feature Detectors," Department of Computer Science, University of Toronto, 6 King’s College Rd, Toronto, Ontario M5S 3G4, Canada, 2012.
[31] K. Chowdhury, "10 Hyperparameters to Keep an Eye on For Your LSTM Model — and Other Tips," Medium, 2021. [Online]. Available: https://medium.com/geekculture/10-hyperparameters-to-keep-an-eye-on-for-your-lstm-model-and-other-tips-f0ff5b63fcd4.
[32] Y. Bengio, "Practical Recommendations for Gradient-based Training of Deep Architectures," in Neural Networks: Tricks of the Trade, Springer, 2012, pp. 437–478.
[33] R. Koch, "The 80/20 Principle: The Secret to Achieving More with Less,"Broadway Business, 2022.
[34] M. Waskom, "Seaborn: User Quide and Tutorial," 2012. [Online]. Available: https://seaborn.pydata.org/generated/seaborn.kdeplot.html.
[35] G. P. Zhang and M. Qi, "Neural Network Forecasting for Seasonal and Trend Time Series," European Journal of Operational Research, vol. 160, no. 2, pp. 501-514, 2005.
[36] Microsoft, "Normalize Data component," 2023. [Online]. Available: https://learn.microsoft.com/en-us/azure/machine-learning/component-reference/normalize-data?view=azureml-api-2.
[37] A. Hayes, "Simple Moving Average (SMA): What It Is and the Formula,"Investopedia, 2022. [Online]. Available: https://www.investopedia.com/terms/s/sma.asp.
[38] H. Erdogan and T. Yoshioka, "Investigations on Data Augmentation and Loss Functions for Deep Learning Based Speech-Background Separation," Microsoft AI and Research, 2018.
[39] O. Köksoy, "Multiresponse Robust design: Mean Square Error (MSE) Criterion," Applied Mathematics and Computation, vol. 175, no. 2, pp. 1716-1729, 2006.
[40] P. Grover, "5 Regression Loss Functions All Machine Learners Should Know, Choosing the right loss function for fitting a model," Medium, 2018. [Online]. Available: https://heartbeat.comet.ml/5-regression-loss-functions-all-machine-learners-should-know-4fb140e9d4b0.
[41] W. Wang and Y. Lu, "Analysis of the Mean Absolute Error (MAE) and the Root Mean Square Error (RMSE) in Assessing Rounding Model,"IOP Conference Series: Materials Science and Engineering, vol. 324, 2023.
[42] 杜佩蓉, "A Study of Flower Sales Forecast - Taking Flower Shop A as an Example", Department of Engineering Science, College of Engineering, National Cheng Kung University, pp. 53, 2023.