簡易檢索 / 詳目顯示

研究生: 郭憲儒
Guo, Xian-Ru
論文名稱: 基於注意力機制之深度學習股價預測模型
Deep Learning Stock Price Prediction Model Based on Attention Mechanism
指導教授: 楊竹星
Yang, Chu-Sing
共同指導教授: 謝錫堃
Shieh, Ce-Kuen
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2023
畢業學年度: 111
語文別: 中文
論文頁數: 50
中文關鍵詞: 股價預測深度學習長短期記憶模型注意力機制
外文關鍵詞: Stock price prediction, deep learning, long short-term memory models, attention mechanism
相關次數: 點閱:139下載:18
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 金融市場一直以來都是人們非常想戰勝的領域,因戰勝它往往可以帶來鉅額的 財富,但在現實中,真正能戰勝它的投資者往往是屈指可數。近十年來由於深度學 習的蓬勃發展,不少學者都利用股價的歷史數據來訓練深度學習的模型,希望可以 藉此得到更客觀的投資判斷,本文也將繼續朝著此方向深入研究。 經過研究整理 後,能夠發現在股價預測的領域中,遞迴神經網路的結果表現更為出色,主要原因 為股價資料的連續性特性,但在訓練的過程中,沒有給予輸入值相對應的權重,而 導致無法判斷輸入資料分別的重要性,是傳統的遞迴神經網路甚至是長短期記憶模 型皆存在的問題,本文將透過注意力機制的權重特性來改善此問題,並與現存的遞 迴神經網路模型、長短期記憶股價預測模型來做比較。為了避免結果只符合單一資 料集,本文總共利用三組不同的資料集分別訓練模型,並在實驗結果的章節中,分 別與改進前的模型去做比較,以確保結果的正確性。 實驗結果表明,因注意力機制 擁有權重的特性,可以透過改變權重來給予各個時間段不同的權重,藉以讓訓練的 結果更好。

    The financial market has always been a field that people aspire to conquer, as it can potentially bring substantial wealth. However, in reality, only a few investors have managed to truly overcome it. In the past decade, with the rapid development of deep learning, many scholars have utilized historical stock price data to train deep learning models, hoping to obtain more objective investment judgments. This article will continue to delve into this direction of research.After conducting research and analysis, it has been observed that in the field of stock price prediction, recurrent neural networks (RNNs) exhibit superior performance. This is primarily due to the sequential nature of stock price data. However, during the training process, traditional RNNs and even long short-term memory models face a challenge of not assigning relative weights to input values, which hinders the ability to determine the importance of different input data.

    This article aims to address this issue by improving the traditional RNN models and long short-term memory stock price prediction models using the weighted characteristics of attention mechanisms. A comparison will be made between the proposed approach and the existing RNN models and long short-term memory models. To ensure the validity of the results and avoid bias towards a single dataset, three different datasets were utilized to train the models. The experimental results section will present a comparison between the improved models and their counterparts before the enhancements, ensuring the accuracy of the findings.

    The experimental results indicate that the attention mechanism, with its weighted characteristics, can assign different weights to each time period, thereby improving the training outcomes. By adjusting the weights, the attention mechanism enhances the model's performance.

    中文摘要 I Abstract II 誌謝 III 目錄 IV 表目錄 VII 圖目錄 VIII 第一章 緒論 1 1-1. 研究背景 1 1-2. 研究動機 1 第二章 背景知識與相關研究 3 2-1. 深度學習 3 2-1.1 深度學習的運作 3 2-1.2 資料前處理 6 2-2. 前饋傳播與反向傳播 6 2-3. 遞迴神經網路 7 2-3.1 長短期記憶網路 8 2-3.2 循環門單元 9 2-4. 注意力機制 10 2-5. 深度學習與股價預測的相關研究 12 第三章 系統設計與實作 15 3-1. 模型架構 15 3-1.1 資料前處理 16 3-2. Resnet架構 16 3-2.1 ResidualBlock 17 3-3. 注意力機制 18 3-4. GRU模型預測 19 3-5. 實驗細節 21 3-5.1 Optimizer 22 3-5.2 LossFunction 23 第四章 實驗結果與分析 24 4-1. 實驗環境 24 4-2. 評估指標 25 4-3. 訓練資料集 27 4-4. 訓練價格選擇 29 4-4.1 價格實驗配置 29 4-4.2 價格實驗結果 30 4-5. 預測時間段實驗 31 4-6. 傳統單一模型比較 33 4-6.1 時間序列模型 33 4-6.2 卷積神經網路模型(LeNet-5) 35 4-6.3 遞迴神經網路模型 37 4-7. 組合模型比較 40 第五章 結論與未來方向 43 5-1. 研究貢獻 43 5-2. 未來研究方向 44 參考文獻 45

    [1] T. Mikolov, M. Karafiát, L. Burget, J. Cernocky`, and S. Khudanpur, “Recurrent neural network based language model.” in Interspeech, vol. 2, no. 3. Makuhari, 2010, pp. 1045–1048.

    [2] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997.

    [3] K. Cho, B. van Merrienboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio, “Learning phrase representations using rnn encoder-decoder for statistical machine translation,” 2014.

    [4] X. Chen, “The advance of deep learning and attention mechanism,” in 2022 Inter- national Conference on Electronics and Devices, Computational Science (ICEDCS), 2022, pp. 318–321.

    [5] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need. arxiv 2017,” arXiv preprint arXiv:1706.03762, 2017.

    [6] G. Bathla, “Stock price prediction using lstm and svr,” in 2020 Sixth International Conference on Parallel, Distributed and Grid Computing (PDGC), 2020, pp. 211–214.

    [7] P. Srivastava and P. K. Mishra, “Stock market prediction using rnn lstm,” in 2021 2nd Global Conference for Advancement in Technology (GCAT), 2021, pp. 1–5.

    [8] X.Zhou,“Stockpricepredictionusingcombinedlstm-cnnmodel,”in20213rdInterna- tional Conference on Machine Learning, Big Data and Business Intelligence (MLBDBI), 2021, pp. 67–71.

    [9] Y. Liu, Z. Wang, and B. Zheng, “Application of regularized gru-lstm model in stock price prediction,” in 2019 IEEE 5th International Conference on Computer and Com- munications (ICCC), 2019, pp. 1886–1890.

    [10] N. Buslim, I. L. Rahmatullah, B. A. Setyawan, and A. Alamsyah, “Comparing bitcoin’s prediction model using gru, rnn, and lstm by hyperparameter optimization grid search and random search,” in 2021 9th International Conference on Cyber and IT Service Management (CITSM), 2021, pp. 1–6.

    [11] S. Yang, X. Yu, and Y. Zhou, “Lstm and gru neural network performance comparison study: Taking yelp review dataset as an example,” in 2020 International Workshop on Electronic Communication and Artificial Intelligence (IWECAI), 2020, pp. 98–101.

    [12] P. Khodaee, A. Esfahanipour, and H. Mehtari Taheri, “Forecasting turning points in stock price by applying a novel hybrid cnn-lstm-resnet model fed by 2d segmented images,” Engineering Applications of Artificial Intelligence, vol. 116, p. 105464, 2022.

    [13] L.-C. Cheng, Y.-H. Huang, and M.-E. Wu, “Applied attention-based lstm neural net- works in stock prediction,” in 2018 IEEE International Conference on Big Data (Big Data), 2018, pp. 4716–4718.

    [14] M.-C. Lee, “Research on the feasibility of applying gru and attention mechanism combined with technical indicators in stock trading strategies,” Applied Sciences, vol. 12, no. 3, 2022. [Online]. Available: https://www.mdpi.com/2076-3417/12/3/1007

    [15] H. Li, Y. Shen, and Y. Zhu, “Stock price prediction using attention-based multi-input lstm,” in Asian conference on machine learning. PMLR, 2018, pp. 454–469.

    [16] T. Awoke, M. Rout, L. Mohanty, and S. C. Satapathy, “Bitcoin price prediction and analysis using deep learning models,” in Communication Software and Networks: Pro- ceedings of INDIA 2019. Springer, 2020, pp. 631–640.

    [17] L.-C. Cheng, Y.-H. Huang, M.-H. Hsieh, and M.-E. Wu, “A novel trading strategy framework based on reinforcement deep learning for financial market predictions,” Mathematics, vol. 9, no. 23, p. 3094, 2021.

    [18] D.-A.Ha,C.-H.Liao,K.-S.Tan,andS.-M.Yuan,“Deeplearningmodelsforpredicting monthly taiex to support making decisions in index futures trading,” Mathematics, vol. 9, no. 24, p. 3268, 2021.

    [19] M. Liu, G. Li, J. Li, X. Zhu, and Y. Yao, “Forecasting the price of bitcoin using deep learning,” Finance research letters, vol. 40, p. 101755, 2021.

    [20] M. Nabipour, P. Nayyeri, H. Jabani, A. Mosavi, and E. Salwana, “Deep learning for stock market prediction,” Entropy, vol. 22, no. 8, p. 840, 2020.

    [21] M. Wysocki and R. Ślepaczuk, “Artificial neural networks performance in wig20 index options pricing,” Entropy, vol. 24, no. 1, p. 35, 2021.

    [22] M.-C. Lee, J.-W. Chang, J. C. Hung, and B.-L. Chen, “Exploring the effectiveness of deep neural networks with technical analysis applied to stock market prediction,” Computer Science and Information Systems, vol. 18, no. 2, pp. 401–418, 2021.

    [23] P. S. Sisodia, A. Gupta, Y. Kumar, and G. K. Ameta, “Stock market analysis and prediction for nifty50 using lstm deep learning approach,” in 2022 2nd International Conference on Innovative Practices in Technology and Management (ICIPTM), vol. 2, 2022, pp. 156–161.

    [24] Y.Yao,“Data analysis on the computer intelligent stock prediction model based on lstm rnn and algorithm optimization,” in 2022 IEEE International Conference on Electrical Engineering, Big Data and Algorithms (EEBDA), 2022, pp. 480–485.

    [25] X. Chi, S. Huang, and J. Li, “Handwriting recognition based on resnet-18,” in 2021 2nd International Conference on Big Data Artificial Intelligence Software Engineering (ICBASE), 2021, pp. 456–459.

    [26] P. Roy, M. M. Oddin Chisty, and H. Abdul Fattah, “Alzheimer’s disease diagnosis from mri images using resnet-152 neural network architecture,” in 2021 5th International Conference on Electrical Information and Communication Technology (EICT), 2021, pp. 1–6.

    [27] H. Liu and B. Song, “Stock price trend prediction model based on deep residual net- work and stock price graph,” in 2018 11th International Symposium on Computational Intelligence and Design (ISCID), vol. 02, 2018, pp. 328–331.

    [28] Y.ZhaoandY.Wang,“Remainingusefullifepredictionviaattentionmechanism-based lstm neural networks,” in 2020 IEEE 9th Joint International Information Technology and Artificial Intelligence Conference (ITAIC), vol. 9, 2020, pp. 306–310.

    [29] M. Saad, T. Yang, and H. Zhou, “A comparison of bidirectional gru and lstm for hand gesture recognition using leap motion,” in 2022 37th Youth Academic Annual Conference of Chinese Association of Automation (YAC), 2022, pp. 1427–1433.

    [30] A. Zohrevand and Z. Imani, “An empirical study of the performance of different op- timizers in the deep neural networks,” in 2022 International Conference on Machine Vision and Image Processing (MVIP), 2022, pp. 1–5.

    [31] J. Duchi, E. Hazan, and Y. Singer, “Adaptive subgradient methods for online learning and stochastic optimization,” J. Mach. Learn. Res., vol. 12, no. null, p. 2121–2159, jul 2011.

    [32] Y. Dauphin, H. de Vries, J. Chung, and Y. Bengio, “Rmsprop and equilibrated adaptive learning rates for non-convex optimization.” arXiv: Learning, 2015.

    [33] F.Pedregosa,G.Varoquaux,A.Gramfort,V.Michel,B.Thirion,O.Grisel,M.Blondel, P. Prettenhofer, R. Weiss, V. Dubourg et al., “Scikit-learn: Machine learning in python,” the Journal of machine Learning research, vol. 12, pp. 2825–2830, 2011.

    [34] A. Géron, Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow. " O’Reilly Media, Inc.", 2022.

    [35] D.Wei,“Predictionofstockpricebasedonlstmneuralnetwork,”in2019International Conference on Artificial Intelligence and Advanced Manufacturing (AIAM), 2019, pp. 544–547.

    [36] J. Bagul, P. Warkhade, T. Gangwal, and N. Mangaonkar, “Arima vs lstm algorithm – a comparative study based on stock market prediction,” in 2022 5th International Conference on Advances in Science and Technology (ICAST), 2022, pp. 49–53.

    [37] S. Sarvesh, R. V. Sidharth, V. Vaishnav, J. Thangakumar, and S. Sathyalakshmi, “A hybrid model for stock price prediction using machine learning techniques with cnn,” in 2021 5th International Conference on Information Systems and Computer Networks (ISCON), 2021, pp. 1–6.

    [38] J. M.-T. Wu, Z. Li, G. Srivastava, J. Frnda, V. G. Diaz, and J. C.-W. Lin, “A cnn-based stock price trend prediction with futures and historical price,” in 2020 International Conference on Pervasive Artificial Intelligence (ICPAI), 2020, pp. 134–139.

    [39] L.Jialin,Q.Shanwen,Z.Zhikai,L.Keyao,M.Jiayong,andT.T.Toe,“Cnn-lstmmodel stock forecasting based on an integrated attention mechanism,” in 2022 3rd International Conference on Pattern Recognition and Machine Learning (PRML), 2022, pp. 403–408.

    下載圖示 校內:立即公開
    校外:立即公開
    QR CODE