| 研究生: |
吳家齊 Wu, Chia-Chi |
|---|---|
| 論文名稱: |
結合財務比率與新聞情緒之多模態深度學習股價預測模式 A Multimodal Deep Learning Model for Stock Price Prediction Integrating Financial Ratios and News Sentiment |
| 指導教授: |
蔡雅雯
Tsai, Ya-Wen |
| 學位類別: |
碩士 Master |
| 系所名稱: |
工學院 - 工程管理碩士在職專班 Engineering Management Graduate Program |
| 論文出版年: | 2025 |
| 畢業學年度: | 114 |
| 語文別: | 中文 |
| 論文頁數: | 147 |
| 中文關鍵詞: | 多模態 、深度學習 、股價預測 、新聞情緒 、財務比率 、半導體產業 |
| 外文關鍵詞: | Multimodal, Deep Learning, Stock Price Prediction, News Sentiment, Financial Ratios, Semiconductor Industry |
| 相關次數: | 點閱:6 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本研究建立一套具實務應用潛力的多模態股價預測與投資決策模式,並以台灣半導體供應鏈上中下游等六家具有代表性的上市公司為實證對象。研究資料涵蓋2015年至2025年資料,針對每個交易日彙整技術指標、財務比率與新聞情緒三大類特徵。技術面納入移動平均、布林通道等十餘項指標,財務面由公開財報計算十二項財務比率,情緒面利用中文自然語言處理模型分析新聞標題內容,轉換為正負向新聞情緒分數。
資訊系統設計方面,本研究以公司與交易日為單位的多模態時間序列,使每一交易日均對應一組同時涵蓋技術面、財務面與新聞情緒的特徵向量,作為深度學習模型的輸入,尤其是財報資料採逐家輸入系統,讓預測結果能同時反映價格走勢、企業體質與市場訊息。
人工智慧模型選用四種深度學習架構,並設計四種資料情境,形成16組模型組合。此外,引入生成對抗網路產生合成股價序列,進行資料增強與情境模擬。研究結果顯示,加入財報(財務比率)與新聞情緒資料可有效提升預測準確度,平均絕對百分比誤差(Mean Absolute Percentage Error,MAPE)下降約2.85%,且不同企業因規模與產業角色差異,其最佳預測模型亦有所不同。
回顧相關文獻可知,多數研究仍以技術指標為主要輸入變數,僅少數有納入財務比率或文本情緒等變數,尤其在同一預測模型架構中同時整合技術指標、財務比率與新聞等三類資訊並通過模擬投資實測驗證,更沒有文獻提及過。本研究據此建構多模態預測架構,同步納入技術指標、財務比率與新聞情緒,並以16組情境資料組合分析預測誤差與模擬投資績效,實屬創新考量。研究結果實證多模態特徵結合GAN資料增強與深度學習架構,可提升股價預測之準確性與穩定性,並具備應用於投資決策與量化交易之參考價值。
This study develops a multimodal stock price prediction and investment decision-making framework with practical applicability, using six representative listed companies in Taiwan’s semiconductor supply chain as empirical subjects: upstream IC design (MediaTek, Realtek), midstream wafer fabrication (TSMC, PSMC), and downstream packaging and testing (ASE, ChipMOS). The research dataset covers the period from 2015 to 2025. For each trading day, three categories of features are compiled: technical indicators, financial ratios, and news sentiment. On the technical side, more than ten indicators such as moving averages and Bollinger Bands are included; on the financial side, twelve key financial ratios are computed from publicly disclosed financial statements; on the sentiment side, Chinese natural language processing models are used to analyze news headlines and convert them into positive and negative sentiment scores.
In terms of information system design, these three types of features are organized into a multimodal time-series panel indexed by firm and trading day, so that each trading day corresponds to a unified feature vector simultaneously encompassing technical, financial, and news-sentiment information. This vector serves as the direct input to deep learning models. Financial statement data are ingested on a firm-by-firm basis rather than via market-level aggregates, enabling the forecasts to jointly reflect price dynamics, firm fundamentals, and market information within a single integrated system.
At the model level, four deep learning architectures are employed together with four data scenarios, yielding 16 model–scenario combinations. In addition, a Generative Adversarial Network (GAN) is used to generate synthetic stock price series for data augmentation and stress-scenario simulation. Empirical results show that incorporating financial ratios and news sentiment effectively improves forecasting accuracy: the Mean Absolute Percentage Error (MAPE) decreases by about 2.85%. Owing to differences in firm size and supply-chain position, however, the best-performing model is not the same for every company.
A review of related studies indicates that most prior work still relies primarily on technical indicators as input variables, while only a minority additionally incorporates either financial ratios or textual sentiment. It is rare for a single forecasting framework to simultaneously integrate technical indicators, firm-level financial ratios, and news sentiment, and to further validate the impact of different multimodal information configurations through simulated trading. To address this gap, this study constructs a multimodal forecasting framework that jointly incorporates technical, financial, and sentiment features, and uses four data scenarios and 16 model combinations to analyze their effects on forecasting errors and simulated investment performance.
Using the semiconductor industry as an example, the study empirically demonstrates that multimodal features, when combined with GAN-based data augmentation and deep learning architectures, can enhance both the accuracy and stability of stock price forecasts. The proposed framework thus provides a useful reference for practical applications in investment decision-making and quantitative trading.
中文參考文獻
[1]吳海韜(2023),分布式強化學習應用於投資組合管理,國立臺灣大學資訊工程學系碩士學位論文。
[2]呂雅芳(2023),結合 LSTM 股價預測與基因模糊交易策略—以台灣50為例,國立臺灣大學工程科學及海洋工程學系學位論文。
[3]林予捷(2023),以可解釋AI模型優化HAD-GNN模型之預測績效-以台灣股票市場為例,國立成功大學會計學系碩士學位論文。
[4]劉馨文(2020),基於混合注意力機制與長短期記憶之股票趨勢預測,國立臺北科技大學資訊工程系碩士學位論文。
[5]鐘毅(2020),以深度學習 LSTM 方法進行台灣加權股價指數預測,國立交通大學科技管理研究所碩士學位論文。
英文參考文獻
[1]Altman, E. I. (1968). “Financial Ratios, Discriminant Analysis and the Prediction of Corporate Bankruptcy.” The Journal of Finance, 23(4), 589-609.
[2]Arjovsky, M., Chintala, S., & Bottou, L. (2017). “Wasserstein Generative Adversarial Networks.” In International Conference on Machine Learning (pp. 214-223). PMLR.
[3]Aroussi, R. (2025). Yfinance (Version 0.2.65) [Computer software]. GitHub. https://github.com/ranaroussi/yfinance, Accessed on: November 1, 2024.
[4]Bahdanau, D., Cho, K., & Bengio, Y. (2014). “Neural Machine Translation by Jointly Learning to Align and Translate.” arXiv preprint arXiv:1409.0473.
[5]Bengio, Y., Simard, P., & Frasconi, P. (1994). “Learning Long-Term Dependencies with Gradient Descent is Difficult.” IEEE Transactions on Neural Networks, 5(2), 157-166.
[6]Bhardwaj, M., Roy, A., & Bilgaiyan, S. (2024). “StockGAN: Enhancing Stock Price Prediction with GAN and Sentiment Analysis.” In 2024 Second International Conference on Networks, Multimedia and Information Technology (NMITCON) (pp. 1-6). IEEE.
[7]Bishop, C. M. & Nasrabadi, N. M. (2006). Pattern Recognition and Machine Learning (Vol. 4, No. 4, p. 738). New York: springer.
[8]Borovykh, A., Bohte, S., & Oosterlee, C. W. (2017). “Conditional Time Series Forecasting with Convolutional Neural Networks.” arXiv preprint arXiv:1703.04691.
[9]Box, G. E. P., Jenkins, G. M., Reinsel, G. C., & Ljung, G. M. (2015). Time Series Analysis: Forecasting and Control. (5th ed.), John Wiley & Sons, Hoboken, NJ.
[10]Breiman, L. (2001). Random Forests. Machine Learning, 45(1), 5-32.
[11]Brigham, E. O. (1988). The Fast Fourier Transform and Its Applications. Prentice-Hall, Inc.
[12]Chai, T. & Draxler, R. R. (2014). “Root Mean Square Error (RMSE) or Mean Absolute Error (MAE)?–Arguments Against Avoiding RMSE in the Literature.” Geoscientific Model Development, 7(3), 1247-1250.
[13]Chapelle, O., Scholkopf, B., & Zien, A. (2009). “Semi-Supervised Learning (Chapelle, O. et al., Eds.; 2006) [Book reviews].” IEEE Transactions on Neural Networks, 20(3), 542-542.
[14]Chekhlov, A., Uryasev, S., & Zabarankin, M. (2005). “Drawdown Measure in Portfolio Optimization.” International Journal of Theoretical and Applied Finance, 8(01), 13-58.
[15]Chen, T. & Guestrin, C. (2016). “XGBoost: A Scalable Tree Boosting System.” In Proceedings of the 22nd Acm Sigkdd International Conference on Knowledge Discovery and Data Mining (pp. 785-794).
[16]Chetlur, S., Woolley, C., Vandermersch, P., Cohen, J., Tran, J., Catanzaro, B., & Shelhamer, E. (2014). “cuDNN: Efficient Primitives for Deep Learning.” arXiv preprint arXiv:1410.0759.
[17]Chung, J., Gulcehre, C., Cho, K., & Bengio, Y. (2014). “Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling.” arXiv preprint arXiv:1412.3555.
[18]Cortes, C. & Vapnik, V. (1995). “Support-Vector Networks.” Machine Learning, 20(3), 273-297.
[19]Ding, Q., Wu, S., Sun, H., Guo, J., & Guo, J. (2020). “Hierarchical Multi-Scale Gaussian Transformer for Stock Movement Prediction.” In Ijcai (pp. 4640-4646).
[20]Donahue, J., Anne Hendricks, L., Guadarrama, S., Rohrbach, M., Venugopalan, S., Saenko, K., & Darrell, T. (2015). “Long-Term Recurrent Convolutional Networks for Visual Recognition and Description.” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2625-2634).
[21]Elman, J. L. (1990). “Finding Structure in Time.” Cognitive Science, 14(2), 179-211.
[22]Fama, E. F. & French, K. R. (1992). “The Cross-Section of Expected Stock Returns.” The Journal of Finance, 47(2), 427-465.
[23]Fama, E. F. (1970). “Efficient Capital Markets.” The Journal of Finance, 25(2), 383-417.
[24]Fataliyev, K. & Liu, W. (2023). “MCASP: Multi-Modal Cross Attention Network for Stock Market Prediction.” In Proceedings of the 21st Annual Workshop of the Australasian Language Technology Association (pp. 67-77).
[25]Fischer, T. & Krauss, C. (2018). “Deep Learning with Long Short-Term Memory Networks for Financial Market Predictions.” European Journal of Operational Research, 270(2), 654-669.
[26]Gers, F. A., Schmidhuber, J., & Cummins, F. (2000). “Learning to Forget: Continual Prediction with LSTM.” Neural Computation, 12(10), 2451-2471.
[27]Goodfellow, I. J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio, Y. (2014). “Generative Adversarial Nets.” Advances in Neural Information Processing Systems, 27.
[28]Goodfellow, I., Bengio, Y., Courville, A., & Bengio, Y. (2016). Deep Learning (Vol. 1, No. 2). Cambridge: MIT press.
[29]Graves, A., Fernández, S., & Schmidhuber, J. (2005, September). “Bidirectional LSTM Networks for Improved Phoneme Classification and Recognition.” In International Conference on Artificial Neural Networks (pp. 799-804). Berlin, Heidelberg: Springer Berlin Heidelberg.
[30]Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., & Courville, A. C. (2017). “Improved Training of Wasserstein GANs.” Advances in Neural Information Processing Systems, 30.
[31]Heaton, J. B., Polson, N. G., & Witte, J. H. (2016). “Deep Learning in Finance.” arXiv preprint arXiv:1602.06561.
[32]Hochreiter, S. & Schmidhuber, J. (1997). “Long Short-Term Memory.” Neural Computation, 9(8), 1735-1780.
[33]Huang, Z., Xu, W., & Yu, K. (2015). “Bidirectional LSTM-CRF Models for Sequence Tagging.” arXiv preprint arXiv:1508.01991.
[34]Hutto, C. & Gilbert, E. (2014). “VADER: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text.” In Proceedings of the International AAAI Conference on Web and Social Media (Vol. 8, No. 1, pp. 216-225).
[35]Hyndman, R. J. & Athanasopoulos, G. (2018). Forecasting: Principles and Practice. OTexts.
[36]Hyndman, R. J. & Koehler, A. B. (2006). “Another Look at Measures of Forecast Accuracy.” International Journal of Forecasting, 22(4), 679-688.
[37]Hyndman, R. J., Athanasopoulos, G., Bergmeir, C., Caceres, G., Chhay, L., O’Hara-Wild, M., Petropoulos, F., Razbash, S., & Wang, E. (2020). Package ‘forecast’. Online] https://cran. r-project. org/web/packages/forecast/forecast. pdf, Accessed on: February 10, 2025.
[38]Jiang, Z., Xu, D., & Liang, J. (2017). “A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem.” arXiv preprint arXiv:1706.10059.
[39]Jordan, M. I. & Mitchell, T. M. (2015). “Machine Learning: Trends, Perspectives, and Prospects.” Science, 349(6245), 255-260.
[40]Kieso, D. E., Weygandt, J. J., & Warfield, T. D. (2016). Intermediate Accounting. John Wiley & Sons.
[41]Kim, K. J. (2003). “Financial Time Series Forecasting using Support Vector Machines.” Neurocomputing, 55(1-2), 307-319.
[42]Kim, T. & Kim, H. Y. (2019). “Forecasting Stock Prices with a Feature Fusion LSTM-CNN Model using Different Representations of the same data.”PloS one, 14(2), e0212320.
[43]Krollner, B., Vanstone, B., & Finnie, G. (2010). “Financial Time Series Forecasting with Machine Learning Techniques: A Survey.”In European Symposium on Artificial Neural Networks: Computational Intelligence and Machine Learning (pp. 25-30).
[44]Kvålseth, T. O. (1985). “Cautionary Note about R².” The American Statistician, 39(4), 279-285.
[45]LeCun, Y., Bengio, Y., & Hinton, G. (2015). “Deep Learning.” Nature, 521(7553), 436-444.
[46]LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). “Gradient-Based Learning Applied to Document Recognition.” Proceedings of the IEEE, 86(11), 2278-2324.
[47]Li, S. & Xu, S. (2025). “Enhancing Stock Price Prediction using GANs and Transformer-Based Attention Mechanisms.” Empirical Economics, 68(1), 373-403.
[48]Li, X., Xie, H., Chen, L., Wang, J., & Deng, X. (2014). “News Impact on Stock Price Return Via Sentiment Analysis.” Knowledge-Based Systems, 69, 14-23.
[49]Liu, Z. (2025). “Improving Stock Price Forecasting with MA-BiLSTM: A Novel Approach.” Frontiers in Applied Mathematics and Statistics, 11, 1588202.
[50]Loughran, T. & McDonald, B. (2011). “When is A Liability Not A Liability? Textual Analysis, Dictionaries, and 10‐Ks.” The Journal of Finance, 66(1), 35-65.
[51]Lu, W., Li, J., Li, Y., Sun, A., & Wang, J. (2020). “A CNN‐LSTM‐Based Model to Forecast Stock Prices.” Complexity, 2020(1), 6622927.
[52]Malkiel, B. G. (2003). “The Efficient Market Hypothesis and Its Critics.” Journal of Economic Perspectives, 17(1), 59-82.
[53]Mane, O. & Kandasamy, S. (2022). “Stock Market Prediction Using Natural Language Processing-A Survey.” arXiv preprint arXiv:2208.13564.
[54]Manning, C. D., Raghavan, P., & Schütze, H. (2008). Introduction to Information Retrieval. Cambridge, UK: Cambridge University Press. ISBN: 978-0-521-86571-5.
[55]Metz, L., Poole, B., Pfau, D., & Sohl-Dickstein, J. (2016). “Unrolled Generative Adversarial Networks.” arXiv preprint arXiv:1611.02163.
[56]Mitchell, T. M. (1997). Machine Learning, McGraw-Hill Science, New York.
[57]Mittermayer, M. A. & Knolmayer, G. (2006). Text Mining Systems for Market Response to News: A Survey. (Working Paper No. 184). Institut für Wirtschaftsinformatik der Universität Bern.
[58]Nair, B. B., Mohandas, V. P., & Sakthivel, N. R. (2010). “A Decision Tree- Rough Set Hybrid System for Stock Market Trend Prediction.” International Journal of Computer Applications, 6(9), 1-6.
[59]Needles, B. E., Powers, M., & Crosson, S. V. (2011). Principles of Accounting (12th ed.), Cengage Learning, Mason, OH.
[60]Nelson, D. M., Pereira, A. C., & De Oliveira, R. A. (2017). “Stock Market's Price Movement Prediction with LSTM Neural Networks.”In 2017 International Joint Conference on Neural Networks (IJCNN) (pp. 1419-1426). IEEE.
[61]Ouf, S., El Hawary, M., Aboutabl, A., & Adel, S. (2024). “A Deep Learning-Based LSTM for Stock Price Prediction Using Twitter Sentiment Analysis.” International Journal of Advanced Computer Science & Applications, 15(12).
[62]Pandas Development Team. (2025). Pandas.DataFrame.ffill — Pandas Documentation (Version 2.2.x). Retrieved August 26, 2025, from https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.ffill.html, Accessed on: August 30, 2025.
[63]Pascanu, R., Mikolov, T., & Bengio, Y. (2013). “On The Difficulty of Training Recurrent Neural Networks.” In International Conference on Machine Learning (pp. 1310-1318). Pmlr.
[64]Penman, S. H. (2013). Financial Statement Analysis and Security Valuation (5th ed.). New York: McGraw-Hill/Irwin.
[65]Piotroski, J. D. (2000). “Value Investing: The Use of Historical Financial Statement Information to Separate Winners from Losers.” Journal of Accounting Research, 1-41.
[66]Poon, S. H. & Granger, C. W. J. (2003). “Forecasting Volatility in Financial Markets: A Review.” Journal of Economic Literature, 41(2), 478-539.
[67]Quinlan, J. R. (2014). C4. 5: Programs for Machine Learning. Elsevier.
[68]Ravanelli, M., Brakel, P., Omologo, M., & Bengio, Y. (2018). “Light Gated Recurrent Units for Speech Recognition.” IEEE Transactions on Emerging Topics in Computational Intelligence, 2(2), 92-102.
[69]Ren, X., Xu, W., & Duan, K. (2022). “Fourier Transform Based LSTM Stock Prediction Model under Oil Shocks.” Quantitative Finance and Economics, 6(2), 342-358.
[70]Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). “Learning Representations by Back-Propagating Errors.” Nature, 323(6088), 533-536.
[71]Russell, S. & Norvig, P.(1995). Artificial Intelligence. A modern approach. Prentice Hall, Egnlewood Cliffs, New Jersey.
[72]Schuster, M. & Paliwal, K. K. (1997). “Bidirectional Recurrent Neural Networks.”IEEE Transactions on Signal Processing, 45(11), 2673-2681.
[73]Sezer, O. B., Gudelek, M. U., & Ozbayoglu, A. M. (2020). “Financial Time Series Forecasting with Deep Learning: A Systematic Literature Review: 2005–2019.” Applied Soft Computing, 90, 106181.
[74]Sha, X. (2024). “Time Series Stock Price Forecasting Based on Genetic Algorithm (GA)-Long Short-Term Memory Network (LSTM) Optimization.” arXiv preprint arXiv:2405.03151.
[75]Shah, D., Isah, H., & Zulkernine, F. (2019). “Stock Market Analysis: A Review and Taxonomy of Prediction Techniques.” International Journal of Financial Studies, 7(2), 26.
[76]Sharpe, W. F. (1998). “The Sharpe Ratio.” Streetwise–the Best of the Journal of Portfolio Management, 3(3), 169-85.
[77]Siami-Namini, S., Tavakoli, N., & Namin, A. S. (2019, December). “The Performance of LSTM and BiLSTM in Forecasting Time Series.” In 2019 IEEE International Conference on Big Data (Big Data) (pp. 3285-3292). IEEE.
[78]Song, D., Baek, A. M. C., & Kim, N. (2021). “Forecasting Stock Market Indices Using Padding-Based Fourier Transform Denoising and Time Series Deep Learning Models.” IEEE Access, 9, 83786-83796.
[79]Spiess, A. N. & Neumeyer, N. (2010). “An Evaluation of R2 As an Inadequate Measure for Nonlinear Models in Pharmacological and Biochemical Research: A Monte Carlo approach.” BMC pharmacology, 10(1), 6.
[80]Sutton, R. S. & Barto, A. G. (1998). Reinforcement Learning: An Introduction (Vol. 1, No. 1, pp. 9-11). Cambridge: MIT press.
[81]Tetlock, P. C. (2007). “Giving Content to Investor Sentiment: The Role of Media in the Stock Market.” The Journal of Finance, 62(3), 1139-1168.
[82]Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2017). “Attention is All You Need.”Advances in Neural Information Processing Systems, 30.
[83]Wang, X. & Liu, L. (2025). “Risk-Sensitive Deep Reinforcement Learning for Portfolio Optimization.” Journal of Risk and Financial Management, 18(7), 347.
[84]Werbos, P. J. (2002). “Backpropagation Through Time: What It Does and How to Do It.” Proceedings of the IEEE, 78(10), 1550-1560.
[85]Wiese, M., Knobloch, R., Korn, R., & Kretschmer, P. (2020). “Quant GANs: Deep Generation of Financial Time Series.” Quantitative Finance, 20(9), 1419-1440.
[86]Willmott, C. J. & Matsuura, K. (2005). “Advantages of the Mean Absolute Error (MAE) Over the Root Mean Square Error (RMSE) in Assessing Average Model Performance.” Climate Research, 30(1), 79-82.
[87]Xu, Y. & Cohen, S. B. (2018). “Stock Movement Prediction from Tweets and Historical Prices.” In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 1970-1979).
[88]Yang, H., Liu, X. Y., Zhong, S., & Walid, A. (2020). “Deep Reinforcement Learning for Automated Stock Trading: An Ensemble Strategy.” In Proceedings of The First ACM International Conference on AI in Finance (pp. 1-8).
[89]Yoon, J., Jarrett, D., & Van der Schaar, M. (2019). “Time-Series Generative Adversarial Networks.” Advances in Neural Information Processing Systems, 32.
[90]Zhang, G. P. (2003). “Time Series Forecasting Using a Hybrid ARIMA and Neural Network Model.” Neurocomputing, 50, 159-175.
[91]Zhang, J., Ye, L., & Lai, Y. (2023). “Stock Price Prediction Using CNN-BiLSTM-Attention Model.” Mathematics, 11(9), 1985.
[92]Zhong, X. & Enke, D. (2017). “Forecasting Daily Stock Market Return Using Dimensionality Reduction.” Expert Systems with Applications, 67, 126-139.