研究生: |
劉巧萱 Liu, Chiao-Hsuan |
---|---|
論文名稱: |
通過對金融新聞進行自動定義極性分數和情緒分析預測國際金融指標 International Financial Indices Prediction through Automatically Defined Polarity Scores and Sentiment Analysis of Financial News |
指導教授: |
鄭順林
Jeng, Shuen-Lin |
學位類別: |
碩士 Master |
系所名稱: |
管理學院 - 統計學系 Department of Statistics |
論文出版年: | 2022 |
畢業學年度: | 110 |
語文別: | 英文 |
論文頁數: | 69 |
中文關鍵詞: | 金融指標預測 、情緒分析 、金融新聞 、自然語言處理 |
外文關鍵詞: | Financial Indicator Price Prediction, Sentiment Analysis, Financial News, Natural Language Processing |
相關次數: | 點閱:150 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
金融科技(FinTech)旨在利用現有技術為傳統金融服務帶來創新應用,從而為金融行業拓展更多可能性。近年來,金融科技在金融領域的吸引力越來越大,尤其是在預測股票價格波動方面。我們研究目的是預測四個不同地區的國際金融指標的漲跌,預測對象有美國地區的標準普爾500指數(S&P 500)、中國地區的上海證券交易所指數(SSE)、香港地區的恆生指數(HSI)以及台灣地區的台灣證券交易所加權指數(TWII)。除了上述的指數外,我們還選出100支美股來做預測。除了使用常見的基本特徵和技術特徵外,本研究還使用自然語言處理 (NLP) 技術對 2019 年至 2021 年華爾街日報與鉅亨網的財經新聞進行情感分析。具體來說,我們提出了基於單詞級別與句子級別的特徵,更進一步提取語義特徵和金融情緒極性分數,各種特徵包含了不同面向且豐富的新聞信息,可提高股票價格預測的性能。其中,新穎的金融情緒極性分數是藉由股價和新聞詞頻相結合而自動計算生成的。這個極性分數在沒有人工標記新聞的情況下捕捉到了詞語情緒和市場趨勢的聯繫。在後續分析中將會採用多元適應性雲形迴歸(MARS)模型建立局部自回歸模型,並考慮特徵之間的相互作用與延遲效應。且MARS 模型能夠藉由選擇重要的特徵來防止維度所帶來的災難。最後使用六個月的資料評估模型的預測準確性,與進一步確定對預測國際金融指標的價格波動有價值的特徵。
Financial technology (FinTech) aims to utilize current technologies to bring innovative applications to traditional financial services, thereby expanding more possibilities for the financial industry. In recent years, FinTech has become more attractive in the financial field, especially for forecasting the volatility of price for stocks. The purpose of our study is to predict the rise and fall of four integrated stock indicators, namely Standard and Poor's 500 Index (S&P 500), Shanghai Stock Exchange Index (SSE), Hang Seng Index (HSI), and Taiwan Stock Exchange Weighted Index (TWII) in the United States, China, Hong Kong, and Taiwan, respectively. Besides the above indices, 100 selected individual stocks are also included. In addition to the basic features and the technical features, we use natural language processing (NLP) technology to perform sentiment analysis on financial news from Wall Street Journal and Anue Net during 2019 to 2021. We propose plenty of features based on word-level, sentence-level, semantic-level features of news, and financial sentiment scores, which contain wealthy information to improve the performance of stock price prediction. Among them, the novel financial sentiment polarity score is automatically calculated by the combination of stock price and news word frequency. This polarity score captures the link of word sentiment and market trend without human labeling of the news. The multivariate adaptive regression splines (MARS) model is adopted to build the local autoregression modeling and consider the interactions between features with time lags. The MARS model is able to select an important subset of features to prevent from the curse of dimensionality. The prediction accuracy is evaluated on the future 6 months. We further identify the features which are valuable in predicting the price of the corresponding financial indicator.
[1] Adam, E. E. B. Deep learning based nlp techniques in text to speech synthesis for communication recognition. Journal of Soft Computing Paradigm (JSCP) 2, 04 (2020), 209–215.
[2] Agarwal, A. Sentiment analysis of financial news. In 2020 12th International Conference on Computational Intelligence and Communication Networks (CICN) (2020), IEEE, pp. 312–315.
[3] Altinok, D. An ontology-based dialogue management system for banking and finance dialogue systems. arXiv preprint arXiv:1804.04838 (2018).
[4] Chan, Y.-H. Financial indices prediction through integrating sentiment analysis with factors of international stock markets. Master’s thesis, Department of Statistics National Cheng Kung University, 2019.
[5] Chang, C.-T. On the construction and analysis of chinese financial sentiment lexicon for financial news. Master’s thesis, Department of Computer Science University of Taipei, 2015.
[6] Chiu, Y.-C. Deep learning for stock price trend prediction using financial news and sentiment analysis: Bi-gru, bert and alberts. Master’s thesis, Department of Information Management National Taipei University, 2020.
[7] Chun, S.-H., and Jang, J.-W. A new trend pattern-matching method of interactive casebased reasoning for stock price predictions. Sustainability 14, 3 (2022), 1366.
[8] Fan, J., Xue, L., and Zhou, Y. How much can machines learn finance from chinese text data? Available at SSRN 3765862 (2021).
[9] Friedman, J. H. Multivariate adaptive regression splines. The annals of statistics 19, 1 (1991), 1–67.
[10] Gales, M. J. F., Watanabe, S., and Fosler-Lussier, E. Structured discriminative models for speech recognition: An overview. IEEE Signal Processing Magazine 29, 6 (2012), 70–81.
[11] Haralick, R. M., Shanmugam, K., and Dinstein, I. H. Textural features for image classification. IEEE Transactions on systems, man, and cybernetics, 6 (1973), 610–621.
[12] Hastie, T., Tibshirani, R., and Friedman, J. H. The elements of statistical learning: data mining, inference, and prediction, vol. 2. Springer, 2009.
[13] Huang, S.-C., Wu, C.-F., Chiou, C.-C., and Lin, M.-C. Intelligent fintech data mining by advanced deep learning approaches. Computational economics 59, 4 (2022), 1407–1422.
[14] Idrees, S. M., Alam, M. A., and Agarwal, P. A prediction approach for stock market volatility based on time series data. IEEE Access 7 (2019), 17287–17298.
[15] Jordan, T., and Elgazzar, H. Stock market prediction using text-based machine learning. In 2020 IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS) (2020), IEEE, pp. 1–5.
[16] Kim, H. Y., and Won, C. H. Forecasting the volatility of stock price index: A hybrid model integrating lstm with multiple garch-type models. Expert Systems with Applications 103 (2018), 25–37.
[17] Krishnamoorthy, S. Sentiment analysis of financial news articles using performance indicators. Knowledge and Information Systems 56, 2 (2018), 373–394.
[18] Lagna, A., and Ravishankar, M. Making the world a better place with fintech research. Information Systems Journal 32, 1 (2022), 61–102.
[19] Li, X., Xie, H., Song, Y., Zhu, S., Li, Q., and Wang, F. L. Does summarization help stock prediction? a news impact analysis. IEEE intelligent systems 30, 3 (2015), 26–34.
[20] Li, Y.-J. Predicting stock market trends using financial news sentiment analysis. Master’s thesis, Department of Information Management National University of Kaohsiung, 2021.
[21] Lin, C.-Y. Integrating sentiment analysis and text mining for news-headlines to predict forex market. Master’s thesis, Department of Information and Finance Management National Taipei University of Technology, 2020.
[22] Liu, C., Wang, J., Xiao, D., and Liang, Q. Forecasting s&p 500 stock index using statistical learning models. Open journal of statistics 6, 6 (2016), 1067–1075.
[23] Long, W., Lu, Z., and Cui, L. Deep learning-based feature engineering for stock price movement prediction. Knowledge-Based Systems 164 (2019), 163–173.
[24] Loughran, T., and McDonald, B. When is a liability not a liability? textual analysis, dictionaries, and 10-ks. The Journal of finance 66, 1 (2011), 35–65.
[25] Manimegalai, T., Manju, J., Rubiston, M. M., Vidhyashree, B., and Prabu, R. T. Prediction of optimized stock market trends using hybrid approach based on knn and bagging classifier (knnb). In 2022 IEEE 11th International Conference on Communication Systems and Network Technologies (CSNT) (2022), IEEE, pp. 257–262.
[26] Meyer, B., Bikdash, M., and Dai, X. Fine-grained financial news sentiment analysis. In SoutheastCon 2017 (2017), IEEE, pp. 1–8.
[27] Mielke, S. J., Alyafeai, Z., Salesky, E., Raffel, C., Dey, M., Gallé, M., Raja, A., Si, C., Lee, W. Y., Sagot, B., et al. Between words and characters: A brief history of open-vocabulary modeling and tokenization in nlp. arXiv preprint arXiv:2112.10508 (2021).
[28] Mohan, S., Mullapudi, S., Sammeta, S., Vijayvergia, P., and Anastasiu, D. C. Stock price prediction using news sentiment analysis. In 2019 IEEE Fifth International Conference on Big Data Computing Service and Applications (BigDataService) (2019), IEEE, pp. 205–208.
[29] Oliveira, N., Cortez, P., and Areal, N. The impact of microblogging data for stock market prediction: Using twitter to predict returns, volatility, trading volume and survey sentiment indices. Expert Systems with applications 73 (2017), 125–144.
[30] Roondiwala, M., Patel, H., and Varma, S. Predicting stock prices using lstm. International Journal of Science and Research (IJSR) 6, 4 (2017), 1754–1756.
[31] Rudin, C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence 1, 5 (2019), 206–215.
[32] Schueffel, P. Taming the beast: A scientific definition of fintech. Journal of Innovation Management 4, 4 (2016), 32–54.
[33] Wang, J.-H., and Huang, S. Improving sentiment classification from high volatility financial news. In Proceedings of the 33rd Annual ACM Symposium on Applied Computing (2018), pp. 1790–1797.
[34] Weng, T.-Y. International financial indices prediction incorporating news information based on sentiment, semantic and image representations. Master’s thesis, Department of Statistics National Cheng Kung University, 2020.
[35] Yadav, A., Jha, C., Sharan, A., and Vaish, V. Sentiment analysis of financial news using unsupervised and supervised approach. In International Conference on Pattern Recognition and Machine Intelligence (2019), Springer, pp. 311–319.