簡易檢索 / 詳目顯示

研究生: 林品磊
Lin, Ping-Lei
論文名稱: 基於LGBM機器學習之台灣加權股價指數預測
The forecasting of Taiwan Capitalization Weighted Stock Index based on the LGBM machine learning model
指導教授: 李強
Lee, Chiang
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2020
畢業學年度: 108
語文別: 英文
論文頁數: 100
中文關鍵詞: 台灣加權股價指數機器學習指數預測遺傳演算法模擬投資技術指標
外文關鍵詞: machine learning, index prediction, genetic algorithm, simulated investment
相關次數: 點閱:233下載:28
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 台灣加權股價指數一直是大家研究、探討以及預測的對象。不僅可以反映出台灣整體的經濟概況、股票市場的走勢,更可以直接利用期貨的投資來獲利。如果可以精準的預測未來台灣加權股價指數的走勢的話,不僅對個人可以投資獲利外,對國家的評判經濟方針與提前預防股市的劇烈波動也有許多的幫助。
    由於股價市場是錯綜複雜的,任何一點小事都有可能影響到股價的變化。尤其台灣屬於淺碟型市場。任何一點國際性的消息或經濟大國的股市波動都有可能衝擊到台灣股票市場。然而先前的研究考量的因素都是局部的,並沒有一個整合眾多因素的分析與預測。以致預測台灣加權股價指數的準確率也一直維持在不到六成。
    為此,本研究考量的大量且多種有可能影響到台灣加權股價指數的因素,包括大量的技術指標、世界各國的國家指數、國際性的指數、三大法人與台灣每日的交易資訊。並利用遺傳演算法將眾多的特徵分組。除了讓幾乎所有的特徵都可以被模型學到外也減少特徵之間的互相干擾。最後經過兩階段合併的方法控制不同預測比率下的準確率。讓投資者有更彈性的投資選擇。本論文預測的準確率也都在六成以上。
    模擬投資的部分我們設計了利用本論文的預測結果投資台灣加權股價指數期貨的三種投資方法。並計算在不同預測率與準確率下的投資可以分別得到的獲利、手續費與平均每次出手的獲利。最好的投資方法的報酬率甚至高達523%。

    Taiwan's weighted stock price index has always been the object of research, discussion and prediction. Not only can it reflect Taiwan's overall economic profile and stock market trends, but it can also directly use futures investment to make a profit. If it is possible to accurately predict the future trend of Taiwan’s weighted stock price index, it will not only benefit individuals, but also help the country’s economic judgment and prevent stock market volatility in advance. Since the stock price market is complicated, any little thing can affect the changes in stock prices. Especially Taiwan belongs to the shallow dish market. Any international news or stock market volatility in a major economic country may impact the Taiwan stock market. However, the factors considered in previous studies are partial. There is no analysis and prediction that integrates many factors. As a result, the accuracy of forecasting Taiwan’s weighted stock price index has also remained at less than 60%. To this end, this study considers a large number of factors that may affect Taiwan’s weighted stock price index, including a large number of technical indicators, borad-base indexes, international indexes, three institutional investors and daily trading information. Using genetic algorithm to group many features. In addition to allowing almost all the features to be learned by the model, it also reduces the mutual interference between the features. Finally, the two-stage merger method is used to control the accuracy under different prediction ratios. Give investors more flexible investment choices. The accuracy of the forecast in this paper is also above 60%. In the simulation investment part, we designed three investment methods that use the prediction results of this paper to invest in Taiwan's weighted stock price index futures. Calculate the profit, handling fee and average profit of each shot that can be obtained by investment under different prediction rates and accuracy rates. The best investment method has a rate of return as high as 523%.

    Chinese Abstract ⅰ Abstract ⅱ Acknowledgements ⅲ List of Contents ⅳ List of Figure ⅵ List of Tables ⅸ Chapter 1. Introduction 1 1.1 Background and Motivation 1 1.2 Purpose 2 1.3 Forecast target 3 1.4 Differences from previous papers 5 1.4.1 Considering many factors that affect stock prices 5 1.4.2 Classification and pre-processing of technical indicators 6 1.4.3 Various prediction targets 6 1.4.4 Prediction rate and accuracy rate 6 1.5 Challenges 6 1.5.1 Numerous features 6 1.5.2 Classification of technical indicators 6 1.5.3 Different forecast targets 7 1.6 Research Framework 7 Chapter 2. Related work 8 2.1 Time series 8 2.1.1 Hidden Markov Model [18] 8 2.1.2 Autoregressive moving average model (ARMA) [20] 8 2.1.3 Generalized AutoRegressive Conditional Heteroskedasticity (GARCH) [21] [22] 9 2.2 Text analysis 9 2.2.1 Word2Vec 9 2.2.2 Text sentiment analysis 10 2.3 Machine Learning 10 2.3.1 Single learner 11 2.3.2 Ensemble learning 12 2.4 Feature Selection 15 2.4.1 Filter 16 2.4.2 Wrapper 17 Chapter 3. Methods 19 3.1. Process architecture diagram 19 3.2. Data Source 19 3.2.1. Broad-base Indexs 20 3.2.2. International Index 20 3.2.3. Three Institutional Investors 22 3.2.4. Daily Trading Information 22 3.3. Techonical Indexes & Classification 23 3.3.1. Signal type 23 3.3.2. Dependent type 24 3.3.3. Independent type 25 3.4. Data pre-processing 26 3.4.1. Ups and downs signals 26 3.4.2. Differential percentage 31 3.5. Feature selection 33 3.6. Multi-racial Genetic Algorithm 35 3.6.1. Coding 35 3.6.2. Generation of parent groups 36 3.6.3. Claculation of fitness 36 3.6.4. Selection and Copy 36 3.6.5. Crossover 36 3.6.6. Mutation 37 Chapter 4. Experiments 40 4.1. Environment 40 4.1.1. Computer environment 40 4.1.2. LGBM 40 4.2. Paper comparison 40 4.3. Experimental results 41 4.3.1. Impact of data on accuracy 41 4.3.2. Technical Index Classification and Pre-processing 46 4.3.3. Genetic Algorithm 49 4.3.4. First-order merge 53 4.3.5. Second-order merge 54 4.4. Simulated investment 58 4.4.1. Taiwan Capitalization Weighted Stock Index Futures 58 4.4.2. Trading hours 59 4.4.3. Investment Strategy 1 60 4.4.4. Investment Strategy 2 68 4.4.5. Investment Strategy 3 76 Chapter 5. Conclusions and future work 86 5.1 Conclusions 86 5.2 Future Work 88 References 89 Appendix 91

    [1] 鍾詠翔林奕榮, 機器學習選股要顛覆投資圈, 2019.
    [2] M. L.dePrado. [Online]. Available: http://www.quantresearch.org/.
    [3] "TAROBO 基金盡職調查的專家 - 大拇哥投顧," 23 12 2019. [Online]. Available: https://www.taroboadvisors.com/.
    [4] "BlackRock," 3 7 2020. [Online]. Available: https://www.blackrock.com/hk/en/investment-ideas/systematic-active-equity.
    [5] "道氏理論," [Online]. Available: https://zh.wikipedia.org/wiki/%E9%81%93%E6%B0%8F%E7%90%86%E8%AE%BA.
    [6] M. R. Vargas, C. E. M. d. Anjos, G. L. G. Bichara and A. G. Evsukoff, "Deep Leaning for Stock Market Prediction Using Technical Indicators and Financial News Articles,," Proceedings of the International Joint Conference on Neural Networks, 7 2018.
    [7] C.-H. Chen and P. Shih, "A Stock Trend Prediction Approach based on Chinese News and Technical Indicator Using Genetic Algorithms,," 2019 IEEE Congress on Evolutionary Computation, CEC 2019 - Proceedings, p. 1468–1472, 2019.
    [8] H. Yang, Y. Zhu and Q. Huang, "A multi-indicator feature selection for CNN-driven stock index prediction," Neural Information Processing - 25th International Conference, ICONIP, pp. 35-46, 2018.
    [9] Y. Hao and Y. Chen, "A feature weighted support vector machine and K-nearest neighbor algorithm for stock market indices prediction," Expert Systems with Applications, pp. 340-355, 1 9 2017.
    [10] M. Hagenau, M. Hauser, M. Liebmann and D. Neumann, "Reading all the news at the same time: Predicting mid-term stock price developments based on news momentum," Proceedings of the Annual Hawaii International Conference on System Sciences, pp. 1279-1288, 2013.
    [11] N.Rekabsaz, M.Lupu, A.Baklanov, A.Dür, L.Andersson and A.Hanbury, "Volatility Prediction using Financial Disclosures Sentiments with Word Embedding-based IR Models,," Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, p. 1712–1721, 2017.
    [12] Z.Zhao, R.Rao, S.Tu and J.Shi, "Time-weighted lstm model with redefined labeling for stock trend prediction," International Conference on Tools with Artificial Intelligence, ICTAI, p. 1210–1217, 11 2017.
    [13] "TA-lib," [Online]. Available: https://mrjbq7.github.io/ta-lib/doc_index.html. [Accessed 3 7 2020].
    [14] "MoneyDJ理財網," [Online]. Available: https://www.moneydj.com/KMDJ/Blog/BlogArticleViewer.aspx?a=09a03094-02b4-4eb1-8982-000000058181. [Accessed 3 7 2020].
    [15] C. Zhang, C. Du, Y. Wang, H. Yin, C. Chen and H. Wang, "Stock assistant: A stock AI assistant for reliability modeling of stock comments," Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 2710-2719, 19 7 2018.
    [16] J. H. Holland, “Genetic Algorithms and Adaptation,” 1975.
    [17] "LightGBM," [Online]. Available: https://github.com/microsoft/LightGBM. [Accessed 3 7 2020].
    [18] E. L. Sonnhammer, G. v. Heijne and A. Krogh, "A hidden Markov model for predicting transmembrane helices in protein sequences,," Intelligent Systems for Molecular Biology, pp. 175-182, 1998.
    [19] M.Zhang, X.Jiang, Z.Fang, Y.Zeng and K.Xu, "High-order Hidden Markov Model for trend prediction in financial time series,," Phys. A Stat. Mech. its Appl, p. 1–12, 5 2019.
    [20] M. F. Anaghi and Y. Norouz, "A model for stock price forecasting based on ARMA systems," 2012 2nd International Conference on Advances in Computational Tools for Engineering Applications (ACTEA), p. 265–268, 12-15 12 2012.
    [21] T.Bollerslev, "Generalized autoregressive conditional heteroskedasticity,," J. Econom., vol. 31, no. 3,, p. 307–327, 1986.
    [22] Denis-AlexandreTrottier and D. Ardia, "Moments of standardized Fernandez–Steel skewed distributions: Applications to the estimation of GARCH-type models,," Finance Research Letters, p. 311–316, 8 2016.
    [23] Y.Wang, Z.Pan and C.Wu, "Volatility spillover from the US to international stock markets: A heterogeneous volatility spillover GARCH model,," J. Forecast., vol. 37, no. 3, p. 385–400, 4 2018.
    [24] Schmidhuber, Hochreiter and J. J.Urgen, "Long Short-term Memory," 1997.
    [25] T. Xia, Q. Sun, A. Zhou, S. Wang, S. Xiong, S. Gao, J. Li and Q. Yuan, "Improving the performance of stock trend prediction by applying GA to feature selection," Proceedings - 8th IEEE International Symposium on Cloud and Services Computing, pp. 122-126, 2018.
    [26] H.-Y. Fang, "國際股市對台灣股市影響之分析," 2005.
    [27] S.-T. HSIEH, 國際油價對塑化產業經營績效之影響, 2019.
    [28] neo4j, "The Jaccard Similarity algorithm," [Online]. Available: https://neo4j.com/docs/graph-algorithms/current/labs-algorithms/jaccard/. [Accessed 22 7 2020].

    下載圖示 校內:2021-07-30公開
    校外:2021-07-30公開
    QR CODE