| 研究生: |
吳翌暄 Wu, Yi-Hsuan |
|---|---|
| 論文名稱: |
使用深度強化學習以投資人風險個性進行投資組合管理 Portfolio Management with Investor Risk Personality Using Deep Reinforcement Learning |
| 指導教授: |
王惠嘉
Wang, Hei-Chia |
| 學位類別: |
碩士 Master |
| 系所名稱: |
管理學院 - 工業與資訊管理學系碩士在職專班 Department of Industrial and Information Management (on the job class) |
| 論文出版年: | 2022 |
| 畢業學年度: | 110 |
| 語文別: | 中文 |
| 論文頁數: | 38 |
| 中文關鍵詞: | 深度強化學習 、投資組合 、A2C 、PPO 、AutoEncoder |
| 外文關鍵詞: | Deep Reinforcement Learning, Portfolio, A2C, PPO, AutoEncoder |
| 相關次數: | 點閱:191 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
在這個高通膨物價與低利率的時代,社會大眾為了增加收入往往會以投資金融來增加自己的被動收入,主要是以股票為投資標的。由於投資有風險,且每人的投資個性(風險承受能力)各不相同,有些是偏好承擔風險來獲得最大報酬,有些是害怕風險而規避風險來獲得穩定的報酬。總而言之投資是介於報酬與風險之間的取捨,而且投資的風險接受度是取決於投資人對風險的接受能力。因此本研究將設計以投資人風險個性進行投資組合管理。主要是以Beta對股票進行分類(找出跟投資人個性對應的股票),並使用財務指標與AutoEncoder對股票進行評分,並透過技術指標、協方差矩陣與深度強化學習-A2C(Advantage Actor Critic)與PPO(Proximal Policy Optimization)進行資產配置,期望能為不同風險個性的投資人,找出投資績效良好且最適合他們的投資組合。訓練結果以A2C模型得出的累積報酬率高達-保守型49.61%,穩建型82.04%,積極型99.69%。POP模型得出的累積報酬高達-保守型39.92%,穩建型89.89%,積極型85.61%。
Since investment involves risk, and each person's investment personality (risk tolerance) is different, some prefer to take risks to get maximum reward, while others are afraid of risks and avoid risks to get stable reward. In short, investment is a trade-off between reward and risk, and the risk tolerance of investment depends on the investor's ability to accept risk. Therefore, this study is designed to manage investment portfolios based on the risk profile of investors. The main purpose of this study is to classify stocks by Beta (to identify stocks that correspond to investors' personalities), to score stocks by financial indicators and AutoEncoder, and to allocate assets by technical indicators, covariance matrix, and Deep Reinforcement Learning-A2C (Advantage Actor Critic) and PPO (Proximal Policy Optimization), in order to find investment performance for investors with different styles. We hope to find the most suitable portfolios for investors with different styles. The A2C model yielded the highest cumulative return - 49.61% for the Conservative model, 82.04% for the Stable model, and 99.69% for the Aggressive model. The POP model yielded the highest cumulative returns of 39.92% for Conservative, 89.89% for Stable, and 85.61% for Aggressive.
英文文獻:
Brim, A. (2020). Deep Reinforcement Learning Pairs Trading with a Double Deep Q-Network. In 2020 10th Annual Computing and Communication Workshop and Conference (CCWC) (pp. 0222-0227). IEEE, Las Vegas, NV, USA.
Chang, Y. H. & Lee, M. S. (2017). Incorporating Markov decision process on genetic algorithms to formulate trading strategies for stock markets. Applied Soft Computing, 52, 1143-1153.
Dai, Y. & Qin, Z. (2021). Multi-period uncertain portfolio optimization model with minimum transaction lots and dynamic risk preference. Applied Soft Computing, 109, 107519.
Fernandez, E., Navarro, J., Solares, E., & Coello, C. C. (2019). A novel approach to select the best portfolio considering the preferences of the decision maker. Swarm and Evolutionary Computation, 46, 140-153.
Fischer, T. G. (2018). Reinforcement learning in financial markets-a survey. Retrieved from Fu, X., Du, J., Guo, Y., Liu, M., Dong, T., & Duan, X. (2018). A machine learning framework for stock selection. arXiv preprint arXiv:1806.01743.
Gunawan, A. A., Ashifa, S. B., Rumagit, R. Y., & Ngarianto, H. (2021, October). Development of Stock Market Price Application to Predict Purchase and Sales Decisions Using Proximal Policy Optimization Method. In 2021 1st International Conference on Computer Science and Artificial Intelligence (ICCSAI) (Vol. 1, pp. 431-437). IEEE, Jakarta, Indonesia.
Hajjami, M. & Amin, G. R. (2018). Modelling stock selection using ordered weighted averaging operator. International Journal of Intelligent Systems, 33(11), 2283-2292.
Harnpadungkij, T., Chaisangmongkon, W., & Phunchongharn, P. (2019). Risk-Sensitive Portfolio Management by using Distributional Reinforcement Learning. In 2019 IEEE 10th International Conference on Awareness Science and Technology (iCAST) (pp. 1-6). IEEE, Morioka, Japan.
Jin, O. & El-Saawy, H. (2016). Portfolio management using reinforcement learning. Stanford University.
Katongo, M., & Bhattacharyya, R. (2021). The Use of Deep Reinforcement Learning in Tactical Asset Allocation. Available at SSRN 3812609.
Li, Y. M., Lin, L. F., Hsieh, C. Y. & Huang, B. S. (2021). A social investing approach for portfolio recommendation. Information & Management, 58(8), 103536.
Murphy, J. J. (1999). Technical analysis of the financial markets: A comprehensive guide to trading methods and applications. Penguin.
Park, H., Sim, M. K., & Choi, D. G. (2020). An intelligent financial portfolio trading strategy using deep Q-learning. Expert Systems with Applications, 158, 113573.
Silva, A., Neves, R., & Horta, N. (2015). A hybrid approach to portfolio composition based on fundamental and technical indicators. Expert Systems with Applications, 42(4), 2036-2048.
Soleymani, F. & Paquet, E. (2020). Financial portfolio optimization with online deep reinforcement learning and restricted stacked autoencoder—DeepBreath. Expert Systems with Applications, 156, 113456.
Vishal, M., Satija, Y., & Babu, B. S. (2021, December). Trading Agent for the Indian Stock Market scenario using Actor-Critic based Reinforcement Learning. In 2021 IEEE International Conference on Computation System and Information Technology for Sustainable Solutions (CSITSS) (pp. 1-5). IEEE, Bangalore, India.
Wei, D. (2019). Prediction of stock price based on LSTM neural network. In 2019 International Conference on Artificial Intelligence and Advanced Manufacturing (AIAM) (pp. 544-547). IEEE, Dublin, Ireland.
Yang, F., Chen, Z., Li, J., & Tang, L. (2019). A novel hybrid stock selection method with stock prediction. Applied Soft Computing, 80, 820-831.
Yu, L., Hu, L., & Tang, L. (2016). Stock selection with a novel sigmoid-based mixed discrete-continuous differential evolution algorithm. IEEE Transactions on Knowledge and Data Engineering, 28(7), 1891-1904.
Yu, J. R., Chiou, W. P., Hung, C. H., Dong, W. K., & Chang, Y. H. (2022). Dynamic rebalancing portfolio models with analyses of investor sentiment. International Review of Economics & Finance, 77, 1-13.
Zai, A. & Brown, B. (2020). Deep reinforcement learning in action. Manning Publications.
Zhang, Y. J. & Ma, S. J. (2019). How to effectively estimate the time-varying risk spillover between crude oil and stock markets? Evidence from the expectile perspective. Energy Economics, 84, 104562.
中文文獻:
楊晴穎 (2020),深度強化學習於投資組合管理交易策略。國立臺北大學,碩士論文。
網站資料:
DailyView網路溫度計(2021). 2021 from https://www.chinatimes.com/realtimenews/20211222004141-260410?chdtv
Haring, M. (2020). 2020 from http://desres18.netornot.at/id/reinforcement-learning-lost-chapters-proximal-policy-optimization/
中時新聞網 (2022). 2022 from https://www.chinatimes.com/cn/realtimenews/20210609000096-261502?chdtv
台部落(2020). 2020 from https://www.twblogs.net/a/5eeba20a33cbe858769e2493
台灣股市交易網 (2022). 財務評分表. from https://goodinfo.tw/tw/StockFinGrade.asp?STOCK_ID=2330
經濟日報 (2021). from https://money.udn.com/money/story/5613/5301504
遠見 (2021). 2021 from https://www.gvm.com.tw/article/83734
校內:不公開