| 研究生: |
洪偉閔 Hung, Wei-Ming |
|---|---|
| 論文名稱: |
基於機率分群之模糊時間序列預測模型 A Probabilistic Clustering Based Forecasting Model for Fuzzy Time Series |
| 指導教授: |
李昇暾
Li, Sheng-Tun |
| 學位類別: |
碩士 Master |
| 系所名稱: |
管理學院 - 工業與資訊管理學系 Department of Industrial and Information Management |
| 論文出版年: | 2017 |
| 畢業學年度: | 105 |
| 語文別: | 中文 |
| 論文頁數: | 67 |
| 中文關鍵詞: | 模糊時間序列 、SAX 、LDA 、機率分群 、股價預測 |
| 外文關鍵詞: | time series, SAX, LDA, Clustering, Stock |
| 相關次數: | 點閱:72 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著巨量資訊時代的來臨,資料量不斷地隨著時間暴增,也發展出越來越多的方法來處理資料流,時間序列分類至今已發展數年了,隨著各領域漸漸的重視,時間序列方法逐漸成熟但如何有效地從時間序列資料中挖掘出有用的資訊或提升預測精確度就變得越來越重要了,然而傳統的時間序列並無法有效的處理語意不清的資料,為了要在不確定環境下處理語意資料,本研究提出了LDA (Latent Dirichlet Allocation ) 模型結合模糊時間序列的方法,目的是為了原本模糊化的動作有了統計方法的依據。並且在當今資料爆炸的大數據時代除了精確也講求快速、及時,所以在資料處理方面加入了符號近似法 SAX (Symbolic Aggregate approXimation)特徵擷取方法,可以讓原先高維度的資料對應到低維度的資料空間又可以保留資料特徵,以方便後續的分析應用,這個步驟使得模型訓練方面可以更快速有效的處理大量資料集。
在此研究中主要目標希望能找到一個能夠兼顧精確性及高效率的處理資料,有鑑於LDA模型成功地應用在巨量文件檔的資訊分析上,且具有探勘時間序列資料主題的能力,因此我們將擴展LDA的機率分群方法到模糊時間序列預測上,,提出一個基於機率分群模型之模糊時間序列預測方法,進而提供不同領域對於模糊時間序列的不同需求,並且改善一般傳統模糊時間序列的預測模型。
With the big data age is coming. There is important that how to obtain a useful information from big data and get a good accuracy effectively. Time series analysis has been studied to solve this problem for years. However traditional time series model deal inefficient with fuzzy or incomplete data. This study purpose combining LDA (Latent Dirichlet Allocation ) with fuzzy time series model for solving uncertainty data well. Aspect of data preprocess this study introduces SAX (Symbolic Aggregate approXimation) to extract the feature of data. It can reduce dimensions and keep the data constant. This preprocess makes the model to handle huge dataset efficiently and analysis conveniently.
The purpose of this study wants to develop a model can do data mining with high accuracy and efficiency. In view of LDA model applied to text mining successful, this study want to extend prediction function of model of fuzzy time series, provides a decision model for all fields and improve the traditional time series model.
[1] Araki, Y., Arita, D., & Taniguchi, R. I. (2006). Motion motif extraction from high-dimensional motion information. 92-97.
[2] Blei, D. M., & Jordan, M. I. (2006). Variational inference for Dirichlet process mixtures. Bayesian analysis, 1(1), 121-144.
[3] Blei, D. M., & Lafferty, J. D. (2006, June). Dynamic topic models. In Proceedings of the 23rd international conference on Machine learning (pp. 113-120). ACM.
[4] Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of machine Learning research, 3(Jan), 993-1022.
[5] Chen, H., Zhang, G., Lu, J., & Zhu, D. (2015, August). A fuzzy approach for measuring development of topics in patents using Latent Dirichlet Allocation. In Fuzzy Systems (FUZZ-IEEE), 2015 IEEE International Conference on (pp. 1-7). IEEE.
[6] Chan, K. P., & Fu, A. W. C. (1999, March). Efficient time series matching by wavelets. In Data Engineering, 1999. Proceedings., 15th International Conference on (pp. 126-133). IEEE.
[7] Chen, S. M. (1996). Forecasting enrollments based on fuzzy time series. Fuzzy sets and systems, 81(3), 311-319.
[8] Chen, S. M., & Chen, C. D. (2011). Handling forecasting problems based on high-order fuzzy logical relationships. Expert Systems with Applications, 38(4), 3857-3864.
[9] Chen, S. M., & Chung, N. Y. (2006). Forecasting enrollments using high‐order fuzzy time series and genetic algorithms. International Journal of Intelligent Systems, 21(5), 485-501.
[10] Chen, S. M., & Hsu, C. C. (2004). A new method to forecast enrollments using fuzzy time series. International Journal of Applied Science and Engineering, 2(3), 234-244.
[11] Chen, S. M., & Tanuwijaya, K. (2011). Multivariate fuzzy forecasting based on fuzzy time series and automatic clustering techniques. Expert Systems with Applications, 38(8), 10594-10605.
[12] Frigyik, B. A., Kapila, A., & Gupta, M. R. (2010). Introduction to the Dirichlet distribution and related processes. Department of Electrical Engineering, University of Washignton, UWEETR-2010, 6.
[13] Faloutsos, C., Ranganathan, M., & Manolopoulos, Y. (1994). Fast subsequence matching in time-series databases (Vol. 23, No. 2, pp. 419-429). ACM.
[14] Gupta, M. R., & Chen, Y. (2011). Theory and use of the EM algorithm. Now Publishers Inc.
[15] Hofmann, T. (1999, August). Probabilistic latent semantic indexing. In Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval (pp. 50-57). ACM.
[16] Hu, X., Tang, L., Tang, J., & Liu, H. (2013, February). Exploiting social relations for sentiment analysis in microblogging. In Proceedings of the sixth ACM international conference on Web search and data mining (pp. 537-546). ACM.
[17] Huarng, K., & Yu, T. H. K. (2006). Ratio-based lengths of intervals to improve fuzzy time series forecasting. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 36(2), 328-340.
[18] Huarng, K. (2001). Effective lengths of intervals to improve forecasting in fuzzy time series. Fuzzy sets and systems, 123(3), 387-394.
[19] Keogh, E., Chakrabarti, K., Pazzani, M., & Mehrotra, S. (2001). Locally adaptive dimensionality reduction for indexing large time series databases. ACM SIGMOD Record, 30(2), 151-162.
[20] Li, S. T., & Chen, Y. P. (2004, July). Natural partitioning-based forecasting model for fuzzy time-series. In Fuzzy Systems, 2004. Proceedings. 2004 IEEE International Conference on (Vol. 3, pp. 1355-1359). IEEE.
[21] Li, S. T., & Cheng, Y. C. (2007). Deterministic fuzzy time series model for forecasting enrollments. Computers & Mathematics with Applications, 53(12), 1904-1920.
[22] Lin, J., Keogh, E., Wei, L., & Lonardi, S. (2007). Experiencing SAX: a novel symbolic representation of time series. Data Mining and knowledge discovery, 15(2), 107-144.
[23] Ramage, D., Hall, D., Nallapati, R., & Manning, C. D. (2009, August). Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1-Volume 1 (pp. 248-256). Association for Computational Linguistics.
[24] Rosen-Zvi, M., Griffiths, T., Steyvers, M., & Smyth, P. (2004, July). The author-topic model for authors and documents. In Proceedings of the 20th conference on Uncertainty in artificial intelligence (pp. 487-494). AUAI Press.
[25] Song, Q., & Chissom, B. S. (1993). Fuzzy time series and its models. Fuzzy sets and systems, 54(3), 269-277.
[26] Tirunillai, S., & Tellis, G. J. (2014). Mining marketing meaning from online chatter: Strategic brand analysis of big data using latent dirichlet allocation. Journal of Marketing Research, 51(4), 463-479.
[27] Tseng, V. S., Chen, L. C., & Liu, J. J. (2007, April). Gene relation discovery by mining similar subsequences in time-series microarray data. In Computational Intelligence and Bioinformatics and Computational Biology, 2007. CIBCB'07. IEEE Symposium on (pp. 106-112). IEEE.
[28] Wang, J., Sun, X., She, M. F., Kouzani, A., & Nahavandi, S. (2013). Unsupervised mining of long time series based on latent topic model. Neurocomputing, 103, 93-103.
[29] Wei, X., Sun, J., & Wang, X. (2007, January). Dynamic Mixture Models for Multiple Time-Series. In Ijcai (Vol. 7, pp. 2909-2914).
[30] Wong, W. K., Bai, E., & Chu, A. W. C. (2010). Adaptive time-variant models for fuzzy-time-series forecasting. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 40(6), 1531-1542.
[31] Zadeh, L. A. (1965). Fuzzy Sets. Information and Control, 8(3), 338-353.
校內:2022-12-31公開