簡易檢索 / 詳目顯示

研究生: 林佳萱
Lin, Chia-Hsuan
論文名稱: 一套基於注意力可解決時間序列預測問題的多元特徵與時間萃取方法
An Attention-Based Extracting Multivariate Features and Time Method for Time Series Forecasting Problem
指導教授: 鄭憲宗
Cheng, Sheng-Tzong
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2021
畢業學年度: 109
語文別: 英文
論文頁數: 34
中文關鍵詞: 時間序列預測問題深度學習注意力機制
外文關鍵詞: time series forecasting problems, Deep Learning, attention mechanism
相關次數: 點閱:104下載:4
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在時間序列預測問題上,近年來有許多採用注意力機制的深度學習架構。基於雙階段注意力的遞歸神經網路透過將注意力機制嵌入至長短期記憶模型;而基於注意力的序列網路則是透過擴張的深度可分離時間卷積網路搭配上注意力機制,以及利用長短期記憶模型搭配上隱藏狀態注意模塊。兩者都透過加入注意力機制使得在預測上能降低失誤。但是他們都將不同時間點的所有特徵序列同時做注意力關注,這導致在萃取特徵的過程中,只能看到大範圍的關聯性。因此我們提出一個想法,分別針對時間以及非目標特徵序列進行關注,達到細化萃取特徵的效果。本篇論文透過使用長期遞歸卷積網路、自注意力機制、擴張因果卷積神經網路提出三種模型架構。我們使用氣候預測與股票預測的資料集,展示我們的方法可以達到與其他使用注意力機制的深度學習模型較低的失誤。

    In terms of time series forecasting problems, there have been many deep learning structures that use attention mechanisms in recent years. The Dual-Stage Attention-Based Recurrent Neural Network (DA-RNN) embeds the attention mechanism into the Long Short-Term Memory (LSTM); while the Attention-Based SeriesNet(A-SeriesNet) uses the Dilated Depthwise Separable Temporal Convolutional Networks (DDSTCNs) with the attention mechanism, and the LSTM with the Hidden State Attention Module (HSAM). Both of them can reduce the error in prediction by adopting the attention mechanisms. However, they all pay attention to all the feature series at different time points at the same time, which leads to only a wide range of correlations in the process of extracting features. Therefore, we propose an idea to focus on time and non-target feature series respectively to achieve the effect of refined feature extraction. This thesis proposes three model architectures using Long-term Recurrent Convolutional Network (LRCN), self-attention mechanism, and Dilated Causal Convolutional Neural Network (DC-CNN). We use weather and stock forecasting datasets to show that our method can achieve lower errors with other deep learning models that use attention mechanisms.

    摘要 IV Abstract V ACKNOWLEDGMENT VII TABLE OF CONTENTS VIII LIST OF FIGURES X LIST OF TABLES XI Chapter 1. Introduction and Motivation 1 Chapter 2. Background and Related work 4 2.1 Time Series Forecasting 4 2.2 Attention Mechanism 5 Chapter 3. Approach 7 3.1 Problem Description and Definition 7 3.2 Type & Time extracting method 8 3.2.1 Type Convolution and Time Convolution 8 3.2.2 Self-Attention 10 3.3 Type & Time CNN-LSTM 11 3.3.1 LSTM-Based Encoder-Decoder 12 3.4 Type & Time Position-Encoding 15 3.4.1 Position-Encoding 17 3.5 Type & Time DC-CNN 18 3.5.1 Dilated Causal Convolution 19 Chapter 4. Implementation and Experiments 21 4.1 Datasets and Environment 21 4.1.1 Datasets 21 4.1.2 Environment 22 4.2 Training Procedure and Evaluation Metrics 23 4.3 Experimental Results 23 4.3.1 Forecasting One Time Point 23 4.3.2 Forecasting Five Time Points 29 Chapter 5. Conclusions and Future Work 31 5.1 Conclusions 31 5.2 Future Work 31 Reference 33

    [1] Vaswani, A., Shazeer, N.M., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., & Polosukhin, I. (2017). Attention is All you Need. ArXiv, abs/1706.03762.
    [2] Qin, Y.; Song, D.; Cheng, H.; Cheng, W.; Jiang, G.; Cottrell, G. A Dual-Stage Attention-Based Recurrent Neural Network for Time Series Prediction. In Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI’17); AAAI Press: Palo Alto, CA, USA, 2017; pp. 2627–2633.
    [3] Cheng Y, Liu Z, Morimoto Y. Attention-Based SeriesNet: An Attention-Based Hybrid Neural Network Model for Conditional Time Series Forecasting. Information. 2020; 11(6):305. https://doi.org/10.3390/info11060305
    [4] Woo, S., Park, J., Lee, J. Y., & Kweon, I. S. (2018). Cbam: Convolutional block attention module. In Proceedings of the European conference on computer vision (ECCV) (pp. 3-19).
    [5] Ioffe, S.; Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. Int. Conf. Int. Conf. Mach. Learn. 2015, 37, 448–456.
    [6] Borovykh, Anastasia, Sander Bohte, and Cornelis W. Oosterlee. "Dilated convolutional neural networks for time series forecasting." Journal of Computational Finance, Forthcoming (2018).
    [7] Shen, Zhipeng, et al. "SeriesNet: a generative time series forecasting model." 2018 International Joint Conference on Neural Networks (IJCNN). IEEE, 2018.
    [8] Cho, Kyunghyun, et al. "On the properties of neural machine translation: Encoder-decoder approaches." arXiv preprint arXiv:1409.1259 (2014).
    [9] Alsharif, M.H.; Younes, M.K.; Kim, J. Time Series ARIMA Model for Prediction of Daily and Monthly Average Global Solar Radiation: The Case Study of Seoul, South Korea. Symmetry 2019, 11, 240. https://doi.org/10.3390/sym11020240
    [10] Wangdi, Kinley, et al. "Development of temporal modelling for forecasting and prediction of malaria infections using time-series and ARIMAX analyses: a case study in endemic districts of Bhutan." Malaria Journal 9.1 (2010): 1-9.
    [11] Wang, Ping, et al. "A novel hybrid-Garch model based on ARIMA and SVM for PM2. 5 concentrations forecasting." Atmospheric Pollution Research 8.5 (2017): 850-860.
    [12] Xu, Shuojiang, Hing Kai Chan, and Tiantian Zhang. "Forecasting the demand of the aviation industry using hybrid time series SARIMA-SVR approach." Transportation Research Part E: Logistics and Transportation Review 122 (2019): 169-180.
    [13] Büyükşahin, Ümit Çavuş, and Şeyda Ertekin. "Improving forecasting accuracy of time series data using a new ARIMA-ANN hybrid method and empirical mode decomposition." Neurocomputing 361 (2019): 151-163.
    [14] Sadek, Ramzi M., et al. "Parkinson's Disease Prediction Using Artificial Neural Network." (2019).
    [15] Ren, Lei, et al. "Prediction of bearing remaining useful life with deep convolution neural network." IEEE Access 6 (2018): 13041-13049.
    [16] Hochreiter, Sepp, and Jürgen Schmidhuber. "Long short-term memory." Neural computation 9.8 (1997): 1735-1780.
    [17] Chung, Junyoung, et al. "Empirical evaluation of gated recurrent neural networks on sequence modeling." arXiv preprint arXiv:1412.3555 (2014).
    [18] Xue, Hao, Du Q. Huynh, and Mark Reynolds. "SS-LSTM: A hierarchical LSTM model for pedestrian trajectory prediction." 2018 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 2018.
    [19] Li, Chaoshun, et al. "Short-term wind speed interval prediction based on ensemble GRU model." IEEE transactions on sustainable energy 11.3 (2019): 1370-1380.
    [20] Kim, Tae-Young, and Sung-Bae Cho. "Predicting residential energy consumption using CNN-LSTM neural networks." Energy 182 (2019): 72-81.
    [21] Xu, Kelvin, et al. "Show, attend and tell: Neural image caption generation with visual attention." International conference on machine learning. PMLR, 2015.
    [22] Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. "Neural machine translation by jointly learning to align and translate." arXiv preprint arXiv:1409.0473 (2014).
    [23] Graves, Alex, Greg Wayne, and Ivo Danihelka. "Neural turing machines." arXiv preprint arXiv:1410.5401 (2014).
    [24] Luong, Minh-Thang, Hieu Pham, and Christopher D. Manning. "Effective approaches to attention-based neural machine translation." arXiv preprint arXiv:1508.04025 (2015).
    [25] Cheng, Yepeng, and Yasuhiko Morimoto. "Triple-Stage Attention-Based Multiple Parallel Connection Hybrid Neural Network Model for Conditional Time Series Forecasting." IEEE Access 9 (2021): 29165-29179.

    下載圖示 校內:2022-08-31公開
    校外:2022-08-31公開
    QR CODE