簡易檢索 / 詳目顯示

研究生: 鄭翊宏
Cheng, Yi-Hong
論文名稱: DPFormer:基於動態批次的時間序列預測
DPFormer: Dynamic Patching for Time Series Forecasting
指導教授: 李政德
Li, Cheng-Te
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2025
畢業學年度: 113
語文別: 英文
論文頁數: 47
中文關鍵詞: 時間序列預測動態補丁Transformer交叉注意力自適應時間建模
外文關鍵詞: Time-series forecasting, Dynamic patching, Transformer, Cross-attention, Adaptive temporal modeling
相關次數: 點閱:4下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 基於 Transformer 的模型已顯著提升時間序列預測的效能。諸如補丁策略等最新進展使模型能夠更好地從時序資料中捕獲結構化資訊。然而,大多數現有方法依賴固定的補丁大小,由於時間模式和事件粒度的差異,這種大小在不同資料集上可能並非最優。本文提出了 DPFormer,這是一個新穎的動態補丁框架,可根據輸入資料的特性調整補丁大小。我們的方法利用一種可學習的選擇機制,為不同大小的補丁分配動態重要性得分,從而實現更靈活、更有效率的時間表徵學習。此外,我們引入了交叉注意力模組和混合模組,以整合補丁級特徵和全局上下文特徵,從而提升模型的預測能力。在七個真實資料集上進行的大量實驗表明,DPFormer 在多種預測場景中均實現了穩健穩定的性能,無需手動調整補丁大小即可超越或匹敵強大的基準模型。

    Time-series forecasting has seen significant improvements through Transformer-based models. Recent advances, such as patching strategies, allow models to better capture structured information from temporal data. However, most existing methods rely on a fixed patch size, which can be suboptimal across diverse datasets due to variations in temporal patterns and event granularity. In this paper, we propose DPFormer, a novel dynamic patching framework that adapts patch sizes based on the characteristics of input data. Our method leverages a learnable selection mechanism to assign dynamic importance scores to patches of varying sizes, enabling more flexible and effective temporal representation learning. Furthermore, we introduce a cross-attention block and a mixing module to integrate patch-level and global contextual features, improving the model’s forecasting capacity. Extensive experiments on seven real-world datasets demonstrate that DPFormer achieves robust and stable performance across multiple forecasting scenarios, outperforming or matching strong baselines without manual tuning of patch sizes.

    中文摘要 i Abstract ii Contents iv List of Tables vi List of Figures vii 1 Introduction 1 1.1 Background 1 1.1.1 MLP-base model 1 1.1.2 RNN-base model 1 1.1.3 CNN-base model 2 1.1.4 Transformer-base model 2 1.2 Motivation and Challenge 3 1.3 Contributions 6 2 Related Works 7 2.1 Patchify in time Series Forecasting 7 2.2 multi-patch in time Series Forecasting 8 3 Methodology 12 3.1 Notation 12 3.2 Problem definition 12 3.3 Structure Overview 13 3.4 Dynamic patch embedding block 13 3.4.1 Patch selection 14 3.4.2 Patch and global embedding 15 3.5 Cross attention block 18 3.5.1 Cross attention layer 18 3.5.2 Patch size and number Mixer 19 3.6 Forecasting output and loss 19 4 Experiments 21 4.1 Datasets 21 4.2 Experiment detail 22 4.3 Main result 22 4.4 Ablation study 24 4.4.1 Lookback window 24 4.4.2 Patching pool 25 4.4.3 Static/Dynamic ablation 26 4.4.4 Mixer ablation 27 4.4.5 alpha ablation 28 4.4.6 Case study 29 4.4.7 Data visualization 31 5 Conclusions 33 References 34

    [1] Ailing Zeng, Muxi Chen, Lei Zhang, and Qiang Xu. Are transformers effective for time series forecasting? In Proceedings of the AAAI conference on artificial intelligence, volume 37, pages 11121–11128, 2023.
    [2] Ilya O Tolstikhin, Neil Houlsby, Alexander Kolesnikov, Lucas Beyer, Xiaohua Zhai, Thomas Unterthiner, Jessica Yung, Andreas Steiner, Daniel Keysers, Jakob Uszkoreit, et al. Mlp-mixer: An all-mlp architecture for vision. Advances in neural information processing systems, 34:24261–24272, 2021.
    [3] Si-An Chen, Chun-Liang Li, Nate Yoder, Sercan O Arik, and Tomas Pfister. Tsmixer: An all-mlp architecture for time series forecasting. arXiv preprint arXiv:2303.06053, 2023.
    [4] Shiyu Wang, Haixu Wu, Xiaoming Shi, Tengge Hu, Huakun Luo, Lintao Ma, James Y Zhang, and Jun Zhou. Timemixer: Decomposable multiscale mixing for time series forecasting. arXiv preprint arXiv:2405.14616, 2024.
    [5] Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555, 2014.
    [6] Sepp Hochreiter and J¨urgen Schmidhuber. Long short-term memory. Neural computation, 9(8):1735–1780, 1997.
    [7] Albert Gu and Tri Dao. Mamba: Linear-time sequence modeling with selective state spaces. arXiv preprint arXiv:2312.00752, 2023.
    [8] Minhao Liu, Ailing Zeng, Muxi Chen, Zhijian Xu, Qiuxia Lai, Lingna Ma, and Qiang Xu. Scinet: Time series modeling and forecasting with sample convolution and inter-action. Advances in Neural Information Processing Systems, 35:5816–5828, 2022.
    [9] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. Advances in neural information processing systems, 30, 2017.
    [10] Yong Liu, Haixu Wu, Jianmin Wang, and Mingsheng Long. Non-stationary transformers: Exploring the stationarity in time series forecasting. Advances in neural information processing systems, 35:9881–9893, 2022.
    [11] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers),pages 4171–4186, 2019.
    [12] Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
    [13] Yuqi Nie, Nam H Nguyen, Phanwadee Sinthong, and Jayant Kalagnanam. A time series is worth 64 words: Long-term forecasting with transformers. arXiv preprint arXiv:2211.14730, 2022.
    [14] Xihao Piao, Zheng Chen, Taichi Murayama, Yasuko Matsubara, and Yasushi Sakurai. Fredformer: Frequency debiased transformer for time series forecasting. In Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 2400–2410, 2024.
    [15] Yuxuan Wang, Haixu Wu, Jiaxiang Dong, Guo Qin, Haoran Zhang, Yong Liu, Yun-zhong Qiu, Jianmin Wang, and Mingsheng Long. Timexer: Empowering transformers for time series forecasting with exogenous variables. arXiv preprint arXiv:2402.19072,2024.
    [16] Peng Chen, Yingying Zhang, Yunyao Cheng, Yang Shu, Yihang Wang, Qingsong Wen,Bin Yang, and Chenjuan Guo. Pathformer: Multi-scale transformers with adaptive pathways for time series forecasting. arXiv preprint arXiv:2402.05956, 2024.
    [17] Cisco Visual Networking Index. Global mobile data traffic forecast update, 2016–2021 white paper, accessed on may 2, 2017.
    [18] Vahid Naghashi, Mounir Boukadoum, and Abdoulaye Banire Diallo. A multiscale model for multivariate time series forecasting. Scientific Reports, 15(1):1565, 2025.
    [19] Ruixin Ding, Yuqi Chen, Yu-Ting Lan, and Wei Zhang. Drformer: Multi-scale transformer utilizing diverse receptive fields for long time-series forecasting. In Proceedings of the 33rd ACM International Conference on Information and Knowledge Management, pages 446–456, 2024.
    [20] Peiwang Tang and Weitai Zhang. Unlocking the power of patch: Patch-based mlp for long-term time series forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 39, pages 12640–12648, 2025.
    [21] Junnan Li, Ramprasaath Selvaraju, Akhilesh Gotmare, Shafiq Joty, Caiming Xiong, and Steven Chu Hong Hoi. Align before fuse: Vision and language representation learning with momentum distillation. Advances in neural information processing systems,34:9694–9705, 2021.
    [22] Artur Trindade. ElectricityLoadDiagrams20112014. UCI Machine Learning Repository, 2015. DOI: https://doi.org/10.24432/C58C86.
    [23] Haoyi Zhou, Shanghang Zhang, Jieqi Peng, Shuai Zhang, Jianxin Li, Hui Xiong, and Wancai Zhang. Informer: Beyond efficient transformer for long sequence time-series forecasting. In Proceedings of the AAAI conference on artificial intelligence, volume 35, pages 11106–11115, 2021.
    [24] Guokun Lai, Wei-Cheng Chang, Yiming Yang, and Hanxiao Liu. Modeling long-and short-term temporal patterns with deep neural networks. In The 41st international ACM SIGIR conference on research & development in information retrieval, pages 95–104,2018.
    [25] Hui Chen, Viet Luong, Lopamudra Mukherjee, and Vikas Singh. Simpletm: A simple baseline for multivariate time series forecasting. In The Thirteenth International Conference on Learning Representations, 2025.
    [26] Yong Liu, Tengge Hu, Haoran Zhang, Haixu Wu, Shiyu Wang, Lintao Ma, and Ming-sheng Long. itransformer: Inverted transformers are effective for time series forecasting. arXiv preprint arXiv:2310.06625, 2023.
    [27] Tian Zhou, Ziqing Ma, Qingsong Wen, Xue Wang, Liang Sun, and Rong Jin. Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting. In International conference on machine learning, pages 27268–27286. PMLR, 2022.

    下載圖示 校內:立即公開
    校外:立即公開
    QR CODE