簡易檢索 / 詳目顯示

研究生: 呂濬琟
Lu, Chun-Wei
論文名稱: 應用Transformer基礎架構下以k-Shape改善自監督式學習之地下水位的補遺與預測
Application of the k-Shape clustering with the Transformer-based model to improve the self-supervised learning of groundwater level imputation and prediction
指導教授: 羅偉誠
Lo, Wei-Cheng
學位類別: 碩士
Master
系所名稱: 工學院 - 水利及海洋工程學系
Department of Hydraulic & Ocean Engineering
論文出版年: 2021
畢業學年度: 109
語文別: 中文
論文頁數: 92
中文關鍵詞: k-ShapeTransformer地下水補遺地下水預測
外文關鍵詞: k-Shape, Transformer, Imputation, Groundwater prediction
相關次數: 點閱:163下載:21
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 應對極端氣候發生頻率日益增加,地下水資源的管理成為一重要議題。地下水位的預測可作為評估地下水水資源的初步條件,然而地下水位資料的缺測使得下游任務執行變得困難,傳統的補遺方法必須調查缺測觀測井周圍相似的地下水位變化,作為資料修補的依據。近年來機器學習廣泛的發展,無監督學習透過自動辨識資料的特徵來進行歸納,能抓取人無法輕易辨識的特徵。Transformer跳脫傳統循環神經的架構,並在自然語言等序列資料展現強大的能力。因此本文使用k-Shape對屏東平原第一含水層觀測資料進行時序列集群分析,並使用自監督學習改善標記資料不足的問題,進一步透過Transformer架構建立資料補遺模型以及水位預測模型。結果顯示k-Shape能有效辨識序列資料長時間的相似度,且以Transformer Encoder為架構的補遺模型,展現了其具備捕捉不同觀測井序列特徵的能力。最後,以原始Transformer為架構的預測結果,顯示出比起傳統循環神經網路更能記憶長時間的資料,並能有效掌握降雨與地下水位的關聯。

    Due to extreme weather events, effective water resource management becomes one of the most crucial issues. The prediction of groundwater level change can serve as a preliminary assessment for groundwater resources management. However, the missing values of groundwater level records raise the difficulties of downstream applications. Traditional time series data imputation methods require the similarity investigation of the time series data which are used to reconstruct the missing data. Recently, machine learning has made progress in data mining, especially the unsupervised learning technique. Furthermore, unlike Recurrent Neural Network (RNN), a novel architecture called Transformer has become the basis of multiple state-of-the-art models in sequential fields. In this paper, we first adopt a shape-based time series clustering method, namely k-Shape, which is an efficient unsupervised learning clustering algorithm, to cluster the spatiotemporal groundwater levels. Secondly, based on the clustering results, we built the self-supervised Transformer-encoder as the imputation model. The result shows that the k-Shape algorithm improves the imputation ability while using the inputs from different clusters. After reconstructing the missing groundwater levels, the Transformer is utilized to predict the future groundwater levels. Finally, we compared the Transformer with the Gate Recurrent Unit (GRU), finding that the Transformer can capture more features and perform more accurately as the prediction time steps increase.

    摘要 i 英文延伸摘要 ii 誌謝 vii 目錄 viii 表格 x 圖片 xi 第一章 緒論 1 1.1. 研究動機與目的 1 1.2. 文獻回顧 3 1.2.1. 地下水資料補遺 3 1.2.2. 地下水位預測方法 4 1.3. 研究流程 5 第二章 研究方法 7 2.1. 時序列集群分析 (Time Series clustering) 7 2.1.1. 動態時間扭曲 (Dynamic Time Warping, DTW) 9 2.1.2. k-Shape 11 2.2. 人工神經網路 (Artificial Neural Network, ANN) 18 2.2.1. 損失函數 (loss function) 20 2.2.2. 最佳化 (Optimization) 21 2.2.3. 反向傳播 (Backpropagation) 23 2.2.4. 激勵函數 (activation function) 24 2.3. 循環神經網路 (Recurrent neural network, RNN) 26 2.3.1. 長短期記憶 (Long short-term memory, LSTM) 27 2.3.2. 閥門循環單元 (Gated recurrent unit, GRU) 28 2.4. Seq2seq 模型 30 2.4.1. Attention 機制 31 2.4.2. Self-Attention 35 2.4.3. Transformer 38 2.4.4. 自監督式學習 (Self-supervised learning, SSL) 41 2.4.5. 環境設定 41 第三章 研究地區 42 3.1. 屏東平原概述 42 3.2. 地質分布 44 3.3. 水文地質條件 47 3.4. 觀測井分布 51 第四章 模式分析與結果討論 52 4.1. k-Shape 集群分析 52 4.1.1. k-Shape 資料預處理 52 4.1.2. k 判定流程 53 4.1.3. 無缺測站之 k-Shape 最佳分群結果 60 4.2. 地下水資料補遺 61 4.2.1. 補遺資料預處理 61 4.2.2. 自監督應用 62 4.2.3. Transformer Encoder 63 4.2.4. Transformer Encoder 超參數設定 65 4.2.5. 補遺結果驗證 65 4.2.6. 實際補遺案例探討 69 4.2.7. 補遺後 k-Shape 結果討論 74 4.3. 地下水預測 77 4.3.1. 預測資料預處理 77 4.3.2. Transformer 超參數設定 79 4.3.3. 預測資料結果討論 80 第五章 結論與建議 86 參考文獻 88

    [1] 經濟部中央地質調查所. 臺灣地區下水文質調查及資源評估地下水補注潛勢評估與模式建置. 2012.
    [2] 經濟部水利署水利規劃試驗所. 屏東平原地下水分區邊界條件之研訂 (2/2) 成果報告. 2019.
    [3] Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473, 2014.
    [4] Donald J Berndt and James Clifford. Using dynamic time warping to find patterns in time series. In KDD workshop, volume 10, pages 359–370. Seattle, WA, USA:, 1994.
    [5] Christopher M Bishop et al. Neural networks for pattern recognition. Oxford university press, 1995.
    [6] Rodolfo C Cavalcante and Adriano LI Oliveira. An approach to handle concept drift in financial time series based on extreme learning machines and explicit drift detection. In 2015 international joint conference on neural networks (IJCNN), pages 1–8. IEEE, 2015.
    [7] Yu Chen, Guodong Liu, Xiaohua Huang, Ke Chen, Jie Hou, and Jing Zhou. Development of a surrogate method of groundwater modeling using gated recurrent unit to improve the efficiency of parameter auto-calibration and global sensitivity analysis. Journal of Hydrology, page 125726, 2020.
    [8] Kyunghyun Cho, Bart Van Merriënboer, Dzmitry Bahdanau, and Yoshua Bengio. On the properties of neural machine translation: Encoder-decoder approaches. arXiv preprint arXiv:1409.1259, 2014.
    [9] Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078, 2014.
    [10] Djork-Arné Clevert, Thomas Unterthiner, and Sepp Hochreiter. Fast and accurate deep network learning by exponential linear units (elus). arXiv preprint arXiv:1511.07289, 2015.
    [11] Jerome T Connor, R Douglas Martin, and Les E Atlas. Recurrent neural networks and robust time series prediction. IEEE transactions on neural networks, 5(2):240–254, 1994.
    [12] Russell S Crosbie, Philip Binning, and Jetse D Kalma. A time series approach to inferring groundwater recharge using the water table fluctuation method. Water Resources Research, 41(1), 2005.
    [13] Carl Doersch, Abhinav Gupta, and Alexei A Efros. Unsupervised visual representation learning by context prediction. In Proceedings of the IEEE international conference on computer vision, pages 1422–1430, 2015.
    [14] Chenguang Fang and Chen Wang. Time series data imputation: A survey on deep learning approaches. arXiv preprint arXiv:2011.11347, 2020.
    [15] Alireza Farhangfar, Lukasz Kurgan, and Jennifer Dy. Impact of imputation of missing values on classification error for discrete data. Pattern Recognition, 41(12):3692–3705, 2008.
    [16] Terrence L Fine. Feedforward neural network methodology. Springer Science & Business Media, 2006.
    [17] Celestino Ordóñez Galán, Fernando Sánchez Lasheras, Francisco Javier de Cos Juez, and Antonio Bernardo Sánchez. Missing data imputation of questionnaires by means of genetic algorithms with different fitness functions. Journal of Computational and Applied Mathematics, 311:704–717, 2017.
    [18] Spyros Gidaris, Praveer Singh, and Nikos Komodakis. Unsupervised representation learning by predicting image rotations. arXiv preprint arXiv:1803.07728, 2018.
    [19] Toni Giorgino et al. Computing and visualizing dynamic time warping alignments in r: the dtw package. Journal of statistical Software, 31(7):1–24, 2009.
    [20] Xavier Glorot, Antoine Bordes, and Yoshua Bengio. Deep sparse rectifier neural networks. In Proceedings of the fourteenth international conference on artificial intelligence and statistics, pages 315–323. JMLR Workshop and Conference Proceedings, 2011.
    [21] Ian J Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial networks. arXiv preprint arXiv:1406.2661, 2014.
    [22] EAPA Gustavo, A Batista, Eamonn J Keogh, Oben Moses Tataw, M Vinícius,
    A de Souza, et al. Cid: an efficient complexity-invariant distance for time series. Data Mining and Knowledge Discovery, 28(3):634, 2014.
    [23] Dan Hendrycks and Kevin Gimpel. Gaussian error linear units (gelus). arXiv preprint arXiv:1606.08415, 2016.
    [24] Sepp Hochreiter and Jürgen Schmidhuber. Long short-term memory. Neural computation, 9(8):1735–1780, 1997.
    [25] Mark Hocking and Bryce FJ Kelly. Groundwater recharge and time lag measurement through vertosols using impulse response functions. Journal of Hydrology, 535:22–35, 2016.
    [26] Kurt Hornik, Maxwell Stinchcombe, and Halbert White. Multilayer feedforward networks are universal approximators. Neural networks, 2(5):359–366, 1989.
    [27] Jina Jeong, Eungyu Park, Huali Chen, Kue-Young Kim, Weon Shik Han, and Heejun Suk. Estimation of groundwater level based on the robust training of recurrent neural networks using corrupted data. Journal of Hydrology, 582:124512, 2020.
    [28] José M Jerez, Ignacio Molina, Pedro J García-Laencina, Emilio Alba, Nuria Ribelles, Miguel Martín, and Leonardo Franco. Missing data imputation using statistical and machine learning methods in a real breast cancer problem. Artificial intelligence in medicine, 50(2):105–115, 2010.
    [29] Eamonn J Keogh and Michael J Pazzani. Scaling up dynamic time warping for datamining applications. In Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 285–289, 2000.
    [30] Bahaa Khalil, Stefan Broda, Jan Adamowski, Bogdan Ozga-Zielinski, and Amanda Donohoe. Short-term forecasting of groundwater levels under conditions of minetailings recharge using wavelet ensemble neural network models. Hydrogeology Journal, 23(1):121–141, 2015.
    [31] Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
    [32] Günter Klambauer, Thomas Unterthiner, Andreas Mayr, and Sepp Hochreiter. Selfnormalizing neural networks. arXiv preprint arXiv:1706.02515, 2017.
    [33] Minh-Thang Luong, Hieu Pham, and Christopher D Manning. Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025, 2015.
    [34] Andrew L Maas, Awni Y Hannun, and Andrew Y Ng. Rectifier nonlinearities improve neural network acoustic models. In Proc. icml, volume 30, page 3. Citeseer, 2013.
    [35] Alexander Miller, Adam Fisch, Jesse Dodge, Amir-Hossein Karimi, Antoine Bordes, and Jason Weston. Key-value memory networks for directly reading documents. arXiv preprint arXiv:1606.03126, 2016.
    [36] Volodymyr Mnih, Nicolas Heess, Alex Graves, and Koray Kavukcuoglu. Recurrent models of visual attention. arXiv preprint arXiv:1406.6247, 2014.
    [37] Lindasalwa Muda, Mumtaj Begam, and Irraivan Elamvazuthi. Voice recognition algorithms using mel frequency cepstral coefficient (mfcc) and dynamic time warping (dtw) techniques. arXiv preprint arXiv:1003.4083, 2010.
    [38] Vinod Nair and Geoffrey E Hinton. Rectified linear units improve restricted boltzmann machines. In Icml, 2010.
    [39] Ahmedbahaaaldin Ibrahem Ahmed Osman, Ali Najah Ahmed, Ming Fai Chow, Yuk Feng Huang, and Ahmed El-Shafie. Extreme gradient boosting (xgboost) model to predict the groundwater levels in selangor malaysia. Ain Shams Engineering Journal, 2021.
    [40] John Paparrizos and Luis Gravano. k-shape: Efficient and accurate clustering of time series. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pages 1855–1870, 2015.
    [41] Taher Rajaee, Hadi Ebrahimi, and Vahid Nourani. A review of the artificial intelligence methods in groundwater level modeling. Journal of hydrology, 572:336–351, 2019.
    [42] Frank Rosenblatt. The perceptron: a probabilistic model for information storage and organization in the brain. Psychological review, 65(6):386, 1958.
    [43] Sebastian Ruder. An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747, 2016.
    [44] D. Rumelhart, Geoffrey E. Hinton, and R. J. Williams. Learning representations by back-propagating errors. Nature, 323:533–536, 1986.
    [45] Seunghyoung Ryu, Minsoo Kim, and Hongseok Kim. Denoising autoencoder-based missing value imputation for smart meters. IEEE Access, 8:40656–40666, 2020.
    [46] Samsu Sempena, Nur Ulfa Maulidevi, and Peb Ruswono Aryan. Human action recognition using dynamic time warping. In Proceedings of the 2011 International Conference on Electrical Engineering and Informatics, pages 1–5. IEEE, 2011.
    [47] A Piyush Shanker and AN Rajagopalan. Off-line signature verification using dtw. Pattern recognition letters, 28(12):1407–1414, 2007.
    [48] Peter Shaw, Jakob Uszkoreit, and Ashish Vaswani. Self-attention with relative position representations. arXiv preprint arXiv:1803.02155, 2018.
    [49] Yue Sun, Shaozhong Kang, Fusheng Li, and Lu Zhang. Comparison of interpolation methods for depth to groundwater and its temporal and spatial variations in the minqin oasis of northwest china. Environmental Modelling & Software, 24(10):1163–1170, 2009.
    [50] Ilya Sutskever, Oriol Vinyals, and Quoc V Le. Sequence to sequence learning with neural networks. arXiv preprint arXiv:1409.3215, 2014.
    [51] Mark Tennant, Frederic Stahl, Omer Rana, and João Bártolo Gomes. Scalable real-time classification of data streams with concept drift. Future Generation Computer Systems, 75:187–199, 2017.
    [52] Paolo Tormene, Toni Giorgino, Silvana Quaglini, and Mario Stefanelli. Matching incomplete time series with dynamic time warping: an algorithm and an application to post-stroke rehabilitation. Artificial intelligence in medicine, 45(1):11–34, 2009.
    [53] Mohammad Valipour, Mohammad Ebrahim Banihabib, and Seyyed Mahmood Reza Behbahani. Comparison of the arma, arima, and the autoregressive artificial neural network models in forecasting the monthly inflow of dez dam reservoir. Journal of hydrology, 476:433–441, 2013.
    [54] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention is all you need. arXiv preprint arXiv:1706.03762, 2017.
    [55] Pauli Virtanen, Ralf Gommers, Travis E. Oliphant, Matt Haberland, Tyler Reddy, David Cournapeau, Evgeni Burovski, Pearu Peterson, Warren Weckesser, Jonathan Bright, Stéfan J. van der Walt, Matthew Brett, Joshua Wilson, K. Jarrod Millman, Nikolay Mayorov, Andrew R. J. Nelson, Eric Jones, Robert Kern, Eric Larson, C J Carey, İlhan Polat, Yu Feng, Eric W. Moore, Jake VanderPlas, Denis Laxalde, Josef Perktold, Robert Cimrman, Ian Henriksen, E. A. Quintero, Charles R. Harris, Anne M. Archibald, Antônio H. Ribeiro, Fabian Pedregosa, Paul van Mulbregt, and SciPy 1.0 Contributors. SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nature Methods, 17:261– 272, 2020.
    [56] Edwin P Weeks. The lisse effect revisited. Groundwater, 40(6):652–656, 2002.
    [57] Jinquan Wu, Renduo Zhang, and Jinzhong Yang. Estimating infiltration recharge using a response function model. Journal of Hydrology, 198(1-4):124–139, 1997.
    [58] Neo Wu, Bradley Green, Xue Ben, and Shawn O’Banion. Deep transformer models for time series forecasting: The influenza prevalence case. arXiv preprint arXiv:2001.08317, 2020.
    [59] Junjing Yang, Chao Ning, Chirag Deb, Fan Zhang, David Cheong, Siew Eang Lee, Chandra Sekhar, and Kwok Wai Tham. k-shape clustering algorithm for building energy usage patterns analysis and forecasting model accuracy improvement. Energy and Buildings, 146:27–37, 2017.
    [60] Zhongrong Zhang, Xuan Yang, Hao Li, Weide Li, Haowen Yan, and Fei Shi. Application of a novel hybrid method for spatiotemporal data imputation: A case study of the minqin county groundwater level. Journal of Hydrology, 553:384–397, 2017.
    [61] Indrė Žliobaitė. Learning under concept drift: an overview. arXiv preprint arXiv:1010.4784, 2010.

    下載圖示 校內:立即公開
    校外:立即公開
    QR CODE