簡易檢索 / 詳目顯示

研究生: 蔡沛宏
Tsai, PeiHung
論文名稱: 判別分析與SMOTE應用於建構乾旱預警模式之研究
A Study on Constructing Drought Early Warning Models by Using Discriminant Analysis and SMOTE
指導教授: 游保杉
Yu, Pao-Shan
學位類別: 碩士
Master
系所名稱: 工學院 - 水利及海洋工程學系
Department of Hydraulic & Ocean Engineering
論文出版年: 2020
畢業學年度: 108
語文別: 中文
論文頁數: 104
中文關鍵詞: 乾旱預警模式線性判別分析核Fisher判別分析SMOTE
外文關鍵詞: drought early warning models, standardized drought indexes, water resources statuses, LDA, KFDA, SMOTE
相關次數: 點閱:126下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本研究旨在結合判別分析(Discriminant Analysis)與合成少數類過採樣技術(Synthetic Minority Oversampling Technique, SMOTE),發展水庫乾旱預警模式,以預判未來第一與第三個月之水情狀態(正常或警戒),作為抗旱決策輔助之工具。研究對象分別位於臺灣北、中、南部之石門水庫、德基水庫與南化水庫。
    主要研究流程先採用不同時間尺度之標準化指標(標準化降雨指標、標準化流量指標與標準化水庫蓄水量指標)進行輸入變量組合,輸出變量則為水情狀態(正常與警戒)。進一步應用線性判別分析(Linear Discriminant Analysis, LDA)與核Fisher判別分析(Kernel Fisher Discriminant Analysis, KFDA)建立輸入與輸出變量間之關係,並藉由訓練集與測試集在「所有月份平均整體判別準確率」與「枯水期月份平均警戒判別準確率」之表現,挑選出最佳輸入變量組合與判別分析方法,再經由引進SMOTE改善類別樣本數不平衡問題並檢驗能否進一步提升判別準確率,最後決定並建議各水庫之最佳乾旱預警模式。
    分析結果顯示:枯水期月份平均警戒判別準確率無論在訓練集或測試集中均有不錯表現。以測試集為例,預測未來第一個月水情狀態之判別準確率與最佳乾旱預警模式,在石門水庫為0.94(採用KFDA與SMOTE)、德基水庫為0.92 (採用LDA與SMOTE),與南化水庫為0.9(採用KFDA);預測未來第三個月水情狀態之判別準確率與最佳乾旱預警模式,在石門水庫為0.89(採用KFDA與SMOTE)、德基水庫為0.93(採用KFDA與SMOTE),與南化水庫為0.85(採用LDA)。

    This study aims to develop drought early warning models based on linear discriminant analysis (LDA) and kernel Fisher discriminant analysis (KFDA) with Synthetic Minority Oversampling Technique (SMOTE) for the Shimen, Techi and Nanhua Reservoir in Taiwan. The proposed models can predict the water resources statuses (i.e., “normal” status or “watch” status) of reservoir for both 1-month and 3-month ahead.
    Various standardized drought indexes (SDIs), including standardized precipitation index (SPI), standardized streamflow index (SSI), and standardized reservoir storage index (SRSI) on different time scales, were used as the model input variables. The water resources status (i.e., “normal” status or “watch” status) was used as the model output variable. The exhaustive search method was used to find the optimal combination of input variables.
    The SDIs were derived based on the hydrological data for Shimen Reservoir from 1959 to 2016, Techi Reservoir from 1976 to 2016, and Nanhua Reservoir from 1979 to 2016. For each reservoir, the first 80% data and the remaining 20% data were selected for model calibration and validation, respectively. For 1-month ahead, the “watch” status accuracies of the dry period for calibration and validation, respectively, are 91% and 94% (KFDA+SMOTE) in Shimen Reservoir, 93% and 92% (LDA+SMOTE) in Techi Reservoir, and 92% and 90% (KFDA) in Nanhua Reservoir. For 3-month ahead, the “watch” status accuracies of the dry period for calibration and validation, respectively, are 88% and 89% (KFDA+SMOTE) in Shimen Reservoir, 91% and 93% (LDA+SMOTE) in Techi Reservoir, and 88% and 85%(LDA) in Nanhua Reservoir. The results suggest that the proposed drought early warning models for both 1-month and 3-month ahead gave satisfactory performances for predicting water resources statuses.

    摘要 I Extended Abstract III 誌謝 XVII 目錄 XIX 表目錄 XXIII 圖目錄 XXIX 第一章 緒論 1 1-1 研究動機與目的 1 1-2 文獻回顧 2 1-3 本文組織架構 7 第二章 研究區域與資料概述 9 2-1 研究區域概述 9 2-2 分析資料 11 2-3 標準化指標 12 第三章 研究方法 17 3-1 線性判別分析(LDA) 17 3-2 核Fisher判別分析(KFDA) 21 3-3 合成少數類過採樣技術(SMOTE) 26 第四章 應用LDA與KFDA建置乾旱預警模式 29 4-1 各水庫水情燈號之概況分析 30 4-2 乾旱預警模式建置之前置作業 37 4-3 LDA分析結果 45 4-4 KFDA分析結果 53 4-5 LDA與KFDA比較結果 60 第五章 應用SMOTE提升模式判別能力 67 5-1 經由SMOTE處理後之類別樣本數變化 67 5-2 資料SMOTE處理前後之分析比較 78 5-3 分析成果討論 93 第六章 結論與建議 97 6-1 結論 97 6-2 建議 99 參考文獻 101 附錄一 石門水庫系統模式 附錄1 附錄1-1 模式理論介紹 附錄1 附錄1-2 模式所需資料 附錄3 附錄1-3 模式操作規則 附錄11 附錄二 大安大甲溪現況水資源系統操作模式 附錄1 附錄2-1 模式系統 附錄1 附錄2-2 模式所需資料 附錄1 附錄三 高屏溪與南化水庫現況水資源系統操作模式 附錄1 附錄3-1 模式系統 附錄1 附錄3-2 模式所需資料 附錄1 附錄四 模式評估 附錄1

    1. 孔銳、張國宣、施澤生、郭立(2003)。基於核Fisher判决分析的臉譜識別新方法。電路與系統學報,8(5),57-61。
    2. 石洪波、陳雨文、陳鑫(2019)。SMOTE 過採樣及其改進算法研究综述。智能系統學報,14(6),1073-1083。
    3. 李明軒(2008)。支撐向量機與模糊推論於流量預報即時誤差修正之研究(碩士論文)。
    4. 宋嘉文(2003)。氣候變遷對台灣西半部地區降雨及乾旱影響之研究(碩士論文)。
    5. 邱伊禕(2012)。以高維球做輔助的判別分析法 (碩士論文)。
    6. 呂季蓉(2006)。台灣南部地區長期乾旱趨勢分析之研究(碩士論文)。
    7. 周志華 (2016)。機器學習。
    8. 周家慶(2008)。水庫乾旱風險預警及水庫操作決策支援系統之建置研究(博士論文)。
    9. 陳弘(2019)。遙相關月雨量預報模式應用於石門水庫乾旱預警(碩士論文)。
    10. 陳思尹(2016)。應用機器學習法於QPESUMS即時雨量預報(碩士論文)。
    11. 袁倫欽(2005)。水庫供水操作與乾旱預警系統之研究(博士論文)。
    12. 黃文政、程俊銘(2003)。石門水庫集水區降雨量乾旱指標之建立。天氣分析與預報研討會,136-140。
    13. 童慶斌、陳嘉和、劉子明(2002)。結合長期氣象預測資料建立乾旱預警系統。水資源管理2002研討會。
    14. 經濟部水利署水利規劃試驗所(2017)。因應氣候變遷水源設施乾旱供水風險評估(1/2)。
    15. 經濟部水利署水利規劃試驗所(2018)。因應氣候變遷水源設施乾旱供水風險評估(2/2)。
    16. 經濟部水利署中區水資源局(2013)。中區水資源調配管理系統更新改善及維護。
    17. 經濟部水利署北區水資源局全球資訊網。
    18. 經濟部水利署北區水資源局(2017)。研商石門水庫、寶山及寶二水庫枯旱警戒值修訂事宜。
    19. 經濟部水利署全球資訊網。
    20. 經濟部水利署南區水資源局(2017)。曾文溪枯水期水情用水管理。
    21. 楊雪梅、李世鵬(2010)。基於核Fisher判别分析的蛋白質氧鏈糖基化位點的預測。計算機應用,30(11),2959-2961。
    22. 楊富堤(2001)。水庫供水決策支援系統之研究(博士論文)。
    23. 劉雅慈(2018)。石門水庫乾旱預警指標之研究(碩士論文)。
    24. 羅萬倫(2015)。短期氣候預報在石門水庫梅雨期之水資源管理應用(碩士論文)。
    25. Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: synthetic minority over-sampling technique. Journal of artificial intelligence research, 16, 321-357.
    26. Drummond, C., & Holte, R. C. (2003, August). C4. 5, class imbalance, and cost sensitivity: why under-sampling beats over-sampling. In Workshop on learning from imbalanced datasets II (Vol. 11, pp. 1-8). Washington DC: Citeseer.
    27. Farahmand, A., & AghaKouchak, A. (2015). A generalized framework for deriving nonparametric standardized drought indicators. Advances in Water Resources, 76, 140-145.
    28. Fisher, R. A. (1936). The use of multiple measurements in taxonomic problems. Annals of eugenics, 7(2), 179-188.
    29. Ghojogh, B., Karray, F., & Crowley, M. (2019). Fisher and kernel Fisher discriminant analysis: Tutorial. arXiv preprint arXiv:1906.09436.

    30. Han, H., Wang, W. Y., & Mao, B. H. (2005, August). Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning. In International conference on intelligent computing (pp. 878-887). Springer, Berlin, Heidelberg.
    31. Hayes, M. J., Svoboda, M. D., Wiihite, D. A., & Vanyarkho, O. V. (1999). Monitoring the 1996 drought using the standardized precipitation index. Bulletin of the American meteorological society, 80(3), 429-438.
    32. He, H., & Garcia, E. A. (2009). Learning from imbalanced data. IEEE Transactions on knowledge and data engineering, 21(9), 1263-1284.
    33. Hounkpatin, K. O., Schmidt, K., Stumpf, F., Forkuor, G., Behrens, T., Scholten, T., ... & Welp, G. (2018). Predicting reference soil groups using legacy data: A data pruning and Random Forest approach for tropical environment (Dano catchment, Burkina Faso). Scientific reports, 8(1), 1-16.
    34. Komasi, M., Sharghi, S., & Safavi, H. R. (2018). Wavelet and cuckoo search-support vector machine conjugation for drought forecasting using Standardized Precipitation Index (case study: Urmia Lake, Iran). Journal of Hydroinformatics, 20(4), 975-988.
    35. Lin, H., Khan, A., & Li, P. (2019). Sparse Relevance Kernel Machine-Based Performance Dependency Analysis of Analog and Mixed-Signal Circuits. In Machine Learning in VLSI Computer-Aided Design (pp. 423-447). Springer, Cham.
    36. Liong, S. Y., & Sivapragasam, C. (2002). Flood stage forecasting with support vector machines 1. JAWRA Journal of the American Water Resources Association, 38(1), 173-186.
    37. McKee, T. B., Doesken, N. J., & Kleist, J. (1993, January). The relationship of drought frequency and duration to time scales. In Proceedings of the 8th Conference on Applied Climatology (Vol. 17, No. 22, pp. 179-183).
    38. Mika, S., Ratsch, G., Weston, J., Scholkopf, B., & Mullers, K. R. (1999, August). Fisher discriminant analysis with kernels. In Neural networks for signal processing IX: Proceedings of the 1999 IEEE signal processing society workshop (cat. no. 98th8468) (pp. 41-48). Ieee.
    39. Mishra, A. K., & Desai, V. R. (2005a). Drought forecasting using stochastic models. Stochastic Environmental Research and Risk Assessment, 19(5), 326-339.
    40. Mishra, A. K., Desai, V. R., & Singh, V. P. (2007). Drought forecasting using a hybrid stochastic and neural network model. Journal of Hydrologic Engineering, 12(6), 626-638.
    41. Mishra, A. K., & Desai, V. R. (2005b). Spatial and temporal drought analysis in the Kansabati river basin, India. International Journal of River Basin Management, 3(1), 31-41.
    42. Mishra, A. K., & Singh, V. P. (2009). Analysis of drought severity‐area‐frequency curves using a general circulation model and scenario uncertainty. Journal of Geophysical Research: Atmospheres, 114(D6).
    43. Mishra, A. K., Singh, V. P., & Desai, V. R. (2009). Drought characterization: a probabilistic approach. Stochastic Environmental Research and Risk Assessment, 23(1), 41-55.
    44. Nelson, D. N. (2013). A statistical approach to understanding and predicting tropical storm formation in the east Pacific basin (Doctoral dissertation, The University of Wisconsin-Madison).
    45. Provost, F. (1998). Glossary of Terms Special Issue on Applications of Machine Learning and the Knowledge Discovery Process. Machine Learning, 30, 271-274.
    46. Sugiyama, M. (2006, June). Local fisher discriminant analysis for supervised dimensionality reduction. In Proceedings of the 23rd international conference on Machine learning (pp. 905-912).
    47. Van Hulse, J., Khoshgoftaar, T. M., & Napolitano, A. (2007, June). Experimental perspectives on learning from imbalanced data. In Proceedings of the 24th international conference on Machine learning (pp. 935-942).

    下載圖示 校內:立即公開
    校外:立即公開
    QR CODE