簡易檢索 / 詳目顯示

研究生: 林廷威
Lin, Tien-Wei
論文名稱: 應用空間混合集成學習模型推估臺灣大氣戴奧辛濃度之時空分布
Estimating the Spatiotemporal Concentration Variations of Ambient Air Dioxin in Taiwan using an Ensemble Mixed Spatial Model
指導教授: 吳治達
Wu, Chih-Da
學位類別: 碩士
Master
系所名稱: 工學院 - 測量及空間資訊學系
Department of Geomatics
論文出版年: 2022
畢業學年度: 110
語文別: 中文
論文頁數: 60
中文關鍵詞: 空氣污染戴奧辛機器學習集成學習時空推估模型
外文關鍵詞: Air Pollution, Dioxin, Machine Learning, Ensemble Learning, Spatiotemporal Estimation Model
相關次數: 點閱:32下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著工業革命,科技進步與人類活動的逐漸頻繁,帶來物質上的豐富與繁榮,也產生出各種廢氣被排放到空氣中,因此環境大氣中逐漸充斥著各式各樣的污染物。由於充分的物質資源滿足了基本的生理需求,近代人越來越重視生活環境與生活品質,環境健康亦成為一項備受重視的議題。在過去的許多研究中,空氣污染的暴露已被證明與人類健康高度相關,而綜觀所有能透過環境空氣傳播的空氣污染物,其中戴奧辛是毒性最強的一種,因此,評估暴露於環境大氣戴奧辛的風險十分重要。然而,環境大氣戴奧辛濃度的量測十分困難,需要以專業的實驗室與設備分析才能獲得精準的數據,並需要花費大量時間、人力、金錢成本,因此無論是臺灣還是全球各地,均缺乏長期、大範圍的環境大氣戴奧辛監測資料。在此限制下,建立環境大氣戴奧辛的推估模型是另一種有效的替代方法。
    本研究在推估模型的建置過程中整合多種空間推估方法,包括空間內插法、土地利用迴歸法、機器學習演算法與集成學習堆疊法。首先應用空間內插法完成多項變數處理與資料庫建置作業;接著應用土地利用迴歸法篩選出用來擬合環境大氣戴奧辛濃度的重要變數,包含細懸浮微粒、X座標、氣溫、二氧化硫、氣壓、稻田、主要道路與降雨量,所建之傳統土地利用迴歸模型R2為61%;再應用多種機器學習演算法來建置多個異構的單一機器學習演算法模型,增進模型的擬合能力,R2達到61%到79%不等;最後使用這些異構模型的預測值重新對環境大氣戴奧辛濃度進行擬合,建立一個集成堆疊架構的推估模型,模型R2達到85%,也是本研究的最終模型成果,命名為「空間混合集成學習模型」。為了探討上述方法學的可行性與實際效能,本研究將其與傳統土地利用迴歸模型、單一機器學習演算法模型進行多方面的對比,同時進行多項驗證來檢驗模型的效能。結果顯示,通過集成學習堆疊法建置的模型效能明顯優於其他模型,R2至少比任何其他方法好5%以上。整體而言,模型穩健性與時空的部分預測能力亦是最佳,並且在大多數驗證中都得到了穩定的結果,證明此模型的推估結果是可靠的。綜上所述,說明由空間混合集成學習模型所畫的推估圖精準且可信,此外,更具有每日的時間更新率與50公尺網格大小的空間解析度,因此足以觀察到非常細微的污染濃度時空變異性。本研究基於推估成果,觀察到戴奧辛污染濃度具有輕微的逐年下降趨勢,還仔細觀察臺灣六都的細部污染濃度變化,發現六都污染濃度高低與分布不盡相同,但可以看出最高污染濃度主要集中在臺中、臺南與高雄的市區範圍。
    本研究證明了整合空間內插法、土地利用迴歸法、機器學習演算法與集成學習堆疊法於一體,所開發的空間混合集成學習模型及其方法學,在空氣污染推估領域中的優勢與可行性。

    【關鍵字】空氣污染、戴奧辛、機器學習、集成學習、時空推估模型

    SUMMARY

    Air pollutants are closely related to human health, and dioxin, the most toxic man-made chemical known so far, can also be transmitted through the air. However, currently, there is a lack of relevant studies on the spatiotemporal concentration variations of ambient air dioxin in Taiwan. This study aims to integrate multiple spatial estimating techniques, create a new air pollution concentration estimation methodology, and establish an estimation model of ambient air dioxin to observe its daily concentration variations in Taiwan from 2006 to 2016. We integrate spatial interpolation, land use regression, machine learning algorithms and ensemble stacking. It not only considers the rationality and the importance of the predictor variables, but also considers the change of the influence weight caused by the distance, as well as considering the linear and nonlinear relationship through the machine learning algorithms. Finally, the whole model is completed with an ensemble stacking architecture, which attains a great explanatory power (R2=0.85). In addition, the robustness of the model has been confirmed well through several model validations, indicating that the estimation results of the model are stable and reliable. The results of this study demonstrate that the developed methodology of model establishing has strong advantages. On top of that, a fine distribution of ambient air dioxin in Taiwan can be observed through the model built this way.

    Key words: Air Pollution, Dioxin, Machine Learning, Ensemble Learning, Spatiotemporal Estimation Model

    摘要 I 致謝 V 目錄 VI 圖目錄 VIII 表目錄 IX I. 前言 1 II. 文獻回顧 2 i. 戴奧辛 2 ii. 空氣污染模型的方法學改進 5 iii. 集成概念與堆疊模型 5 iv. 資料轉換 7 v. 模型驗證 7 III. 材料 9 i. 研究試區 9 ii. 環境空氣戴奧辛資料 9 iii. 變數資料庫 10 1. 環保署資源資料庫 11 2. 交通部中央氣象局資料庫 11 3. 國土利用調查資料庫 12 4. 交通部運輸研究所路網資料庫 12 5. 衛星植生指標監測資料庫 12 6. 經濟部工業局工業區資料庫 13 7. 政府開放資料平台 13 8. 其他 13 IV. 方法 15 i. 資料處理與資料庫建置 17 ii. 模型建置 21 1. 土地利用迴歸模型建置 21 2. 機器學習模型建置 21 3. 集成堆疊模型建置 23 iii. 模型驗證 23 V. 結果 25 i. 環境大氣戴奧辛採樣資料之描述性統計 25 1. 濃度分布 25 2. 資料型態 28 ii. 土地利用迴歸模型建置 29 iii. 機器學習模型建置 30 iv. 集成堆疊模型建置 32 v. 模型驗證 33 1. 各階段模型效能評估 33 2. 空間混合集成學習模型驗證 34 vi. 空間混合集成學習模型之推估圖 38 VI. 討論 42 VII. 結論 47 VIII. 參考文獻 48 附錄 53

    Araki, S., Shima, M., & Yamamoto, K. (2018). Spatiotemporal land use random forest model for estimating metropolitan NO2 exposure in Japan. Science of The Total Environment, 634, 1269-1277. doi:https://doi.org/10.1016/j.scitotenv.2018.03.324
    Beckerman, B. S., Jerrett, M., Serre, M., Martin, R. V., Lee, S.-J., van Donkelaar, A., . . . Burnett, R. T. (2013). A Hybrid Approach to Estimating National Scale Spatiotemporal Variability of PM2.5 in the Contiguous United States. Environmental science & technology, 47(13), 7233-7241. doi:https://doi.org/10.1021/es400039u
    Bunsan, S., Chen, W.-Y., Chen, H.-W., Chuang, Y. H., & Grisdanurak, N. (2013). Modeling the dioxin emission of a municipal solid waste incinerator using neural networks. Chemosphere, 92(3), 258-264. doi:https://doi.org/10.1016/j.chemosphere.2013.01.083
    Chang, M.-B., & Chung, Y.-T. (1998). Dioxin contents in fly ashes of MSW incineration in Taiwan. Chemosphere, 36(9), 1959-1968. doi:https://doi.org/10.1016/S0045-6535(97)10080-7
    Chang, S.-S., Lee, W.-J., Holsen, T. M., Li, H.-W., Wang, L.-C., & Chang-Chien, G.-P. (2014). Emissions of polychlorinated-p-dibenzo dioxin, dibenzofurans (PCDD/Fs) and polybrominated diphenyl ethers (PBDEs) from rice straw biomass burning. Atmospheric Environment, 94, 573-581. doi:https://doi.org/10.1016/j.atmosenv.2014.05.067
    Chen, J., de Hoogh, K., Gulliver, J., Hoffmann, B., Hertel, O., Ketzel, M., . . . Hoek, G. (2019). A comparison of linear regression, regularization, and machine learning algorithms to develop Europe-wide spatial models of fine particles and nitrogen dioxide. Environment International, 130, 104934. doi:https://doi.org/10.1016/j.envint.2019.104934
    Chen, J., Yin, J., Zang, L., Zhang, T., & Zhao, M. (2019). Stacking machine learning model for estimating hourly PM2.5 in China based on Himawari 8 aerosol optical depth data. Science of The Total Environment, 697, 134021. doi:https://doi.org/10.1016/j.scitotenv.2019.134021
    Chen, K., Peng, Y., Lu, S., Lin, B., & Li, X. (2021). Bagging based ensemble learning approaches for modeling the emission of PCDD/Fs from municipal solid waste incinerators. Chemosphere, 274, 129802. doi:https://doi.org/10.1016/j.chemosphere.2021.129802
    Chen, T.-H., Hsu, Y.-C., Zeng, Y.-T., Candice Lung, S.-C., Su, H.-J., Chao, H. J., & Wu, C.-D. (2020). A hybrid kriging/land-use regression model with Asian culture-specific sources to assess NO2 spatial-temporal variations. Environmental Pollution, 259, 113875. doi:https://doi.org/10.1016/j.envpol.2019.113875
    Chi, K. H., Luo, S., Kao, S. J., & Lee, T. Y. (2013). Sources and deposition fluxes of PCDD/Fs in a high-mountain lake in central Taiwan. Chemosphere, 91(2), 150-156. doi:https://doi.org/10.1016/j.chemosphere.2012.12.020
    Chu, H.-J., Huang, B., & Lin, C.-Y. (2015). Modeling the spatio-temporal heterogeneity in the PM10-PM2.5 relationship. Atmospheric Environment, 102, 176-182. doi:https://doi.org/10.1016/j.atmosenv.2014.11.062
    Delle Monache, L., & Stull, R. B. (2003). An ensemble air-quality forecast over western Europe during an ozone episode. Atmospheric Environment, 37(25), 3469-3474. doi:https://doi.org/10.1016/S1352-2310(03)00475-8
    Di, Q., Amini, H., Shi, L., Kloog, I., Silvern, R., Kelly, J., . . . Schwartz, J. (2019). An ensemble-based model of PM2.5 concentration across the contiguous United States with high spatiotemporal resolution. Environment International, 130, 104909. doi:https://doi.org/10.1016/j.envint.2019.104909
    Di, Q., Amini, H., Shi, L., Kloog, I., Silvern, R., Kelly, J., . . . Schwartz, J. (2020). Assessing NO2 Concentration and Model Uncertainty with High Spatiotemporal Resolution across the Contiguous United States Using Ensemble Model Averaging. Environmental science & technology, 54(3), 1372-1384. doi:https://doi.org/10.1021/acs.est.9b03358
    Eeftens, M., Beelen, R., de Hoogh, K., Bellander, T., Cesaroni, G., Cirach, M., . . . Hoek, G. (2012). Development of Land Use Regression Models for PM2.5, PM2.5 Absorbance, PM10 and PMcoarse in 20 European Study Areas; Results of the ESCAPE Project. Environmental science & technology, 46(20), 11195-11205. doi:https://doi.org/10.1021/es301948k
    Epstein, E. S. (1969). A scoring system for probability forecasts of ranked categories. Journal of Applied Meteorology (1962-1982), 8(6), 985-987. doi:https://www.jstor.org/stable/26174707
    Fan, C., Chen, Y.-c., Ma, H.-w., & Wang, G.-s. (2010). Comparative study of multimedia models applied to the risk assessment of soil and groundwater contamination sites in Taiwan. Journal of Hazardous Materials, 182(1), 778-786. doi:https://doi.org/10.1016/j.jhazmat.2010.06.102
    Fang, C., Liu, H., Li, G., Sun, D., & Miao, Z. (2015). Estimating the Impact of Urbanization on Air Quality in China Using Spatial Regression Models. Sustainability, 7(11), 15570-15592. doi:https://doi.org/10.3390/su71115570
    Fructuoso da Costa, A., & Fernando Crepaldi, A. (2014). The bias in reversing the Box–Cox transformation in time series forecasting: An empirical study based on neural networks. Neurocomputing, 136, 281-288. doi:https://doi.org/10.1016/j.neucom.2014.01.004
    Gocheva-Ilieva, S., Ivanov, A., & Stoimenova-Minova, M. (2022). Prediction of Daily Mean PM10 Concentrations Using Random Forest, CART Ensemble and Bagging Stacked by MARS. Sustainability, 14(2). doi:https://doi.org/10.3390/su14020798
    Harner, T., Green, N. J. L., & Jones, K. C. (2000). Measurements of Octanol−Air Partition Coefficients for PCDD/Fs:  A Tool in Assessing Air−Soil Equilibrium Status. Environmental science & technology, 34(15), 3109-3114. doi:https://doi.org/10.1021/es000970m
    Henderson, S. B., Beckerman, B., Jerrett, M., & Brauer, M. (2007). Application of Land Use Regression to Estimate Long-Term Concentrations of Traffic-Related Nitrogen Oxides and Fine Particulate Matter. Environmental science & technology, 41(7), 2422-2428. doi:https://doi.org/10.1021/es0606780
    Hites, R. A. (2011). Dioxins: an overview and history. Environmental science & technology, 45(1), 16-20. doi:https://doi.org/10.1021/es1013664
    Hoffman, R. N., & Kalnay, E. (1983). Lagged average forecasting, an alternative to Monte Carlo forecasting. Tellus A: Dynamic Meteorology and Oceanography, 35(2), 100-118. doi:https://doi.org/10.3402/tellusa.v35i2.11425
    Jung, C.-R., Hwang, B.-F., & Chen, W.-T. (2018). Incorporating long-term satellite-based aerosol optical depth, localized land use data, and meteorological variables to estimate ground-level PM2.5 concentrations in Taiwan from 2005 to 2015. Environmental Pollution, 237, 1000-1010. doi:https://doi.org/10.1016/j.envpol.2017.11.016
    Just, A. C., De Carli, M. M., Shtein, A., Dorman, M., Lyapustin, A., & Kloog, I. (2018). Correcting Measurement Error in Satellite Aerosol Optical Depth with Machine Learning for Modeling PM2.5 in the Northeastern USA. Remote Sensing, 10(5), 803. doi:https://doi.org/10.3390/rs10050803
    Ke, H., Gong, S., He, J., Zhang, L., Cui, B., Wang, Y., . . . Zhang, H. (2022). Development and application of an automated air quality forecasting system based on machine learning. Science of The Total Environment, 806, 151204. doi:https://doi.org/10.1016/j.scitotenv.2021.151204
    Kerckhoffs, J., Hoek, G., Portengen, L., Brunekreef, B., & Vermeulen, R. C. H. (2019). Performance of Prediction Algorithms for Modeling Outdoor Air Pollution Spatial Surfaces. Environmental science & technology, 53(3), 1413-1421. doi:https://doi.org/10.1021/acs.est.8b06038
    Lebret, E., Briggs, D., van Reeuwijk, H., Fischer, P., Smallbone, K., Harssema, H., . . . Elliott, P. (2000). Small area variations in ambient NO2 concentrations in four European areas. Atmos. Environ., 34, 177. doi:https://doi.org/10.1021/es301948k
    Leith, C. E. (1974). Theoretical Skill of Monte Carlo Forecasts. Monthly Weather Review, 102(6), 409-418. doi:https://doi.org/10.1175/1520-0493(1974)102<0409:TSOMCF>2.0.CO;2
    Li, Z., Yim, S. H.-L., & Ho, K.-F. (2020). High temporal resolution prediction of street-level PM2.5 and NOx concentrations using machine learning approach. Journal of Cleaner Production, 268, 121975. doi:https://doi.org/10.1016/j.jclepro.2020.121975
    Lohmann, R., & Jones, K. C. (1998). Dioxins and furans in air and deposition: A review of levels, behaviour and processes. Science of The Total Environment, 219(1), 53-81. doi:https://doi.org/10.1016/S0048-9697(98)00237-X
    Luminati, O., Ledebur de Antas de Campos, B., Flückiger, B., Brentani, A., Röösli, M., Fink, G., & de Hoogh, K. (2021). Land use regression modelling of NO2 in São Paulo, Brazil. Environmental Pollution, 289, 117832. doi:https://doi.org/10.1016/j.envpol.2021.117832
    Nguyen, D.-D., Tsai, C.-L., Hsu, Y.-C., Chen, Y.-W., Weng, Y.-M., & Chang, M. B. (2017). PCDD/Fs and dl-PCBs concentrations in water samples of Taiwan. Chemosphere, 173, 603-611. doi:https://doi.org/10.1016/j.chemosphere.2017.01.087
    Quaß, U., Fermann, M., & Bröker, G. (2004). The European Dioxin Air Emission Inventory Project––Final Results. Chemosphere, 54(9), 1319-1327. doi:https://doi.org/10.1016/S0045-6535(03)00251-0
    Schwetz, B., Norris, J., Sparschu, G., Rowe, U., Gehring, P., Emerson, J., & Gerbig, C. (1973). Toxicology of chlorinated dibenzo-p-dioxins. Environmental health perspectives, 5, 87-99. doi:https://doi.org/10.1289/ehp.730587
    Stafoggia, M., Bellander, T., Bucci, S., Davoli, M., de Hoogh, K., de' Donato, F., . . . Schwartz, J. (2019). Estimation of daily PM10 and PM2.5 concentrations in Italy, 2013–2015, using a spatiotemporal land-use random-forest model. Environment International, 124, 170-179. doi:https://doi.org/10.1016/j.envint.2019.01.016
    Trinh, M. M., & Chang, M. B. (2018). Review on occurrence and behavior of PCDD/Fs and dl-PCBs in atmosphere of East Asia. Atmospheric Environment, 180, 23-36. doi:https://doi.org/10.1016/j.atmosenv.2018.02.037
    Wang, M., Brunekreef, B., Gehring, U., Szpiro, A., Hoek, G., & Beelen, R. (2016). A New Technique for Evaluating Land-use Regression Models and Their Impact on Health Effect Estimates. Epidemiology, 27(1). doi:https://doi.org/10.1097/EDE.0000000000000404
    Wang, Y. H., & Wong, P. K. (2002). Mathematical relationships between vapor pressure, water solubility, Henry's law constant, n-octanol/water partition coefficent and gas chromatographic retention index of polychlorinated-dibenzo-dioxins. Water Research, 36(1), 350-355. doi:https://doi.org/10.1016/S0043-1354(01)00192-0
    Wei, J., Li, Z., Li, K., Dickerson, R. R., Pinker, R. T., Wang, J., . . . Cribb, M. (2022). Full-coverage mapping and spatiotemporal variations of ground-level ozone (O3) pollution from 2013 to 2020 across China. Remote Sensing of Environment, 270, 112775. doi:https://doi.org/10.1016/j.rse.2021.112775
    Wong, P.-Y., Hsu, C.-Y., Wu, J.-Y., Teo, T.-A., Huang, J.-W., Guo, H.-R., . . . Spengler, J. D. (2021). Incorporating land-use regression into machine learning algorithms in estimating the spatial-temporal variation of carbon monoxide in Taiwan. Environmental Modelling & Software, 139, 104996. doi:https://doi.org/10.1016/j.envsoft.2021.104996
    Wong, P.-Y., Lee, H.-Y., Chen, Y.-C., Zeng, Y.-T., Chern, Y.-R., Chen, N.-T., . . . Wu, C.-D. (2021). Using a land use regression model with machine learning to estimate ground level PM2.5. Environmental Pollution, 277, 116846. doi:https://doi.org/10.1016/j.envpol.2021.116846
    Wu, C.-D., Chen, Y.-C., Pan, W.-C., Zeng, Y.-T., Chen, M.-J., Guo, Y. L., & Lung, S.-C. C. (2017). Land-use regression with long-term satellite-based greenness index and culture-specific sources to model PM2.5 spatial-temporal variability. Environmental Pollution, 224, 148-157. doi:https://doi.org/10.1016/j.envpol.2017.01.074
    Wu, C.-D., Zeng, Y.-T., & Lung, S.-C. C. (2018). A hybrid kriging/land-use regression model to assess PM2.5 spatial-temporal variability. Science of The Total Environment, 645, 1456-1464. doi:https://doi.org/10.1016/j.scitotenv.2018.07.073
    Zamani Joharestani, M., Cao, C., Ni, X., Bashir, B., & Talebiesfandarani, S. (2019). PM2.5 Prediction Based on Random Forest, XGBoost, and Deep Learning Using Multisource Remote Sensing Data. Atmosphere, 10(7). doi:https://doi.org/10.3390/atmos10070373

    下載圖示 校內:2024-12-31公開
    校外:2024-12-31公開
    QR CODE