| 研究生: |
翁佩詒 Wong, Pei-Yi |
|---|---|
| 論文名稱: |
結合土地利用迴歸與機械學習演算法發展二氧化氮之高時空解析度推估模型 Development of an integrated model for NO2 variation prediction using land-use regression and machine learning algorithms |
| 指導教授: |
蘇慧貞
Su, Huey-Jen |
| 共同指導教授: |
吳治達
Wu, Chih-Da |
| 學位類別: |
碩士 Master |
| 系所名稱: |
醫學院 - 環境醫學研究所 Department of Environmental and Occupational Health |
| 論文出版年: | 2020 |
| 畢業學年度: | 108 |
| 語文別: | 英文 |
| 論文頁數: | 78 |
| 中文關鍵詞: | 二氧化氮 、土地利用迴歸 、克利金空間內插 、機械學習 、推估模型 |
| 外文關鍵詞: | NO2, Land-use regression, Kriging interpolation, Machine learning, Predictive model |
| 相關次數: | 點閱:117 下載:17 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
二氧化氮為高反應性氣態污染物,對呼吸系統具不良的影響。受限於空氣品質測站數量限制,大範圍地區空氣污染濃度梯度較無法被良好表現,而個人採樣器雖然可以獲得較精確的暴露濃度,但執行所耗的人力物力成本太高,因此發展高時空解析度之預測模型來探討大範圍暴露濃度有其必要性。基於此,本研究之主要目的在使用傳統以及混和式土地利用迴歸模型,進一步結合機械學習演算法進行模型擬合,以發展50公尺網格解析度之日平均二氧化氮濃度推估模型,進而推估台灣本島二氧化氮之時空分布。
本研究利用2000-2016年於環保署環境資源資料庫,共蒐集41萬筆每日空氣污染物觀測值;並透過地理資訊系統資料庫獲取土地排放源之資訊,如:衛星環境綠蔽度資料庫、國土利用調查資料、地標資料庫、路網數值圖等資料庫;進而建立土地利用迴歸模式、克利金/土地利用迴歸混合模式,並分別結合深度類神經網路 (Deep neural network)、隨機森林 (Random forest) 以及極限梯度提升 (eXtreme gradient boosting) 等三種機械學習演算法進行模型擬合,總計共完成八種暴露評估模式;經由資料切分建模、十折交叉驗證、外部資料驗證與區分不同季節、地區驗證後,利用預測能力最佳之模型推估台灣二氧化氮濃度的變化。
研究結果發現,在八種暴露評估模式中,克利金/土地利用迴歸混合模式結合極限梯度提升機械學習演算法之模式預測能力最佳,R2為0.91;與傳統之土地利用迴歸模型 (R2 = 0.65) 相比,提升26%之模型表現;此外,其均方根誤差 (Root-mean-square error, RMSE) 為3.01 ppb,10折交叉驗證之R2為0.90,顯示模型沒有過度擬合的問題。以上結果證明,在本研究所使用的八種推估方法中,以克利金/土地利用迴歸混合模式結合XGBoost演算法的模型表現最佳,亦證實整合克立金空間內插法、土地利用迴歸與機械學習,確實可以提高空氣污染預測能力。最後利用該模型推估台灣二氧化氮濃度變化之結果發現,隨著年份增加濃度有下降的趨勢,且多分布於西部地區。
本研究發展之方法可準確推估其他沒有測站地區之二氧化氮濃度值。
Nitrogen dioxide (NO2) is a highly reactive gas and a secondary pollutant from the burning of fossil fuels. It is predominantly expelled from vehicle exhausts. There is a high traffic density and a large number of temples and restaurants in Taiwan so NO2 pollution is significant. The high concentration of NO2 has an adverse effect on respiratory systems. In order to estimate NO2 concentrations more accurately, this study featured a land use regression models that uses machine learning to assess the spatial-temporal variability. Daily average NO2 data was collected from the 70 fixed air quality monitoring stations on the main island of Taiwan that belonged to the Taiwan Environmental Protection Administration (EPA). Around 0.41 million observations were used for the analysis. Several datasets were collected to determine spatial predictor variables, including the EPA environmental resources dataset, the meteorological dataset, the land-use inventory, the landmark dataset, the digital road network map, the digital terrain model, Normalized Difference Vegetation Index obtained through the Moderate Resolution Imaging Spectroradiometer, and the power plant distribution dataset. A conventional land-use regression (LUR) and hybrid Kriging-LUR were firstly used to identify the important prediction variables. A deep neural network, a random forest and XGBoost algorithms were then used to fit the prediction model. Data splitting, 10-fold cross validation, external data verification, seasonal-based and county-based validation methods were used to verify the robustness of the model. The results showed that the proposed conventional LUR and hybrid Kriging-LUR models respectively captured 65% and 78% of NO2 variation. When a machine learning algorithm was used, the explanatory power of the models was respectively increased to 84% and 91%. The hybrid Kriging-LUR with an XGBoost algorithm outperformed the other integrated methods. This study demonstrated the value of combining the hybrid Kriging-LUR model and an XGBoost algorithm to estimate the spatial-temporal variability of NO2 exposure.
Achakulwisut P, Brauer M, Hystad P, Anenberg SC. 2019. Global, national, and urban burdens of paediatric asthma incidence attributable to ambient NO2 pollution: Estimates from global datasets. Lancet Planet Health 3:e166-e178.
Ackermann-Liebrich U, Leuenberger P, Schwartz J, Schindler C, Monn C, Bolognini G, et al. 1997. Lung function and long term exposure to air pollutants in switzerland. Study on air pollution and lung diseases in adults (SAPALDIA) team. Am J Respir Crit Care Med 155:122-129.
Adams MD, Kanaroglou PS. 2016. Mapping real-time air pollution health risk for environmental management: Combining mobile and stationary air pollution monitoring with neural network models. J Environ Manage 168:133-141.
Alexeeff SE, Schwartz J, Kloog I, Chudnovsky A, Koutrakis P, Coull BA. 2015. Consequences of kriging and land use regression for PM2.5 predictions in epidemiologic analyses: Insights into spatial variability using high-resolution satellite data. J Expo Sci Environ Epidemiol 25:138-144.
Araki S, Shima M, Yamamoto K. 2018. Spatiotemporal land use random forest model for estimating metropolitan NO2 exposure in japan. Sci Total Environ 634:1269-1277.
Beelen R, Hoek G, Vienneau D, Eeftens M, Dimakopoulou K, Pedeli X, et al. 2013. Development of NO2 and NOx land use regression models for estimating air pollution exposure in 36 study areas in Europe–the escape project. Atmospheric Environ 72:10-23.
Belanger K, Triche EW. 2008. Indoor combustion and asthma. Immunology allergy clinics of North America 28:507-519.
Brauer M, Hoek G, van Vliet P, Meliefste K, Fischer P, Gehring U, et al. 2003. Estimating long-term average particulate air pollution concentrations: Application of traffic indicators and geographic information systems. Epidemiology 228-239.
Breiman L. 2001. Random forests. Mach Learn 45:5-32.
Briggs DJ, Collins S, Elliott P, Fischer P, Kingham S, Lebret E, et al. 1997. Mapping urban air pollution using GIS: A regression-based approach. Int J Geogr Inf Sci 11:699-718.
Brunekreef B, Holgate ST. 2002. Air pollution and health. Lancet 360:1233-1242.
Chan TC, Chen ML, Lin IF, Lee CH, Chiang PH, Wang DW, et al. 2009. Spatiotemporal analysis of air pollution and asthma patient visits in Taipei, taiwan. Int J Health Geogr 8:26.
Chan TH, Jia K, Gao S, Lu J, Zeng Z, Ma Y. 2015. Pcanet: A simple deep learning baseline for image classification? IEEE Trans Image Process 24:5017-5032.
Chauhan A, Krishna M, Frew A, Holgate S. 1998. Exposure to nitrogen dioxide (NO2) and respiratory disease risk. Rev Environ Health 13:73-90.
Chen J, de Hoogh K, Gulliver J, Hoffmann B, Hertel O, Ketzel M, et al. 2019. A comparison of linear regression, regularization, and machine learning algorithms to develop Europe-wide spatial models of fine particles and nitrogen dioxide. Environ Int 130:104934.
Chen TH, Hsu YC, Zeng YT, Lung SCC, Su H-J, Chao HJ, et al. 2020. A hybrid kriging/land-use regression model with Asian culture-specific sources to assess NO2 spatial-temporal variations. Environ Pollut 259:113875.
Chen T, Guestrin C. 2016. Xgboost: A scalable tree boosting system. In: Proceedings of the Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining 785-794.
Collobert R, Weston J. 2008. A unified architecture for natural language processing: Deep neural networks with multitask learning. In: Proceedings of the Proceedings of the 25th international conference on Machine learning 160-167.
Cyrys J, Heinrich J, Richter K, Wölke G, Wichmann HE. 2000. Sources and concentrations of indoor nitrogen dioxide in Hamburg (west Germany) and Erfurt (east Germany). Sci Total Environ 250:51-62.
de Hoogh K, Gulliver J, van Donkelaar A, Martin RV, Marshall JD, Bechle MJ, et al. 2016. Development of west-European PM2.5 and NO2 land use regression models incorporating satellite-derived and chemical transport modelling data. Environ Res 151:1-10.
Di Q, Amini H, Shi L, Kloog I, Silvern R, Kelly J, et al. 2019. Assessing NO2 concentration and model uncertainty with high spatiotemporal resolution across the contiguous united states using ensemble model averaging. Environ Sci Technol 54:1372-1384.
EEA. 2013. European union emission inventory report 1990–2011 under the UNECE convention on long-range transboundary air pollution (LRTAP). European Environment Agency.
Eeftens M, Beelen R, de Hoogh K, Bellander T, Cesaroni G, Cirach M, et al. 2012. Development of land use regression models for PM2.5, PM2.5 absorbance, PM10 and PMcoarse in 20 European study areas; results of the ESCAPE project. Environ Sci Technol 46:11195-11205.
Frampton MW, Smeglin AM, Roberts Jr NJ, Finkelstein JN, Morrow PE, Utell MJ. 1989. Nitrogen dioxide exposure in vivo and human alveolar macrophage inactivation of influenza virus in vitro. Environ Res 48:179-192.
Guo L, Chehata N, Mallet C, Boukir S. 2011. Relevance of airborne lidar and multispectral image data for urban scene classification using random forests. ISPRS J Photogramm 66:56-66.
Guo Y, Su JG, Dong Y, Wolch J. 2019. Application of land use regression techniques for urban greening: An analysis of Tianjin, China. Urban For Urban Green 38:11-21.
Han S, Bian H, Feng Y, Liu A, Li X, Zeng F, et al. 2011. Analysis of the relationship between O3, NO and NO2 in Tianjin, China. Aerosol Air Qual Res 11:128-139.
Ibe F, Njoku P, Alinnor J, Opara A. 2016. Spatial variation of NO2 and SO2 in the ambient environment of Imo State, Nigeria. Int J Sci Environ Technol 5:33-46.
Ierodiakonou D, Zanobetti A, Coull BA, Melly S, Postma DS, Boezen HM, et al. 2016. Ambient air pollution, lung function, and airway responsiveness in asthmatic children. J Allergy Clin Immunol 137:390-399.
Jerrett M, Burnett RT, Ma R, Pope III CA, Krewski D, Newbold KB, et al. 2005. Spatial analysis of air pollution and mortality in Los Angeles. Epidemiology 727-736.
Kagawa J. 1985. Evaluation of biological significance of nitrogen oxides exposure. Tokai J Exp Clin Med 10:348-353.
Kamińska JA. 2018. The use of random forests in modelling short-term air pollution effects based on traffic and meteorological conditions: A case study in Wrocław. J Environ Manage 217:164-174.
Kamińska JA. 2019. A random forest partition model for predicting NO2 concentrations from traffic flow and meteorological conditions. Sci Total Environ 651:475-483.
Kampa M, Castanas E. 2008. Human health effects of air pollution. Environ Pollut 151:362-367.
Kenty KL, Poor ND, Kronmiller KG, McClenny W, King C, Atkeson T, et al. 2007. Application of CALINE4 to roadside NO/NO2 transformations. Atmospheric Environ 41:4270-4280.
Kim DR, Lee JB, Song CK, Kim SY, Ma Yl, Lee KM, et al. 2015. Temporal and spatial distribution of tropospheric NO2 over Northeast Asia using OMI data during the years 2005–2010. Atmospheric Pollut Res 6:768-776.
Kim KH, Kabir E, Kabir S. 2015. A review on the human health impact of airborne particulate matter. Environ Int 74:136-143.
Kuo SC, Tsai YI, Sopajaree K. 2015. Emission identification and health risk potential of allergy-causing fragrant substances in PM2.5 from incense burning. Build Environ 87:23-33.
Larkin A, Geddes JA, Martin RV, Xiao Q, Liu Y, Marshall JD, et al. 2017. Global land use regression model for nitrogen dioxide air pollution. Environ Sci Technol 51:6957-6964.
Lee CS, Chang KH, Kim H. 2018. Long-term (2005–2015) trend analysis of PM2.5 precursor gas NO2 and SO2 concentrations in Taiwan. Environ Sci Pollut R 25:22136-22152.
Lee JH, Wu CF, Hoek G, de Hoogh K, Beelen R, Brunekreef B, et al. 2014. Land use regression models for estimating individual NOx and NO2 exposures in a metropolis with a high density of traffic roads and population. Sci Total Environ 472:1163-1171.
Lee SC, Wang B. 2004. Characteristics of emissions of air pollutants from burning of incense in a large environmental chamber. Atmospheric Environ 38:941-951.
Lin TC, Krishnaswamy G, Chi DS. 2008. Incense smoke: Clinical, structural and molecular effects on airway disease. Clin Mol Allergy 6:3.
Liu C, Henderson BH, Wang D, Yang X, Peng Z-r. 2016. A land use regression application into assessing spatial variation of intra-urban fine particulate matter (PM2.5) and nitrogen dioxide (NO2) concentrations in city of Shanghai, China. Sci Total Environ 565:607-615.
Liu SV, Chen FL, Xue J. 2017. Evaluation of traffic density parameters as an indicator of vehicle emission-related near-road air pollution: A case study with nexus measurement data on black carbon. Int J Environ Res Public Health 14:1581.
Liu W, Li X, Chen Z, Zeng G, León T, Liang J, et al. 2015. Land use regression models coupled with meteorology to model spatial and temporal variability of NO2 and PM10 in Changsha, China. Atmospheric Environ 116:272-280.
Lung SCC, Kao MC. 2003. Worshippers’ exposure to particulate matter in two temples in Taiwan. J Air Waste Manag Assoc 53:130-135.
Michanowicz DR, Shmool JL, Cambal L, Tunno BJ, Gillooly S, Hunt MJO, et al. 2016. A hybrid land use regression/line-source dispersion model for predicting intra-urban NO2. Transportation Research Part D: Transport Environment 43:181-191.
Mohamed A-r, Sainath TN, Dahl G, Ramabhadran B, Hinton GE, Picheny MA. 2011. Deep belief networks using discriminative features for phone recognition. In: Proceedings of the 2011 IEEE international conference on acoustics, speech and signal processing (ICASSP) 5060-5063.
Moore D, Jerrett M, Mack W, Künzli N. 2007. A land use regression model for predicting ambient fine particulate matter across Los Angeles, ca. J Environ Monitor 9:246-252.
MOTC. 2020. Vehicle statistics. (Ministry of Transportation and Communications).
Peters A, Liu E, Verrier RL, Schwartz J, Gold DR, Mittleman M, et al. 2000. Air pollution and incidence of cardiac arrhythmia. Epidemiology 11:11-17.
Qu Y, An J, He Y, Zheng J. 2016. An overview of emissions of SO2 and NOx and the long-range transport of oxidized sulfur and nitrogen pollutants in east Asia. J Environ Sci 44:13-25.
Rijnders E, Janssen N, Van Vliet P, Brunekreef B. 2001. Personal and outdoor nitrogen dioxide concentrations in relation to degree of urbanization and traffic density. Environmental health perspectives 109:411-417.
Rose N, Cowie C, Gillett R, Marks GB. 2009. Weighted road density: A simple way of assigning traffic-related air pollution exposure. Atmospheric Environ 43:5009-5014.
Sbihi H, Tamburic L, Koehoorn M, Brauer M. 2016. Perinatal air pollution exposure and development of asthma from birth to age 10 years. Eur Respir J 47:1062-1071.
Soh PW, Chang JW, Huang JW. 2018. Adaptive deep learning-based air quality prediction model using the most relevant spatial-temporal relations. IEEE Access 6:38186-38199.
Solaiman T, Coulibaly P, Kanaroglou P. 2008. Ground-level ozone forecasting using data-driven methods. Air Qual Atmos Health 1:179-193.
Sun EJ, Wang YN, Swei W. 2009. Measuring the deposition velocity of nitrogen dioxide on three big trees in Taiwan. 環境保護 32:25-38.
USEPA. 2016. Integrated science assessment for oxides of nitrogen –health criteria.
USEPA. 2019. Nitrogen dioxide (NO2) pollution.
Wang X, Sun W. 2019. Meteorological parameters and gaseous pollutant concentrations as predictors of daily continuous PM2.5 concentrations using deep neural network in Beijing–Tianjin–Hebei, China. Atmospheric Environ 211:128-137.
Weng Q, Yang S. 2006. Urban air pollution patterns, land use, and thermal landscape: An examination of the linkage using GIS. Environ Monit Assess 117:463-489.
Wu CD, Chen YC, Pan WC, Zeng YT, Chen MJ, Guo YL, et al. 2017. Land-use regression with long-term satellite-based greenness index and culture-specific sources to model PM2.5 spatial-temporal variability. Environ Pollut 224:148-157.
Wu CD, Zeng YT, Lung SCC. 2018. A hybrid kriging/land-use regression model to assess PM2.5 spatial-temporal variability. Sci Total Environ 645:1456-1464.
Wu S, Ni Y, Li H, Pan L, Yang D, Baccarelli AA, et al. 2016. Short-term exposure to high ambient air pollution increases airway inflammation and respiratory symptoms in chronic obstructive pulmonary disease patients in Beijing, China. Environ Int 94:76-82.
Xu H, Bechle MJ, Wang M, Szpiro AA, Vedal S, Bai Y, et al. 2019. National PM2.5 and NO2 exposure models for china based on land use regression, satellite measurements, and universal kriging. Sci Total Environ 655:423-433.
Xu Y, Ho HC, Wong MS, Deng C, Shi Y, Chan TC, et al. 2018. Evaluation of machine learning techniques with multiple remote sensing datasets in estimating monthly concentrations of ground-level PM2.5. Environ Pollut 242:1417-1426.
Young MT, Bechle MJ, Sampson PD, Szpiro AA, Marshall JD, Sheppard L, et al. 2016. Satellite-based NO2 and model validation in a national prediction model based on universal kriging and land-use regression. Environ Sci Technol 50:3686-3694.
Zamani Joharestani M, Cao C, Ni X, Bashir B, Talebiesfandarani S. 2019. PM2.5 prediction based on random forest, XGBoost, and deep learning using multisource remote sensing data. Atmosphere 10:373.
Zhai B, Chen J. 2018. Development of a stacked ensemble model for forecasting and analyzing daily average PM2.5 concentrations in Beijing, China. Sci Total Environ 635:644-658.
Zhan Y, Luo Y, Deng X, Grieneisen ML, Zhang M, Di B. 2018a. Spatiotemporal prediction of daily ambient ozone levels across china using random forest for human exposure assessment. Environ Pollut 233:464-473.
Zhan Y, Luo Y, Deng X, Zhang K, Zhang M, Grieneisen ML, et al. 2018b. Satellite-based estimates of daily NO2 exposure in china using hybrid random forest and spatiotemporal kriging model. Environ Sci Technol 52:4180-4189.
Zhang CY, Chen CP, Gan M, Chen L. 2015. Predictive deep Boltzmann machine for multiperiod wind speed forecasting. IEEE T Sustain Energ 6:1416-1425.
Zhang Z, Wang J, Hart JE, Laden F, Zhao C, Li T, et al. 2018. National scale spatiotemporal land-use regression model for PM2.5, PM10 and NO2 concentration in china. Atmospheric Environ 192:48-54.
Zhou Y, Chang FJ, Chang LC, Kao IF, Wang YS. 2019. Explore a deep learning multi-output neural network for regional multi-step-ahead air quality forecasts. J Clean Prod 209:134-145.