簡易檢索 / 詳目顯示

研究生: 曾敏航
Tseng, Min-Hang
論文名稱: 基於Transformer模型與生成式AI模型之漂綠行為偵測與情緒分析比較:以TCFD框架為基礎的實證研究
An Empirical Study on Greenwashing Detection and Sentiment Analysis Using Transformer-Based and Generative AI Models: A Comparative Approach Based on the TCFD Framework
指導教授: 陳牧言
Chen, Mu-Yen
學位類別: 碩士
Master
系所名稱: 工學院 - 工程科學系
Department of Engineering Science
論文出版年: 2025
畢業學年度: 113
語文別: 中文
論文頁數: 106
中文關鍵詞: ESG漂綠偵測生成式AITransformer模型TCFD框架氣候情緒分析
外文關鍵詞: ESG, Greenwashing Detection, Generative AI, Transformer Models, TCFD Framework, Climate Sentiment Analysis
相關次數: 點閱:21下載:15
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著氣候變遷與永續發展議題日益受到重視,企業在ESG(環境、社會與治理)資訊揭露中「漂綠」(Greenwashing)行為的問題亦逐漸浮現,對投資者與社會大眾造成重大影響。傳統漂綠偵測方法多依賴人工審閱與靜態指標,難以準確掌握企業聲明與實際行動間的語意偏差與情緒落差。本研究結合生成式人工智慧(Generative AI)與Transformer類神經網絡架構,設計出一套自動化、量化且可擴展的ESG漂綠偵測與情緒分析系統。
    研究資料涵蓋德國 DAX 40 家上市企業於 2021 至 2022 年間之 ESG 報告(企業內部揭露)與外部媒體文本,並以 TCFD(Task Force on Climate-related Financial Disclosures, 氣候相關財務揭露建議)四大分類──治理(Governance)、策略(Strategy)、風險管理(Risk Management)與指標與目標(Metrics & Targets)──作為文本分類基礎。資料預處理流程包含斷句標準化、句子向量化(使用 SBERT)、語意匹配與動態相似度門檻計算,藉此建立內外部文本配對邏輯,進行情緒分類與語意一致性評估。
    實驗設計方面,本文比較不同模型於氣候情緒分類任務之表現,包括基於 BERT 的 Transformer 模型與 OpenAI GPT-3.5 所建構之生成式模型。以專家標註樣本進行十折交叉驗證後,結果顯示:
    • BERT 模型經微調與情緒權重後可達 F1-score = 0.9599,進一步整合動態門檻後提升至 F1 = 0.9994;
    • GPT-3.5 API 模型 fine-tune 表現亦佳(F1 = 0.9774),惟直接使用 API 則僅達 F1 = 0.7582,顯示語意標註仍需任務微調與人工驗證。
    為進一步評估企業 ESG 揭露與外部觀點之語意一致性與資訊揭露完整度,本文建立六項量化指標:語意相似度(Cosine similarity ratio, CR)、情緒偏差(sentiment drift, SD)、分布一致性(disclosure consistency, DC)、承諾落差(action-gap index, AG)、正負詞彙比(polarity ratio, PNR)與風險揭露(risk disclosure scarcity, RDS),並整合為最終之漂綠綜合指數(Greenwashing Index, GWI),以多面向視角量化潛在漂綠風險。分析結果顯示,部分企業如 Beiersdorf AG 存在內外文本情緒傾向不一致之現象,內部揭露偏向中立或風險,然從外部媒體觀察則偏向正面機會,反映出語意層級之潛在漂綠傾向。
    整體而言,本研究不僅提出可系統化之漂綠偵測方法,亦透過實證驗證不同語意模型於 ESG 氣候文本上的應用差異與準確性。此架構具有高度重現性與擴展性,未來可廣泛應用於綠色金融風險監控、企業永續報告品質評比及 AI 在永續揭露自動化審核等領域,具備理論價值與實務意涵。

    With the increasing global attention on climate change and sustainable development, the issue of greenwashing in corporate ESG (Environmental, Social, and Governance) disclosures has drawn significant scrutiny, posing risks to investors and public trust. Traditional methods for detecting greenwashing often rely on manual review or static indicators, making it difficult to systematically identify the semantic and emotional gaps between corporate claims and actual practices. To address these limitations, this study integrates Generative AI with Transformer-based neural architectures to develop an automated, scalable, and quantitative ESG greenwashing detection and sentiment analysis framework.
    The study analyzes ESG reports (internal disclosures) and external media texts from Germany’s DAX 40 listed companies between 2021 and 2022. The classification framework is based on the four categories of the TCFD (Task Force on Climate-related Financial Disclosures): Governance, Strategy, Risk Management, and Metrics & Targets. Preprocessing steps include sentence segmentation, vectorization using SBERT, semantic matching, and a dynamic similarity threshold mechanism to pair internal and external texts for emotion classification and consistency evaluation.
    In terms of model evaluation, this study compares the performance of BERT-based Transformer models and OpenAI’s GPT-3.5 generative model in climate-related sentiment classification. Using 10-fold cross-validation on expert-annotated samples, the results show that:
    • The BERT model, after fine-tuning and applying sentiment-based class weighting, achieved an F1-score of 0.9599; when further combined with dynamic thresholding, its performance improved to F1 = 0.9994.
    • The GPT-3.5 API model also performed well after fine-tuning (F1 = 0.9774), but its direct API output without adjustment yielded only F1 = 0.7582, indicating that semantic labeling still requires task-specific fine-tuning and expert verification.
    To quantitatively evaluate the alignment between corporate ESG disclosures and external perspectives, this study proposes six multidimensional indicators: Content Resemblance (CR), Sentiment Deviation (SD), Distribution Consistency (DC), Action-Gap (AG), Positive-Negative Ratio (PNR), and Risk Disclosure Score (RDS). These are integrated into a final composite score—the Greenwashing Index (GWI)—designed to capture latent greenwashing risks. The analysis reveals that companies such as Beiersdorf AG demonstrate emotional inconsistency between internal reports (neutral or risk-focused) and external narratives (more opportunity-focused), indicating potential semantic-level greenwashing behaviors.
    In conclusion, this study presents a novel, systematic method for greenwashing detection and empirically validates the comparative performance of different AI models on ESG-related climate texts. The proposed framework is reproducible, adaptable, and holds substantial potential for practical applications in green finance risk monitoring, sustainability disclosure auditing, ESG reporting assessment, and regulatory AI-assisted screening. It also offers meaningful contributions to the academic field of computational sustainability and climate-related NLP research.

    摘要 I 目錄 X 表目錄 XII 圖目錄 XIII 緒論 1 1.1 研究背景與動機 1 1.2 研究目的 2 1.3 論文架構 4 第二章 文獻探討 5 2.1 漂綠行為(Greenwashing)概念與影響 5 2.2 ESG揭露與不對稱問題 6 2.3 漂綠偵測技術之演進 7 2.4 自然語言處理(NLP)技術於漂綠偵測應用 9 2.5 生成式AI與Transformer模型在情緒分析的發展 11 2.6 TCFD框架於ESG文本分類之應用 12 第三章 研究方法 14 3.1 研究整體設計概述 14 3.2 研究流程架構 14 3.3 資料蒐集及前處理 16 3.3.1 資料來源 16 3.3.2 文本預處理的流程 16 3.3.3 資料集統計摘要 17 3.4 二層式分類設計 17 3.4.1 第一層分類:TCFD四大類別分類 18 3.4.2 第二層分類:情緒三分類(風險/中立/機會) 18 3.4.3 分類信心門檻(Confidence Threshold)設定與策略 19 3.5 回測與模型調整 20 3.5.1 回測設計流程 20 3.5.2 模型調整策略 22 3.5.3 回測與模型調整流程總結 22 3.6 漂綠行為量化與GWI指標設計 23 3.6.1 個別指標架構及計算方式 24 3.6.2 指標設計概述 28 3.6.3 綜合指數GWI計算 28 3.6.4 GWI 權重調整方法:PCA 與分群潛力應用 29 第四章 實證分析 31 4.1 資料集結構與目的 32 4.1.1 企業 ESG 文本資料 32 4.1.2 TCFD 語意標註資料 32 4.1.3 氣候情緒標註資料 33 4.2 實驗環境及參數設定 34 4.2.1 模型流程與實驗推演階段 34 4.2.2 各模型設定與實驗參數比較 36 4.3 TCFD 與情緒多層分類結果 36 4.4 漂綠指數(GWI)與語意漂綠分析方法 38 4.4.1 GWI結構及方法說明 38 4.4.2 PCA權重與權重調整方法 39 4.4.3 漂綠語意分群分析 42 4.5 實驗結果 45 4.5.1 ESG 文本標註與 TCFD 分類配對 45 4.5.2 動態相似度門檻設定方法與分析 46 4.5.3 TCFD 分類一致性與分布偏差檢測 50 4.5.4 氣候情緒分類模型訓練與交叉驗證 51 4.5.5 氣候情緒分類模型錯誤分析 55 4.5.6 GWI 指數建構與各構面評估 58 4.5.7 PCA 主成分分析與權重推估 66 4.5.8 KMeans 漂綠語意分群與句子特徵分析 70 4.5.9 Beiersdorf AG 案例分析:ESG 漂綠風險指標應用與語意異質性檢測實證 74 4.5.10 漂綠風險指標實證案例分析(GWI 前五名公司) 76 第五章 結論與未來發展方向 85 5.1 研究結論 85 5.2 研究貢獻 86 5.3 研究限制 86 5.4 未來發展方向 87 參考文獻 89

    Aggarwal, C. C., & Zhai, C. (2012). A survey of text clustering algorithms. In Mining text data (pp. 77–128). Springer. https://doi.org/10.1007/978-1-4614-3223-4_4
    Aina, L., Gulordava, K., & Boleda, G. (2019). Putting words in context: LSTM language models and lexical ambiguity. Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2019). https://aclanthology.org/P19-1324.pdf
    Auzepy, A., Tönjes, E., Lenz, D., & Funk, C. (2023). Evaluating TCFD reporting: A new application of zero-shot analysis to climate-related financial disclosures. PLOS ONE, 18(7), e0288052. https://doi.org/10.1371/journal.pone.0288052
    Beiersdorf AG ESG and Climate Transparency Analysis. (2024).
    包含以下來源整合:
    • Deutsche Umwelthilfe v. Beiersdorf AG – Greenwashing Case Overview. Climate Case Chart. https://climatecasechart.com/non-us-case/deutsche-umwelthilfe-v-beiersdorf-ag/
    • German Drugstore Commits to Greater Transparency. Peters & Peters Law. https://www.petersandpeters.com/case/german-drugstore-commits-to-greater-transparency-in-response-to-consumer-greenwashing-claim/
    • DitchCarbon. https://ditchcarbon.com/organizations/beiersdorf
    • Beiersdorf AG Annual Reports 2023–2024. https://reports.beiersdorf.com
    • Sustainalytics ESG Risk Rating. https://www.sustainalytics.com/esg-rating/beiersdorf-ag/1008754329
    • Beiersdorf Sustainability Reporting Portal. https://www.beiersdorf.com/sustainability/reporting/sustainability-reporting 
    Bingler, J., Kraus, M., Leippold, M., & Webersinke, N. (2024). How cheap talk in climate disclosures relates to climate initiatives, corporate emissions, and reputation risk. Journal of Banking and Finance, 164, 107191. https://doi.org/10.1016/j.jbankfin.2024.107191
    Boelders, F. B. (2020). Firm characteristics and greenwashing in the European energy sector. Master’s thesis, Utrecht University. https://studenttheses.uu.nl/bitstream/handle/20.500.12932/38034/
    Brown, T. B., Mann, B., Ryder, N., Subbiah, M., et al. (2020). Language models are few-shot learners. Advances in Neural Information Processing Systems (NeurIPS 2020). https://papers.nips.cc/paper_files/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf
    Buda, M., Maki, A., & Mazurowski, M. A. (2018). A systematic study of the class imbalance problem in convolutional neural networks. Neural Networks, 106, 249–259. https://doi.org/10.1016/j.neunet.2018.07.011
    Chalkidis, I., Fergadiotis, M., Malakasiotis, P., & Androutsopoulos, I. (2021). MultEURLEX – A multi-lingual and multi-label legal document dataset for EU legislation. arXiv preprint, arXiv:2109.00904. https://arxiv.org/abs/2109.00904
    ClimateBERT Team. (2024). climate_sentiment & tcfd_recommendations. HuggingFace. https://huggingface.co/datasets/climatebert/
    Delmas, M. A., & Burbano, V. C. (2011). The drivers of greenwashing. California Management Review, 54(1), 64–87. https://doi.org/10.1525/cmr.2011.54.1.64
    Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. NAACL 2019. https://aclanthology.org/N19-1423/
    Equintel. (2024). Detecting greenwashing signals through a comparison of ESG reports and public media. Kaggle Dataset. https://www.kaggle.com/datasets/equintel/dax-esg-media-dataset
    Esteva, A., et al. (2019). A guide to deep learning in healthcare. Nature Medicine, 25(1), 24–29. https://doi.org/10.1038/s41591-018-0316-z
    Guo, C., Pleiss, G., Sun, Y., & Weinberger, K. Q. (2017). On calibration of modern neural networks. Proceedings of the 34th International Conference on Machine Learning (ICML). https://proceedings.mlr.press/v70/guo17a.html
    Jay Westerveld. (1986). Greenwashing: Misleading claims of environmental responsibility. Business & Society.
    (無正式出版網址,常以綜述文章引用)
    Johnson, J. M., & Khoshgoftaar, T. M. (2019). Survey on deep learning with class imbalance. Journal of Big Data, 6(1), 1–54. https://doi.org/10.1186/s40537-019-0192-5
    Kim, J. S., et al. (2022). NLP-based greenwashing pattern detection. International Conference on Computer Science and Applications. https://www.researchgate.net/publication/371258228_Establishment_of_NLP-Based_Greenwashing_Pattern_Detection_Service
    Kim, S.-W., & Gil, J.-M. (2024). Enhancing K-means clustering for journal articles using TF-IDF and LDA feature extraction. https://www.researchgate.net/publication/390753449_Enhancing_K-Means_Clustering_for_Journal_Articles_using_TF-IDF_and_LDA_Feature_Extraction
    Kleffel, P., & Muck, M. (2023). Aggregate confusion or inner conflict? An experimental analysis of investors’ reaction to greenwashing. Finance Research Letters, 53. https://www.sciencedirect.com/science/article/pii/S1544612322005980
    Kobit, J., Schmitt, V., & Woloszyn, V. (2021). Towards automatic green claim detection. Forum for Information Retrieval Evaluation (FIRE ’21). https://www.researchgate.net/publication/358135246_Towards_Automatic_Green_Claim_Detection
    Kotonya, N., & Toni, F. (2020). Explainable automated fact-checking for public health claims. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). https://aclanthology.org/2020.emnlp-main.623/
    Lipenkova, J., Lu, G., & Rao, S. X. (2023). Detecting greenwashing signals through a comparison of ESG reports and public media. SwissText 2023 Workshop. https://sites.google.com/view/greenwashingswisstext/home
    Lyon, T. P., & Montgomery, A. W. (2015). The means and end of greenwash. Organization & Environment, 28(2), 223–249. https://doi.org/10.1177/1086026615575332
    Moodaley, W., & Telukdarie, A. (2023). Greenwashing, sustainability reporting, and artificial intelligence: A systematic literature review. Sustainability, 15(2), 1081. https://doi.org/10.3390/su15021481
    OpenAI. (2023). GPT models. https://platform.openai.com/docs/models
    OpenAI. (2023). GPT-4 technical report. https://openai.com/research/gpt-4
    Santos, C., Coelho, A., & Marques, A. (2023). A systematic literature review on greenwashing and its relationship to stakeholders. Management Review Quarterly. https://link.springer.com/article/10.1007/s11301-023-00337-5
    Task Force on Climate-related Financial Disclosures (TCFD). (2017). Final report: Recommendations of the TCFD. https://www.fsb-tcfd.org/publications/final-recommendations-report/
    Tian, Y., & Shi, J. (2025). Facilitating or inhibiting: A study on the impact of artificial intelligence on corporate greenwashing. Sustainability, 17(1), 101. https://www.proquest.com/docview/3176375215/5CF7617361754CE7PQ
    United Nations. (2015). Transforming our world: The 2030 Agenda for Sustainable Development. https://sdgs.un.org/2030agenda
    Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., … & Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems (NeurIPS). https://papers.nips.cc/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html
    Walker, K., & Wan, F. (2012). The harm of symbolic actions and green-washing: Corporate actions and communications on environmental performance and their financial implications. Journal of Business Ethics, 109(2), 227–242. https://doi.org/10.1007/s10551-011-1122-4
    Xu, Y., Zhang, W., Wang, D., & Huang, H. (2023). Improved TF-IDF-based LDA topic clustering. https://ebooks.iospress.nl/doi/10.3233/FAIA230934
    Zhou, X., & Zafarani, R. (2020). A survey of fake news: Fundamental theories, detection methods, and opportunities. ACM Computing Surveys, 53(5), 1–40. https://doi.org/10.1145/3395046

    下載圖示 校內:立即公開
    校外:立即公開
    QR CODE