ESG-DocQA: A Three-Annotator Validated Dataset for Evidence-Grounded Question Answering over Corporate ESG Reports

Huajian Jiang

Zenodo (CERN European Organization for Nuclear Research)データセット2026-05-30#AI×ESGOrigin: Global

DOI: 10.5281/zenodo.20461703

原典: https://doi.org/10.5281/zenodo.20461703

🤖 gxceed AI 要約

日本語

本論文では、企業のESG報告書に対する根拠に基づく質問応答のためのベンチマークデータセットESG-DocQAを提案する。300サンプルから成り、検証、比較、推論の3種類の質問を含む。3名のアノテーターによる検証で高い信頼性を達成。環境（E）145、社会（S）91、ガバナンス（G）64の分布。リポジトリには完全なメタデータと再現可能なスクリプトが含まれる。

English

This paper presents ESG-DocQA, a 300-sample benchmark for evidence-grounded question answering over corporate ESG reports. It includes verification, comparison, and inference questions. Three-annotator validation achieved substantial inter-annotator reliability (Fleiss' kappa=0.644). Domain distribution: E=145, S=91, G=64. The repository provides full metadata and reproducibility scripts.

Unofficial AI-generated summary based on the public title and abstract. Not an official translation.

📝 gxceed 編集解説 — Why this matters

日本のGX文脈において

日本のGX文脈では、SSBJ開示が進む中でESG情報の自動抽出・分析の重要性が高まる。本データセットは日本語には直接対応しないが、手法や注釈ガイドラインは日本の開示データにも応用可能であり、AIを活用した開示分析の基盤となり得る。

In the global GX context

In the global GX context, this dataset addresses the need for structured and verifiable extraction of ESG information from corporate reports. It supports automation in climate disclosure analysis, relevant for frameworks like TCFD, ISSB, and CSRD. The rigorous annotation process sets a standard for future benchmarks.

👥 読者別の含意

🔬研究者:Researchers in NLP and ESG analytics can use ESG-DocQA as a benchmark for developing and evaluating question-answering systems on corporate sustainability disclosures.

🏢実務担当者:Corporate sustainability teams can leverage the dataset's methodology to improve automated extraction of ESG metrics from their own reports.

🏛政策担当者:Policymakers may find the annotation guidelines useful for standardizing how ESG information is structured and queried in regulatory filings.

📄 Abstract（原文）

ESG-DocQA is a 300-sample benchmark for evidence-grounded question answering over corporate environmental, social, and governance (ESG) reports. The dataset was constructed from page-level ESG report evidence and contains multi-step verification, comparison, and inference questions. The benchmark was produced through iterative human review and three-annotator validation, achieving substantial inter-annotator reliability (Fleiss' kappa = 0.644; Krippendorff's alpha = 0.647). The three trained graduate-student peer annotators independently used Annotation Guidelines v2 and did not participate in candidate question-answer generation. The final dataset contains 300 validated samples with domain distribution E=145, S=91, G=64 and answer-type distribution verification=127, comparison=106, inference=67. The repository includes benchmark JSONL records, data dictionary and metadata, annotation guidelines, validation reports, adjudication and replacement logs, source-report manifests, prompts, reproducibility scripts, and the current SCIE/IJDAR submission package. Original ESG report source PDFs and rendered page images are not redistributed due to copyright considerations. Users can locate source reports using the provided source-report manifest and page-level metadata.

🔗 Provenance — このレコードを発見したソース

openalex https://doi.org/10.5281/zenodo.20461703first seen 2026-06-02 04:53:39 · last seen 2026-06-16 04:49:20

🔔 こうした論文の新着を逃したくない方はキーワードアラートに登録（無料・3キーワードまで）。

gxceed は公開メタデータに基づく研究支援データセットです。要約・翻訳・解説は AI 支援で生成されています。最終的な解釈・検証は利用者が原典資料に基づいて行うことを前提とします。