ESG-DocQA: A Three-Reviewer Validated Dataset for Evidence-Grounded Question Answering over Corporate ESG Reports

ESG-DocQA: 企業ESG報告書に対するエビデンスに基づく質問応答のための三者検証済みデータセット (AI 翻訳)

Huajian Jiang

Zenodo (CERN European Organization for Nuclear Research)データセット2026-05-30#AI×ESGOrigin: Global

DOI: 10.5281/zenodo.20459641

原典: https://doi.org/10.5281/zenodo.20459641

🤖 gxceed AI 要約

日本語

本論文は、企業のESG報告書に対するエビデンスに基づく質問応答のためのベンチマークデータセットESG-DocQAを紹介する。300サンプルからなり、検証、比較、推論の3種類の質問を含む。3人のレビュアーによる検証を経て高い信頼性を達成している。データセットはJSONL形式で提供され、再現性スクリプトも含む。

English

This paper introduces ESG-DocQA, a benchmark dataset for evidence-grounded question answering over corporate ESG reports. It contains 300 samples with verification, comparison, and inference questions, validated by three reviewers achieving substantial inter-rater reliability. The dataset includes JSONL records, annotation guidelines, and reproducibility scripts. It does not redistribute source PDFs due to copyright.

Unofficial AI-generated summary based on the public title and abstract. Not an official translation.

📝 gxceed 編集解説 — Why this matters

日本のGX文脈において

日本では、SSBJ（サステナビリティ開示基準）の策定が進んでおり、ESG報告書の質と分析可能性が重要視されている。本データセットは、AIを用いたESG報告書の自動分析基盤として活用可能であり、開示データの活用促進に貢献する可能性がある。

In the global GX context

Globally, the demand for automated analysis of ESG reports is rising with regulations like CSRD and SEC climate rules. This dataset provides a validated benchmark for developing NLP systems that can extract evidence from ESG reports, supporting both disclosure and research communities.

👥 読者別の含意

🔬研究者:Useful for NLP researchers working on ESG document understanding and question answering.

🏢実務担当者:Can guide the development of internal tools for analyzing ESG reports, but requires adaptation to specific disclosure frameworks.

🏛政策担当者:Illustrates the potential for automated verification of ESG disclosures, which could inform future assurance technologies.

📄 Abstract（原文）

ESG-DocQA is a 300-sample benchmark for evidence-grounded question answering over corporate environmental, social, and governance (ESG) reports. The dataset was constructed from page-level ESG report evidence and contains multi-step verification, comparison, and inference questions. The benchmark was produced through iterative human review and three-reviewer validation, achieving substantial inter-reviewer reliability (Fleiss' kappa = 0.644; Krippendorff's alpha = 0.647). The final dataset contains 300 validated samples with domain distribution E=145, S=91, G=64 and answer-type distribution verification=127, comparison=106, inference=67. The repository includes benchmark JSONL records, data dictionary and metadata, annotation guidelines, validation reports, adjudication and replacement logs, and reproducibility scripts. Original ESG report source PDFs and rendered page images are not redistributed due to copyright considerations. Users can locate source reports using the provided source-report manifest and page-level metadata.

🔗 Provenance — このレコードを発見したソース

openalex https://doi.org/10.5281/zenodo.20459641first seen 2026-06-02 04:52:56 · last seen 2026-06-16 04:49:20

🔔 こうした論文の新着を逃したくない方はキーワードアラートに登録（無料・3キーワードまで）。

gxceed は公開メタデータに基づく研究支援データセットです。要約・翻訳・解説は AI 支援で生成されています。最終的な解釈・検証は利用者が原典資料に基づいて行うことを前提とします。