gxceed
← 論文一覧に戻る

Public Utility Data Liberation Project (PUDL) Data Release

公共事業データ解放プロジェクト(PUDL)データリリース (AI 翻訳)

Selvans, Zane A., Gosnell, Christina M., Sharpe, Austen, Schira, Zach, Xia, Dazhong, Belfer, Ella, Mazaitis, Kathryn

Zenodoデータセット2026-06-19#エネルギー転換Origin: US対象セクター: power
DOI: 10.5281/zenodo.20766264
原典: https://zenodo.org/records/20766264

🤖 gxceed AI 要約

日本語

PUDLの定例データリリースで、EIA-860M、EPA CEMS、FERCフォーム1などの米国電力関連データを更新。2026年第1四半期のデータや新たなLNG在庫テーブルを追加し、データパッケージの仕様をv2に改善。オープンデータとして誰でも利用可能。

English

This monthly PUDL data release updates US electricity data (EIA-860M, EPA CEMS, FERC Form 1, etc.) through early 2026, adds new LNG inventory tables, and improves data package output to Frictionless v2. It provides open, standardized data for energy system analysis.

Unofficial AI-generated summary based on the public title and abstract. Not an official translation.

📝 gxceed 編集解説 — Why this matters

日本のGX文脈において

日本ではSSBJ対応やサプライチェーン排出量算定に詳細な電力データが必要だが、PUDLのようなオープンな電力データ基盤は未整備。日本のGX推進に向け、データ公開のモデルケースとして参考になる。

In the global GX context

PUDL is a key open-data platform for the US electricity system, enabling transparent carbon accounting and energy transition analysis. Its data release model and Frictionless Data Package improvements are relevant for global disclosure infrastructure development (e.g., ISSB, CSRD).

👥 読者別の含意

🔬研究者:Access to granular, cleaned US utility data for energy system modeling and emission analysis.

🏢実務担当者:Reliable, standardized data for carbon accounting and renewable energy procurement analysis.

🏛政策担当者:Model for open data infrastructure that enhances transparency and supports climate policy design.

📄 Abstract(原文)

v2026.6.1 (2026-06-19) This is a monthly PUDL data release, primarily motivated by updating the EIA-860M monthly data through February 2026. As usual, it also includes all of the other changes that have accumulated on  main  since our last release. This month, we have the belated EPA CEMS update for 2026Q1, the annual update for FERC 1, some great community contributions for RUS7 and EIA-176, and an assortment of datapackage, Dagster, and deployment notification improvements. Enhancements Overhauled PUDL’s  Frictionless Data Package  output to conform to the v2 spec. The  pudl_datapackage  Dagster asset now generates  datapackage.json  directly during the ETL, including full column types, constraints, and foreign key relationships for every Parquet table. The descriptor is distributed as  pudl_parquet_datapackage.json  at the top level of the S3 bucket and on Zenodo, allowing potential users to browse the PUDL schema without downloading any data. The  pudl_parquet.zip  archive also contains a  datapackage.json  descriptor so it can be used as a self-describing Frictionless package after extraction. A reusable  valid_datapackage_check()  factory is now available in  pudl.dagster.asset_checks  to add frictionless v2 validation as an asset check on any datapackage output. See issues  #5122 ,  #5237  and PR  #5270 ,  #5343 . Also makes progress towards  catalyst-cooperative/agent-skills#14 Added a bare-bones datapackage for DBF SQLite outputs. See issue  #5200  and PR  #5275 . New Data EIA-176 Added  core_eia176__yearly_gas_supply , which contains cleaned company-level natural and supplemental gas supply data from Part 4 of the EIA-176 survey. See  #4711  and  #5227 . Added  core_eia176__yearly_liquefied_natural_gas_inventory , a new table containing annual LNG storage volume and capacity reported by operators on EIA Form 176 Part 5. Data covers 2002-2024 and includes LNG terminal and marine terminal records. See issue  #4695  and PR  #5219 . Expanded Data Coverage EIA-191 Updated  EIA-191  data to include additional 2026 data. See PR  #5292 . EIA-860M Added  EIA-860M  data through April 2026. See issue  #5277  and PR  #5284 . FERC 1 Added 2025 data from  FERC form 1 . This update includes several new renewable and energy storage fields in several tables. See issue  #5214  and PRs  #5236 ,  #5325 . EIA Electricity API Updated the  bulk EIA Electricity API  data used to fill in redacted fuel prices. See PR  #5292 . EPA CEMS Updated the  EPA CEMS  data to include 2026Q1. See PR  #5292 . FERC Forms 2 & 6 Updated the raw FERC Form 2 and 6 archives to include 2025 data. This data is converted to SQLite, but not deeply integrated into PUDL. See PR  #5292 . Documentation Added a data source page for  EIA-191 . See PR  #5267  and issue  #4756 . Updated the  EIA-930  column descriptions to note that starting in 2024Q3 EIA began reporting more granular renewable energy source categories, differentiating wind and solar plants with and without energy storage, splitting pumped hydro from conventional hydro, and adding new battery storage and geothermal categories. See issue  #5335  and PR  #5336 . New Data Tests & Validations Added validations to  RUS7  service interruption tables to ensure subcomponents sum to the total for annual observation periods. See issue  #5285  and PR  #5286 . Bug Fixes & Data Cleaning Renamed the  fuel_consumed_mmbtu  column in the  out_eia923__fuel_receipts_costs ,  out_eia923__monthly_fuel_receipts_costs , and  out_eia923__yearly_fuel_receipts_costs  tables. This column is the result of dividing  total_fuel_cost  by  fuel_received_mmbtu . The name  fuel_consumed_mmbtu  was misleading because the fuel received in these tables is not necessarily consumed in the same month, and the fuel cost is not necessarily associated with fuel received in the same month. The new name,  fuel_received_mmbtu , more accurately reflects what the column actually contains. See PR  #5294 . Fixed a bug in the Zenodo Data Release script which was not actually skipping top-level directories when deciding what to upload to Zenodo, which caused release failures once we started leaving the  ferc*_xbrl  directories on the filesystem. See PR  #5254 . Quality of Life Improvements Refactored Dagster-managed path handling to use a dedicated  pudl_paths  resource instead of constructing  pudl.workspace.setup.PudlPaths  directly throughout assets, IO managers, and tests. This makes path resolution more explicit in Dagster contexts and allows interactive definitions to override  pudl_input  and  pudl_output  directly when calling  pudl.dagster.build.build_interactive_defs() . See PR  #5261 ,  #5288 . Added a PUDL devcontainer configuration to make it easier for contributors to get up and running, and to enable the safe use of coding agents in YOLO mode. See PRs  #5260 ,  #5287 . Cleaned up PUDL’s default Dagster wiring by separating default resources from IO managers, giving shared data-config resources clearer defaults, and simplifying the FERC SQLite IO manager and provenance stack. Consolidated the FERC EQR deployment helper assets with the rest of the Dagster package layout. Created a new Dagster definition builder for use in notebooks and other interactive environments outside of a  dg -spawned environment:  pudl.dagster.build.build_interactive_defs() . See issue  #5118  and PR  #5242 . Migrated build and deployment notifications from Slack to Zulip. All GitHub Actions workflows that previously posted to Slack now send notifications to the Catalyst Cooperative Zulip instance via the  zulip/github-actions-zulip  action. A new  ZulipNotificationResource  Dagster resource was added to send Zulip stream messages from within assets, with best-effort error handling. The FERC EQR deployment helpers in  pudl.dagster.assets.deploy.ferceqr  were updated to use it. Notification coverage was also expanded to include community activity (issues, discussions, comments, and pull requests from non-Catalyst contributors). See PRs  #5298 ,  #5328 ,  #5331 . FERC provenance metadata (Zenodo DOIs, data years, XBRL extractor version) is now stored in the FERC SQLite datapackage files rather than only in Dagster asset metadata. The  ferc_to_sqlite  asset can now optionally download and reuse pre-built FERC SQLite outputs from the most recent nightly build, skipping expensive re-extraction when the inputs haven’t changed. Set  PUDL_FERC_FORCE_EXTRACT=true  to force re-extraction regardless. See issue  #5220  and PR  #5264 . Migrated hashtag-prefixed comments from soon-to-be-machine-generated dbt schema files into their corresponding human-editable schema input files ( dbt/schema_inputs/**/schema.human.yml ) to preserve their content, since any regenerated schemas will forcibly strip out hashtag comments. See PR  #5310 . Other PUDL v2026.6.1 Resources PUDL v2026.6.1 Data Dictionary PUDL v2026.6.1 Documentation PUDL in the AWS Open Data Registry PUDL v2026.6.1 in a free, public AWS S3 bucket: s3://pudl.catalyst.coop/v2026.6.1/ PUDL v2026.6.1 in a requester-pays GCS bucket: gs://pudl.catalyst.coop/v2026.6.1/ Zenodo archive of the PUDL GitHub repo for this release PUDL v2026.6.1 release on GitHub Contact Us If you're using PUDL, we would love to hear from you! Even if it's just a note to let us know that you exist, and how you're using the software or data. Here's a bunch of different ways to get in touch: Follow us on GitHub Use the PUDL Github issue tracker to let us know about any bugs or data issues you encounter GitHub Discussions is where we provide user support. Watch our GitHub Project to see what we're working on. Email us at [email protected] for private communications. On Mastodon: @[email protected] On BlueSky: @catalyst.coop Connect with us on LinkedIn Play with our data and notebooks on Kaggle Combine our data with ML models on HuggingFace Learn more about us on our website: https://catalyst.coop Subscribe to our announcements list for email updates .

🔗 Provenance — このレコードを発見したソース

🔔 こうした論文の新着を逃したくない方は キーワードアラート に登録(無料・3キーワードまで)。

gxceed は公開メタデータに基づく研究支援データセットです。要約・翻訳・解説は AI 支援で生成されています。 最終的な解釈・検証は利用者が原典資料に基づいて行うことを前提とします。