Description of augmented Energy Information Administration generation unit dataset
The data file available here is the 2011 Energy Information Administration (EIA) generation unit dataset (the GeneratorsY2011.xls dataset available at http://www.eia.gov/electricity/data/eia860/index.html) augmented with additional columns, as described below. All EIA datasets mentioned below are available at the website just mentioned, and all are from the same year as the EIA dataset mentioned above.
- Augmented EIA generator dataset
- Additional description of columns in augmented EIA dataset
- EPA descriptions of EPA data columns added to the dataset
- Questionnaire used to collect EIA data
- Guide to data matchup procedure
- Data matchup code
Flue ID. This column specifies the flue(s) associated with the boiler(s) that is associated with the generation unit, according to the EIA’s “EnviroAssoc” dataset. It excludes flues for which the association is only “theoretical.” For some generation units, the EIA does not report a boiler associated with them.
Various characteristics of the boiler(s) associated with the generation unit, from the EIA’s “EnviroEquip” dataset. Most of the characteristics we report have to do with the emissions of the generation unit.
Various characteristics of the generation unit from the Environmental Protection Agency’s (EPA’s) Air Markets Program annual summary data for each unit, available at http://ampd.epa.gov/ampd/. Through experimentation, we developed computer code to identify which unit in the EPA’s Air Markets Program Data is the same as which unit in the EIA’s generation unit dataset. While plants can be matched up easily because the EIA and EPA use the same plant numbering system, matching up units within the plants was quite challenging, for a few reasons: The unit names often bear no resemblance to each other in the two datasets, the EPA dataset does not include many of the units in the EIA list, many units are combined differently in the two datasets, the EPA does not report generation capacity, and the two datasets sometimes report different fuel types for the same unit. As a result, through experimentation we have developed and coded an algorithm for matching the units based on fuel type, unit type, and similarity scores that are based on maximum generation and on unit names. We calculate the maximum generation of most EPA units from a separate dataset that reports each unit’s fuel use and gross generation in each hour of the year, because we found that estimates of generation capacity based on the Air Markets Program annual summary data produce extremely inaccurate estimates of generation capacity in some cases, perhaps in part because some units are capable of much higher heat input during upramping than they would use in steady state operation. Based on extensive checking, we judge that the resulting matchup of EIA and EPA generation units is not perfect but is highly accurate, on the order of 95% accurate. In most cases in which the matchups seem to be incorrect, the EIA unit was matched with an EPA unit that is of the same vintage, type, and approximate capacity as the correct match, or else is the best available proxy when the correct unit does not appear in the EPA dataset.
“nox,” “so2,” and “hr.” These are the estimated nitrogen oxide emission rate, sulfur dioxide emission rate, and heat rate of the generation unit, respectively. For EIA generation units that we were able to match with EPA units, we calculate them from EPA annual or hourly data. For other EIA generation units, we estimate these values using regression analysis, based on the known characteristics of the unit and the characteristics and rates of the units with known rates. We used a separate set of regressions for each combination of fuel type and generator (“prime mover”) type, except in the case of combinations containing too few units to support separate regressions. In such cases, we used one regression in which the combination of fuel type and generator type was an explanatory variable. We constrained all estimated rates to be no higher than the highest, and no lower than the lowest, rate reported by EPA for a unit with that combination of fuel type and generator type. For some combinations with no rates at all available from EPA data, we have used rates from the literature. For a small number of generators with unusual combinations of fuel type and generator type, we have not included rates.
The “Read me.xls” file provides additional information about these columns.
If you use or refer to these data, please cite the source. For now, that can be this website. However, in a paper we will describe in more detail how we produced the estimated emission and heat rates. Once that is available, please cite that paper. Please check back later for citation information for that paper.
We gratefully acknowledge the US Department of Energy's Center for Electric Reliability Technology Solutions for funding this work.