UNITED STATES ENVIRONMENTAL PROTECTION AGENCY COMMENTS OF THE UTILITY AIR REGULATORY GROUP on NATIONAL AMBIENT AIR QUALITY STANDARDS FOR PARTICULATE MATTER; PROPOSED RULE 71 Fed. Reg. 2620 (January 17, 2006) (Docket No. EPA-HQ-OAR-2001-0017) and REVISIONS TO AMBIENT AIR MONITORING REQUIREMENTS 71 Fed. Reg. 2710 (January 17, 2006) (Docket No. EPA-HQ-OAR-2004-0018) HUNTON & WILLIAMS Lucinda Minton Langworthy 1900 K Street, N.W. Suite 1200 Washington, D.C. 20006 (202) 955-1500 Counsel for the Utility Air Regulatory Group April 17, 2006 COMMENTS OF THE UTILITY AIR REGULATORY GROUP on NATIONAL AMBIENT AIR QUALITY STANDARDS FOR PARTICULATE MATTER; PROPOSED RULE 71 Fed. Reg. 2620 (January 17, 2006) (Docket No. EPA-HQ-OAR-2001-0017) and REVISIONS TO AMBIENT AIR MONITORING REQUIREMENTS 71 Fed. Reg. 2710 (January 17, 2006) (Docket No. EPA-HQ-OAR-2004-0018) I. Executive Summary The Utility Air Regulatory Group (“UARG”) is a voluntary, nonprofit group of electric generating companies and organizations a nd four national trade associations (the Edison Electric Institute, the National Rural Electric Cooperative Association, the American Public Power Association, and the National Mining Association). UARG’s purpose is to participate on behalf of its members collectively in rulemakings by the U.S. Environmental Protection Agency (“EPA” or “Agency”) and other Clean Air Act (“CAA” or “Act”) proceedings that affect the interests of electric generators, and in related litigation. A list of UARG members joining in these comments is Attachment 1 to these comments. UARG has a substantial interest in the national ambient air quality standards (“NAAQS” or “standards”) for particulate matter (“PM”). Electric generating companies, including UARG members, and their customers are expected to bear most of the burden of implementation of the current PM standards. In fact, EPA has established an interstate program for reducing emissions from electric generating units substantially 1 based on the purported contributions of these emissions to nonattainment of the PM NAAQS.1 UARG has participated extensively in this – and previous – reviews of the PM NAAQS,2 and is pleased to offer the following comments on the present proposal to revise those NAAQS.3 In addition, UARG offers brief comments on a related proposal regarding ambient air quality monitoring, insofar as that proposal concerns PM2.5 . 4 1 See EPA, Rule to Reduce Interstate Transport of Fine Particulate Matter and Ozone (Clean Air Interstate Rule); Revisions to the Acid Rain Program; Revisions to the NOx SIP Call, 70 Fed. Reg. 25,162 (May 12, 2005). 2 See, e.g., Comments of the Utility Air Regulatory Group on the Review of the National Ambient Air Quality Standards for Particulate Matter: Policy Assessment of Scientific and Technical Information – OAQPS Staff Paper (Second Draft January 2005) and the Particulate Matter Health Risk Assessment for Selected Urban Areas: Second Draft Report (January 2005) (March 31, 2005); Comments of the Utility Air Regulatory Group on the Review of the National Ambient Air Quality Standards for Particulate Matter: Policy Assessment of Scientific and Technical Information – OAQPS Staff Paper (First Draft August 2003) and the Particulate Matter Health Risk Assessment (August 2003) (October 28, 2003); Comments of the Utility Air Regulatory Group on EPA’s Proposed Methodology for Particulate Matter Risk Analyses for Selected Urban Areas (February 27, 2002); Comments of the Utility Air Regulatory Group on the Air Quality Criteria for Particulate Matter (Second External Review Draft March 2001) and the Review of the National Ambient Air Quality Standards: Policy Assessment of Scientific and Technical Information – OAQPS Staff Paper (Preliminary Draft June 2001) (July 12, 2001); Supplemental Comments of the Utility Air Regulatory Group on the National Ambient Air Quality Standards for Particulate Matter: Proposed Decision, 61 Fed. Reg. 65,638 (December 13, 1996) (Docket No. A-95-54) and Proposed Requirements for Designation of Reference and Equivalent Methods for PM2.5 and Ambient Air Quality Surveillance for Particulate Matter: Proposed Rule, 61 Fed. Reg. 65,780 (December 13, 1996)(Docket No. A-96-51) (March 12, 1997, corrected March 25, 1997). 3 National Ambient Air Quality Standards for Particulate Matter, Proposed Rule, 71 Fed. Reg. 2620 (Jan. 17, 2006). 4 Revisions to Ambient Air Monitoring Requirements, 71 Fed. Reg. 2710 (Jan. 17, 2006). 2 In brief, UARG believes that no revision of the existing NAAQS for fine PM, measured as PM2.5, is appropriate at this time. The scientific record, supported by a new human health risk assessment, provides no basis for abandoning the Agency’s 1997 determination that the present NAAQS provide the requisite level of protection of public health and welfare. Moreover, the record does not provide an adequate basis for setting a new NAAQS for urban coarse PM, measured as PM10-2.5. II. The NAAQS Review and Revision Process Section 108(a)(1) of the CAA authorizes EPA to develop a list of air pollutants emitted by diverse mobile and stationary sources that the EPA Administrator (“Administrator”) determines “cause or contribute to air pollution which may reasonably be anticipated to endanger public health or welfare.” 5 Once a pollutant is added to this list, CAA § 108(a)(2) requires the Administrato r to issue “air quality criteria” for the pollutant that “accurately reflect the latest scientific knowledge useful in indicating the kind and extent of all identifiable effects on public health or welfare which may be expected from the presence of such pollutant in the ambient air, in varying quantities.” This information is compiled in a document commonly referred to as the Criteria Document. Thereafter, the Administrator must establish primary and secondary NAAQS for the listed pollutant. 6 Primary NAAQS must protect the public health, allowing an 5 42 U.S.C. § 7408(a)(1). Hereinafter, citations are given only to the Act. 6 CAA § 109(a). 3 adequate margin of safety. 7 Secondary NAAQS must protect the public welfare from known or anticipated adverse effects. 8 While basing the NAAQS on the criteria document, the Administrator must exercise policy judgment to select the level for the NAAQS “attainment and maintenance of which” is requisite to provide this protection. 9 The Administrator receives advice from a panel of independent experts -- the Clean Air Scientific Advisory Committee (“CASAC”) – to assist him in formulating his judgment. 10 Nothing requires the Administrator to adopt CASAC’s advice, however; he need only explain why his proposal differs in any important respect from CASAC’s recommendations. 11 In interpreting the statutory standard for NAAQS decisions, the Supreme Court has held that the CAA requires the Administrator to exercise his judgment to set both primary and secondary NAAQS “at the level that is ‘requisite’ – that is, not lower or higher than is necessary . . . .” 12 EPA recognizes that NAAQS need not be set at a level 7 CAA §109(b)(1). 8 CAA § 109(b)(2). The Act defines welfare effects to include “effects on soils, water, crops, vegetation, manmade materials, animals, wildlife, weather, visibility, and climate, damage to and deterioration of property, and hazards to transportation, as well as effects on economic values and on personal comfort and well-being, whether caused by transformation, conversion, or combination with other air pollutants.” CAA § 302(h). 9 CAA § 109(b)(1)-(2). 10 CAA § 109(d)(2). 11 CAA §307(d)(3)(c) (“[I]f the proposal differs in any important respect from any of [CASAC’s] recommendations, [the statement of basis and purpose for the proposed rule] must provide an explanation of the reasons for such differences.”). 12 Whitman v. American Trucking Ass’ns, 531 U.S. 457, 475-76 (2001). 4 that eliminates all risk. 13 Rather, NAAQS are set at the level that the Administrator deems “safe,” a determination that, as Justice Breyer has pointed out, requires putting the estimated risk posed by the pollutant into the context of background circumstances.14 Certainly, a standard that is “likely to cause more harm to health than it prevents is not a rule that is ‘requisite to protect the public health.’”15 Under § 109(d) of the CAA, once the Administrator has set NAAQS, he must review them, and the Criteria Document on which they are based, at least every five years. At the completion of that review, he may revise them only if, and to the extent, those revisions are “appropriate” under section 109(b) of the Act and the “not lower or higher than is necessary” test.16 Moreover, because there is “at least a presumption” that an existing rule best carries out the policies committed to an Agency by Congress, 17 13 71 Fed. Reg. at 2622/2. See also Whitman , 531 U.S. at 494 (Breyer, J., concurring) (“The statute, by its express terms, does not compel the elimination of all risk.”). 14 Whitman, 531 U.S. at 494-95 (Breyer, J., concurring). 15 Whitman, 531 U.S. at 495 (Breyer, J., concurring); cf. American Trucking Ass’ns v. EPA, 175 F.3d 1027, 1052 (D.C. Cir.), aff’d and modified in part on other grounds, 195 F.3d 4 (D.C. Cir. 1999), rev’d on other grounds sub nom Whitman v. American Trucking Ass’ns, 531 U.S. 457 (2001) (characterizing as “bizarre” EPA’s position that in setting NAAQS it was required to ignore potential beneficent effects for health of the presence of a substance in the ambient air) (“ATA I”). 16 Whitman, 531 U.S. at 476. 17 See Atchison, Topeka & Santa Fe Ry. Co. v. Wichita Bd of Trade, 412 U.S. 800, 808 (1973). 5 the Administrator must supply a reasoned analysis before changing the NAAQS based on a change of policy or judgment.18 III. In 1997, EPA Determined that the Present Suite of PM2.5 NAAQS Provided the Requisite Protection of Public Health and Welfare. EPA adopted the present PM2.5 NAAQS in 1997.19 They included identical primary and secondary NAAQS with a 3-year annual average of 15 µg/m3, 20 and identical primary and secondary NAAQS with a 24-hour average of 65 µg/m3. 21 A. The Primary NAAQS In 1997, the CD contained discussion of: an extensive PM epidemiological data base [that] provide[s] evidence of serious health effects (e.g., mortality, exacerbation of chronic disease, increased hospital admissions) in sensitive populations (e.g., the elderly, individuals with cardiopulmonary disease), as well as significant adverse health effects (e.g. increased respiratory symptoms, school absences, and lung function decrements) in children. 22 18 See Motor Vehicle Mfrs. Ass’n v. State Farm Mut. Auto. Ins. Co., 463 U.S. 29, 42 (1983). 19 National Ambient Air Quality Standards for Particulate Matter, 62 Fed. Reg. 38,652 (July 18, 1997). At that time, the Agency also made minor changes to its earlier NAAQS for PM10 and recharacterized the revised PM10 NAAQS as standards for coarse particulate matter. 62 Fed. Reg. at 38,677/3 to 38,678/1-2. These new PM10 NAAQS were subsequently vacated by the D.C. Circuit, ATA I, 175 F.3d at 1055, leading to the reinstatement of the earlier PM10 NAAQS. See 40 C.F.R. §50.6. 20 See 40 C.F.R. § 50.7(a). Averaging of monitored values across sites in a “community monitoring zone” (“spatial averaging”) was permitted at the discretion of the state. 40 C.F.R. Pt. 50, App. N at 2.1; 40 C.F.R. Pt. 58, App. D at 2.8.1.6. 21 See 40 C.F.R. § 50.7(a). Compliance with these standards was judged by comparing the 98th percentile PM2.5 concentration in ambient air with the 65 µg/m3 level of the standards. 40 C.F.R. § 50.7(c). 22 62 Fed. Reg. at 38,657/1. 6 Despite this body of scientific studies, the Agency recognized “uncertainty in the characterization of health effects attributable to exposure to ambient PM.”23 For example, the Agency noted the “lack of demonstrated mechanisms” as an uncertainty, but explained that “a number of potential mechanisms have been hypothesized in the recent literature.” 24 Thus, for purposes of setting standards, EPA concluded that the effects observed in the studies were “at least plausibly related to inhalation of PM.”25 The Agency also described questions about whether a threshold exists below which associations between ambient PM levels and health endpoints would no longer be observed as “the most significant uncertainty,”26 and pointed to it, among other uncertainties, as justification for not adopting more stringent standards. 27 While acknowledging these uncertainties, EPA nevertheless concluded that it was appropriate to revise the NAAQS to “increase the public health protection.”28 Recognizing that not all components of PM are likely to produce the same effects, 29 EPA determined that components of the “fine fraction” of PM (particles of approximately 2.5 microns i n diameter or less) were more likely to be “linked” to the effects described 23 62 Fed. Reg. at 38,655/2. 24 Id. at 38,657/1. 25 Id. 26 Id. at 38,665/1. 27 Id. at 38,675/2. 28 Id. at 38,666/3. 29 Id. at 38,666/1. 7 above.30 The Agency therefore decided to regulate fine particles through new primary NAAQS for PM2.5. 31 Having found that the health effects science, although uncertain, warranted the promulgation of PM2.5 NAAQS, the Agency decided to consider the protection that could be provided by a “generally controlling” annual standard and a complementary 24-hour standard.32 Explaining that its goal was to “reduce risk sufficiently to protect public health with an adequate margin of safety” and not to set standards that were “riskfree,” 33 EPA supplemented its review of the uncertain science with a risk assessment to “aid [the Administrator] in judging which alternative PM NAAQS would reduce risks sufficiently” to provide the level of protection called for by the CAA.34 This risk assessment indicated that the combination of an annual PM2.5 NAAQS of 15 µg/m3 and a 24-hour PM2.5 NAAQS of 65 µg/m3 would reduce – but not eliminate – health risk from exposure to ambient PM2.5. 35 Based on the science and risk information before it, the Agency decided to promulgate an annual PM2.5 NAAQS of 15 µg/m3, in combination with a 24-hour PM2.5 30 Id. at 38,666/2. 31 See id. at 38668/1. 32 Id. at 38,670/3. 33 Id. at 38,674/3. 34 Id. at 38656/2. 35 See EPA, Review of the National Ambient Air Quality Standards for Particulate Matter: OAQPS Staff Paper, VI-49, VI-51 (1996) (“1996 Staff Paper”); Memorandum from E. Post & J. Voyzey, Abt Associates, Inc., Exhibits 4-1 & 4.2 (June 5, 1997). Other possible NAAQS combinations that would have reduced risks further were also considered in the risk assessment, but not adopted by the Agency. 8 standard of 65 µg/m3. 36 EPA concluded that these standards would “protect public health from adverse health effects associated with short- and long -term exposures to ambient fine particles,” and provide “an adequate margin of safety.” In other words, the suite of standards that the Agency adopted was “not lower or higher tha n necessary.” 37 The D.C. Circuit upheld the Agency’s conclusion. Faced with claims that the suite of PM2.5 standards was, on the one hand, too stringent and, on the other hand, insufficiently stringent, the court found that EPA had provided “evidence of the old PM standard’s inadequacy.” 38 The court also held that, in establishing a revised standard, EPA had properly “reject[ed] . . . lower [PM2.5 ] standards” because lower standards “might result in regulatory programs that go beyond those that are needed to effectively reduce risks to public health.” 39 The court concluded that the Agency had fulfilled “its statutory obligation to set the primary NAAQS at levels no lower than necessary to reduce public health risks. 40 Finally, noting that, as Justice Breyer explained in his concurring opinion in Whitman, section 109 of the Act must be construed to “permit EPA to ‘consider whether a proposed rule promotes safety overall’”, 41 the court upheld EPA’s 36 62 Fed. Reg. at 38,677/3. 37 Whitman, 531 U.S. at 475-76. 38 American Trucking Associations v. EPA, 283 F.3d 355, 365 (D.C. Cir. 2002) (“ATA II”). 39 Id. at 369. 40 Id. 41 Id. at 395, quoting, Whitman, 531 U.S. at 494 (Breyer, J. concurring). 9 decision, based on considerations related to development of effective control strategies, to set the annual PM2.5 NAAQS at a level that would be generally controlling.42 B. The Secondary NAAQS At the same time that EPA concluded that it was appropriate to revise the PM NAAQS by adding primary PM2.5 standards, it concluded that PM “can and does produce adverse effects on visibility in various locations.” 43 The Agency also concluded that fine particles contributed to these effects. 44 Nevertheless, because background visibility levels differ between the eastern and western United States, EPA found that “addressing visibility solely through setting more stringent national secondary standards would not be an appropriate means to protect the public welfare from adverse impacts of PM on visibility in all parts of the country.” 45 Next, EPA considered how other CAA programs would affect visibility. With regard to the primary PM2.5 NAAQS, the Agency noted that “[t]he spatially averaged form of the annual standard is well suited to the protection of visibility, which involves effects of PM throughout an extended viewing distance,”46 and that many urban areas would “see perceptible improvement in visibility” with attainment of a 15 µg/m3 annual PM2.5 NAAQS.47 The Agency also concluded that the 24-hour primary PM2.5 NAAQS 42 Id. at 375. 43 62 Fed. Reg. at 38,680/1. 44 Id. 45 Id. at 38,680/3. 46 Id. 47 Id. at 38,681/1. 10 “would be expected to reduce” the visibility impairment on the worst days.48 EPA noted that other programs “such as those to reduce acid rain or mobile source emissions” would also improve visibility, especially in the East. 49 Finally, the Agency explained that the regional haze program required by sections 169A and 169B of the Act – which had not yet been established – would also be expected to improve visibility in urban as well as rural areas. 50 For these reasons, EPA established secondary NAAQS identical to the primary NAAQS.51 Once again, the D.C. Circuit upheld the Agency’s actions. The court explained that “Congress did not intend the secondary NAAQS to eliminate all adverse visibility effects.” 52 EPA therefore acted within its authority in deciding to rely on ano ther program – the regional haze program – to mitigate some of the adverse effects of PM2.5 on visibility. 53 IV. Revision of the Primary PM2.5 NAAQS Is Not Appropriate at this Time. The Agency has now proposed revisions to the primary PM2.5 NAAQS. EPA has proposed reducing the level of the primary 24-hour standard from 65 µg/m3 to 35 48 Id. at 38,861/2. 49 Id. 50 Id. at 38,681/3 to 38,682/1. EPA explained that the regional haze programs is “more protective” than the NAAQS because it addresses all man-made impairment of visibility, not just that deemed “adverse.” Id. at 38,681/3. 51 Id. at 38,683/1. 52 ATA II, 283 F.3d at 375, quoting, ATA I, 175 F.3d at 1056. 53 ATA I, 175 F.3d at 1056-57. 11 µg/m3. 54 In addition, the Agency has proposed to change the way in which compliance with the 15 µg/m3 annual standards is determined.55 This change will make demonstrating compliance with the annual standard more difficult, making the standard itself more stringent. 56 These revisions are appropriate only if EPA’s 1997 conclusion, upheld by the D.C. Circuit, that the present NAAQS reflect the level that is requisite to protect public health is no longer correct. 57 The Agency has “provisionally conclude[d]” that it is not. 58 For the following reasons, this “provisional” conclusion is unwarranted. A. The Present Suite of Primary PM2.5 NAAQS Continue To Provide the Requisite Public Health Protection. In 1997, EPA determined that the present suite of primary PM2.5 NAAQS provided the level of public health protection that was requisite under Section 109 of the CAA. There is, therefore, “at least a presumption” that these standards best carry out the policies that Congress committed to the Agency under the Act.59 Before the Agency changes its determination that these standards provide the requisite level of public 54 71 Fed. Reg. at 2649/3. 55 Id. at 2647/3. 56 See Memorandum from M. Schmidt, et al., to PM NAAQS Review Docket, Output A.7 at 2, 13 (June 30, 2005). 57 The Agency uses the same rationale that it used in 1997 -- that the new NAAQS would provide “increased public health protection.” Compare 62 Fed. Reg. at 38,666/3 with 71 Fed. Reg. at 2643/3. Increased public health protection is justified now, however, only if EPA provides a reasoned explanation of why the Agency’s determination in 1997 that the present standards -- not more stringent ones -- provide the requisite level of protection is no longer valid. 58 71 Fed. Reg. at 2643/3. 59 See Atchison, 412 U.S. at 808. 12 health protection, it has an obligation to explain fully the basis for that change.60 This obligation is necessarily particularly great here, given the extensive record on which that determination was made and its affirmance by the D.C. Circuit. 61 In this regard, if new evidence established more certain risk or a risk of effects that are significantly different in character than those that provided the basis for the present NAAQS, a case might be made that the present NAAQS do not reflect a level that is “requisite” to protect public health. Similarly, if the evidence demonstrated that the health risk upon attainment of the present PM2.5 NAAQS would be greater than was understood in 1997, there might be reason to question whether the level of protection provided by those standards is less than necessary, and therefore not “requisite” to protect public health. The evidence in the record, however, demonstrates that, while there are additional health studies, those studies address the same types of possible health effects, are subject to both the same and new uncertainties, and ultimately show that the risks are no greater than they were estimated to be in 1997. Based on this record, there is no basis for EPA now to conclude that it was wrong in 1997. Moreover, a desire to provide an even greater level of health protection cannot be reconciled with the “not lower or higher than is necessary” standard articulated by the Supreme Court in Whitman. 60 See Motor Vehicle Mfrs. Ass’n, 463 U.S. at 42 (explaining that an Agency that changes an existing rule “is obligated to supply a reasoned analysis beyond that which may be required when an agency does not act in the first instance.”). 61 See Louisiana Pub. Serv. Comm’n v. FERC , 184 F.3d 892, 897 (D.C. Cir. 1999) (“For an agency to reverse its position in the face of a precedent it has not persuasively distinguished is quintessentially arbitrary and capricious.”). 13 1. The Effects of Concern Have Not Changed Significantly Since 1997. In 1997, EPA cited studies showing associations between levels of particles in the ambient air and rates of “premature mortality, aggravation of respiratory and cardiovascular disease (as indicated by increased hospital admissions and emergency room visits, school absences, work loss days, and restricted activity days), changes in lung function and increased respiratory symptoms, changes to lung tissues and structure, and altered respiratory defense mechanisms” as justification for the new PM2.5 NAAQS.62 The newer studies cited in the proposal address essentially these same effects as being of potential concern. Specifically, the current proposal indicates that the newer studies address premature mortality, 63 hospital admissions or emergency department visits for respiratory disease,64 respiratory symptoms and lung function changes, 65 and hospital admissions and emergency department visits for cardiovascular diseases.66 The proposal addresses only one type of effects that were not specifically considered in 1997: studies of “more subtle physiological changes in the cardiovascular system,” including measures of changes in cardiac function and changes in blood 62 62 Fed. Reg. at 38,656. 63 71 Fed. Reg. at 2629/1 -3. 64 Id. at 2629/3 to 2630/1. 65 Id. at 2630/1-2. 66 Id. at 2630/2. 14 components.67 The current Criteria Document (“CD”), however, urges caution in accepting these studies as evidence of effects of ambient PM2.5 . 68 For example, it urges caution “in drawing any conclusions yet regarding ambient PM effects on heart rate variability or other ECG [electrocardiogram] measures of cardiovascular parameters.”69 Similarly, the CD explains that “[m]uch more research will be needed [in] to order to both confirm [. . .] associations [with changes in blood components] and to better understand which specific ambient PM species may contribute to them.” 70 Moreover, these subtle effects relate to the more serious cardiovascular effects, such as aggravation of cardiovascular disease, that were identified in 1997 as possible risks associated with exposure to PM2.5 in the ambient air. In other words, the types of health risks at issue now are essentially those that provided the basis in 1997 for the current NAAQS. Even the uncertain new evidence of associations between PM2.5 levels and physiological changes in the cardiovascular system are related to the cardiovascular risks that were considered in 1997. Indeed, they are far less serious than the cardiovascular effects for which evidence of a possible association with ambient PM2.5 existed in 1997. 67 71 Fed. Reg. at 2630/3. 68 EPA, Air Quality Criteria for Particulate Matter, EPA 600/P-99/002aF-002bF (October 2004). 69 Id. at 8 -166. 70 CD at 8-171. 15 2. Uncertainties in the Underlying Health Science Are as Great or Greater Than in 1997. a. Unknown Mechanisms In 1997, the Agency recognized the lack of a demonstrated mechanism for PM2.5 to produce the serious adverse health effects, including premature mortality, with which it had been associated in epidemiology studies.71 The mechanism or mechanisms by which PM2.5 at ambient levels could produce the effects with which it has been associated in some studies remains uncertain. EPA indicates that “much new evidence is now available on potential mechanisms” that was unavailable in 1997.72 Nevertheless, the Agency identifies only “potential mechanisms” and “plausible biological pathways.”73 Thus, questions about how PM2.5 at levels found in ambient air could produce the effects in question remains an important uncertainty in the scientific literature. Moreover, when EPA established the present NAAQS, the Agency assumed that plausible mechanisms existed by which PM2.5 could cause the effects in question. 74 Thus, even if more recent information has “advanced our understanding” of those 71 62 Fed. Reg. at 38,657/1. 72 71 Fed. Reg. at 2626/3. 73 Id. at 2636/1-2. 74 See 62 Fed. Reg. at 3857/1. 16 mechanisms, 75 this would not justify standard revision. The existing standards are based on the assumptions that mechanisms exist. b. Uncertain Concentration/Response Function The Agency continues to acknowledge that the shape of the concentration/response function is an important remaining uncertainty. Indeed, EPA describes uncertainty about the concentration/response function as “[t]he single most important factor influencing the quantitative estimates of risk.” 76 While EPA highlights studies that did not find an air quality threshold for associations between PM and various health endpoints, 77 the Agency also acknowledges that some analyses have “provided suggestions of some potential threshold levels.” 78 In fact, as CASAC members have pointed out, the expectation from a biological perspective is that a threshold exists, even if we lack the tools to detect it.79 Because the shape of the concentration/response function is critical to understanding the potential health risk associated with different PM2.5 air quality, it remains a key uncertainty. Moreover, because the actual shape of this function remains unknown, this uncertainty has not been reduced since 1997. 75 71 Fed. Reg. at 2626/3. 76 Id. at 2641/1. 77 Id. at 2635/1-2. 78 62 Fed. Reg. at 2635/1. 79 See Letter from Dr. Phil Hopke, Chair, CASAC, to the Hon. Michael O. Leavitt, Administrator, EPA (Oct. 4, 2004) (CASAC commentary on the Fourth External Review Draft of Air Quality Criteria for Particulate Matter), App. B-6 (Dr. James Crapo); App. B-7 (Dr. Fred Miller) (“Hopke (2004)”). 17 c. Importance of Confounding Factors or Effect Modifiers EPA is dismissive of the potential for co-pollutants to confound reported associations between PM and mortality and morbidity endpoints. 80 Citing the CD for authority, 81 the EPA asserts, “the effect estimates for associations between mortality and morbidity and various PM indices are generally robust to confounding.” 82 In fact, however, inclusion of co-pollutants in a model often results in a PM2.5 estimate becoming statistically insignificant. As a result, potential confounding of any apparent PM2.5/health associations by co-pollutants is a key issue in evaluating whether or not there is a significant association between ambient PM2.5 and the health effects at issue in this rulemaking. Dr. Anne Smith has examined the effect of including gaseous co-pollutants in models of associations between PM2.5 various health endpoints for UARG. 83 Dr. Smith reports that in ten of twelve studies that she identified as (1) reporting a statistically significant association between short-term PM2.5 levels and a health endpoint in a onepollutant model (2) that also included a two -pollutant model formulation, the PM2.5 had 80 71 Fed. Reg. at 2634/1 -2. 81 UARG is concerned that significant biases that CASAC (and UARG) identified in drafts of the CD have not been resolved. This is an example of such bias. Cf. Hopke (2004), App. B-29 (Dr. Roger McClellan) (“The section on confounding is written with a bias toward emphasizing a PM effect and downplaying the effects of other pollutants.”). 82 71 Fed. Reg. at 2634/2. 83 A.E. Smith, Ph.D., Technical Comments on the Proposed Rule for National Ambient Air Quality Standards for Particulate Matter (April 17, 2006) (“Smith”) (Attachment 2 to these comments). 18 no statistically significant association with health i n the two-pollutant model. 84 Dr. Smith also notes that when the co-pollutant sulfur dioxide (“SO2”) was used with PM2.5 in a two-pollutant model in a study by Krewski et al. (2000) of possible associations between long-term PM2.5 exposure and premature mortality, the statistically significant association with PM2.5 that had been reported for a one -pollutant model disappeared.85 Thus, Dr. Smith concludes that EPA should acknowledge that the statistical significance of PM2.5-health associations is not robust when one or more gaseous pollutants are included.86 Dr. Smith also points out that, even in one -pollutant models, statistically significant associations between long -term PM2.5 exposure and premature mortality are limited to those with no more than a high school education. 87 The Agency acknowledges this finding,88 but does not reflect it quantitatively in its analysis of risk. As Dr. Smith points out, this could mean that the Agency’s risk estimates are biased 84 Id. at 35-36. Furthermore, Dr. Smith explains that in nine of these ten cases, multicollinearity cannot explain the loss of the statistically significant association with PM2.5. Id at 35, 39. 85 Smith at 40. 86 Smith at 2. Courts commonly disregard scientific studies that do not report statistically significant results. See , e.g., American Home Prods. v. Johnson & Johnson, 577 F.2d 160, 169 n.19 (2d Cir 1978), Dunn v. Sandoz Pharmaceuticals, 275 F. Supp. 2d 672, 681 (M.D.N.C. 2003); Soldo v. Sandoz Pharmaceuticals, 244 F. Supp. 2d 434, 455 (W.D. Penn. 2003). Even in a rulemaking context, studies that do not find a statistically significant association have been given “diminished importance.” See, Disease Not Associated with Exposure to Certain Herbicide Agents, 59 Fed. Reg. 346 (Jan. 4,1994). 87 Smith at 12-13. 88 71 Fed. Reg. at 2631/2. 19 high (because they assume the risk applies to the entire population) or it could mean that the reported associations are actually due to some factor other than PM2.5 that affects primarily those with less education.89 d. Role of Individual Constituents EPA noted in 1997 that it was unlikely that all of the chemical components of ambient PM would produce identical effects. 90 The Agency at that time concluded that “the available scientific information [did] not rule out” any PM2.5 component as a contributor to associations between PM2.5 and health effects. 91 In the current proposal, EPA acknowledges continued uncertainty about the specific components of PM2.5 that may be causally linked to health endpoints. 92 This continued uncertainty means provides not grounds for reconsidering the Agency’s 1997 conclusion that the level of the present NAAQS provides the necessary protection of public health. Which PM2.5 component or components may cause which health effects is a key uncertainty both for understanding the nature of the health risk and for developing standards that will effectively redress that risk. Indeed, EPA continues to identify this as an “important research need.” 93 Both the federal Office of Management and Budget and the National Research Council have also pointed to the importance of increasing 89 Smith at 12. 90 62 Fed. Reg. at 38,666/1. 91 Id. at 38,666/2. 92 71 Fed. Reg. at 2644/3. 93 EPA, Review of the National Ambient Air Quality Standards for Particulate Matter: Policy Assessment of Scientific and Technical Information (June 2005), at 5-73. 20 information on the importance to possible health effects of individual PM2.5 constituents.94 e. Sensitivity to Model Specification In 1997, the Agency concluded that there was “no evidence that different plausible model specifications could lead to markedly different conclusions.”95 EPA now recognizes that new analyses have demonstrated sensitivity in the time series studies of daily PM2.5 and health endpoints to “the modeling approach used to account for temporal effects and weather variables.”96 EPA nevertheless concludes that the associations between short-term PM2.5 levels are “generally robust to the inclusion of alternative modeling strategies.” 97 The Health Effects Institute(“HEI”), an independent research institute that receives approximately half of its funding from EPA, however, takes a different view of the significance of the discovery that the short-term time-series study findings are sensitive to the modeling approach that is used. HEI has cautioned: Neither the appropriate degree of control for time, nor the appropriate specification of the effects of weather has been determined for time-series analyses [used to study the association between daily variations in air pollution and health endpoints]. In the absence of adequate biological 94 National Research Council, Research Priorities for Airborne Particulate Matter IV: Continuing Research Progress 130-32 (National Academy Press 2005); Memorandum from John Graham, Office of Management and Budget, to Christine Todd Whitman, Administrator, EPA (Dec. 4, 2001) available at http://www.reginfo.gov/public/prompt/epa_pm_research_prompt120401.html. 95 62 Fed. Reg. at 38,664/1. 96 71 Fed. Reg. at 2633/2 -3. 97 Id. at 2633/3. 21 understanding of the time course of PM and weather effects, and their interactions, the [HEI] Panel recommends exploration of the sensitivity of future time-series studies to a wider range of alternative degrees of smoothing and to alternative specifications for weather.”98 Sensitivity of the results to the model that is used for analysis has also been reported in the studies looking for associations between long-term PM2.5 exposure and health endpoints. As EPA recognizes, 99 Krewski et al. (2000) reported finding spatial autocorrelation between particulate matter and mortality data. Correcting for that autocorrelation and other important potential explanatory variables reduced the estimated health risk associated with PM2.5 in Krewski et al. to insignificance.100 Similarly, the association between PM2.5 and premature mortality was no longer statistically significant when spatial controls were used.101 f. Summary In short, the substantial uncertainties about the potential relationship between PM2.5 and various health endpoints that EPA recognized in 1997 when it established the present primary PM2.5 NAAQS have not been resolved. Moreover, previously unrecognized sensitivity of the modeling results to the model formulation that is used has been identified. Thus, it is clear the uncertainty about the possible health risk associated with exposure to ambient PM2.5 has not diminished. Indeed, the previously- 98 Health Effects Institute, Revised Analyses of Time-Series Studies of Air Pollution and Health (May 2003), at iv. 99 71 Fed. Reg. at 2631/3. 100 See Smith at 10-11. 101 See id. at 11. 22 recognized and new uncertainties provide no basis for revising the Agency’s 1997 conclusion that the present NAAQS provide the requisite level of public health protection. 3. The Estimated Risk Upon Attainment of the NAAQS Has Decreased Since 1997. As discussed above,102 in 1997, EPA prepared an analysis of the health risks estimated to be associated with attainment of the current PM2.5 NAAQS. As part of the present review of those NAAQS, the Agency prepared a new assessment of the health risk posed by PM2.5 upon attainment of the current PM2.5 NAAQS, taking into account the newer studies. 103 The health risks now predicted to remain upon attainment of the present PM2.5 NAAQS are, in fact, lower than was predicted in 1997 when those standards were deemed to provide the requisite public health protection. Thus, the risk assessment reaffirms that the present NAAQS provide the requisite level of public health protection. Specifically, for the two cities that were addressed in both the 1997 risk assessment and the 2005 risk assessment (Los Angeles and Philadelphia) , “the magnitude of the estimates [of short-term mortality or morbidity risk] associated with just meeting the current annual standard, in terms of the percentage of total incidence, is similar [in the 1997 and 2005 risk assessments] in one of the locations (Philadelphia) and the current estimate is lower in the other location (Los Angeles).” 104 Furthermore, 102 See supra, page 8, . 103 E. Post, et al., Particulate Matter Health Risk Assessment for Selected Urban Areas (June 2005) (“Post (2005)”). 104 71 Fed. Reg. at 2640/2. 23 “[i]n terms of the magnitude of the risk estimates [associated with long-term exposure to PM2.5], the estimates in terms of percentage of total incidence are very similar for the two specific locations included in both the prior and current assessments.” 105 Moreover, although not mentioned in the proposal, the estimated mortality incidence rates in the 2005 risk assessment for each of the other cities considered in that risk assessment (Detroit, Pittsburgh, St. Louis, Boston, Phoenix, San Jose, and Seattle) are lower that those for Philadelphia and/or Los Angeles upon attainment of the present standards when a threshold of 10 µg/m3 for possible PM2.5 effects is assumed,106 as recommended by EPA’s science advisors. 107 Given that the predicted incidence rates in those cities are lower than those in Philadelphia and Los Angeles – where risk estimates have decreased since 1997 – the risks in these additional cities do not appear to be greater than the risks deemed acceptable by EPA in 1997.108 Finally, the approaches used in the risk assessment have virtually guaranteed that the risks estimated by EPA are overstated. Dr. Anne Smith points out that EPA’s method of selection of the concentration/response function used for its base case short- 105 Id. at 2640/3. 106 EPA, Review of the National Ambient Air Quality Standards for Particulate Matter: Policy Assessment of Scientific and Technical Information (OAQPS Staff Paper December 2005) at 5 -12, 5 -13. 107 Letter from Rogene Henderson, Chair, Clean Air Scientific Advisory Committee to Administrator Stephen L. Johnson (June 6, 2005) (CASAC commentary on the second draft of EPA’s Review of the National Ambient Air Quality Standards for Particulate Matter) at 6 (“Henderson (2006)”). 108 See Smith at 6-7, explaining that current estimates of risk associated with short-term PM2.5 exposures are, with one exception, lower than the estimate of shortterm mortality risk that EPA used in its previous risk assessment. 24 term risk estimate has, for each city considered, led the Agency to overlook other concentration/response functions applicable to the city that found no statistically significant association with PM2.5. 109 If those functions that showed no statistically significant association between PM2.5 and health had been used for the risk assessment, the health risks associated with short-term exposure to ambient PM2.5 could actually have included zero. The same conclusion applies to the selection of the concentration/response function used for base case estimates of health risk associated with long -term PM2.5 exposure; alternative concentration/response models that reported no statistically significant association between PM2.5 and health endpoints were not used.110 If they had been, the risk assessment would also have shown the possibility of zero health risk associated with ambient PM2.5 . 111 As a result of this conservatism in the risk assessment, the risks upon attainment of the present NAAQS are almost surely far below those that were predicted in 1997. B. The Recent Science Does Not Support EPA’s Proposal for a 24-hour Primary PM2.5 NAAQS of 35 µg/m3. An examination of EPA’s asserted basis for the proposed level of a revised 24hour NAAQS confirms that the Agency’s proposed finding that the current PM2.5 NAAQS are not at the level requisite to protect the public health is erroneous. The Agency has 109 See id. at 7-8. 110 See id. at 9-12. 111 Indeed, Dr. Smith has shown, using Philadelphia and Los Angeles as examples, that in a risk analysis that considered simultaneously the multiple uncertainties in the science associating PM2.5 and health endpoints – an integrated uncertainty analysis – the probability of zero health risk from ambient PM2.5 may be as much as 50%. See id. at 16-17. 25 proposed a 24-hour primary NAAQS level of 35 µg/m3 to “require improvements in air quality generally in areas in which short-term exposure to PM2.5 in an area can reasonably be expected to be associated with serious health effects.” 112 Several problems exist with this purported justification for the proposed NAAQS. First, the relevant question that the Agency must answer is whether the present NAAQS are less stringent than necessary to provide the level of air quality that is requisite to protect public health. 113 Because EPA’s current risk assessment finds that risks upon attainment of the current NAAQS are lower than those estimated in 1997 when those NAAQS were judged to provide the requisite level of public health protection, no more stringent NAAQS are now requisite. Second, the increased uncertainties about any possible association between daily PM2.5 exposures and health effects mean that no basis exists to conclude that “serious health effects” can reasonably be expected to occur in areas that attain the present PM2.5 NAAQS. Although the proposal asserts that “there is a strong predominance of studies with 98th percentile values down to about 39 µg/m3 . . . reporting statistically significant associations with mortality, hospital admissions, and respiratory symptoms,” and not below that range,114 the full picture of recent studies examining possible associations between daily PM2.5 levels and a variety of health endpoints does not support that assertion. Instead, the full picture shows no consistent 112 71 Fed. Reg. at 2648/3 to 2649/1. 113 Whitman, 531 U.S. at 475-76. 114 71 Fed. Reg. at 2649/1 -2. 26 association at levels up to (and above) the 65 µg/m3 98th percentile level of the present 24-hour standard. Dr. Anne Smith developed a list of relevant studies of PM2.5 and health endpoints (including several studies that are not cited by EPA in making its judgments about the predominance of statistically significant effects) and ranked them by the overall significance of each study’s results.115 She found: [T]here is no evidence of a ‘predominance of statistically significant findings’ above the 39 µg/m3 line. Nor is there any higher level of PM2.5 98th percentile above which statistical significance begin to become more common than below. While it is true that evidence of significance is mixed for studies in the 30-35 µg/m3 range, it is just as mixed for studies above the current standard of 65 µg/m3. 116 In short, Dr. Smith’s analysis shows that EPA’s assertion of “a strong predominance of studies with 98th percentile values down to about 39 µg/m3” is unsupported by the evidence.117 115 Smith at 25-27. 116 Id. at 30. 117 Compare 71 Fed. Reg. at 2649/1 with Smith at 27-29.. Dr Smith goes further and reviews the methodological merits of the studies that reported robust statistical associations with 98th percentile values for daily PM2.5 at the lowest levels. Smith at 30-33. She identifies the studies analyzing the exact set of Boston air quality data that provided the basis in 1997 for the controlling 15 µg/m3 annual NAAQS as reporting generally robust associations between PM2.5 and health endpoints, Smith at 30, and she notes that the primary methodological concern with these studies is their failure to use any two -pollutant model formulations. Id. at 32. She also explains that in the more recent analyses of this data set, alternative approaches to controlling for weather and time have produced “a progressive decline in the original size of effect, and in the most controlled case the PM2.5 association was statistically insignificant.” Id. at 32. Having examined the studies – including those of Boston – that reported low 98th percentile PM2.5 levels, Dr. Smith finds “one could reasonably conclude not to tighten the current standard at all.” Id. at 34. 27 Thus, not only do the decreasing estimates of health risk associated with attainment of the present standards and the increasing uncertainty about reported associations between daily PM2.5 levels and health endpoints fail to establish that the Agency’s 1997 conclusion that the current PM2.5 provide the requisite level of protection is no longer valid, but they establish that the Agency’s logic cannot support its proposed revision to the PM2.5 NAAQS. C. The Recent Science Supports EPA’s Proposal To Retain the Level of the Annual PM2.5 NAAQS at 15 µg/m3 . EPA proposes to retain the level of the annual standard at 15 µg/m3 because it “provisionally concludes” that the newer mortality and morbidity studies “do not provide a clear basis for selecting a level lower than the current standard of 15 µg/m3. 118 This conclusion is appropriate. The newer studies serve only to increase uncertainty about possible health risks associated with exposure to PM2.5 at levels permitted by the present annual standard. This conclusion is reflected in the finding of EPA’s risk assessment of no increase in health risk from that previously anticipated to remain once the current standards are attained. Thus, the newer studies provide no basis for the Agency to reconsider its 1997 conclusion that the present standards provide the requisite level of public health protection. 1. Increased Uncertainty About Any Health Risk From Long-Term PM2.5 Exposure In 1997, EPA relied on studies by Dockery et al. (1993) and Pope et al. (1995) to characterize the uncertain relationship between long -term exposure to ambient PM2.5 and premature mortality. Reanalyses of both of these studies have since been 118 71 Fed. Reg. at 2651/2. 28 conducted and enhancements of the Pope study using additional data have been reported.119 As EPA recognizes, these reanalyses replicated the analyses reported in the original papers. 120 The enhancements of the work done by Pope et al. (1995), however, also identified previously unrecognized sensitivity of analyses of the Pope data set to inclusion of SO2 in the model and to methods of controlling for spatial patterns. 121 Furthermore, the reanalyses found that the associations were statistically significant only within the fraction of the study population that lacks education above the high school level.122 Thus, the findings of the extended analyses necessarily increased the uncertainty about the dimensions of the reported association between PM2.5 and mortality in tha t data set. EPA also acknowledges new studies using two additional data sets have examined possible associations between long-term PM2.5 exposure and premature mortality. As the Agency recognizes, studies of a cohort of Seventh Day Adventists in Southern California reported positive associations between PM2.5 and mortality only in 119 Id. at 2631/1. 120 Id. 121 Id. at 2631/1-2. Dr. Anne Smith notes, “[t]he . . . reduction in risk level and statistical significance [when these factors are controlled for in analyses of data set used by Pope et al.] is so dramatic that it calls into question the causal interpretation of any other . . . relative risk estimates [based on that data set]” Smith at 11. Insufficient data were available from the Dockery et al., 1993 paper to conduct similar extended analysis. 122 71 Fed. Reg. at 2631/2; see also Smith at 12-13 (noting this pattern has been found by studies of both the data set used by Pope et al. and that used by Dockery et al.) 29 males, and even those associations were not generally statistically significant.123 In addition, a study of hypertensive male veterans reported “inconsistent and largely nonsignificant associations” between PM exposure and mortality. 124 Although the Agency has chosen to “place greatest weight” on the reanalyses and extensions of the Pope et al. (1995) and Dockery et al. (1993) studies, 125 the fact that these other recent studies of different populations find no consistent association between long -term ambient PM2.5 levels and mortality increases the uncertainty about the association between ambient PM2.5 and that health endpoint.126 In addition, the Agency discusses a new study of morbidity effects in children in Southern California. As EPA notes, the findings of this study were also mixed.127 Statistically significant positive associations between long-term PM2.5 exposure and decreases in lung function growth were reported in one studied cohort, but generally not in others. Thus, the findings of this study are not robust and do not call into question the Agency’s 1997 conclusion that the level of the present NAAQS provides the requisite protection of public health. Finally, the Agency has solicited comment on “relevant new published research.” 128 The most recent research on possible health effects associated with long123 71 Fed. Reg. at 2632/1. 124 Id. 125 Id. at 2632/1-2. 126 See Smith at 12. 127 71 Fed. Reg. at 2632/3. 128 Id. at 2645/2. 30 term exposure to PM2.5 continues to have mixed results. For example, Enstrom (2005) reports finding no support for an association between PM2.5 and mortality in elderly Californians. 129 Jerrett et al. (2005) find no statistically significant association between PM2.5 and mortality when individual- level variables and all ecological variables that both (1) had associations with mortality and (2) reduced the estimated risk of mortality associated with PM2.5 were controlled.130 Lipfert et al. (2006) report no statistically significant association with PM2.5 in analyses that controlled for traffic density in the county in which the subject resided, an indicator of congestion and a variety of sociological variables.131 Thus, the most recent research continues to increase uncertainty about possible health risks associated with long-term exposure to PM2.5 and provides no reason to question the Agency’s 1997 determination that the present NAAQS are set at a level that provides the requisite health protection. 2. Health Risk from Short-term Exposures to PM2.5 When the Annual Standard Is Met As a basis for recommending an annual PM2.5 standard in the range of 13 µg/m3 to 14 µg/m3, CASAC identified three studies (Burnett and Goldberg (2003), Mar et al. (2003) and Lipsett, et al (1997)) that reported associations between daily PM2.5 levels 129 J.E. Enstrom, Fine Particulate Air Pollution and Total Mortality Among Elderly Californians, 1973-2002, Inhalation Tox. 17:803-816 (2005) (Table 10) (RR 1.00 (0.981.02) for mortality between 1983 and 2002 using PM2.5 data from 1979-1983). 130 M. Jerrett, et al., Spatial Analysis of Air Pollution and Mortality in Los Angeles, Epidemiology 16:1-10 (2005) (Table 1) (RR of 1.11 (0.99-1.25) for “all cause” mortality). 131 F.W. Lipfert, et al., Traffic Density as a Surrogate Measure of Environmental Exposures in Studies of Air Pollution Health Effects: Long-term Mortality in a Cohort of US Veterans, Atmospheric Environment 40:154-169 (2006) (Table 4) (RR of 1.033 (0.923, 1.081)). 31 and various health endpoints in cities where the long-term average PM2.5 level was below 15 µg/m3 . 132 The EPA staff similarly pointed to a study by Fairley (2003) in an area where the annual standard is attained as a possible justification for a more stringent annual primary NAAQS.133 None of these studies, however, shows a consistent association between daily ambient PM2.5 levels and health endpoints. Lipsett et al. (1997) did not examine PM2.5 data.134 Rather, they investigated possible associations between PM10 data and emergency room visits for asthma. Such an investigation is of little value in understanding PM2.5 effects at specific air quality concentrations. Burnett and Goldberg (2003) found that their results “were sensitive to the method of statistical analysis” and concluded that “strategies need to be developed for selecting the appropriate amount of smoothing of time in the conduct of time-series studies.” 135 Moreover, it is unclear how to relate the results of this study – which examined daily PM2.5 concentrations in eight cities in a single analysis – to annual average PM2.5 levels in any single city. The Mar, et al. (2003) paper examining effects in Phoenix, Arizona, reported the results of ten analyses using daily PM2.5 air data, of 132 Letter from Rogene Henderson, Chair, CASAC, to Administrator Stephen L. Johnson (March 21, 2006), at 3-4. 133 See EPA, Review of the National Ambient Air Quality Standards for Particulate Matter: Policy Assessment of Scientific and Technical Information (Dec. 2005) at 5-32. 134 M. Lipsett, et al. Air Pollution and Emergency Room Visits for Asthma in Santa Clara County, California, 105 Environ. Health Perspect. 216 (1997). 135 R.T. Burnett & M.S. Goldberg, Size Fractionated Particulate Mass and Daily Mortality in Eight Canadian Cities, in Health Effects Institute, Revised Analyses of TimeSeries Studies of Air Pollution and Health (May 2003) at 85, 88. 32 which only three reported statistically significant associations with health endpoints. 136 Furthermore, two other papers (Clyde et al. and Smith et. al.) reported no statistically significant associations between daily PM2.5 and health effects in Phoenix. 137 In short, these studies do not provide a sound basis to question EPA’s 1997 determination that the level of the present annual standard provides the requisite protection of public health. D. No Change to the Form of the Annual NAAQS Is Warranted. Although EPA has appropriately proposed to retain the 15 µg/m3 level for the annual NAAQS, the Agency has proposed to change the form of that standard (i.e., how attainment of that standard is to be demonstrated) in a manner that makes that standard more stringent. Specifically, EPA has proposed to increase from 0.6 to 0.9 the required correlation coefficient between properly sited monitor pairs that may be spatially averaged and to require that this correlation coefficient be attained on a seasonal basis.138 This change to the form of the annual standard would make demonstration of attainment through spatial averaging more difficult,139 effectively increasing the 136 T.F. Mar, Air Pollution and Cardiovascular Mortality in Phoenix 1995-1997 in Health Effects Institute, Revised Analyses of Time-Series Studies of Air Pollution and Health (May 2003) at 178. 137 M.A. Clyde, et al., Effects of ambient fine and coarse particles on mortality in Phoenix, Arizona (University of Washington, National Research Center for Statistics and the Environment; NRCSE technical report series, NRCSE-TRS no. 040) (available at http://www.nrcse.washington.edu/pdf/trs40_pm.pdf); R.L. Smith, et al., Threshold Dependence of Mortality Effects for Fine and Coarse Particles in Phoenix, Arizona, 50 J. Air Waste Manage. Assoc. 1367 (2000). 138 71 Fed. Reg. at 2647/3. 139 See Memorandum from M. Schmidt, et al., to PM NAAQS Review Docket, Output A.7 at 2, 13 (June 2005). 33 stringency of the standard. The Agency has not demonstrated, however, that its 1997 determination that the present NAAQS provide the requisite level of health protection is no longer valid. Thus, the proposed change to the form of the annual standard is unwarranted. EPA adopted spatial averaging for the annual standard in 1997 because it was “most directly related to the epidemiological studies used as the basis for [the annual PM2.5] NAAQS.”140 Specifically, the studies examined either concentrations at a single centrally-located monitor or averaged concentrations from multiple monitoring sites to examine associations with health endpoints. 141 That approach is still used for the vast majority of the newer studies. In fact, the studies do not generally impose any restrictions concerning the degree of correlation that is required before an averaged value is used or even report the degree of correlation between the monitors whose values are being averaged. Thus, even the requirement in 1997 for a 0.6 correlation coefficient between monitors being averaged is more stringent than is justified by the approach used in the underlying health studies. The Agency gives two arguments for changing the form of the standard. First, it asserts: [E]stimated risks associated with long-term exposures remaining upon just meeting the current annual standard are greater when spatial averaging is used than when the highest monitor is used [or, presumably, when more stringent criteria must be met before spatial averaging can be used] (i.e., the estimated reductions in risk associated 140 62 Fed. Reg. at 38,671/2. 141 Id. 34 with just attaining the current . . . annual standards are less when spatial averaging is used).142 But, as discussed above, the present annual average standard, with the present provision for spatial averaging, has been determined to ensure the level of air quality that is requisite to protect public health. Changing the spatial averaging provisions of the annual standard with the result that the standard becomes more stringent would produce a NAAQS that is lower than necessary, and therefore not requisite.143 The Agency also expresses concern that the present form of the annual standard may have a “potential disproportionate impact on potentially vulnerable subpopulations.” 144 Presumably, however, any such impact would have been captured by the studies that used spatial averaging. Indeed, if the spatial averaging used in these studies did not reasonably represent the exposure of the population in the area, including sensitive individuals, the studies themselves would have limited value. For example, if all of the associations reported in the epidemiological studies were the result of disproportionately high exposures of sensitive individuals, then the PM levels associated with health risk are higher (by some unknown and likely variable amount) than those reported by the investigators and estimates of risks in the general population are unwarranted. In short, the hypothetical concern that sensitive individuals may experience disproportionately high PM exposures does not justify revisiting the 142 71 Fed. Reg. at 2647/1. 143 Whitman, 531 U.S. at 475-76. 144 71 Fed. Reg. at 2647/3. 35 Agency’s 1997 conclusion that the present annual level NAAQS provides the requisite protection of public health. V. Revision of the Secondary PM2.5 NAAQS Is Not Appropriate at this Time. Similar to its obligations when reviewing the primary NAAQS, the Agency has an obligation to justify any change from its 1997 determination, upheld by the D.C. Circuit, that the level of the present PM2.5 NAAQS provides the requisite protection of public welfare. Because the record does not establish that the risks to public welfare from ambient PM2.5 are greater, different in character, or more certain than was understood when the present standards were established, the Agency lacks a basis for revising its conclusion that those standards provide the requisite protection of public welfare. EPA has proposed to revise the PM2.5 NAAQS to protect visibility “principally in urban areas,” 145 a concern that EPA also considered in 1997.146 The Agency recognizes that “the fundamental characterization of the role of PM, especially fine particles, in visibility impairment” has not changed since the present secondary PM2.5 standards were found to provide the requisite level of visibility protection. 147 The only new information that the proposal identifies concerns “visibility trends and current levels.”148 although characterizing air quality (including visual air quality) is important in understanding present conditions, it does not call into question the Agency’s 1997 conclusion that the present standards provide the requisite level of visibility protection. 145 Id. at 2679/3. 146 62 Fed. Reg. at 38,681/1-2. 147 71 Fed. Reg. at 2678/3. 148 Id. at 2679/1. 36 Other information on urban visibility discussed in the proposal is also insufficient to call that conclusion into question. For example, the proposal notes that some states and localities have adopted visibility protection programs that are more protective than the current secondary NAAQS.149 The CAA, however, contemplates that, to the extent authorized by state law, states and localities may adopt standards more stringent than the federal ones. 150 Thus, the existence of such standards does not imply that the present NAAQS are less stringent than necessary on a national scale. Indeed, in 1997, the Agency noted that in cases where residents of an area “place a high value on unique scenic resources in or near” the area, a local visibility standard “would be more appropriate . . . because of the localized and unique characteristics of the problems involved.”151 In fact, “Congress did not intend the secondary NAAQS to eliminate all adverse visibility effects.” 152 EPA also cites to studies (most of which have seemingly not been formally peerreviewed) involving photographic representations (simulated and actual) of visibility when the present 65 µg/m3 daily secondary NAAQS is met to conclude that at that level of air quality, scenic views in urban areas “are significantly obscured from view.” 153 It is questionable whether these images reflect typical urban vistas and, thus, whether the y provide meaningful insight into the general acceptability of urban visibility. It is clear, 149 Id. at 2677/3-2678/1 150 CAA § 116. 151 62 Fed. Reg. at 38,682/3 to 38,683/1. 152 ATA II, 283 F.3d at 375, quoting ATA I, 175 F.3d at 1056. 153 71 Fed. Reg. at 2679/3. 37 however, that for most areas, daily PM2.5 levels of 65 µg/m3 will not occur if the annual 15 µg/m3 secondary NAAQS is met. Because these studies ignore the protection provided by the present annual secondary standard, they cannot call into question the Agency’s conclusion in 1997 that the present secondary NAAQS provide the requisite protection of public welfare. Although no basis exists for the Agency to change its 1997 determination that the level of the present secondary NAAQS provide the requisite protection of public welfare, that does not mean that visibility impairment is not a legitimate concern under the CAA. Visibility improvement in both urban and rural areas is expected as a result of implementation of the present secondary NAAQS and EPA’s regional haze program. 154 Indeed, both EPA and the D.C. Circuit cited the regional haze program in rejecting a claim that the present secondary NAAQS are not sufficiently stringent to protect visibility. 155 Additional visibility improvement is expected from a wide range of other existing CAA programs as well, including motor vehicle and fuel standards under Title II, the acid rain program under Title IV, the NOx SIP Call program, and the Clean Air Interstate Rule. VI. EPA Lacks a Scientific Basis for Its Proposed Urban Coarse Particulate Matter NAAQS. In addition to proposing to make the PM2.5 NAAQS more stringent, EPA proposes to add a daily NAAQS of 70 µg/m3 for urban coarse PM, measured as PM10- 154 40 C.F.R. §§ 51.300-309. 155 See ATA I, 175 F.3d at 1056-57; 62 Fed. Reg. at 38,681/3. 38 156 2.5. EPA purports to base this standard on “several epidemiologic studies that report statistically significant associations between short-term (24-hour) exposure to PM10-2.5 and various morbidity effects and mortality.”157 In an interesting innovation, the Agency proposes to define the PM10-2.5 indicator to exclude those components of PM falling in the 2.5-to-10 micron size range that the Agency concludes have not been shown to be associated with mortality or morbidity effects. 158 Although it is encouraging that EPA is seeking to refine the PM NAAQS to focus only on toxic components in the complex mixture that constitutes ambient PM, the available health effects evidence simply does not provide a scientific basis for any NAAQS for PM10-2.5. As the proposal itself illustrates, the vast majority of epidemiological studies using a PM10-.2.5 indicator found no statistically significant association with either mortality or morbidity. 159 Moreover, none of the few reported statistically significant associations provide a sound basis for concluding that ambient PM10-2.5 is associated with the health endpoints that were studied. As explained in the proposal itself, it is especially important in light of the limited evidence to consider whether the PM10-2.5 associations remain statistically significant if co-pollutants are considered in the analysis. 160 Three of the five 156 71 Fed. Reg. at 2674/1. 157 Id. at 2668/3. 158 See id. at 2667/1-3. 159 See id. at 2656, Figure 2 (only two of twenty analyses found a statistically significant association of PM10-2.5 with mortality and only four of twe lve analyses of various morbidity endpoints found a statistically significant association with PM10-2.5). 160 Id. at 2672/1. 39 associations identified in the proposal as statistically significant for PM10-2.5 – two from Burnett et al. (1997) and one from Ito (2003) – were sensitive to inclusion of other pollutants. 161 A fourth statistically significant association with PM10-2.5 was reported by Mar et al. (2003). That study did not model co-pollutants in conjunction with PM10-2.5, although it reported statistically significant associations with other pollutants, including PM2.5 , carbon monoxide and nitrogen dioxide. Thus, it is impossible to determine whether the PM10-2.5 finding would remain statistically significant in a multi-pollutant model. Another of the associations reported as statistically significant came from Ostro et al. (2003), which estimated PM10-2.5 levels from PM10 data. Given this approach to PM10-2.5 measurements, it is difficult to determine whether there was actually a statistically significant association with PM10-2.5. 162 Finally, Schwartz & Neas (2000) reported mixed findings with regard to respiratory symptoms. 163 PM10-2.5 was significantly associated only with cough and only in the absence of any other symptoms. The paper itself concludes that ambient PM2.5 is more important to health than is ambient PM10-2.5. In short, the evidence of statistically significant associations between PM10-2.5 and either mortality or morbidity is extremely limited and subject to serious questions and 161 Id. at 2672/1. 162 See id. at 2672/2; see also, CD at 9 -30 (pointing out the difficulty of evaluating the potential association o f health effects with PM10-2.5 from PM10 data). 163 J. Schwartz & L.M. Neas, Fine Particles Are More Strongly Associated than Coarse Particles with Acute Respiratory Health Effects in Schoolchildren, Epidemiology 11: 6-10 (Jan. 2000). 40 uncertainties. The majority of studies show no such association , and the few statistically significant associations that have been reported have not been shown to remain so when co-pollutants are also considered. Thus, the record provides no adequate basis on which to set a NAAQS for PM10-2.5. 164 VII. EPA Should Apply the Generally Accepted Practice of Blank Correction to FRM Measurements of PM2.5 Standard laboratory practice involves collecting field blanks to determine how contamination and artifacts may affect measurement. These blanks are “filters which are handled in the field as much as possible like actual filters except that ambient air is not pumped through them.165 Although good laboratory practice calls for correcting sample measurements for such contamination, EPA does not accept blank correction of PM2.5 measurements with an FRM. The consequence is that measurements of ambient PM2.5 are too high. This, in turn, may lead to some instances of an area being incorrectly determined not to attain the NAAQS. The attached report by Eric S. Edgerton of Atmospheric Research & Analysis, Inc. provides evidence of the extent of contamination on field blanks from ambient PM2.5 monitoring networks.166 Mr. Edgerton summarized field blank mass data from the Southeastern Aerosol Research and Characterization (SEARCH) network and from 164 See American Petroleum Institute v. Costle, 665 F.2d 1130, 1187 (D.C. Cir. 1981) (The Administrator may not engage in “sheer guesswork” when setting NAAQS). 165 Revisions to Ambient Air Monitoring Regulations; Proposed Rule, 71 Fed. Reg. 2710, 2749/1 (Jan. 17, 2006). 166 E.S. Edgerton, Comment on Proposal to Require Submission of Data on PM2.5 Field Blank Mass in Addition to PM2.5 Filter-Based Measurements (“Edgerton”) (Attachment 3). Mr. Edgerton serves as a member of CASAC’s Ambient Air Monitoring and Methods Subcommittee. 41 various state and local agency networks in the states of Alabama, Florida, and Mississippi. He found that the PM2.5 contamination of filter blanks varied by about a factor of two among the networks he considered even when the same FRM and sampling protocols were used.167 The average mass of that contamination was as much as 0.45 µg/m3 , or 3.0% of the 15 µg/m3 annual PM2.5 NAAQS.168 Although this is a small quantity, it “can be significant for sites near, or slightly above, the NAAQS.”169 EPA has now proposed to require the submission of data from PM2.5 field blanks.170 The Agency acknowledges that data from field blanks “will help EPA and other researchers better understand the relationship between the mass of PM that is sampled and weighed on a regular PM filter and the PM that is actually present in the ambient air.” 171 Indeed, that is the purpose of collecting data with field blanks. EPA’s proposal is a step in the right direction. The Agency should, however, go further and require – or at least permit – correction of ambient air measurements for the contamination found on field blanks. This requirement should be applicable, on a network-by-network basis, to data from chemical speciation monitor networks and from PM10-2.5 networks, 172 as well as to PM2.5 networks. 167 Id. at 2. 168 Id. 169 Id. at 2 -3. 170 71 Fed. Reg. at 2749/1. 171 Id. 172 As Mr. Edgerton notes, the effect of blank correction can be particularly significant for organic carbon measurements, for which the mass on the blank filter may be as much as 18 percent of that reported by the ambient monitor. Edgerton at 3. 42 VIII. Conclusion In summary, EPA should not adopt the proposed revisions to the PM2.5 NAAQS. The current evidence and related assessments of risk provide no basis for the Agency to reconsider its determination in 1997 that the levels of the current NAAQS are neither more nor less stringent than necessary to protect the public health and welfare. The Agency also should not promulgate new NAAQS for urban particulate matter measured as PM10-2.5. Scientific evidence to support any such standard is lacking. On the other hand, the Agency should amend its monitoring regulations to provide for correction of ambient PM2.5 measurements for mass that is collected on a field blank. Such blank correction will provide for more accurate measurements of PM2.5 concentrations in the ambient air. 43 Attachment 1 Utility Air Regulatory Group Appalachian Power Company Carolina Power & Light Company Central Illinois Public Service Company Central Power and Light Company CINergy Corp. The Cincinnati Gas & Electric Company PSI Energy Columbus Southern Power Company Constellation Power Source Generation, Inc. Consumers Energy Company Dayton Power and Light Company, The DTE Energy Dominion Energy Dominion Generation Dominion Virginia Power Dominion North Carolina Power Duke Energy Corporation Dynegy Marketing and Trade E.ON U.S. LLC LG&E Energy Corp. Kentucky Utilities Company Louisville Gas & Electric Company Western Kentucky Energy FirstEnergy Corp. Florida Power Corporation Indiana Michigan Power Company Kansas City Power & Light Company Kentucky Power Company Los Angeles Department of Water & Power Madison Gas and Electric Company Minnesota Power/ALLETE Mirant Corporation Monongahela Power Company, dba Allegheny Power NiSource, Inc. Oglethorpe Power Corporation Ohio Power Company Ohio Valley Electric Corporation Otter Tail Power Company PacifiCorp Electric Operations Potomac Edison Company, The dba Allegheny Power Public Service Company of New Mexico Public Service Company of Oklahoma Salt River Project South Carolina Electric & Gas Company Southern Company Alabama Power Company Georgia Power Company Gulf Power Company Mississippi Power Company Savannah Electric and Power Company Southwestern Electric Power Company Texas Utilities Tucson Electric Power Company Union Electric Company West Penn Power Company, dba Allegheny Power West Texas Utilities Company We Energies and Edison Electric Institute National Rural Electric Cooperative Association American Public Power Association National Mining Association 31531.460004 WASHINGTON 593706v1 Attachment 2 Technical Comments on the Proposed Rule for National Ambient Air Quality Standards for Particulate Matter Prepared on behalf of the Utility Air Regulatory Group Anne E. Smith, Ph.D. CRA International April 17, 2006 I. Introduction and Summary of Key Points EPA published its Proposed Rule (“PR”) for the National Ambient Air Quality Standards for Particulate Matter on January 17, 2006. 1 This document contains my comments on the technical basis in the PR for the levels that EPA proposes for the PM2.5 daily standard (i.e., 35 μg/m3 98th percentile 24-hour average PM2.5) and for the PM2.5 annual standard (i.e., 15 μg/m3 annual average PM2.5). F F EPA uses both a risk-based approach and an evidence-based approach to determine whether to tighten the PM2.5 standards. It uses an evidence-based approach to support the specific proposed levels for the PM2.5 standards. When one considers a more complete summary of the available epidemiological evidence than EPA offers in the PR, neither the risk analysis nor the evidence-based approach provides support for tightening either PM2.5 standard. Furthermore, even if one were to accept the view that the daily standard must be tightened below current levels, a more complete summary of the available evidence than provided in the PR reveals that its evidence-based approach provides no technically-based guidance for where to set a standard. The reasons that the quantitative risk analysis does not support a tightening of the PM2.5 standards are: (1) the risk estimates are generally lower than they were in 1997, when the standards were first set; and (2) the overall robustness of statistical significance associated with the quantitative estimates has fallen dramatically since 1997, driven mainly by greater evidence and acknowledgment of the importance of modeling uncertainties. This eroding confidence directly reflects an eroding basis in the epidemiological evidence itself; therefore, one cannot argue that overall evidence supports a tightened standard even if the quantitative risk estimates do not. Additionally, the quantitative risk analysis that is provided is biased upwards because it does not incorporate the larger body of quantitative findings regarding model uncertainty. The majority of the evidence that EPA’s risk analysis ignores would produce lower risk estimates, and imply greater likelihood that there is no PM2.5 effect at all. I substantiate these statements in Part II of this document. 1 Federal Register, Vol. 71 (10), pp. 2620-2708. CRA International EPA uses an evidence-based approach to attempt to identify a justifiable level below 65 μg/m3 for setting a daily standard. EPA’s application of this approach, however, is incomplete in that it relies on only a subset of the actual available literature, and it does not provide evaluation of the quality of evidence in each study that it does cite. When these gaps are filled, and the results are presented in a more structured manner, it becomes clear that EPA’s line of reasoning for setting the standard at 35 μg/m3 is not supportable. Part III of this document provides the detailed analysis demonstrating these points. Finally, the PR argues that the recent literature has found that estimated associations between PM2.5 and health endpoints are generally robust to inclusion of gaseous pollutants in the epidemiological model. This statement is a linchpin in the PR’s case that a causal role for PM2.5 has been more strongly established since 1997. Unfortunately, it is also an incorrect conclusion based on only a subset of the full body of PM2.5 epidemiological literature. In Part IV, I review and summarize the evidence on the impact of considering gaseous pollutants simultaneously with PM2.5 in epidemiological modeling. My review reveals that statistical significance of PM2.5-health associations is not robust when one or more gaseous pollutants are included. My review also finds that this sensitivity does not appear to be caused by problems of multicollinearity. Thus, EPA should eliminate this conclusion, and revise its analysis concerning the public health risk from PM2.5 without reference to such a conclusion. 2 CRA International II. Quantitative Risk Analysis Does Not Support Tightening the Standards EPA has decided that its quantitative risk analysis cannot be used to decide where to set a standard. 2 This is a reasonable conclusion because, in the absence of any specific knowledge of where an effects threshold may lie, the risk analysis will always produce linearly declining amounts of risk for each incremental reduction in the level of the standard until background concentrations are attained. However, the risk analysis also fails to provide a case for tightening the current standards, both in its explicit results, and also because EPA’s risk analysis incompletely represents all of the relevant quantitative evidence. The PR does not fully recognize these two points, and thus makes a technically unsound case that the risk analysis does provide support for tightening the daily standard. Although EPA has not proposed to tighten the annual standard, it is also the case that current evidence would not support such a reduction. F F Quantitative Risk Estimates Are Lower Now Than When the Current Standards Were Set, and Confidence in the Risk Associations Is Also Reduced A key insight from EPA’s risk analysis is that current quantitative estimates of risk levels that remain at exact attainment of the current PM2.5 standards are lower now than they were when those standards were set. 3 This result occurs even under EPA’s “base case” assumptions, which overstate the current evidence on risk levels. Risk estimates would be lower still under most of the alternative set of assumptions that EPA does not use in its “base case” calculations. If risk levels are estimated to be lower than they were at the time that standards were originally set then, ceteris paribus, one cannot make a quantitative case for tightening the PM2.5 daily standard. 4 This outcome of the risk analysis is mentioned briefly in the PR 5 but the extent of the change in point estimates of risk since 1997 is understated: F F F F F F “With respect to short-term exposure mortality and morbidity … [c]omparing the risk estimates for the only two specific locations that were included in both the prior and current assessments, the magnitude of the estimates associated with just meeting the current annual standard, in terms of percentage of total incidence, is similar in one of the locations (Philadelphia) and the current estimate is lower in the other location (Los Angeles). … With respect to long-term exposure mortality risk estimates, the estimates in terms of percentage of total incidence are very similar for the two specific locations included in both the prior and current assessments.” 6 F F 2 PR, p. 2648. These risk estimates are due to lower epidemiological estimates of the relative risk of PM2.5, and are not lower due to reduced air pollution since 1997. (The air quality used to make these risk estimates is effectively the same now as in 1997 for the simulation of exact attainment of the current standards on which they are based.) 4 As EPA notes, even if risk estimates are lower, a case might still be made to tighten the standards if the confidence in the risk estimates is increased. As I will explain next, the technical evidence now available does not engender greater this greater confidence. 5 It was not mentioned at all in the Staff Paper (EPA (2005)), or in the risk analysis (Post et al. (2005)). 6 PR, p. 2640. 3 3 CRA International EPA did not report these comparative results of the risk analysis in any of the documents leading up to the PR, 7 and even the PR does not clearly explain the policy-relevant facts. The following sections provide the facts behind the above quote, and reveal that the degree of reduction in quantitative risk estimates is more pronounced than a mere comparison of EPA’s base case risk estimates would imply. It is also possible to infer that reduced risks have occurred for all but one of the new cities that have been included in the current risk analysis. F F Short-term Mortality Risk. In 1997, EPA performed a risk analysis for two cities, Los Angeles and Philadelphia. EPA based its estimates of short-term mortality risk for both cities on a statistically significant relative risk estimate from Schwartz, Dockery and Neas (1996), while I will refer to hereafter as “SDN”. Neither of the two cities had been represented in SDN, but this was the only study of short-term mortality risk associated with PM2.5 available at the time. Since then, part of the newly available body of evidence includes short-term mortality studies specific to these two cities: Moolgavkar (2003) for Los Angeles and Lipfert et al. (2000a) for Philadelphia. EPA has relied on these two new studies for its current risk analysis for these two cities. Figure 1 (for Los Angeles) and Figure 2 (for Philadelphia) compare the 1997 short-term mortality risk estimate (and associated 95% statistical confidence ranges) to risk estimates supported by the two new studies. These are risk estimates for ambient PM2.5 levels that are just attaining the current standards. In both figures, the leftmost vertical bar (shown in red) is the short-term percent mortality incidence estimated in 1997, based on SDN (as noted on the graph under that bar). One can see that the percent incidence attributed to PM2.5 in 1997 was about 1.7% for Los Angeles and 1.5% for Philadelphia. All of the blue bars (i.e., all of the bars except for the leftmost one) are based on relative risk estimates from the newly available study for each city. There are multiple blue bars because these new studies reported results of PM2.5 relative risk estimates using many different model formulations. The variations in models included use of GAM versus GLM methods of estimation, varying degrees of smoothing for time and weather, and use of single-pollutant (“1-P”) versus two-pollutant (“2-P”) formulations that included one of the gaseous co-pollutants. Of all the alternative estimates in each study, EPA’s base case risk estimates use only a single one, which is reflected in the leftmost blue bar, positioned to the left of a grey dashed divider line, next to the red bar reflecting the 1997 estimate. The two bars to the left of the dashed divider line thus provide a comparison of EPA’s base case risk estimates from the 1997 and current risk analyses. They imply a large decrease in Los Angeles (from 1.7% to 0.5%) and a modest increase in Philadelphia (from 1.5% to 2.2%). 7 It was, however, documented in my comments to EPA on the August 2003 draft Risk Analysis, and on the January 2005 second drafts of the Staff Paper Risk Analysis. (Smith (2003, 2005)). The exact levels of risk reported here have changed slightly since these drafts, and correspond to those in the final risk analysis and Staff Papers of June 2005. 4 CRA International Figure 1. Comparison of Short-term Mortality Risk Estimates in 1997 and Now When Ambient PM2.5 is Just Attaining the Current PM2.5 Standards – Los Angeles (red) 1997 EPA risk estimate (blue) Risk estimates from current evidence 2.5% Used in Staff Paper Not Used in EPA’s Analysis of Alternative Standards 2.0% % of total 1.5% mortality incidence due to 1.0% PM2.5 0.5% 0.0% Schwartz, Dockery, Neas (1996) Non-acc 1-P Moolgavkar (2003) Non-acc GAM 30df 1-P 0-day lag Non-acc GAM 30df 1-P 1-day lag Non-acc GAM 30df 2-P 1-day lag Non-acc GLM 30df 1-P 1-day lag Non-acc GAM 100df 1-P 1-day lag Non-acc GLM 100df 1-P 1-day lag Non-acc GAM 100df 2-P 1-day lag Non-acc GLM 100df 2-P 1-day lag Moolgavkar (2003) Figure 2. Comparison of Short-term Mortality Risk Estimates in 1997 and Now When Ambient PM2.5 is Just Attaining the Current PM2.5 Standards – Philadelphia (red) 1997 EPA risk estimate (blue) Risk estimates from current evidence Used in Staff Paper Not Used in EPA’s Risk Analysis or Staff Paper 2.5% % of total mortality incidence due to PM2.5 2.0% Stat. significant Stat. significant 1.5% 1.0% Not stat. significant Not stat. significant 0.5% 0.0% Schwartz, Dockery, Neas (1996) Non-acc 1-P Lipfert et al. (2000a) Cardio#2 1-Pollutant Cardio#1 1-P Cardio#1 2-P Non-acc 1-Pollutant Lipfert et al. (2000a) Non-acc 2-P 5 CRA International The additional lines to the right of the dashed divider line reflect the full range of new information that has been excluded from EPA’s current base case risk analysis results. As can be seen, most of the alternative available risk estimates are much lower than the single one that EPA chose to use for its base case. Further, all of the alternative estimates using 2-P models are not only much lower in risk, but statistically insignificant. (In each of these cases, the gaseous second pollutant in the model was statistically significant.) In both papers, the authors concluded that the 2-P results indicated that the culprit pollutant appeared to be a gaseous pollutant rather than PM2.5, and that use of any of their 1-P model results would overstate the case for a PM2.5-mortality association. Thus, a full comparison of the newly available risk information to that which was available in 1997 indicates a substantial reduction in estimated risk levels. Further, this review reveals that EPA’s current base case risk estimates substantially overstate the short-term PM2.5 mortality risks that the newly studies imply, because they rely solely on 1-P model results that the authors themselves discredit. Los Angeles and Philadelphia are given special focus in the comparison of risk analysis findings because those were the only two cities for which risk estimates were developed in the 1997 analysis. However, it is quite feasible to determine what the percent mortality incidence would have been for other cities in the current risk analysis if they had been included in a 1997 risk analysis. This is because risk estimates in all of those cities would have also been based on SDN, just as they were for Los Angeles and Philadelphia. Even if current risk estimates for the other cities were still to be based on the database used by SDN, they would be lower now than in 1997 as a result of reanalyses performed when correcting the GAM statistical error in SDN. The newly available studies replacing SDN are Schwartz (2003a) and Klemm and Mason (2003). Both new studies found that the relative risks in the original SDN paper fell as linear methods of estimation (GLM) were used to replace the GAM method, and as temporal controls were enhanced. Indeed, Klemm and Mason reported that the relative risk estimates became statistically insignificant in their most highly controlled formulations. Figure 3 shows how much risk estimates and confidence intervals for this dataset have declined as a result of the newly available results for the database used in SDN. 8 X F X F St. Louis and Boston are two cities in the current risk analysis for which Schwartz (2003a) and Klemm and Mason (2003) are used, because these cities are among the “Six Cities” that are analyzed in these studies. Their individual city-specific estimates were affected in a similar manner to that depicted in Figure 3, although St. Louis was affected more, with all of the GLM findings in Klemm and Mason (2003) producing statistically insignificant associations. X X 8 The figure shows results for Los Angeles (again, showing percent mortality incidence when just attaining the current standards), using the combined cities estimates. Although absolute levels of incidence will vary slightly from city to city, the relative change in estimated risk levels would be the same in any city to which these study results might be applied. 6 CRA International Figure 3. Comparison of Risk Estimates Based on SDN in 1997 to Those Based on Reanalyses of SDN Available Since 1997. (The figure reflects the estimated mortality incidence just attaining the PM2.5 standards in Los Angeles, but the relative pattern in the risk estimates would be the same for any city for which risk estimates might be based on these studies. The estimates are based on the combined cities results, but the patterns are very similar for each of the six individual cities in the studies as well.) (red) 1997 EPA risk estimate (blue) Risk estimates from current evidence Used in Staff Paper Not Used in EPA’s Analysis of Alternative Standards 2.5% % of total mortality incidence due to PM2.5 2.0% 1.5% 1.0% 0.5% 0.0% Schwartz, Dockery, Neas (1996) Non-acc 1-P Converged GAM Non-acc 1-P GLM “thin-plate” Non-acc 1-P Schwartz (2003) GLM “df match” Non-Acc 1-P GLM 4 knots/year Non-Acc 1-P GLM 12 knots/year Non-Acc 1-P Klemm and Mason (2003) The remaining cities for which short-term mortality risks are provided in the current risk analysis are Phoenix, Detroit, Pittsburgh, and San Jose. There is a newly available cityspecific study for each of these four cities. 9 The single base case relative risk estimate that EPA uses for three of these cities in its current risk analysis is lower than the relative risk estimate it would have used in 1997 (i.e., the SDN combined cities result). The sole exception is San Jose, based on a risk estimate from Fairley (2003). Importantly, even the base case risk estimate for two of those three cities is now statistically insignificant, whereas the estimate that would have been produced in 1997 based on SDN would have been statistically significant. F F Table 1 summarizes the overall state of newly available evidence used in the current risk analysis, as compared to the statistically significant estimate that was available for the 1997 risk analysis. For short-term PM2.5 mortality risk, EPA’s base case point estimates of risk are lower in six of the eight cities in the current risk analysis, and many of those base case estimates are themselves statistically insignificant. For the remaining two cities X X 9 These are Mar et al. (2003), Ito (2003), Chock et al. (2000) and Fairley (2003), respectively. 7 CRA International (Philadelphia and San Jose), other model results in the source studies would, however, produce lower risk estimates, if used. Furthermore, none of the newly available studies for these eight cities finds a PM2.5-mortality association that is statistically significant to all of the formulations that are reported in those studies. This is reflected in the last column of Table 1, that none of the cities’ risk estimates remain statistically significant under all the model outcomes that are reported for each city. X X Table 1. Summary of Declining PM2.5 Risk Estimates and Reduced Confidence in Statistical Significance of Current Risk Estimates Compared to 1997. Did EPA’s risk estimate rise or fall since 1997?(*) Shortterm risk Longterm risk Is EPA’s risk estimate statistically significant? Is significance of estimate robust to alternative model choices? Philadelphia Up Significant Not robust Los Angeles Down Insignificant Not robust Phoenix Down Significant Not robust St. Louis Down Significant Not robust Boston Down Significant Not robust Detroit Down Insignificant Not robust Pittsburgh San Jose Down Up Insignificant Significant Not robust Not robust All Cities Down Significant Not robust (*) For the short-term risks, the 1997 estimate is assumed to be the SDN “combined” for all cities except for Boston and St. Louis, for which it is the SDN city-specific estimate. Current estimate is the risk coefficient used in Staff Paper to estimate short-term mortality risk reduction under alternative standards (e.g., Figure 5-2). Thus, the evidence reflects a very strong case that short-term mortality risks estimates are lower now than they were in 1997. EPA’s statement on this point, quoted above, substantially understates this trend in risk estimates since the current standard were set. The quantitative risk analysis for short-term mortality from PM2.5 does not support a tightening of the PM2.5 standards. Instead, the quantitative risk analysis reveals a consistent pattern of decreased risk even when using the “highest” and “most significant” estimates from each of the newly available epidemiological studies. Additionally, these same new studies provide strong evidence that the PM2.5-mortality associations are not robustly statistical significant (i.e., statistical significance is eliminated in the face of a variety of reasonable alternative statistical modeling methods and formulations.) 8 CRA International Long-term Mortality Risk. Lower risk levels and eroding confidence in the underlying statistical evidence also apply to long-term mortality risk estimates, as Table 1 also shows. EPA’s statement, quoted above, that the long-term risk estimates are “very similar” to those in 1997 is not consistent with the facts. EPA’s base case long-term risk estimates rely only on evidence in a dataset that was available in 1997, the American Cancer Society (“ACS”) cohort. Although parts of the PR recognize that reanalyses of the ACS dataset have revealed important uncertainties in its statistical associations for PM2.5, EPA’s base case risk analysis ignores the major impact these sensitivities have on the level of risk and associated confidence levels. Additionally, even the sensitivity analyses in the risk analysis completely ignore one newly available study, Lipfert et al. (2000b), that reports PM2.5 health associations in a cohort that had not even been studied as of 1997. EPA chooses to give this study no weight in its risk analysis, yet it provides some important new evidence that should at least be considered in an evaluation of trends in size of effect and confidence levels that can be assigned to the long-term mortality associations. X X Figure 4 provides a graphical summary of the past and newly available evidence on quantitative long-term mortality risks. The percent risk incidence data in this figure also pertain to PM2.5 levels at exact attainment of the current standards, as for the short-term mortality figures above. Risk estimates in Figure 4 are based on Los Angeles, but the patterns would be identical for any city in the U.S. because long-term mortality risk estimates are based on a single relative risk estimate applicable to all cities. 10 As with the previous figures for short-term mortality risk, the leftmost vertical bar (in red) reflects the level of risk that was estimated for just-attaining the current PM2.5 standards in the 1997 risk analysis: about 1.5% of total mortality incidence. As noted below that bar, the 1997 estimate was based on the relative risk in Pope et al. (1995), which studied the ACS cohort. The first blue bar, coupled in the left segment of the figure with the 1997 estimate, reflects the current risk analysis’s base case estimate of mortality risk: about 1.3%. As noted on the figure, this base case risk estimate comes from Pope et al. (2002), which also used the ACS cohort, but with data extended since the 1995 study. (It is, specifically, the relative risk based on “averaged PM” concentrations from that paper.) These two risk analysis estimates reveal a slight decline in long-term risk estimates since 1997. However, the remainder of the estimates presented in Figure 4 show the broader body of evidence. The broader body of evidence supports a conclusion that long-term risk estimates are substantially lower today than in 1997. It also shows that much less confidence can be assigned to long-term PM2.5-mortality associations now than in 1997. The basis for these conclusions is discussed in detail below. X X X X F X F X 10 This is because long-term risk studies are performed in a cross-sectional manner, by comparing mortality risks to pollution levels across cities. The resulting estimate is a single relative risk estimate across the cities in the study, rather than a different relative risk estimate for each city, as one obtains in the timeseries type of study that produces a short-term mortality risk estimate. 9 CRA International Figure 4. Summary of Long-Term Risk Estimates Used in 1997 and Current Risk Analyses, and Comparison with Alternative Newly Available Estimates (red) 1997 EPA risk estimate (blue) Risk estimates from current evidence Used in Staff Paper Mentioned in Staff Paper Not Used in EPA’s Analysis of Alternative Standards 2.5% 2.0% % of total mortality 1.5% incidence due to 1.0% PM2.5 CO NO2 O3 SO2 0.5% 0.0% Pope (1995) Pope (2002) “Averaged PM” Krewski (2000) 1-P Krewski (2000) Sensitivities 2P Krewski (2000) Spatial autocorr fixed & best controls Pope (2002) “79-83 PM” Pope (2002) “’79-’83 PM” HighestP Sp.Smooth Lipfert (2000b) All results w/ ecol. Controls (-3.5% result off chart & NOT VISIBLE) Current long-term risk incidences are based on an LML value adjusted for comparability to 1997 estimates; All bars labeled “Krewski” use ACS-based risk estimates in Krewski et al. (2000); All examples are “nonaccidental” mortality regressions. These results are based on Los Angeles, but the pattern is the same for Philadelphia, and all other cities. The risk estimates shown in the middle segment of Figure 4 (between the two grey dashed divider lines) show the implications of including controls for of gaseous copollutants when analyzing the PM2.5-mortality associations in the ACS cohort data (i.e., “2-P” model results). This 2-P sensitivity analysis was performed in the Krewski et al. (2000) reanalyses of Pope et al. (1995). The first bar in this segment is the 1-P result in the reanalysis, and the next set of 4 bars reflects the comparable PM2.5 risk estimate from 2-P formulations, which included CO, NO2, O3, or SO2, respectively. These sensitivity analyses, which are mentioned in the PR, reveal that the PM2.5 association is not robust when SO2 is included in the regression. 11 The size of the PM2.5 risk estimate falls X F X F 11 The PR notes at p. 2631 that inclusion of SO2 in the analysis “decreased the size of the effect estimates for PM2.5 to one-sixth of its original value and for sulfates to less than one-third of its original value.” The PR notes this sensitivity again at p. 2634 and p. 2652. 10 CRA International dramatically relative to any of the 1-P risk estimates, and it becomes statistically insignificant. Remarkably, despite this important finding in the reanalyses of 2000, Pope et al. (2002) still did not report any 2-P results using SO2. This is especially remarkable because that paper does report that SO2 has a statistically significant association with long-term mortality in its own 1-P regression, yet the authors simply do not present any results where PM2.5 and SO2 had been considered simultaneously. Even EPA notes this as an unusual omission, “[b]ecause the correlation coefficient between PM2.5 and SO2 was 0.50 in the ACS data, in this view it is plausible to believe that the independent effects of the two pollutants could be disentangled with additional study.” 12 Thus, the evidence of this non-robustness is observable only in the older study of Krewski et al (2000), which EPA chooses to ignore in its base case risk analysis in favor of the more recent, but less comprehensive results in Pope et al. (2002). F F Despite this dramatic sensitivity to SO2, EPA’s base case risk estimate relies on the 1-P formulation from Pope et al. (2002) and, elsewhere in the PR, EPA contends that risk estimates are “generally robust” to inclusion of gaseous pollutants. 13 The result is that both the risk analysis and the PR overstate the evidence in favor of a long-term mortality association for PM2.5. F F The rightmost segment of Figure 4 provides several additional risk estimates that are never mentioned in EPA’s quantitative risk analysis even as sensitivity analyses. Aspects of these other results are only briefly noted in the PR’s discussion of the overall evidence. When included in this figure, enabling a direct quantitative comparison to the assumptions used in the quantitative risk analysis, their implications for an eroding level of confidence (and yet lower risk estimates) become far clearer than one can ascertain from the text of the PR. X X The first vertical bar in the rightmost segment of Figure 4 reflects the PM2.5 risk estimates in the Krewski et al. (2000) reanalyses after removing undesirable spatial autocorrelation in the base case estimates from the ACS cohort, and simultaneously controlling for SO2 and other important explanatory variables that were not included in relative risk estimates used in the risk analysis. The resulting reduction in risk level and statistical significance is so dramatic that it calls into question the causal interpretation of any of the other ACSbased relative risk estimates. Although Pope et al. (2002) did not provide such a 2-P formulation, it did report a 1-P example of the impacts of adding spatial controls. The effect, which can been seen by comparing the next two vertical bars to the right, was again to make the PM2.5 relative risk estimate statistically insignificant. (The PR obscures this finding when it states that “Pope et al. (2002) reported that effect estimates were not highly sensitive to spatial smoothing approaches intended to address spatial autocorrelation.” 14 ) One can only surmise what the impact would have been on the Pope X F X F 12 PR, p. 2652. PR, p. 2660. 14 PR, p. 2652. 13 11 CRA International et al. (2002) PM2.5 risk estimate if both spatial controls and SO2 had been simultaneously included in that paper. Finally, the far right portion of Figure 4 presents the PM2.5 risk results from Lipfert et al. (2000b), which relies on an entirely different cohort, the “Veterans Cohort.” This study was formally excluded from any part of the quantitative risk analysis, and is largely downplayed in the PR. The PR says only the following about this new long-term study: X X “In addition, one new set of analyses was done using subsets of PM exposure and mortality time periods and data from a Veterans Administration (VA) cohort of hypertensive men. The investigators report inconsistent and largely nonsignificant associations between PM exposure (including, depending on availability, TSP, PM10, PM2.5, PM15 and PM15-2.5) and mortality.” 15 F F One can see from Figure 4, however, that EPA’s characterization of this study as having “inconsistent” findings is not that the results within this study are inconsistent, but that this study’s findings are completely inconsistent with those that EPA has chosen to rely on. In fact, the Veterans Cohort dataset contains consistently negative associations between PM2.5 and long-term mortality. Furthermore, the PR is incorrect in stating that those associations are “largely nonsignificant.” All but one of the associations between mortality and PM2.5 in this paper are significant in the negative direction (reflected by a solid point estimate in Figure 4, as the numerical confidence ranges are not provided in the paper). EPA feels it should rely primarily on cohorts that had already been studied in 1997, and which have been reanalyzed since 1997. Regardless of this, EPA should still acknowledge that evidence from study of the newly available cohort casts greater uncertainty on the long-term PM2.5-mortality association. X X X X Figure 5 illustrates a final aspect of the newly available evidence that further clouds confidence in relative risk findings reported for PM2.5-mortality associations based on the ACS and Six-Cities cohorts. This is the fact that the association between PM2.5 and mortality in those two cohorts is entirely attributable to the individuals within those cohorts that have a high school education or less. The PR does describe this finding, but it is not reflected in any way in the quantitative risk analysis. There is no statistical significance in the association for individuals within these two cohorts that have more than a high school level education. The point estimate of relative risk for these individuals is effectively unity. This pattern, first identified by Krewski et al. (2000) in their reanalysis report, appears in both of the cohorts that EPA has chosen to emphasize, and to rely on for its quantitative risk estimates. The effect has persisted into the extended dataset of Pope et al. (2002). There are a number of hypotheses that can be offered for what this means, but all of these hypotheses lead to conclusions that either the relative risk estimates being used in the risk analysis are biased, or that the association with PM2.5 is actually due to some other confounder, and is not causal. X X 15 PR, p. 2632. 12 CRA International Figure 5. Persistent Pattern of Long-Term Mortality Associations Being Applicable only to Individuals in Cohorts Who Have Lower Educational Levels Estimated Relative Risk per 10 ug/m3 (all cause) ACS/HEI 6-Cities/HEI HS HS >HS HS No Effect 0.9 “ACS/HEI” refers to the reanalysis of the ACS cohort in Krewski et al. (2000). The relative risks shown were taken from Table ES-3 therein, and converted from units “per 24.5 μg/m3” to relative risk per 10 μg/m3. The label “6-Cities/HEI” refers to the reanalysis of the Six-Cities cohort in Krewski et al. (2000). The relative risks were taken from Table ES-3 therein, and converted from units “per 18.6 μg/m3” to per 10 μg/m3. The results from Pope et al. (2002) were taken from Figure 4A of that paper. Summary. For both long-term and short-term mortality risks, the newly available evidence reveals decreased risk levels, and heightened uncertainty about the nature of the associations, compared to the body of evidence that was available in 1997. This is revealed even in EPA’s quantitative risk analysis results, even though that analysis is biased upwards by selective use of model results that have the largest and most significant findings in each source study. This bias is enabled by EPA’s decisions to rely only on 1-P results, its use of GAM-based analyses rather than GLM-based estimates, and its use of model formulations with the least amount of controls of those reported in each paper. EPA has elected not to tighten the current annual standard in light of its qualitative acknowledgment of these types of uncertainties, but the PR provides a weaker case for this reasonable conclusion it actually could make. The quantitative risk analysis and related new epidemiological evidence similarly fail to provide a technical basis for tightening the daily PM2.5 standard, yet the PR does not make this case even weakly. Instead, the PR obscures this case by not providing a balanced discussion of the sensitivities in the new evidence on short-term mortality associations with PM2.5. The PR 13 CRA International also does not fully reveal the extent of the decline in EPA’s quantitative risk estimates since 1997. An Integrated Uncertainty Analysis in the Risk Analysis Would Reveal the Eroding Confidence Levels EPA is clearly aware that there is enormous uncertainty in the newly available evidence. The PR solicits comments on its methodology for evaluating the uncertainty and significance of risks to public health, and specifically on “methods and approaches for conducting a more formalized uncertainty analysis.” 16 F F EPA’s methodology has been to rely on single “base case” estimates in a quantitative risk analysis, combined with occasional references to sensitivities in these results. The figures provided above provide a quite different story, revealing that EPA’s current methodology has resulted in overstatement of the risk levels and overstatement of confidence in PM2.5-health associations. Lack of a clear comparison to earlier risk estimates also obscures what is a pronounced trend towards lower risk estimates and eroded confidence levels. The net effect is that EPA’s case to support tightening the daily standard is not supportable with the full evidence. This tendency toward overstatement of risks in EPA’s approaches to handling uncertainty can be averted merely by more complete and clear representation of the evidence, without any formal uncertainty analysis. As the section above has demonstrated, merely providing complete and quantitative information from the full body of epidemiological evidence can provide a far clearer synopsis of the evolution of confidence in the associations. At the same time, a carefully conducted synthesis of the evidence into an integrated uncertainty analysis of the quantitative risk estimates could also help in standard-setting deliberations. I have provided detailed comments on such an approach and examples using the currently available evidence in previous written comments to EPA, Smith (2003, 2005) which I incorporate here by reference. Figure 6 provides an example of how different the information resulting from an integrated uncertainty analysis can be from the “base case” approach that is the hallmark of EPA’s analysis. This figure is taken from my earlier written comments to EPA, and its derivation is documented there in this set of comments. I will only discuss its interpretation and implications. The histograms in Figure 6 represent full probability distributions from an integrated uncertainty analysis that weights EPA base case models as well as others from the full body of relevant literature. 17 They can be interpreted as follows. The x-axis is the percent incidence of long-term mortality attributed to PM2.5 in risk calculations (the same metric as the y-axis in Figure 4). 18 Each bar of the histogram reports the probability that F X F X F F 16 PR, p. 2653. Although the details of which models are included is documented in Smith (2003), briefly, they include alternative models shown in Figure 4 as well as others, such as those from the Six-Cities long-term cohort. 18 In this example, however, the percent incidence is that associated with as-is PM2.5 rather than with PM2.5 at exact attainment of the current PM2.5 standards. Hence the incidence levels are somewhat higher than they were in Figure 4. 17 14 CRA International the true risk falls in the range of percent incidence that the bar sits over. The y-axis is the probability associated with each bar. The first bar is colored differently from others because it reflects the likelihood that there is no PM2.5 risk at all. That is, its height reflects the probability that risk is exactly 0. 19 The yellow bars show the probability of various levels of positive PM2.5 risk, expressed for ranges of risk levels. For example, the leftmost yellow bar shows the probability that the percent incidence lies in the range greater than 0% and less than or equal to 2%, and is positioned in the center of that range. The next bar reflects risks greater than 2% and less than or equal to 4%. F F The blue horizontal lines in Figure 6 show the ranges of the 95% confidence intervals that EPA reports for its base case estimates of as-is risk for these two cities. 20 These are based on just the one model from Pope et al. (2002) that EPA has selected for its base case. 21 EPA’s base case estimates show a significant effect in both cities at as-is PM2.5 levels (i.e. no part of the EPA confidence intervals includes 0 percent incidence, where the red bar is located). These intervals reflect only the statistical variance associated with the underlying relative risk estimate selected from Pope et al. (2002), which was statistically significant. F F F F As this section has demonstrated, many other relative risk estimates do exist that are not statistically significant, and the integrated uncertainty analysis incorporates their effect as well as the effect of the statistically significant ones such as EPA uses for its base case estimates. These other estimates account for the non-zero probability of “no effect” (the red bar of the integrated uncertainty analysis probability distribution.) 19 Technically, this bar reflects a discrete pulse of probability associated with the single risk value of 0, whereas all other parts of the probability distribution are continuous, with zero probability for any single risk value. 20 Technically, there is also a probability distribution over these ranges, which is a normal (bell-shaped) distribution centered over the point estimate that is shown as the circle on each bar. Only the ranges are shown here, for simplicity. 21 EPA reports these two as-is risk ranges in Post et al. (2005), Exhibit 7.2, p. 85. 15 CRA International Figure 6. Examples of Probability Distributions on Percent Incidence of Mortality Due to LongTerm Exposure to As-Is PM2.5 Resulting from an Integrated Uncertainty Analysis, Compared to “Confidence Intervals” Resulting from EPA’s Approach of Relying on Base Case Estimates Only. The blue horizontal lines reflect the EPA base case as-is risk estimate’s 95% statistical confidence interval, based on Pope et al. (2002). The circle reflects EPA’s base case point estimate of as-is mortality risk. The histogram provides the full probability distribution from an integrated uncertainty analysis that weights the EPA-selected model with others from the full body of relevant literature. The red bar reflects the likelihood of “no PM2.5 risk” and yellow bars reflect relative likelihood of various levels of positive PM2.5 risk. Documentation of analysis provided in Smith (2003). 60% Los Angeles Probability estimated for 50% each range of PM2.5 risk on x-axis 40% EPA base case 95% confidence interval based on Pope et al. (2000) 30% 20% 24-26% 22-24% 20-22% 18-20% 16-18% 14-16% 12-14% 10-12% 6-8% 4-6% 2-4% No PM2.5 risk at all 0-2% 0% 8-10% 10% 60% Philadelphia 50% 40% EPA base case 95% confidence interval based on Pope et al. (2000) 30% 20% 10% 24-26% 22-24% 20-22% 18-20% 16-18% 14-16% 12-14% 10-12% 8-10% 6-8% 4-6% No PM2.5 risk at all 2-4% 0% 0-2% Probability estimated for each range of PM2.5 risk on x-axis Ranges of percent of long-term mortality incidence attributed to PM2.5 in risk calculations 16 CRA International One benefit of a more explicit, probabilistic approach is therefore that it provides a more complete and unbiased summary of the overall evidence than can be obtained with a deterministic approach that emphasizes results from a single selected base case model. The most dramatic effect of the integrated uncertainty analysis approach is how much probability is attributed to the outcome that there is no increased mortality risk at all from PM2.5. It is over 35% for as-is conditions in Los Angeles, and over 55% for as-is conditions in Philadelphia. 22 In contrast, the EPA method of reliance on base case model results suggests that there is zero chance of no effect at all. This comparison could, of course, be reversed if EPA were to select one of the available statistically insignificant model results as its base case. However, that result also would be biased. The key point is that any time a single base case model is adopted, the risk estimate that is produced using it will be biased. Even if the base case model is selected such that its relative risk estimate lies near the middle of the range of all model results, its confidence interval will not be wide enough to reflect the true range of modeling uncertainty. F F Another important point that emerges from this illustrative integrated uncertainty analysis is that the true uncertainty is highly asymmetric around EPA’s base case point estimates. That is, the true probability distribution has much greater probability of risks below EPA’s base case estimate than it has probability of risks above EPA’s base case estimate. This directly contradicts the implication of EPA’s “confidence intervals” which imply that the chances that risks are higher or lower than the point estimate are equal (i.e., they follow the normal distribution associated with variances of statistically-estimated relative risks). This asymmetry is because many of the modeling uncertainties that are ignored in EPA’s base case approach would reduce the estimated risks resulting from that simplistic approach. These include: (a) possibilities of thresholds that the models have not been able to identify and (b) model results that are lower than the ones that EPA has selected. To the extent that the models not selected for the base case are statistically insignificant, a larger and larger pulse of probability becomes associated with the “no effect” outcome when these model results are incorporated into an integrated uncertainty analysis. This overstatement of risks from EPA’s deterministic risk analysis approach was recognized by CASAC in its comments to EPA on the draft Staff Paper: “It is unfortunate that a more comprehensive, quantitative characterization of uncertainty has not been undertaken, even if it only took into account several sources of uncertainty simultaneously. …There is also likely to be directionality to the degree of uncertainty, with greater uncertainty around effects at lower, compared with higher PM levels. Overall, the chapter tends to understate uncertainty, both through style, (e.g., inclusion of numerically specific estimates, e.g., “403” deaths rather than “400” or “about 400”, and by not bringing together the individual sensitivity analyses.” 23 F F 22 The difference is due to higher as-is pollution levels in Los Angeles, as the same set of relative risk coefficients and weights are applied identically in both cities for long-term mortality risks. 23 US EPA (2005), p. C-9. 17 CRA International One benefit of a more explicit, probabilistic approach is therefore that it provides a more complete and unbiased summary of the overall evidence than can be obtained with EPA’s deterministic approach. Another benefit of an integrated uncertainty analysis, which is not possible to demonstrate with just these figures for as-is risk, is that an integrated uncertainty analysis can provide direct evidence of how the uncertainty in further health benefits starts to expand as one has to decide among lower and lower potential NAAQS levels. This concept of expanding uncertainty appears to be a rationale underlying some of EPA’s discussions of where to set the PM2.5 standards. It has obvious merits in a situation where the standard cannot be “risk-free,” yet also cannot be set by balancing incremental costs against the incremental risk reduction. A natural step would be for EPA to adopt a probabilistic risk methodology that can directly provide estimates of the range of uncertainty in incremental risk reductions associated with incremental tightening of the standard. 18 CRA International III. EPA’s “Evidence-Based” Approach Does Not Support Tightening the Daily Standard to 35 μg/m3. EPA makes a case to tighten the daily standard based, in part, on a risk analysis that overstates current evidence of PM2.5 risks, and an incomplete summary of the trends in the risk analysis’ estimates. In Part II of these comments, I presented my reasons for why that case to tighten the PM2.5 NAAQS is not supported by the full evidence. However, even if one accepts that the daily standard should be tightened, EPA then uses an evidence-based approach in the PR to argue that the right level for a revised daily standard is 35 μg/m3. I have reviewed this evidence-based argument, which appears on p. 2649 of the PR, and have a number of concerns with its technical basis. My concerns include: (a) that several of EPA’s statements about the findings in specific epidemiological studies are incorrect or overstated, (b) that the PR offers an incomplete review of the full relevant body of evidence, also resulting in overstatement, and (c) that the presentation of evidence is so unclear that readers cannot readily observe that EPA’s conclusions are not supported even by the evidence that the PR provides. Corrections to Statements in PR’s Evidence-Based Case The case EPA makes for a daily PM2.5 standard of 35 μg/m3 on p. 2649 is strictly verbal, with very complex and lengthy sentences. While this is a confusing way to present any evidence, it also contains several factual misstatements. To start my review of the case, I have therefore reproduced the PR’s exact text of this case in column 1 of Table 2, broken into its structural segments on separate, sequential rows of the table. 24 In column 2, I have then provided annotated comments on each of the PR’s statements. X F X F EPA’s overall argument to set the daily PM2.5 standard at 35 μg/m3 is based on three parts: (i) (ii) That there is much evidence of an effect in studies with 98th percentile values above 35 μg/m3 (i.e., in a range down to 39 μg/m3), That there is more mixed evidence among studies with 98th percentile PM2.5 in the range of 30-35 μg/m3, 25 and That not much information is available for studies that had PM2.5 98th percentiles below 30 μg/m3. F (iii) F Following this structure, I have broken Table 2 into parts (i), (ii), and (iii). X X 24 If one starts to read down column 1 from Part (i) to Part (ii) to Part (iii) of Table 2, one will have the entire text from p. 2649 of the PR’s evidence-based argument for where to set the daily standard based on daily health effects studies for PM2.5. The only changes I have made are some added punctuation and formatting to enhance clarity. 25 The gap between 39 μg/m3 and 35 μg/m3 is caused by a lack of any studies in this range. EPA could just as easily have defined the first range as being “down to 35 μg/m3”, or the second range as being “from 3039 μg/m3” without any loss of generality in its argument for setting the standard at 35 μg/m3. 19 Table 2 - Continued CRA International Table 2. Comments on Statements Made by EPA’s in its Evidence-Based Case for a Daily PM2.5 Standard of 35 μg/m3. Part (i): Statements about Studies with 98th Percentile Values Down to 39 μg/m3 Based on the information in the Staff Paper and a supporting 42 staff memo, the Administrator observes an overall pattern of statistically significant associations reported in studies of short-term exposure to PM2.5 across a wide range of 98th percentile values. More specifically, there is a strong predominance of studies with 98th percentile values down to 3 about 39 μg/m (in Burnett and Goldberg, 2003) reporting statistically significant associations with mortality, hospital admissions, and respiratory symptoms. Burnett and Goldberg (2003) is a mortality paper for 8 Canadian cities. The 39 μg/m3 is the 98thile provided by the authors to EPA for all 8 cities and, according to Ross and Langstaff (2005), is based on “averaged annual values for years in study” – there were 11 years. The act of averaging the values across the years will reduce the variance, and thus will generate a 98th percentile that could be substantially lower than the actual 98th percentile of the air concentrations in the dataset. Also, no information is provided on a city by city basis, so it is unclear what the 98th percentile was for the individual cities in that dataset. It is also unknown if only a couple of cities were driving the significant finding for the pooled set of cities. Thus, it seems inappropriate to use this particular study’s 98th percentile estimate for attempting to identify boundaries where significance levels drop off. If this data point is dropped because of its variance-reducing bias, the next lowest 98th percentile value in this range would be 42 μg/m3 associated with Schwartz (2003a) and Klemm and Mason (2003) – i.e., for Boston in the “Six-Cities” study. 42 As will be noted below, several of the papers the PR cites in this excerpt are not mentioned on p.5-30 of the Staff Paper, nor in the supporting memo, Ross and Langstaff (2005). Further, several of the papers cited have not been reanalyzed to address statistical modeling issues (i.e., the GAM problem). Also, as will be discussed below, most of the statistically significant associations found in the studies cited above were not robust to co-pollutant modeling, but only occurred in 1-P formulations. (Part IV of my comments also provides a detailed review of the lack of robustness of PM2.5 associations to inclusion of gaseous co-pollutants, demonstrating the incorrectness of the statement in this footnote of the PR.) As discussed in the Staff Paper (EPA, 2005a, p. 5–30) and supporting staff memo (Ross and Langstaff, 2005), staff focused on U.S. and Canadian short-term exposure PM2.5 studies that had been reanalyzed as appropriate to address statistical modeling issues and considered the extent to which the reported associations are robust to co-pollutant confounding and alternative modeling approaches and the extent to which the studies used relatively reliable air quality data. For example, within this range of air quality, statistically significant associations were reported for mortality in: the combined Six City study (and three of the individual cities within that study) (Klemm and Mason, 2003), As in the comment regarding Burnett and Goldberg (2003), there are difficulties using a 98th percentile for a group of cities, and so reference to the “combined Six City” results should be eliminated. Four of the six cities in the Six City study have 98th percentiles at or above 42 μg/m3, and only one of those four cities had a statistically significant effect in a majority of the model formulations reported (i.e., Boston – the city with the lowest 98th percentile of the group, at 42 μg/m3). A second city (St. Louis) had mixed statistical significance. The remaining 2 cities (Steubenville and Knoxville) had no statistically significant results. Thus, although this study does report some statistically significant associations in the range down to 42 μg/m3, its findings are far more mixed than EPA’s statement suggests. This study only had 1-P formulations, leaving no indication of whether they might be robust to co-pollutant modeling. the Canadian 8-City Study (Burnett and Goldberg, 2003), This study does present robustly statistically significant associations for the combined set of eight cities. However, it does not reveal any city-specific results, to help understand if the associations are related to all or just a few of the 8 cities. Further, it is very difficult to know how to characterize a 98th percentile that can be associated with these findings. The level of 39 μg/m3 is biased downwards by an unknown degree. and in studies in Santa Clara County, CA (Fairley, 2003) This study reports associations that are statistically significant for same-day PM2.5, but which is negative for PM2.5 at a one-day lag. Only the former was was subjected to co-pollutant modeling, and it was robustly significant. The 98th percentile for this dataset is 59 μg/m3 20 Table 2 - Continued and Philadelphia (Lipfert et al., 2000a); CRA International This study’s results were not at all robust to 2-P formulations, and it does not support a statement in favor of a “strong predominance” of statistical significance in this range of air quality. In fact, the authors concluded that the associations observed in 1-P formulations should actually be attributable to ozone, not PM2.5, based on their response to co-pollutant modeling. for hospital admissions and emergency department visits: in Seattle (Sheppard, 2003), The statistical significance found in this paper is limited to a 1-P model formulation using the GAM estimation technique. GLM and 2-P formulations are also reported in this paper, and they are a mix of borderline significant and insignificant. Toronto (Burnett et al., 1997; Thurston et al., 1994), Burnett et al. (1997) use the GAM method and has not been reanalyzed. Its inclusion here violates EPA’s policy not to rely on such studies unless they have been reanalyzed. Even so, it did not find robustly significant associations. Thurston et al. (1994) also did not find robustly significant associations for PM2.5, especially after considering the role of ozone in co-pollutant modeling. They state at p.282: “This points out the importance of considering as many pollutants as possible in such analyses, in order to diminish the chances of being misled as to which of the many ambient air pollutants is actually culpable for any noted air pollution-health effects associations.” Detroit (Ito, 2003, for ischemic heart disease and pneumonia, but not for other causes), Ito (2003) considers 4 types of cardiac admissions and 2 types of respiratory admissions in 1-P formulations only. He does find a statistically significant association for pneumonia, and a borderline significant association for heart failure. Ischemic heart disease was not statistically significant, contrary to EPA’s statement; nor were three other categories considered. No gaseous co-pollutant modeling results were reported. and Montreal (Delfino et al., 1998, 1997, for some but not all age groups and years); Delfino (1998) finds only a borderline significant PM2.5 association in a 1-P formulation that disappears in a 2-P formulation, with ozone becoming the dominant explanatory pollutant. 26 F F Delfino (1997) has a 98th percentile of 31.2 μg/m3and thus should not be placed in this group of studies for a range of “down to 39 ug/m3). Furthermore, its findings are more mixed than this suggests. It does find a statistically significant association between PM2.5 and emergency room visits for >64 years age in a 1-P formulation, but its significance is utterly eliminated in a 2-P formulation (where ozone, however, remains significant). for respiratory symptoms in panel studies: in a combined Six City study (Schwartz et al., 1994) Schwartz et al. (1994) uses the GAM method and has not been reanalyzed. Its inclusion here violates EPA’s policy not to rely on such studies unless they have been reanalyzed. Additionally, this study is for a combined set of cities, without reporting city-specific associations or city-specific 98th percentile values. Therefore, as is the case with Burnett and Goldberg (2003), it is difficult to use the 98th percentile value reported for this study to attempt to determine where to set a daily standard. The 98th percentile value is not provided in the supporting staff memo that the PR cites. A memo released only on April 5, 2006 reveals its 98th percentile level is 48μg/m3. and in two Pennsylvania cities (Uniontown in Neas et al., 1995; State College in Neas et al., 1996); These 2 papers also are not mentioned in the supporting staff memo that the PR cites. A memo released only on April 5, 2006 (Ross and Langstaff (2006)) reveals these datasets have 98th percentile levels of 60 and 69 μg/m3, respectively. Despite having relatively high PM levels compared to other studies that EPA has considered, both papers find 26 The PR actually cites to the wrong Delfino et al. (1998) paper, referencing a paper about PM10 and asthma in Los Angeles. I use the correct citation in my comments here. 21 Table 2 - Continued CRA International mixed results over 4 types of respiratory symptoms (colds, cough, wheeze, and changes in PEFR). Across both studies, only cough was associated with PM2.5 in a statistically significant manner under both 1-P and 2-P formulations. and for lung function in Philadelphia (Neas et al., 1999). 43 None of the associations between PEFR and PM2.5 in this paper are statistically significant, even in 1-P modeling. (EPA’s statement on p. 2630 of the PR that associations in this paper are statistically significant is not supported by a review of the original paper.) Again, this paper is not mentioned in the supporting staff memo that the PR cites. Ross and Langstaff (2006)) reveals this dataset has a 98th percentile levels of 45 μg/m3. 43 Of the studies within this group that evaluated multipollutant associations, as discussed above in section II.A.3, the results reported in Fairley (2003), Sheppard (2003), and Ito (2003) were generally robust to inclusion of gaseous co-pollutants, whereas the effect estimate in Thurston et al. (1994) was substantially reduced with the inclusion of O3. Ito (2003) did not present any 2-P formulations with gaseous co-pollutants included. The only 2-P formulations shown included coarse fraction simultaneously with PM2.5, and in those results, even the few significant associations (pneumonia and heart failure hospital admissions) became insignificant. Part IV of my comments provides a detailed review of the evidence on inclusion of gaseous co-pollutants, and demonstrated that the results are not “generally robust” as this footnote of the PR says. Studies in this air quality range that reported positive but not statistically significant associations with mortality include studies in: As the entire logic of this analysis is to seek a level of air quality at which the degree of statistical significance in findings starts to drop off, studies that find positive associations that do not rise to the level of statistical significance should be viewed as examples that undermine the statement that there is a predominance of statistically significant results within this range has been extended too low. For example, Schwartz and Neas (2000) state that “lower respiratory symptoms in a two-pollutant model were associated with …fine particles…but not coarse particles” and the supporting evidence for this statement was that there was an odds ratio >1 for PM2.5 that was statistically significant, and there was an odds ratio >1 for coarse particles also, but it was not statistically significant (Schwartz and Neas, 2000, p. 6). Detroit (Ito, 2003), This study found no statistically significant association between PM2.5 and mortality at a 98th percentile of 55 μg/m3. Pittsburgh (Chock et al., 2000), This study found no statistically significant association between PM2.5 and mortality at a 98th percentile of about 75 μg/m3. and Montreal (Goldberg and Burnett, 2003). This study found no statistically significant association between PM2.5 and mortality at a 98th percentile of about 53 μg/m3. Part (ii): Statements about Studies with 98th Percentile Concentrations between 30 and 35 μg/m3 Within the range of 98th percentile PM2.5 concentrations of 3 about 35 to 30 μg/m , this strong predominance of statistically significant results is no longer observed. Rather, within this range, some studies report statistically significant results: The two examples cited next should not be characterized as reporting statistically significant “results”, but only as having some model formulations that are statistically significant. Neither of the two papers cited rise to the level of having found an overall statistically significant set of associations, as explained below, but even these two papers individually report mixed results, at best. The main distinction in the evidence cited in this clause and the following clause is that Mar et al. and Ostro et al. are mortality studies, while Delfino et al. and Peters et al. are 22 Table 2 - Continued CRA International morbidity studies -- but all four of them find “mixed results.” Mar et al., 2003; Mar et al. did report some statistically significant results, but the statistical significance was not robust in about 2/3 of their results. The data for Mar et al. (2003) was for Phoenix, and was studied by three different sets of researchers, including also Smith et al. (2000) and Clyde et al. (2000). The findings for Phoenix should reflect this entire suite of modeling using the same data, and those results are far more mixed than represented by Mar et al. alone. Ostro et al., 2003. This study finds only one statistically significant result, while the majority of the associations it reports between PM2.5 and mortality are insignificant. other studies report mixed results in which some associations reported in the study are statistically significant and others are not: Delfino et al., 1997 This study finds a statistically significant association between PM2.5 and emergency room visits for >64 years age in a 1-P formulation, but its significance is utterly eliminated in a 2-P formulation (where ozone, however, remains significant). Other associations reported were not significant even in a 1-P formulation. Peters et al., 2000 This was a study of the frequency of discharge of implanted defibrillators, effectively a “symptoms” study. For 6 of 100 patients in the study, one out of several PM2.5 associations (all 1-P) was significant, but “the strongest associations were observed for NO2….including both pollutants into one model reduced the effect estimate of PM2.5 effectively to 0, whereas the effect estimate of NO2 was unchanged.” There was no statistically significant association for PM2.5 and defibrillator discharges in any of the other 100 patients in the study. This paper also was not cited in Ross and Langstaff (2005). Ross and Langstaff (2006), released only on April 5, 2006, shows its 98th percentile to be 31.7 μg/m3. and another study reports associations in two of six cities that are not statistically significant (Klemm and 44 Mason, 2003). 44 For example, Delfino et al. (1997) report statistically significant associations between PM2.5 and respiratory emergency department visits for elderly people (>64 years old), but not children (<2 years old) in one part of the study period (summer 1993) but not the other (summer 1992). Peters et al. (2000) report new findings of associations between fine particles and cardiac arrhythmia, but the Criteria Document observes that the strongest associations were reported for a small subset of the study population that had experienced 10 or more defibrillator discharges (EPA, 2004, p. 8–164). This clause moves back to a mortality study. It is misleadingly worded, and needs clarification: Within the six cities studied in Klemm and Mason (2003), there are two cities whose 98th percentiles are in the range of 30-35 μg/m3, and neither has any significant findings even in the 1-P formulations that are all that are presented. (The other four cities are in the range of >39 μg/m3, they are mostly insignificant as well.) This footnote is consistent with my comments above. It should have been placed with the preceding clause. 23 Table 2 - Continued CRA International Part (iii): Statements about Studies with 98th Percentile Concentrations below 30 μg/m3 Further, the very limited number of studies in which the 98th percentile values are below this range [i.e., 30 to 35 μg/m3] do not provide a basis for reaching conclusions about associations at such levels: Stieb et al., 2000 Stieb et al. (2000) used the GAM method and it has not been reanalyzed. Its inclusion here violates EPA’s policy not to rely on such studies unless they have been reanalyzed. It should be dropped, leaving only one study in this range at all. Peters et al., 2001 This study reports a significant association between myocardial infarction and PM2.5 in 1-P formulations. No 2-P formulations were presented. However, the study uses a procedure for controlling for PM2.5 that has since been shown to be biased (Jane, Sheppard & Lumley, 2004). This paper also was not cited in Ross and Langstaff (2005). Ross and Langstaff (2006), released only on April 5, 2006, shows its 98th percentile to be 28.2 μg/m3. 24 CRA International More Complete Information to Supplement Evidence-Based Case in PR After making the corrections to the factual inaccuracies that I have identified in column 2 of Table 2, the evidence of a “strong predominance” of significant associations for dataset with 98th percentiles above 39 μg/m3 appears to be unsupported. Additionally, the dividing line at 39 μg/m3 is not a location where one starts to find clearly less mixed results above than below. Even so, the evidence cited in PR is not complete. X X In order to perform a more complete review of the body of evidence on PM2.5, I developed a list of all the epidemiological studies for short-term PM2.5 that I could identify that met the following criteria: • Is cited in the CD (papers published after the CD cut-off date are therefore not included, since these are not supposed to be a part of EPA’s current evidence-based rationale). • Is a short-term health effects study. • Is based on US or Canadian datasets. • Has no GAM problem. (If a GAM problem existed in a paper, only reanalyzed results were considered.) 27 F F • Used directly measured PM2.5. (Studies that “filled” missing values were included, but studies that estimated all the PM2.5 values from visibility or other measures were not included.) • Considered any type of effect that could be categorized as a clear health impact. This included aggravation of asthma, changes in lung function measures (e.g., PEFR, FEV1), and detected arrhythmias. 28 F F I found studies for 38 specific combinations of type of health effects and PM2.5 dataset, that I call “locations,” and list in Table 3 . I report cities’ results individually wherever possible, if the study is a multi-city study. Two multi-city papers do not provide city-specific data: Burnett and Goldberg (2003) and Schwartz and Neas (2000). Of these 38 “locations” 25 are cited on p. 2649 of the PR. I reviewed each study to determine the general significance level that it found for the PM2.5 association specifically. I ranked them into one of three categories: “no overall significant association,” “mixed significance” and “overall significant association.” By “overall significant association,” I mean that a majority of the regressions in the paper produced statistically significant associations. If a 2-P result is provided, it must also be statistically significant to be placed in this category, unless there is evidence of X 27 X This requirement caused me to drop three of the studies that the PR cites in its case on p. 2649: Stieb et al. (2000), Burnett et al. (1997), and Schwartz et al. (1994). However, the last of these was replaced in my review by Schwartz and Neas (2000) which analyzes the same effects, and finds the same general associations for PM2.5, but which does not appear to have the kind of GAM usage that was subject to the convergence problem. 28 It did not include measures of heart rate variability, an association with which may be indicative of some kind of physiological response to PM2.5 exposure, but whose significance to actual health outcomes remains unknown. This criterion only excluded two studies that otherwise fit these criteria: Gold (2003) and Liao et al. (1999), neither of which EPA uses in its evidence-based case for setting the standard at 35 μg/m3. CRA International multicollinearity problems in the 2-P model. 29 A ranking of “no overall significant association” was assigned if the majority of the results in the paper are insignificant even if a statistically significant result exists in the paper. If there is only one 1-P and one 2-P result reported, and the 2-P is insignificant, I assigned it to this category, unless there is evidence of multicollinearity problem in that the 2-P result. F F My specific rankings are shown in Table 3. Appendix A provides my rationale for each assigned ranking. Table 3 also shows the 98th percentile PM2.5 level associated with each location. Where possible, these values are from Ross and Langstaff (2005) or Ross and Langstaff (2006). However, many of these studies are not listed in those documents, and I estimated their 98th percentile PM2.5 from other relevant the distributional data provided in the cited paper(s). My estimation methods are documented in Appendix B. However, it should be noted that all the values that I had to estimate fall above the 40 μg/m3 level, and so my estimates do not affect how one might consider setting a standard in ranges from 40 μg/m3 downwards, if that is the interval of concern. X X X X 29 Such evidence exists when both the PM and gaseous pollutant would become insignificant in a 2-P formulation even though both are significant in their respective 1-P formulations. 26 CRA International Table 3. List of Locations and Associated Papers Reporting Short-Term Epidemiological Findings Location Impact Category Newark, NJ Camden, NJ Elizabeth, NJ Mortality Mortality Mortality Steubenville, OH Mortality Los Angeles, CA Santa Clara County, CA Pittsburgh, PA Detroit, MI Montreal, Canada Philadelphia, PA St. Louis, MO Mortality Knoxville, TN Mortality Boston, MA Mortality 8 Canadian Cities Mortality Madison, WI Mortality Coachella Valley, CA Mortality Phoenix, AZ Mortality Topeka, KS Mortality Papers Studying this Impact for this Location 98th Percentile PM2.5 Robustness of Association for PM2.5 (*) 94.2 84.4 84.0 2 2 1 81.5 2 Mortality Tsai et al. (2000) Tsai et al. (2000) Tsai et al. (2000) Schwartz (2003a), Klemm and Mason (2003) Moolgavkar (2003) 60.4 1 Mortality Fairley (2003) 59.0 3 Mortality Mortality Mortality Mortality Chock et al. (2000) Ito (2003), Lippmann et al. (2000) Goldberg and Burnett (2003) Lipfert et al. (2000a) Schwartz (2003a), Klemm and Mason (2003) Schwartz (2003a), Klemm and Mason (2003) Schwartz (2003a), Klemm and Mason (2003) Burnett and Goldberg (2003) Schwartz (2003a), Klemm and Mason (2003) Ostro et al. (2003) Mar et al. (2003), Smith et al. (2000) and Clyde et al. (2000) Schwartz (2003a), Klemm and Mason (2003) Moolgavkar (2003) Ito (2003) Thurston et al. (1994) Sheppard (2003) Tolbert et al. (2000) Delfino et al. (1998) Delfino et al. (1997) Peters et al. (2001) Ostro et al. (2001) Neas et al. (1996), Schwartz and Neas (2000) Ostro et al. (1991) Neas et al. (1995), Schwartz and Neas (2000) Linn et al. (1999) Delfino et al. (1996) Schwartz and Neas (2000) Naeher et al. (1999) Zhang et al. (2000) Neas et al. (1999) Korrick et al. (1998) Peters et al. (2000) 56.3 55.2 53.1 44.2 1 1 1 2 43.6 2 43.5 1 42.0 3 38.9 3 34.3 1 33.4 1 32.2 2 32.0 1 60.4 55.2 51.0 46.6 41.5 40.7 31.2 28.2 112.0 1 3 1 2 1 1 1 3 3 69.0 2 60.3 1 60.0 3 59.1 51.1 48.0 45.1 45.1 44.9 41.2 31.7 1 1 3 2 1 1 2 1 Los Angeles, CA Detroit, MI Toronto, Canada Seattle, WA Atlanta, GA Montreal, Canada Montreal, Canada Boston, MA Los Angeles, CA Morbidity: Hospital Visits Morbidity: Hospital Visits Morbidity: Hospital Visits Morbidity: Hospital Visits Morbidity: Hospital Visits Morbidity: Hospital Visits Morbidity: Hospital Visits Morbidity: Hospital Visits Morbidity: Symptoms State College, PA Morbidity: Symptoms Denver, CO Morbidity: Symptoms Uniontown, PA Morbidity: Symptoms Los Angeles, CA San Diego, CA 6 US Cities Virginia Virginia Philadelphia, PA New Hampshire Massachusetts Morbidity: Symptoms Morbidity: Symptoms Morbidity: Symptoms Morbidity: Symptoms Morbidity: Symptoms Morbidity: Symptoms Morbidity: Symptoms Morbidity: Symptoms (*) 1=“no overall significant association,” 2=“mixed significance of findings” and 3=“overall significant association.” See Appendix A for my definition of criteria for these three categories and my rationale for assigned rankings. 27 CRA International Synthesizing and Interpreting the Full Body of Evidence Information as extensive and complex as that in Table 3 is easier to interpret if presented in alternative formats. Figure 7 presents a graphical summary for all the relevant studies that EPA should use in making its evidence-based case, based on the data in Table 3. The blue diamonds are studies EPA cited on p. 2649, and the red diamonds are studies EPA has not called (and for which I therefore had to estimate the 98th percentile value). The Burnett and Goldberg (2003) and the Schwartz and Neas (2000) studies are shown as an unfilled diamonds to emphasize that these 98th percentile values are for a combination of individual cities, and therefore cannot be compared to the others in this analysis. The two dotted red horizontal lines reflect the dividing lines of the three categories in EPA’s evidence-based argument. X X X X X X When this complete set of evidence is organized into this internally-consistent summary format, it becomes clear that EPA’s arguments to set the standard at 35 μg/m3 do not conform with the evidence. For example, there is no evidence of a “predominance of statistically significant findings” above the 39 μg/m3 line. Nor is there any higher level of PM2.5 98th percentile above which statistical significance is more common than below. While it is true that evidence of significance is mixed for studies in the 30-35 μg/m3 range, it is just as mixed for studies above the current standard of 65 μg/m3 – and particularly so for the category of mortality. When the Burnett and Goldberg study (the unfilled diamond) is ignored for the reasons stated above, the lowest 98th percentile level for which a statistically significant mortality effect is robust is at 42 μg/m3 – and this is for Boston from the Six Cities database on which the current standards were based in the first place. In other words, if considering just the mortality evidence, EPA is left with making a case to tighten the daily standard using the same study that was used when the present standards were set. Moreover, since that time, the overall robustness of the association in that study has been proven to be less than it was thought to be in 1997. Further, among the mortality studies, only 2 out of 10 that are in the range of 42 μg/m3 up to the current daily standard of 65 μg/m3 find robust statistical significance. The second of these is Fairley (2003) for San Jose, with a 98th percentile of 59 μg/m3. Turning to the morbidity evidence, little further guidance appears. For hospital admissions and emergency room visits, there is no pattern of significance at all. There is one study at the very low 98th percentile of 28.2 μg/m3 that finds a robustly significant result, which is Peters et al. (2001). The only other study in this category with a robustly significant effect has a 98th percentile of 55.2 μg/m3 (i.e., Ito (2003)), and there are five studies in the intervening PM2.5 levels that find do not find a robust association. 28 CRA International Figure 7. Summary of Robustness of Statistical Significance as Function of PM2.5 98th Percentile, for Three Categories of Severity of Health Effects. Blue diamonds are for studies where EPA has directly reported 98th percentile values in the study. Red diamonds are for studies that EPA has not included in its evidence-based arguments and has not provided 98th percentile information. For these studies, 98th percentile PM2.5 has been estimated from other relevant data in the respective publications, as documented in Appendix B. Unfilled diamonds are for studies where multiple cities’ data were combined, thus not reflecting a true population-level 98th percentile exposure. “NS”=no overall significance in dataset”; “Mixed”=mixed significance of findings for the dataset; “Signif”=overall significant associations in the dataset. Mortality ER and Hosp. Adm. Symptoms 120 98th percentile PM2.5 in study location 110 100 90 80 70 60 50 40 39 30 30 20 NS Mixed Signif NS Mixed Signif NS Mixed Signif 10 0 29 CRA International The third category is studies of respiratory and cardiovascular symptoms. These studies mostly consider changes in measures of pulmonary function (e.g., PEFR and FEV1), or in reported incidence of coughing or wheezing. One study in this group assessed changes in frequency of cardiac arrhythmia, using defibrillator data (it did not find a robust significant effect however.) Once again, there are too few studies with robustly significant findings to discern any trend. The evidence thus does not provide a coherent case for setting the daily standard at 35 μg/m3. It also does not provide a case for tightening the daily standard, given the continued prominence of the Boston mortality association within the statistically significant results. And most of all, the evidence does not provide any apparent alternative level at which to set the standard, if one is intent on tightening it. Nevertheless, there are some studies that do find robust associations at levels below the current standard of 65 μg/m3. Thus, if one is intent on tightening the daily PM2.5 standard, one’s best approach might be to start by exploring the methodological merits of individual studies finding robust associations at the lowest PM2.5 levels, and work upwards until a study is found that has strong methodological properties. I will therefore go through this exercise next. The study with robust significant findings at the lowest PM2.5 98th percentile level is Peters et al. (2001), followed by Burnett and Goldberg (2003), then the Boston results from the Six Cities database. Peters et al. (2001). This paper is the one with the lowest 98th percentile of all those identified, yet it also is among the few that report a robustly significant association between PM2.5 and health – in this case, with likelihood of onset of myocardial infarction. The 98th percentile in the time period studied was 28.2 μg/m3. The method used is different from all of the other short-term PM2.5 mortality and morbidity papers reviewed here – a “case-crossover approach.” Most short-term health effects studies for mortality and hospital visits/admissions explore the association between daily numbers of deaths or hospital visits/admissions and daily air quality. Individuals are not tracked at all. The case-crossover method is quite different in that it identifies individuals who have experienced a particular health event (myocardial infarction, in this case), and then explores the differences between air quality just prior to the time of that event and at other, “referent times” when the event did not occur. U U The case-crossover design is an accepted statistical approach, considered to have substantial merits for use in epidemiology. However, it is also a relatively new approach and its use in Peters et al. (2001) stands as a methodological outlier within the PM2.5 epidemiological literature reflected in Table 3 and Figure 7 above. Given the fact that this paper also appears to provide the strongest case for possibly tightening the PM2.5 daily standard, its application of this relatively new statistical approach merits some scrutiny. X X X X A critical issue in case-crossover design is how to select the “referent times” for the statistical controls. This is a judgment that is in the hands of the researcher and there are many alternative ways that the referent times can be selected – each of which can produce different 30 CRA International statistical findings, of course. Some referent selection methods can introduce statistical biases, and thus are less desirable. Janes et al. (2004) characterize the biases associated with alternative referent selection schemes, and report that a “restricted unidirectional” selection method is subject to intractable forms of biases. Peters et al. (2001) use this restricted unidirectional method. Of nineteen case-crossover studies of air pollution exposures between 1999 and 2004 noted by Janes et al., Peters et al. (2001) appears to be the only one that used this biased referent selection method exclusively. In light these statistical concerns with the way Peters et al. (2001) have applied the casecrossover approach, it would be prudent not to let this single study serve as a basis for where to set the PM2.5 daily standard. Additionally, this study did not report any PM2.5 results that also controlled for gaseous pollutants. Part IV of my comments explains why this should be another cause for caution in the weight that this study should receive in an evidence-based approach. Burnett and Goldberg, 2003. This paper is a reanalysis of part of a much more extensive study, Burnett et al. (2000), that is affected by the GAM default setting problem. This is a study of short-term mortality in eight cities in Canada. Across a range of 1-P formulations, this paper does find mostly statistically significant associations between mortality and PM2.5. However, the only results reported are for all eight cities combined, yet alternative smoothing strategies reported in Burnett and Goldberg (2003) – particularly those that allowed the smoothing to be different for each city – substantially reduce the size of the PM2.5 relative risk estimate and also render the PM2.5 association statistically insignificant. The authors note that there is insufficient evidence from this study to conclude that the association varies across the cities. Nevertheless, their results also make it difficult to consider using the 98th percentile across all eight cities as an indicator of what level of 98th percentile in a given city might account for the associations observed in this study. The 98th percentile of 38.9 μg/m3 reported in Ross and Langstaff (2005) is for all eight cities combined. It is not clear how the city’s individual daily PM2.5 values were used to develop a single 98th percentile for the combined set, but air quality summary statistics in the original paper (Burnett et al. (2000)) indicate that the 98th percentiles for the eight individual cities likely range between about 27 and 48 μg/m3. If the statistically significant association is being driven by one or more of the cities with higher levels of PM, then the 98th percentile that should be assigned to this study would be higher than 39 μg/m3. (The opposite could be true as well, but if the associations are being driven by cities with lower rather than higher PM2.5, this would raise yet other important questions for the NAAQS.) Thus, a primary concern with relying on the data point of 39 μg/m3 from this study to consider where to set a daily standard for PM2.5 is the fact that this value is not comparable to the 98th percentiles for all of the other studies whose findings are being evaluated in this manner. 30 There are, however, a number of other issues associated with use of this study that merit mention. F F 30 Note, for example, that my analysis breaks the results from the Six Cities studies of mortality into their individual city-specific 98th percentiles. 31 CRA International First, the original study did consider the role of PM2.5 in conjunction with gaseous copollutants, and those analyses found that PM2.5 and PM10-2.5 together appeared to have much less explanatory power than the gaseous pollutants. These multi-pollutant explorations were not repeated for the reanalysis, so we cannot know if this result would remain when applying correct statistical methods. However, at present, the results EPA is using in its evidencebased approach are strictly from 1-P models. As Part IV of my comments will explain, that in itself is a concern. However, the concern is heightened in this case because earlier analyses suggested that indeed the PM2.5 association in this study is not robust to inclusion of gaseous pollutants. Second, Montreal is one of the eight cities in this study. The same core group of researchers (Goldberg and Burnett (2003)) could find no statistically significant associations for Montreal alone using similar methodology. Delfino et al. (1997 and 1998) were unable to find any morbidity associations for PM2.5 in Montreal. Toronto is also one of the eight cities and Thurston et al. (1994) could find no PM2.5 association with morbidity there. These other studies raise concerns about the role of individual cities in driving the combined-city associations reported in Burnett and Goldberg (2003), yet there is no such information available to better assess and understand the implications and robustness of its findings. Boston results from Six Cities Dataset. Boston is one of the cities for which a statistically significant short-term PM2.5 mortality association was first reported in Schwartz et al. (1996). The current PM2.5 standards were based on the association found for Boston in this dataset. Although the 98th percentile for Boston in this dataset is 42 μg/m3, the combination of an annual standard of 15 μg/m3 and a daily standard of 65 μg/m3 was found to provide the requisite level of public health protection in Boston and elsewhere. Since 1997, the Boston results have been reanalyzed in Schwartz (2003a) and in Klemm and Mason (2003). The original GAM-based finding was not much affected by reanalysis with correct convergence settings. However, a series of alternative temporal smoothing, and linear estimation methods were explored. Alternative degrees of smoothing did produce a progressive decline in the original size of effect, and in the most controlled case, the PM2.5 association was statistically insignificant. 31 Nevertheless, this Boston dataset provides one of the lowest 98th percentile levels for which a robust association appears. F F A final note that has heightened relevance today compared to 1997 is that the only associations reported from the Boston dataset are single-pollutant formulations. As Part IV of my comments explains, there is substantial evidence available since 1997 to know that 1-P formulations generally overstate the role of PM2.5. Fairley (2003). This paper is often also cited as one of the reasons to lower the standard, but as can be seen, it actually has a relatively high 98th percentile of 59 μg/m3. Nevertheless, its annual average PM2.5 level (13.6 μg/m3) is among the lowest of the studies with robustly significant associations, and a closer look at some of the features of this study also is warranted. 31 The pattern of sensitivity in these extra analyses can be observed in Figure 3 above. That figure shows the combined-city result’s sensitivity, but it is mirrored by the Boston city-specific results, which appear to be a key driver of the combined-city effects. 32 CRA International A key attribute of the dataset used for this study is the magnitude of the decline over time in PM2.5 levels that occurs within the period analyzed. Table 4 shows that although the 98th percentile and annual average levels in this dataset were below the current standards when averaged over the entire time period, PM2.5 levels were actually quite high in the earliest years of the study period. There is no discussion or information provided in the paper about the possibility that the PM2.5 associations found in this dataset might be driven by the higher levels in earlier years, or how they vary over time. Few other epidemiological papers address this question, but few of them have relied on data with such pronounced trends in the air quality being associated with health effects. X X Table 4. PM2.5 Levels (μg/m3) in Dataset Used in Fairley (2003) Source: Fairley (1999) 98th percentile Annual mean 1990 88 18.4 1991 51 15.5 1992 48 13.8 1993 50 12.9 1994 44 12.6 1995 32 10.3 1996 25 9.5 One other concern associated with this study is the fact that the significant PM2.5 association occurs for same-day PM2.5, but the association is actually quite negative when a 1-day lag is considered. While a true causal association would likely reveal greater effects with some lags than with others, it does raise some concern when a large and robust effect for one lag is accompanied by a complete reversal of the association for another lag that differs by only a single day. The same-day PM2.5 association is robust in various formulations including gaseous pollutants, but the negative association for a 1-day lag is never again explored. In conclusion, there are a number of significant questions remaining regarding Fairley (2003) that makes it a poor candidate as a basis for a tighter daily PM2.5 standard. Conclusions from the More Complete Evidence-Based Approach In this section, I have reviewed each of the statements made by EPA in its evidence-based case in the PR and made a number of corrections. I have also identified and incorporated elements of the relevant literature that are missing from the case presented in the PR. Finally, I summarized this information in a graphical format more useful for interpretation and decision making. Having done this, it became apparent that EPA’s arguments for setting the daily standard at 35 μg/m3 are not supported by the evidence. I find that there is in fact no obvious level above which there is a clear “predominance” of significant studies. As I have explained in Part II, there is no clear case in the risk analysis to tighten the daily standard at all. That finding is probably related to the problem of finding a reasonable level for a standard using an evidence-based approach. Nevertheless, a complete analysis of the current evidence leaves EPA with a quite arbitrary decision on where to draw the line. With the choice of where to set the standard not possible to be guided by any patterns or trends in the evidence, one is forced to consider the individual merits of just a few key studies that are salient in that they do find statistically significant results at air quality levels below the current standards. My review in this manner leads me to conclude that Peters et al. (2001) 33 CRA International poses too many significant methodological concerns to be a basis for a stringent daily standard. The next lowest study, Burnett and Goldberg (2003), leaves some important methodological questions, but the primary difficulty in relying on this study to set a daily standard is that its reported 98th percentile level is not really a value associated with a single location that has been demonstrated to have a PM2.5-health association. The relevant PM2.5 level to associate with that study is likely above the combined-city level of 39 μg/m3, perhaps in the mid to high 40s. This brings us to the Boston dataset from the Six Cities study with a 98th percentile of 42 μg/m3. The primary methodological concern with this study is that it relies on only a 1-P formulation (as do the other two discussed above). Ironically, this is the same dataset on which the current standards were based, and since that time, some additional uncertainties associated with its findings have been elucidated in reanalyses. Nevertheless, it remains a study that one could argue would still serve as a lower bound for setting a daily standard. Given that the current standards were set on the basis of this study originally, and the somewhat eroded robustness of its findings since then, one could reasonably conclude not to tighten the current standard at all. This would be consistent with my overall conclusions that the estimated risks are lower now than in 1997, and that uncertainties with the epidemiological evidence have become heightened since 1997. 34 CRA International IV. EPA Understates the Sensitivity of PM2.5 Risk Associations to Inclusion of Gaseous Co-Pollutants The PR repeated states that the PM2.5 associations with health are “generally robust” to the inclusion of gaseous pollutants in a 2-P model formulation. This statement is offered as a reason that the evidence is stronger for these associations now than in 1997. It is also used as a reason to rely strictly on 1-P model results in the risk analysis, even when 2-P results are available. Much therefore rests on this statement, yet it is inconsistent with the actual evidence. In this part of my comments, I document how the inclusion of gaseous copollutants in a statistical model generally erodes the confidence in any PM2.5 association that might be supported by a 1-P model result. Short-Term Studies Following the selection criteria I described in Part III for identifying short-term PM2.5 health effects studies, I identified 34 papers that provided estimates of the association between PM2.5 and one or more health endpoints, ranging from mortality to subtle changes of unknown significance, such as heart rate variability (HRV). Of the 34 papers, 13 reported both 1-P results for PM2.5 and also 2-P results that included at least one gaseous co-pollutant simultaneously with PM2.5. 32 Table 5 lists those papers and summarizes the outcomes of the 1-P and 2-P formulations. All but one of those studies did find a statistically significant PM2.5 association is a 1-P formulation. Table 5 shows that in all but two of those cases where the 1P formulation found a significant association with PM2.5, the PM2.5 association became insignificant in the 2-P model. Additionally, in all but one of those cases, the gaseous copollutant would remain significant, thus eliminating an argument that the sensitivity of the PM2.5 association must be due to multicollinearity. 33 (If multicollinearity were a problem, both pollutants would become insignificant.) F F X X X X F F Of the 13 studies that included both 1-P and 2-P results, only two studies (Fairley (2003) for mortality and Gold (2003) for heart rate variability) found a PM2.5 effect that was robust to inclusion of gaseous pollutants. This is quite strong evidence that EPA is incorrectly stating in the PR that 1-P formulations are reasonable to continue to use. This is an important point because EPA is relying primarily on 1-P results to build its case for the need to tighten the PM2.5 standards, both in the quantitative risk analysis and in its evidence-based approach. Table 6 lists the 20 papers that report only 1-P model results. A majority of them are being using the in the PR as part of the evidence in favor of tightening the daily PM2.5 standard. X X 32 A 14th paper (Zhang et al. 2000) also performed multi-pollutant modeling that included consideration of PM2.5 as well as the gaseous pollutant NO2. However, no results were provided in the paper for PM2.5 in 1-P form so that a comparison on 1-P and 2-P results is not possible. Instead the authors reported that PM2.5 was not significant when considered in combination with all the best predictors, based on a forward stepwise method of choosing explanatory variables. 33 Naeher et al. (1999) is the one case where a multi-pollutant formulation appears to suggest that variance inflation is the root cause of PM2.5’s lost significance, rather than the fact that the gaseous pollutant had greater statistical explanatory power. 35 CRA International Table 5. Summary of Sensitivity of 2-Pollutant Modeling on PM2.5 Health Associations Across All Relevant Studies in the Criteria Document (*) Paper City Effect Estimated Delfino et al., 1997 Sheppard, 2003 Lipfert et al., 2000a Delfino et al., 1996 Montreal Seattle Philadelphia ER visits Hosp adm Mortality San Diego Korrick et al., 1998 NH Mtns Thurston et al., 1994 Moolgavkar, 2003 Toronto Los Angeles Delfino et al., 1998 Peters et al., 2000 Montreal E. Mass Naeher et al., 1999 Fairley, 2003 SW Virginia Gold et al., 2003 Santa Clara Co, CA Boston Chock et al., 2000 Pittsburgh Was any PM2.5 coefficient significant? Gaseous pollutant signif in 2-P? Gaseous Pollutant 1-P Yes Yes Yes 2-P No No (**) No Yes Yes Yes O3 CO O3 Asthma symptoms Lung function indicators Hosp adm Yes No Yes O3 Yes No Yes O3 Yes No Yes O3 Hosp adm Mortality ER visits Arrhythmia symptoms Lung function indicators Mortality Yes Yes Yes Yes No No No No Yes Yes Yes Yes CO, NO2 CO O3 NO2 Yes No No Several Yes Yes HRV Yes Yes Yes for peak O3 Mixed Mortality No No No NO2, O3, CO O3, NO2, SO2 Several (*) Note: Zhang et al. (2000) also performed multi-pollutant modeling that included consideration of PM2.5 as well as NO2. However, no results were provided in the paper for PM2.5, either alone or in combination with other pollutants; the authors reported that PM2.5 was not significant when considered in combination with all the best predictors, based on a forward stepwise method of choosing explanatory variables. It therefore cannot be included in Table 5. (**) For Sheppard (2003), the reanalyzed GAM-based 1-P and 2-P results were both significant. However, the GAM code produces a biased standard error that overstates significance levels, and hence EPA states that GLMbased results are viewed as more reliable and should be used when available. The GLM-based 1-P result is significant, while the GLM-based 2-P result in this paper is insignificant (albeit borderline), and the relative risk level is reduced. Additionally, all four of the seasonal coefficients for the 2-P GLM models are insignificant. X X 36 CRA International Table 6. Summary of Papers in the Criteria Document Reporting PM2.5 Health Associations for 1-P Formulations Only (*) Paper City Klemm & Mason, 2003; Schwartz, 2003a Boston, MA St. Louis, MO Knoxville, TN Madison, WI Steubenville, OH Topeka, KS 8 Canadian cities Detroit Burnett & Goldberg, 2003 Ito, 2003 Mar et al., 2003 Smith et al., 2000 Clyde et al., 2000 Schwartz and Neas, 2000 (**) Ostro et al., 2003 Tsai et al., 2000 Peters et al., 2001 Goldberg & Burnett, 2003 Neas et al., 1995 Effect Estimated Mortality Mortality Mortality Mortality Mortality Is Paper Used in the Risk Analysis? X X Mortality Mortality Mortality Hosp adm Mortality Mortality Mortality Resp symptoms Resp symptoms Is Paper Cited in Evidence Based Arguments of Proposed Rule? X X X X X X X X X X X X X X X X X Phoenix Phoenix Phoenix 6 US cities State College, PA Uniontown, PA Coachella, CA Elizabeth, NJ Newark, NJ Camden, NJ Boston Montreal Resp symptoms Mortality Mortality Mortality Mortality Hosp adm Mortality X X X X X X X Uniontown, PA Resp symptoms X Neas et al., 1996 State College, Resp symptoms X PA Tolbert et al., 2000 Atlanta ER visits Linn et al., 1999 Los Angeles Resp symptoms Liao et al., 1999 Baltimore HRV Ostro et al., 1991 Denver Resp symptoms Neas et al., 1999 Philadelphia Resp symptoms X Ostro et al., 2001 So. Calif. Resp symptoms (*) Note: Some of these studies did include 2-P models in an original paper that was affected by the GAM statistical errors. This summary does not account for any results that were not reanalyzed and reported in HEI (2003). (**) Grant et al. (2002) list this paper as potentially having a GAM problem, but our review of this paper suggests this is not the case, and so it is listed here. 37 CRA International Flaws in the PR Case. Contrary to the evidence in Table 5, the PR states that recent shortterm studies are “generally robust” to inclusion of gaseous co-pollutants.34 I will now explain the flaws in the supporting evidence that the PR offers for this statement. X X F F The first evidence the PR cites is two studies that did not include any measures of PM2.5: Domenici et al. (2003) and Schwartz (2003b). The PR then cites studies that did include PM2.5, but the PR makes a more equivocal statement: “Similar results are seen in some single-city studies using PM2.5 for some health outcomes in which the single-pollutant model association was statistically significant” (emphasis added). 35 EPA cites four papers to support this statement: F • • • • F Fairley (2003) Ito (2003) for hospital admissions for heart failure and pneumonia Sheppard (2003) Gold et al. (2003) As Table 5 shows, Fairley (2003) and Gold (2003) do indeed find robustness to 2-P formulations. However, Sheppard (2003) finds that the association for PM2.5 does become insignificant in a 2-P formulation with CO, when relying on the GLM results rather than the GAM results (as I noted in Table 5). Further, Ito (2003) does not report any 2-P results that included a gaseous pollutant simultaneously with PM2.5, which is why it does not even appear in Table 5. 36 X X X X X F X F As Table 5 shows, Fairley (2003) and Gold (2003) are just two of 13 studies in which both 1-P and 2-P results were reported. While this does qualify as “some” of the studies, when these two studies are placed in context of the total body of evidence, the case for robustness to inclusion of a gaseous pollutant is clearly incorrect. X X The PR’s discussion then diverts attention to the impact on the size of the PM2.5 association. First, this ignores the more important implication, which is that the 2-P modeling suggests there may be no causal role of PM2.5 at all in face of consideration of gaseous pollutants as well. Second, the supporting evidence EPA provides is miniscule: “The size of the effects estimates were little changed in other studies as well in which the single-pollutant model associations were not statistically significant” (emphasis added) 37 – and then they cite unspecified insignificant results in Ito (2003), which as already noted does not contain any F F 34 PR, p. 2634. PR, p. 2634. 36 The original paper that Ito (2003) reanalyzes did find a robust response to 2-P modeling, but reanalyzed 2-P results are not reported in Ito (2003). Ito (2003) at p. 153 does report 2-P results that included PM10 and PM2.5 simultaneously, which is not the type of 2-P that is being discussed here. However, even so, these results reveal that the PM2.5 associations for hospital admissions for heart failure and pneumonia are insignificant in these runs (and also insignificant for all other mortality and morbidity effect reported). The information provided in Figure 7 on Ito’s p. 153 also reveals that the reanalysis with strict convergence for the GAM resulted in insignificance where it had once been significant for pneumonia admissions under the default GAM convergence criteria of the original study. 37 PR, p. 2634. 35 38 CRA International 2-P model results, and Chock et al. (2000), which happens to be the single study in Table 5 that did not find a significant 1-P effect! It is unsurprising that there would be little change between the 1-P and 2-P effects if the former is insignificant to start with. X X The PR’s statements regarding the robustness of PM2.5 results to 2-P modeling goes on to note that “In yet other studies, however, for some combinations of pollutants in some areas, substantial reductions in the size of the effect estimates for PM2.5 were observed.” 38 For this they cite only Moolgavkar (2003), Thurston et al. (1994), and Delfino et al. (1998). Strangely, they only cite the hospitalization impacts in Moolgavkar (2003) even though he reports the same effect for mortality impacts. More importantly, Table 5 reveals that EPA could have cited seven additional papers to support this statement: Delfino et al. (1997), Delfino et al. (1996), Korrick et al. (1998), Lipfert et al. (2000a), Peters et al. (2000), Naeher et al. (1999), and (arguably) Sheppard (2003). F X F X EPA concludes by attempting to diminish the import of any findings based on 2-P models by stating that “collinearity between co-pollutants can make interpretation of such multipollutant model results difficult.” 39 This is a baseless criticism, which may be why EPA provides no supporting citations. If collinearity were the cause of the sensitivity of the model results to 2-P formulations, then both the PM2.5 and gaseous pollutant coefficients would face reduction in significance levels. However, Table 5 reveals that the gaseous pollutant remained significant in 9 of the 10 papers where the PM2.5 effect was significant in the 1-P case but rendered insignificant in a 2-P case with a gaseous pollutant. Only the Naeher et al.(1999) paper would appear to reflect a problem of collinearity rather than a case of a gaseous co-pollutant having the more important explanatory power. F F X X Thus, EPA has made an incorrect case that PM2.5 associations are generally robust to inclusion of gaseous co-pollutants through the compounded effects of: 1. 2. 3. 4. Referring to PM10-only results; Incorrectly citing two of four papers as evidence in favor of robustness; Citing only three of ten papers that provide evidence of non-robustness; and Arguing incorrectly that a statistical difficulty of collinearity complicates the findings of non-robustness, when the actual evidence reveals that to be a potential problem for only one of the ten papers finding non-robustness. When all of the evidence is considered, the short-term studies reveal a substantial concern that a spurious association may be found for PM2.5 if one relies on 1-P results. However, EPA’s risk analysis and most of the figures in the PR showing results from individual papers rely solely on 1-P results, even when 2-P results are available. 40 Further, Table 6 reveals that the majority of PM2.5 short-term epidemiological papers provide only 1-P results for PM2.5 associations. F F X X 38 PR, p. 2634. PR, p. 2634. 40 Cases where the risk analysis relies on 1-P results but 2-P results are available are Moolgavkar (2003) for both mortality and morbidity in Los Angeles, Lipfert et al. (2000a) for mortality in Philadelphia, Fairley (2003) for mortality in San Jose, Chock et al. (2000) for mortality in Pittsburgh, and Sheppard (2003) for asthma in Seattle. 39 39 CRA International Long-Term Studies It is also noteworthy that the long-term study on which EPA has based its quantitative risk analysis, that of the ACS cohort, has a well-established sensitivity to the inclusion of SO2 in 2-P formulations with PM2.5. Krewski et al. (2000) clearly demonstrated this was an issue with the 1-P results first reported by Pope et al. (1995): “We observed a stronger association between sulfur dioxide levels and mortality from all causes in the ACS Study than between either fine particles or sulfate and all-cause mortality….The fact that sulfur dioxide was a stronger predictor of mortality than was sulfate does not appear to be due to the larger number of sulfur dioxide measurements. …The sulfur dioxide effect on mortality risk was diminished for the best-educated subjects, a pattern we also observed with exposure to fine particles and sulfate. However, the sulfur dioxide effect, unlike the fine particle effect, was not the strongest for the least-educated subjects.” 41 F F The inclusion of SO2 dramatically reduced the size of the estimated relative risk for PM2.5, and rendered the PM2.5 association statistically insignificant. “The inclusion of sulfur dioxide, which has a positive association with mortality (RR=1.30, 95% CI: 1.23-1.38) … reduces the relative risk [for sulfate] from 1.16 to 1.04…[and] loss of formal statistical significance.” 42 “The relative risk of all-cause mortality, as with sulfate, was diminished after adjustment for …sulfur dioxide (Table 37).” 43 Table 37 reveals that the PM2.5 association falls from RR = 1.20 in the 1-P case to 1.03 in the 2-P case with SO2. Additionally it goes from strong statistical significance in the 1-P case (CI: 1.11-1.29) to insignificance in the 2-P case (CI: 0.95-1.13). 44 F F F F F F Despite this widely discussed finding, the extended analyses of Pope et al. (2002) do not report any 2-P results for PM2.5 and SO2, even though it does report that SO2 has a significant association in its own 1-P formulation. 45 Importantly, EPA’s risk analysis continues to rely on only 1-P results from ACS cohort studies, including using the 1-P rather than 2-P result that is available in Krewski et al. (2000). (The 2-P result for SO2 is discussed only in a sensitivity analysis, combined with 2-P results for other gaseous pollutants that were not significant.) While there may be reasons, as EPA notes, to question the causality of the SO2 association, that does not justify ignoring the observed sensitivity of the long-term mortality association to inclusion of this gaseous pollutant. F F The other long-term study that EPA has used in its risk analysis, and uses in its evidencebased discussions is the “Six Cities Study” of Dockery et al. (1993), and reanalyzed by Krewski et al. (2000). This study, because of its very small number of cities, would not allow 2-P formulations to be tested. Technically speaking, this “was not practical because of the limited number of degrees of freedom (at most 6 df) for further analyses. The ACS Study, 41 Krewski et al. (2000), p. 224. Krewski et al. (2000), p. 179. 43 Krewski et al. (2000), p. 181. 44 Krewski et al. (2000), p. 184. 45 Pope et al. (2002), p. 1140. 42 40 CRA International which involved 154 cities with a wide range of pollutant concentration profiles, was not seriously affected by this limitation.” 46 F F There are other long-term mortality studies that EPA does not rely on to build its case for tightening the PM2.5 standard that also consider the impact of 1-P and 2-P formulations. Lipfert et al. (2000b) found no significant effect for PM2.5 and mortality in any of a number of 1-P formulations. 47 They did find significant 1-P associations for O3 and NO2, which they subjected to 2-P formulations to check for robustness. These 2-P formulations (which used other PM indicators than PM2.5) found that the gaseous pollutant associations were robust, and further, that O3 appeared to be the dominant gaseous pollutant over NO2. 48 F F F F Flaws in the PR Case. I have already demonstrated significant flaws in the PR case that short-term PM2.5 associations are robust to inclusion of gaseous co-pollutants. The PR briefly extends that argument to long-term studies with the single sentence: “Further, associations between long-term exposure to PM2.5 and mortality were not generally sensitive to inclusion of co-pollutants, with the notable except of the inclusion of SO2 in multipollutant models us in the reanalysis of the ACS study.” 49 F F This also is an incorrect summary of the evidence. The PR does recognize the non-robustness in the ACS study, but does not acknowledge that this is the only long-term mortality paper that EPA is relying on that reports any 2-P results. Its sensitivity to 2-P formulations is therefore a major concern. The PR also fails to recognize that gaseous co-pollutants had the robust association in the only other long-term mortality study that considered any 2-P formulations. 46 Krewski et al. (2000), p. 216. In fact, almost all of the results find negative associations for PM2.5 and mortality, most of which are statistically significant in the negative direction. This set of results alone should be given some weight in EPA’s risk analysis and evidence-based approach, but is not. 48 Lipfert et al. (2000b), pp. 60-61. 49 PR, p. 2634. 47 41 CRA International Appendix A Explanation of Basis for Robustness Ratings of PM2.5 Studies This appendix details the rationale used for summarizing the studies listed in Table 3 of my report in accordance with the classification system outlined as follows. I reviewed each study to determine the general significance level that it found for the PM2.5 association specifically. I ranked them into one of three categories: “no overall significant association,” “mixed significance” and “overall significant association.” By “overall significant association,” I mean that a majority of the regressions in the paper produced statistically significant associations. If a 2-P result is provided, it must also be statistically significant to be placed in this category, unless there is evidence of multicollinearity problems in the 2-P model. 1 A ranking of “no overall significant association” was assigned if the majority of the results in the paper are insignificant even if a statistically significant result exists in the paper. If there is only one 1-P and one 2-P result reported, and the 2-P is insignificant, I assigned it to this category, unless there is evidence of a multicollinearity problem in the 2-P result. Table A.1 summarizes results in short-term mortality studies that included PM2.5. Table A.2 summarizes results in short-term morbidity studies (including hospital admissions and emergency room visits.) Table A.3 summarizes results in studies of associations between symptoms and short-term PM2.5 exposures. In the ratings that appear in these tables the following numbering system is used: 1 = No overall statistically significant association in the study. 2 = Mixed statistical significance among models reported in the study. 3 = Statistically significant associations are the overall pattern in the study. 1 Such evidence exists when both the PM and gaseous pollutant would become insignificant in a 2-P formulation even though both are significant in their respective 1-P formulations. 42 CRA International Table A.1 – Mortality Studies Study Ostro et al. (2003) – Coachella Valley, CA Schwartz (2003), Klemm and Mason (2003) – Knoxville, KY Schwartz (2003), Klemm and Mason (2003) – Steubenville, OH Schwartz (2003), Klemm and Mason (2003) – Madison, WI Schwartz (2003), Klemm and Mason (2003) – Topeka, KS Rating 1 1 2 1 1 Mean (μg/m3) 15.8 20.8 29.6 11.2 12.2 98th Percentile (μg/m3) Discussion 33.4 Ostro et al. is a reanalysis of an earlier paper measuring the relationship between pollutants and mortality using a GAM model with stricter convergence and a separate GLM analysis. Authors state: “No association with cardiovascular mortality was found for PM2.5 or any of the gaseous pollutants” (p 201) and “[a]s in the original analysis, associations were detected for 0-day, 1-day, and 2-day lags for both PM10 and coarse particles, but not for PM2.5.” (p 202). Accordingly, given the lack of significance, we have rated this measured association as a “1”. 43.5 Schwartz provides 5 mortality risk estimates for PM2.5 in Knoxville using a single-pollutant specification and finds two to be significant. Schwartz did not conduct a multi-pollutant analysis. Klemm and Mason have made risk estimates for the identical dataset using additional smoothing methods and find no significance in the risk estimates for Knoxville using a GLM model. Accordingly, given the lack of significance, we have rated this measured association as a “1”. 81.5 Schwartz provides 5 mortality risk estimates for PM2.5 in Steubenville using a singlepollutant specification and none were found to be significant. Schwartz did not conduct a multi-pollutant analysis. Klemm and Mason conducted risk estimates for the identical dataset, but using additional smoothing methods and similarly find no significance in the total risk estimates for Steubenville. However, using a GLM model Klemm and Mason find that PM2.5 has a significant association with COPD (3 instances) and with pneumonia in a single instance. Accordingly, given the significant association, specifically with COPD, we have rated this measured association as a “2”. 34.3 Schwartz provides 5 mortality risk estimates for PM2.5 in Madison using a single-pollutant specification and none were found to be significant. Schwartz did not conduct a multipollutant analysis. Klemm and Mason conducted risk estimates for the identical dataset, but using additional smoothing methods and similarly find no significance in the total risk estimates for Madison. Accordingly, given the lack of significance, we have rated this measured association as a “1”. 32.0 Schwartz provides 5 mortality risk estimates for PM2.5 in Topeka using a single-pollutant specification and none were found to be significant. Schwartz did not conduct a multipollutant analysis. Klemm and Mason conducted risk estimates for the identical dataset, but using additional smoothing methods and similarly find no significance in the risk estimates for Topeka. Accordingly, given the lack of significance, we have rated this measured association as a “1”. 43 CRA International Table A.1 – Mortality Studies continued Study Ito (2003) – Detroit, MI Tsai et al. (2000) – Elizabeth, NJ Moolgavkar (2003) – Los Angeles, CA Chock, Winkler, and Chen (2000) – Pittsburgh, PA Rating 1 1 1 1 Mean (μg/m3) 18.0 37.1 22.0 20.5 98th Percentile (μg/m3) Discussion 55.2 Ito contains six mortality risk estimates for PM2.5 in Detroit in a single-pollutant specification and all six are found to be insignificant. Although not reported in the paper, the authors estimated PM2.5 risks with many other models and a large portion of them apparently produced negative findings. Ito does not present a two-pollutant analysis. Given the lack of significance, we have rated this measured association as a “1”. 84.0 Authors explore the association between pollutants and total deaths and cardiovascular deaths in Elizabeth, NJ. Authors find no statistical association between FPM and the two death measures. Authors also investigated whether there was any association between deaths and FPM arising from various industries. They found no significant association between FPM and total deaths in all industries considered, though significance between FPM and two of the three industrial classifications for cardiorespiratory deaths. Given the over-all lack of significance, we have rated this measured association as a “1”. 60.4 This paper contains 67 mortality risk estimates for PM2.5 in Los Angeles, plus a large number for other pollutants. Of these estimates for PM2.5, only six were found to be significant. A key point of the paper was to explore the relative roles of PM2.5 and other pollutants. Seven of the estimates were from two-pollutant models using CO and five of those seven were found to be insignificant for PM2.5. In contrast, CO was considerably more likely to be significant, where half of the 12 two-pollutant formulations that included CO were found to be significantly positive for CO. The author stated, “it is clear that CO was the best single index of air pollution associations with health endpoints, far better than … PM2.5.” (p 198) Given PM2.5’s lack of significance, we have rated this measured association as a “1”. 56.3 This paper is primarily about PM10 risks as the authors downplay the usefulness of the PM2.5 results due to data limitations. The paper contains eight risk estimates for PM2.5 and all are found to be insignificant. The authors included other pollutants to explore their relative role and all other pollutants were found to also be insignificant. Given PM2.5’s lack of significance as well as the lack of conclusions that can be drawn from these results, we have rated this measured association as a “1”. 44 CRA International Table A.1 – Mortality Studies continued Study Schwartz (2003), Klemm and Mason (2003) – Boston, MA Schwartz (2003), Klemm and Mason (2003) – St. Louis, MO Lipfert et al. (2000) – Philadelphia, PA Tsai et al. (2000) – Newark, NJ Rating 3 2 2 2 Mean (μg/m3) 15.7 18.7 17.3 42.1 98th Percentile (μg/m3) Discussion 42.0 Schwartz provides 5 mortality risk estimates for PM2.5 in Boston using a single-pollutant specification and all five were found to be significant. Schwartz did not conduct a multipollutant analysis. Klemm and Mason conducted risk estimates for the identical dataset, but using additional smoothing methods and find less significance in the risk estimates for Boston than Schwartz. Klemm and Mason conduct additional estimates for total mortality risk and various sub categories using increasing degrees of statistical controls and find much lower and insignificant estimates in the most controlled formulation. However, given the overall high degree of statistical significance, we have rated this measured association as a “3”. 43.6 Schwartz provides 5 mortality risk estimates for PM2.5 in St. Louis using a single-pollutant specification and all five were found to be significant. Schwartz did not conduct a multipollutant analysis. Klemm and Mason conducted risk estimates for the identical dataset, but using additional smoothing methods and find less significance in the risk estimates for St. Louis than Schwartz. Klemm and Mason conducted additional estimates for total mortality risk and various sub categories using increasing degrees of statistical controls and find much lower and insignificant estimates in the most controlled formulation. Further, the vast majority of the estimates that do not use the GAM method were found to be insignificant. Accordingly, given the uncertainty over the appropriate level of statistical controls and the findings of the models not using uncorrected GAM, we have rated this measured association as a “2”. 44.2 This study provides 30 risk estimates for PM2.5 mortality risk. Of these 30 estimates, 14 effects were found to be significant. A key point of the paper was to explore the relative roles of PM2.5 and other pollutants. Seven of the estimates were from two-pollutant models using ozone and all were found to be insignificant for PM2.5 while the risk estimates for ozone were all significant, causing the authors to state: “The most immediate implications of this study are the apparent variability of the PM results and the robustness of peak O3 as predictors of daily mortality.” Due to these mixed findings, we have rated this measured association as a “2”. 94.2 Authors explore the association between pollutants and total deaths and cardiovascular deaths in Newark, NJ. Authors found a statistical association between FPM and the two death measures in Newark. Authors also investigated whether there was any association between deaths and FPM from various industries. They found a significant association between FPM and totals deaths in 3 of the 7 industries considered, and in one industry when considering cardiorespiratory death. Given the over-all level of significance combined with somewhat mixed findings at the industrial level, we have rated this measured association as a “2”. 45 CRA International Table A.1 – Mortality Studies continued Study Tsai et al. (2000) – Camden, NJ Mar et al. (2003) – Phoenix, AZ Fairley (2003) – Santa Clara County, CA Goldberg and Burnett (2003) – Montreal, Canada Rating 2 2 3 1 Mean (μg/m3) 39.9 13.5 13.6 17.4 98th Percentile (μg/m3) Discussion 84.4 Authors explore the association between pollutants and total deaths and cardiovascular deaths in Camden, NJ. Authors find a statistical association between FPM and the two death measures in Camden. Authors also investigated whether there was any association between deaths and FPM from various industries. They found a significant association between FPM and totals deaths in two of the seven industries considered, and in two industries when considering cardiorespiratory death. Given the over-all level of significance combined with somewhat mixed findings at the industrial level, we have rated this measured association as a “2”. 32.2 Mar et al. provide ten risk estimates for PM2.5 under a single-pollutant specification. Of these ten estimates, three were found to be significant. The authors considered the case of other pollutants (e.g. CO, NO2, SO2) and found similar results. The paper does not provide any risk estimates for two-pollutant formulations. Accordingly, due to these mixed findings, we have rated this measured association as a “2”. 59.0 This paper provides 28 risk estimates for PM2.5 and of these estimates, 16 were found to be significant. Ten of these estimates are from single-pollutant formulations, two of which are significant. This paper also explores the relative roles of PM2.5 and other pollutants in two-pollutant models. The eighteen two-pollutant models started from a single-pollutant formulation that was significant for PM2.5. PM2.5 was found to be significant in 16 of the resulting two-pollutant specifications. The paper also explores seasonal patters in risk estimates and finds no pattern. On the strength of the findings of significance between PM2.5 and mortality, even in 2-P models, we have rated this measured association as a “3”. 53.1 This paper of is a reanalysis of an earlier set of papers that used the GAM method, and the results in the current paper differ dramatically from the prior studies. The paper presents 16 estimates for an association between PM2.5 and non-accidental mortality using natural spline smoothing and all 16 are found to be insignificant. The paper also presents 36 estimates for an association between PM2.5 and various sub-groups of mortality, again using natural spline smoothing. Nearly all of these estimates, 35, are insignificant. The paper does not provide any multi-pollutant analyses. Given the complete lack of significance detailed in this paper, we have rated this measured association as a “1”. 46 CRA International Table A.1 – Mortality Studies continued Study Burnett and Goldberg (2003) Rating 3 Mean (μg/m3) 13.3 98th Percentile (μg/m3) 38.9 Discussion This paper is a reanalysis of an earlier paper by Burnett et al. (2000) though it is more limited in scope. The updated study describes 14 reanalyzed mortality risk estimates for PM2.5 pooled over all 8 cities. Two of these estimates used the GAM as in the original paper and were found to be significant. The remaining 12 included variations on the degree of smoothing and on the data included in the analysis, and in most of these cases PM2.5 was found to be significant. The paper does not provide any multi-pollutant analyses. On the strength of the findings of a significant association between PM2.5 and mortality, we have rated this measured association as a “3”. 47 CRA International Table A.2 – Morbidity Studies: Hospital Admissions/Emergency Room Visits Study Delfino et al. (1997) – Montreal, Canada Peters et al. (2001) – Boston, MA Sheppard (2003) – Seattle, WA Rating Mean (μg/m3) 98th Percentile (μg/m3) 1 12.1 31.2 3 2 12.1 16.7 Discussion Delfino et al. studied the association between emergency room visits for respiratory illnesses and air pollution in Montreal, Canada. Authors indicate that “[t]here were no significant associations between air pollutants and ER visits for patients in the age groups from 2 to 64 yr of age[.]” (p 571) PM2.5 with a one-day lag was shown to be significantly associated with ER visits for individuals over the age of 64 in a single-pollutant model, though this significance failed to be robust in a two-pollutant case with ozone included. In a separate analysis that controlled for weather and day-of-week trends, PM2.5 with a oneday lag was found to be significantly associated with an increase in the mean level of ER visit for individuals over the age of 64 in a single-pollutant case. Because PM2.5’s significance in a single-pollutant model was found not to be robust to a multi-pollutant specification, we have rated this measured association as a “1”. 28.2 Peters et al. examined the association between particulate air pollution and the onset of myocardial infarction (MI) for 772 patients in Boston, MA. Authors find that there is a mild significant association in a single-pollutant model between PM2.5 with a 1 and 2 hour lag and the probability of an onset of MI. No significance was found for four other lags considered or when using a 24 hour average PM2.5 level. PM2.5 was also found to be significant when the impact of 2 hour lag values and 24 hour average values were jointly considered. Because PM2.5 was found to be significantly associated with the onset of MI under multiple modeling assumptions, we have rated this measured association as a “3”. 46.6 This study is a reexamination of results in a prior study focusing on the association between air pollution and asthma hospital admissions in Seattle, WA. Sheppard uses a GAM analysis with a more stringent default convergence criteria and also reports the results of a GLM analysis. In a single-pollutant analysis, there was a significant association between PM2.5 and hospital admissions under a one day lag. Under a twopollutant analysis, PM2.5 appears to be only moderately significant and not significant under analyses of individual seasons. Because PM2.5 was found to be significant in a single-pollutant setting and moderately significant for some portion of the two-pollutant analysis, we have rated this measured association as a “2”. 48 Table A.2 – Morbidity Studies: Hospital Admissions/Emergency Room Visits continued Study Ito (2003) – Detroit, MI Thurston et al. (1994) – Toronto, Canada Tolbert et al. (2000) – Atlanta, GA 2 Rating Mean (μg/m3) 98th Percentile (μg/m3) 3 18.0 55.2 1 1 18.6 2 19.4 CRA International Discussion This study is a reexamination of results in a prior study focusing on the association between air pollution and elderly hospital admissions in Detroit, MI. Ito uses a GAM analysis with a more stringent default convergence criteria and also reports the results of a GLM analysis. Ito restricts attention to single-pollutant analyses and reports findings at their most significant lag period. Ito finds PM2.5 to be significantly associated with hospital admissions for both pneumonia and heart failure. PM2.5 was not found to be significant for admissions for stroke, dysrhythmia, ischemic heart disease, and COPD. Because PM2.5 was found to be significantly associated with certain categories of hospital visits and no evidence was presented that contradicts those significant associations, we have rated this measured association as a “3”. 51.0 Thurston et al. reports the results of an analysis of the air pollutants and daily hospital admissions in Toronto Canada. Authors find that there is a significant association between PM2.5 and total respiratory admissions in a single-pollutant analysis, but that PM2.5 loses its significance when included with O3 in a two-pollutant case, while ozone remains significant. Authors state “This points to the importance of considering as many pollutants as possible … in order to diminish the chances of being misled as to which of the many ambient air pollutants is actually culpable for any noted air pollution-health effects associations.” (p 282). PM2.5 was found not to be significant in either a singlepollutant or a two-pollutant case for total asthma admissions. Because PM2.5’s significance in a single-pollutant model was found not to be robust to a multi-pollutant specification, we have rated this measured association as a “1”. 41.5 The study presents the results of an GLM estimation of the impact of pollutants on adult asthma, COPD, dysrhythmia, and all-CVDs in Atlanta GA. Authors found an aggregated PM2.5 measure not to be significantly associated with any of the these four morbidity measures. Authors conducted a separate analysis of PM2.5 decomposed into five subclassifications – for asthma and COPD (ten scenarios in total) and none of these five PM2.5 definitions were found to be significant. For dysrhythmia and all-CVDs there were only three instances (out of ten total) where these PM2.5 definitions were significant. Given the limited degree of significance for PM2.5, we have rated this measured association as a “1”. 18.6 is the average of the means for three years of data reported by the authors. 49 Table A.2 – Morbidity Studies: Hospital Admissions/Emergency Room Visits continued Study Moolgavkar (2003) – Los Angeles, CA Delfino et al. (1998) – Montreal, Canada Rating Mean (μg/m3) 98th Percentile (μg/m3) 1 22.0 60.4 1 18.3 40.7 CRA International Discussion The study is a reanalysis of results in three prior papers focusing on the association between air pollution and hospital admissions in Los Angeles. Moolgavkar uses a GAM analysis with a more stringent default convergence criteria and also reports the results of a GLM analysis. The morbidity endpoints considered were the impact on CVD and COPD. Under single-pollutant analyses, PM2.5 has a significant impact on daily CVD admissions with lags of zero and one day for both the restricted GAM and GLM models. However, this association for both lag days is eliminated in a two-pollutant case when CO is included in the analysis. The author finds similar results for COPD admissions. Thus, the significance of PM2.5 is not robust to the inclusion of additional significant explanatory variables and we have, therefore, rated this measured association as a “1”. This study focuses on the association between respiratory illnesses and air pollutants among the elderly in Montreal Canada. The authors find that PM2.5 has no significant association with respiratory illness emergency room visits in any of the 3 cases considered (one single-pollutant case and two two-pollutant cases) causing the authors to remark: “Although there were adverse effects of estimated PM2.5 on ER visits for respiratory illnesses among the elderly, the association was unstable and completely confounded by both temperature and O3” (p. 74) Accordingly, we have rated this measured association as a “1”. 50 CRA International Table A.3 – Morbidity Studies: Symptoms Study Peters et al. (2000) – Eastern Massachusetts Korrick et al. (1998) – New Hampshire Naeher et al. (1999) – Virginia Rating 1 2 2 Mean (μg/m3) 12.7 15.0 21.3 98th Percentile (μg/m3) Discussion 31.7 Peters et al. studied the association between air pollution and defibrillator discharges among 100 patients with such devices in eastern Massachusetts. For the six patients with ten or more discharge events, PM2.5 was significantly associated with discharges under a two day lag. No significance was found for same day, prior day, or three day lags or under a five day average pollutant level. For the 33 patients with at least one discharge event, PM2.5 was found not to be significant under any lag structure. Authors found NO2 to be significant under roughly half of the lag structures modeled. “Including both [PM2.5 and NO2] into one model reduced the effect estimate of PM2.5 effectively to 0, whereas the effect estimate of NO2 was unchanged.” Accordingly, we have rated this measured association as a “1”. 41.2 This study measures the effects of ozone, PM2.5, and aerosol acidity on the pulmonary function of exercising adults in New Hampshire. The authors use mortality endpoints measuring expiratory volume and flow. In the single-pollutant case, the authors found PM2.5 to be significantly associated with three expiratory measures in the base case and in the adjusted case where age, sex, smoking status, etc are controlled for. However, all of these significant associations are eliminated in the two-pollutant case when ozone is included in the analysis. At the same time, separate multi-pollutant regressions of ozone, PM2.5, and acidity generate no significant association for ozone on pulmonary functions. We have rated this measured association as a “2”. 45.1 This study measures the effect of air pollution on daily changes in peak expiratory flow (PEF) in Virginia at two times during the day (morning and evening). In a single-pollutant analysis, the authors found PM2.5 to be significantly associated with morning PEF only as a same-day effect. PM2.5 was found to be marginally significant using a one day lag and using a 3 day average value. PM2.5 was found not to be significant for evening PEF. Under a multi-pollutant analysis for the morning PEF, PM2.5 was not found to be significant under any lag measure, prompting the authors to write: “When the main effects (H+ and PM2.5 or PM10) were analyzed as individual main effects in multivariate models related to morning PEF, none remained significant in the presence of any other.” (p 120) Even though the significance of PM2.5 in the single-pollutant case was not found to be robust to a multi-pollutant specification, nor were the other pollutants found to be significant when considered in a multi-pollutant analysis. Thus, we have rated this measured association as a “2”. 51 CRA International Table A.3 – Morbidity Studies: Symptoms continued Study Ostro et al. (1991) – Denver, CO Neas et al. (1999) – Philadelphia, PA Neas et al. (1996) Schwartz and Neas (2000) – State College, PA Neas et al. (1995) Schwartz and Neas (2000) – Uniontown, PA Rating 1 1 2 3 Mean (μg/m3) 22.0 22.2 23.5 24.5 98th Percentile (μg/m3) Discussion 60.3 Ostro et al. measures the association between acidic aerosols and asthmatic response measured by general asthma status, cough, and shortness of breath. Authors found that PM2.5 was not significantly associated with any of these health measures. PM2.5 was found to be only moderately significant with respect to general asthma status when adjustments were made to control for missing values of PM2.5 in the data, prompting authors to state: “This study suggests that hydrogen ion is the pollutant of primary concern” and that “concentrations of particulate matter less than 2.5 microns were mildly associated with overall asthma ratings, but not with either cough or shortness of breath.” (p 699) Accordingly, we have rated this measured association as a “1”. 44.9 Neas et al. measures the association between fine particulate matter and Peak Expiratory Flow Rate (PEFR) in children in Philadelphia, PA. Authors found under two standardized measures of PM2.5 (24-hour average and 5-day moving average) that PM2.5 was not significantly associated with reductions in PEFR in neither the morning or afternoon readings. Accordingly, we have rated this measured association as a “1”. 69.0 Neas et al. measures the association between summertime haze episodes and PEFR and the incidence of various respiratory symptoms (cough, wheeze, or cold) in 108 children in State College, PA. Fine particulate matter was found to be significantly associated with the probability of experiencing a same-day cold but not on the probability of experience a same-day wheeze or cough. Under a one day lag, fine particulate matter was found to be significantly associated with the probability of a cough, but not of a wheeze or cold. Under both the same-day or one-day lag scenarios, fine particulate matter was not found to be significantly associated with a deviation in PEFR. Schwartz and Neas found PM2.1 to be nearly significantly associated with PEFR. Given these mixed findings, we have rated this measured association as a “2”. 60.0 Neas et al. measures the association between air pollutants and PEFR in 83 children in Uniontown, PA. Authors found that fine particulate matter was significantly associated with an increase in the probability of evening cough episodes. This continued to be the case under a separate scenario in which the amount of hours spent outdoors was controlled for. Fine particulate matter was nearly significantly associated with a decrease in the mean deviation in PEFR when time spent outdoors was controlled for but not significant otherwise. Schwartz and Neas found PM2.1 to be significantly associated with PEFR. Because PM2.5 was found to be significantly associated with the onset of a cough and there was no evidence provided that contradicted this result, we have rated this measured association as a “3”. 52 CRA International Table A.3 – Morbidity Studies: Symptoms continued Study Ostro et al. (2001) – Los Angeles, CA Zhang et al. (2000) – Virginia Delfino et al. (1996) – San Diego, CA Schwartz and Neas (2000) – Six Cities 3 Rating 3 1 1 3 Mean (μg/m3) 40.8 Not Provided 24.8 18.0 3 98th Percentile (μg/m3) Discussion 112.0 Ostro et al. measures the association between air pollution and asthma exacerbation measured as shortness of breath, cough, wheeze (symptoms), and the use of extra asthma medicine among African-American children in Los Angeles. For the asthma symptoms, authors conducted two analyses for the daily probability of each symptom and the probably of a new onset of a symptom episode, respectively (6 cases in total). PM2.5 was found to be significantly associated with asthma symptoms in all cases with the exception of the probability of a day-with-cough. The authors found no impact for any pollution on the use of extra asthma medicine stating: “With these models, none of the pollutants, pollens, or molds was associated with the use of extra medication among the full population. These null results were robust to model specification.” (p 205) Accordingly, because PM2.5 was found to be significantly associated with the existence of asthma symptoms and there was no evidence presented that contradicted this result, we have rated this measured association as a “3”. 45.1 Zhang et al. measures the association between air pollution and respiratory symptoms, measured as a runny or stuffy nose, for a set of mothers in southwestern Virginia. Authors included PM2.5, PM10, Coarse PM, and other pollutants. The authors indicate that they ran a 2-stage estimation first “removing statistically insignificant terms (at the 0.05 level) from the model” (p 1210) prior to conducting the final model. In doing so, the final model included only Coarse PM and sulfate terms among the original set of potential explanatory pollutants, indicating that PM2.5 was found not to be significant under their methodology. Accordingly, we have rated this measured association as a “1”. 51.1 Delfino et al. measured the association between ambient ozone and fine particle concentrations on daily asthma symptoms for sixteen children living in San Diego, CA. Daily asthma symptoms used in the analysis was measured as a composite score of five individual symptoms rated by participants in six levels of severity as well as the number of uses of an asthma inhaler each day. Authors report that “The low atmospheric levels of PM2.5, the SO4(2-) fraction of PM2.5, predicted aerosol H+, NHO3, and pollen were not associated with either asthma symptom scores or as-needed inhaler use.” (p 638) Accordingly, we have rated this measured association as a “1”. 48.0 Schwartz and Neas measured the association between fine particles and respiratory symptoms in school children using the Harvard Six Cities Diary Study. Authors find that PM2.5 is significantly associated with lower respiratory symptoms under both single and double pollutant specifications. Given these findings, we have rated this measured association as a “3”. Entry reflects median value. Mean not provided in study. 53 CRA International Table A.3 – Morbidity Studies: Symptoms continued Study Linn et al. (1999) – Los Angeles, CA Rating 1 Mean (μg/m3) 24.8 98th Percentile (μg/m3) 59.1 Discussion Linn et al. measured the association between cardiovascular indicators and air pollutants on 30 individuals having severe COPD living in Los Angeles, CA. Measured cardiovascular indicators were arterial blood oxygen saturation, blood pressure, ECG readings and lung function. Authors find a highly significant association between peak flow and previous day’s indoor PM2.5, though indicate that this association looses significance if outlying subjects are excluded. Authors find no significant relationship between blood oxygen saturation for any of the pollutants considered, though did find a significant association between overnight saturation and indoor PM2.5, though this association was found to be positive rather than negative as expected. Authors find no significant association between ECG readings and PM2.5 levels. Accordingly, given these findings we have rated this measured association as a “1”. 54 CRA International Appendix B Estimation of 98th Percentile Values for PM2.5 Studies Cited in these Comments The 98th percentile values for individual datasets play a central role in my comments on numerous PM2.5 studies. The individual papers reporting PM2.5-health associations rarely report this particular summary statistic for the PM2.5 data that they used. These values were reported for fifteen of the studies cited in these comments in a January 28th, 2005 EPA memorandum (Ross and Langstaff (2005)). The PR commented on the 98th percentile levels of six more studies. After making a request to EPA, EPA released a supplemental memorandum on April 5, 2006 (Ross and Langstaff (2006)) that provided these data for six more studies. The EPA memos also report the annual averages for these 21 studies. For remaining studies covered in my analysis, I estimated the 98th percentile (“98p”) based on relationships derived from the relevant twenty-one studies for which the EPA memoranda specify statistical characteristics. I approached this in two ways, depending on the quality of air quality distributional data provided in the source papers. If the source paper provided both a mean and standard deviation for their PM2.5 data, then I estimated its 98th percentile by adding a multiple of the standard deviation to the reported mean. If the distributions of PM2.5 were a normal distribution, the multiplier would be 1.96. However, PM2.5 data are more skewed. I therefore used the EPA-provided data to determine a reasonable multiplier to use. 1 In particular, in six of the studies detailed in the EPA memoranda, the sample mean (“u”) and standard deviation (“σ”) are reported. 2 I computed the effective multiplier, Z, corresponding to those six 98th percentiles, where: Z = (98p – u)/σ. This returned values ranging from 2.37 to 2.96. Thus, for studies not covered by the EPA memorandum and for which authors provided summary mean and standard deviation statistics, I estimated their 98th percentile to be equal to: 98p = u + Min Z * σ, where Min Z is the smallest Z value found among the data provided by EPA, that is, Min Z = 2.37. I relied on the smallest Z value to avoid overstating the 98th percentile. I have applied this estimation rule to 13 additional studies for which the authors reported both annual average (mean) and standard deviation statistics for their PM2.5 data. These studies are noted in the accompanying table. This left me with six studies in which authors did not report even a sample standard deviation statistic. All of these studies did report the mean PM2.5, however. For these six, I relied on the average ratio of the 98th percentile statistic to the mean PM2.5 concentration (ratio = (98p)/u) from the relevant studies detailed in the EPA Memorandum. The average value over all 21 studies reported by EPA is 2.62. Thus, for these remaining studies lacking standard deviation statistics, the 98th percentile is estimated to be: 98p = 2.62 * u. 1 I first considered fitting a parametric lognormal to the datapoints provided, but discovered that a lognormal distribution produces much higher 98th percentiles than fit the 21 studies reported by EPA. 2 The six studies are: Schwartz (2003) and Klemm and Mason (2003) for Madison, Topeka, and Boston, Lipfert et al. (2000), Mar et al. (2003), and Peters et al. (2001). 55 CRA International Tables B.1 through B.3 provide the precise data and calculations used to produce the 98th percentiles for the total of 37 datasets that I rely on in my comments. 56 CRA International Table B.1 – Mortality Studies Study Ostro et al. (2003) – Coachella Valley, CA Schwartz (2003), Klemm and Mason (2003) – Knoxville, KY Mean (μg/m3) 98th Percentile (μg/m3) 15.8 33.4 Value from Ross and Langstaff (2005). Attachment A reports this value as 33.8, but the source material from the author attached to that memo indicates the value is actually 33.4. 43.5 th Source paper reports standard deviation of 9.6. 98 percentile value is computed th assuming 98 percentile is 2.37 standard deviations above mean based on other data provided in EPA memoranda. Accordingly, 43.5 = 20.8+2.37(9.6) 20.8 Discussion of Percentile Calculation Schwartz (2003), Klemm and Mason (2003) – Steubenville, OH 29.6 81.5 Source paper reports standard deviation of 21.9. 98th percentile value is computed assuming 98th percentile is 2.37 standard deviations above mean based on other data provided in EPA memoranda. Accordingly, 81.5 = 29.6+2.37(21.9) Schwartz (2003), Klemm and Mason (2003) – Madison, WI 11.2 34.3 Value from Attachment A in Ross and Langstaff (2005). Schwartz (2003), Klemm and Mason (2003) – Topeka, KS 12.2 32.0 Value from Attachment A in Ross and Langstaff (2005). Ito (2003), Lippmann et al. (2000) – Detroit, MI 18.0 55.2 Value from Attachment A in Ross and Langstaff (2005). 84.0 Source paper reports standard deviation of 19.8. 98th percentile value is computed assuming 98th percentile is 2.37 standard deviations above mean based on other data provided in EPA memoranda. Accordingly, 84.0 = 37.1+2.37(19.8) 60.4 The 98 percentile is estimated as the mean PM2.5 level reported by authors multiplied by th the average ratio (2.75) of the 98 percentile to particulate mean for studies reported in the EPA memoranda. Accordingly, 60.4=2.75(22.0). Tsai et al. (2000) – Elizabeth, NJ 37.1 th Moolgavkar (2003) – Los Angeles, CA 22.0 th Chock, Winkler, and Chen (2000) – Pittsburgh, PA 20.5 56.3 The 98 percentile is estimated as the mean PM2.5 level reported by authors multiplied by the average ratio (2.75) of the 98th percentile to particulate mean for studies reported in the EPA memoranda. Accordingly, 56.3=2.75(20.5). Schwartz (2003), Klemm and Mason (2003) – Boston, MA 15.7 42.0 Value from Attachment A in Ross and Langstaff (2005). 57 Table B.1 – Mortality Studies continued Study Mean (μg/m3) CRA International 98th Percentile (μg/m3) Discussion of Percentile Calculation Schwartz (2003), Klemm and Mason (2003) – St. Louis, MO 18.7 43.6 th Source paper reports standard deviation of 10.5. 98 percentile value is computed th assuming 98 percentile is 2.37 standard deviations above mean based on other data provided in EPA memoranda. Accordingly, 43.6 = 18.7+2.37(10.5) Lipfert et al. (2000) – Philadelphia, PA 17.3 44.2 Value from Attachment A in Ross and Langstaff (2005). 94.2 Source paper reports standard deviation of 22. 98th percentile value is computed assuming 98th percentile is 2.37 standard deviations above mean based on other data provided in EPA memoranda. Accordingly, 94.2 = 42.1+2.37(22) Tsai et al. (2000) – Newark, NJ 42.1 Tsai et al. (2000) – Camden, NJ 39.9 84.4 th Source paper reports standard deviation of 18.8. 98 percentile value is computed th assuming 98 percentile is 2.37 standard deviations above mean based on other data provided in EPA memoranda. Accordingly, 84.4 = 39.9+2.37(18.8) Mar et al. (2003) – Phoenix, AZ 13.5 32.2 Value from Attachment A in Ross and Langstaff (2005). Fairley (2003) – Santa Clara County, CA 13.6 59.0 Value from Attachment A in Ross and Langstaff (2005). Goldberg and Burnett (2003) – Montreal, Canada 17.4 53.1 Value from Attachment A in Ross and Langstaff (2005). Burnett and Goldberg (2003) 13.3 38.9 Value from Attachment A in Ross and Langstaff (2005). 58 CRA International Table B.2 – Morbidity Studies: Hospital Admissions/Emergency Room Visits Mean (μg/m3) 98th Percentile (μg/m3) Delfino et al. (1997) 12.1 31.2 Value from Attachment A in Ross and Langstaff (2005). Peters et al. (2001) 12.1 28.2 Value from Ross and Langstaff (2006). Sheppard (2003) 16.7 46.6 Value from Attachment A in Ross and Langstaff (2005). Ito (2003) 18.0 55.2 Value from Attachment A in Ross and Langstaff (2005). Thurston et al. (1994) 18.6 3 51.0 Value from Attachment A in Ross and Langstaff (2005). Study Tolbert et al. (2000) 19.4 41.5 Source paper reports standard deviation of 9.35. 98th percentile value is computed assuming 98th percentile is 2.37 standard deviations above mean based on other data provided in EPA memoranda. Accordingly, 41.5 = 19.4+2.37(9.35) Moolgavkar (2003) 22.0 60.4 The 98th percentile is estimated as the mean PM2.5 level reported by authors multiplied by the average ratio (2.75) of the 98th percentile to particulate mean for studies reported in EPA memoranda. Accordingly, 60.4=2.75(22.0). 40.7 Source paper reports standard deviation of 9.5. 98th percentile value is computed assuming 98th percentile is 2.37 standard deviations above mean based on other data provided in EPA memoranda. Accordingly, 40.7 = 18.3+2.37(9.5) Delfino et al. (1998) 3 Discussion 18.3 Entry reflects median value. Mean not provided in study. 59 CRA International Table B.3 – Morbidity Studies: Symptoms Study Peters et al. (2000) Korrick et al. (1998) Naeher et al. (1999) Mean (μg/m3) 98th Percentile (μg/m3) 12.7 31.7 Value from Ross and Langstaff (2006). 41.2 The 98th percentile is estimated as the mean PM2.5 level reported by authors multiplied by th the average ratio (2.75) of the 98 percentile to particulate mean for studies reported in the EPA memoranda. Accordingly, 41.2 = 2.75(15.0). 45.1 th Source paper reports standard deviation of 10.1. 98 percentile value is computed th assuming 98 percentile is 2.37 standard deviations above mean based on other data provided in EPA memoranda. Accordingly, 45.1 = 21.2 + 2.37(10.1). 15.0 21.3 Discussion Ostro et al. (1991) 22.0 60.3 The 98th percentile is estimated as the mean PM2.5 level reported by authors multiplied by th the average ratio (2.75) of the 98 percentile to particulate mean for studies reported in the EPA memoranda. Accordingly, 60.3 = 2.75(22.0). Neas et al. (1999) 22.2 44.9 Value from Ross and Langstaff (2006). Neas et al. (1996) Schwartz and Neas (2000) 23.5 69.0 Value from Ross and Langstaff (2006). Neas et al. (1995) Schwartz and Neas (2000) 24.5 60.0 Value from Ross and Langstaff (2006). Schwartz and Neas (2000) 18 48.0 No summary statistic data is provided in this study. I have relied on summary statistics from Schwartz et al. 1994 the precursor study to Schwartz and Neas (2000) which are reported in Ross and Langstaff (2006). 112.0 The 98th percentile is estimated as the mean PM2.5 level reported by authors multiplied by the average ratio (2.75) of the 98th percentile to particulate mean for studies reported in the EPA memoranda. Accordingly, 112.0 = 2.75(40.8). 45.1 No mean or standard deviation estimates provided. 98th percentile value assumed equal to Naeher et al. (1999) since both studies cover the same area (Southwest Virginia) and have the same reported minimum and maximums. 51.1 th Source paper reports standard deviation of 11.1. 98 percentile value is computed th assuming 98 percentile is 2.37 standard deviations above mean based on other data provided in EPA memoranda. Accordingly, 51.1 = 24.8 + 2.37(11.1). Ostro et al. (2001) 40.8 Zhang et al. (2000) Delfino et al. (1996) 24.8 60 Table B.3 – Morbidity Studies: Symptoms of Morbidity continued Study Linn et al. (1999) Mean (μg/m3) 24.8 98th Percentile (μg/m3) 59.1 CRA International Discussion th Source paper reports standard deviation of 14.5. 98 percentile value is computed th assuming 98 percentile is 2.37 standard deviations above mean based on other data provided in EPA memoranda. Accordingly, 59.1 = 24.8 + 2.37(14.5). 61 CRA International List of References Burnett, R. T.; Brook, J.; Dann, T.; Delocla, C.; Philips, O.; Cakmak, S.; Vincent, R.; Goldberg, M. S.; Krewski, D. (2000) “Association Between Particulate- and Gas-phase Components of Urban Air Pollution and Daily Mortality in Eight Canadian Cities” In: Grant, L. D., ed. PM2000: Particulate Matter and Health. Inhalation Toxicology 12(suppl. 4): 15-39. Burnett, R. T.; Goldberg, M. S. (2003) “Size-fractionated Particulate Mass and Daily Mortality in Eight Canadian Cities.” Revised analyses of time-series studies of air pollution and health. Special report. Boston, MA: Health Effects Institute; pp. 85-90. Chock, D. P.; Winkler, S.; Chen, C. (2000) “A Study of the Association Between Daily Mortality and Ambient Air Pollutant Concentrations in Pittsburgh, Pennsylvania” Journal of the Air & Waste Management Association 50: 1481-1500. Clyde, M. A.; Guttorp, P.; Sullivan, E. (2000) “Effects of Ambient Fine and Coarse Particles on Mortality in Phoenix, Arizona” Seattle, WA: University of Washington, National Research Center for Statistics and the Environment; NRCSE technical report series, NRCSETRS no. 040. Delfino, R. J.; Coate, B. D.; Zeiger, R. S.; Seltzer, J. M.; Street, D. H.; Koutrakis, P. (1996) “Daily Asthma Severity in Relation to Personal Ozone Exposure and Outdoor Fungal Spores” American Journal of Respiratory and Critical Care Medicine 154: 633-641. Delfino, R. J.; Murphy-Moulton, A. M.; Burnett, R. T.; Brook, J. R.; Becklake, M. R. (1997) “Effects of Air Pollution on Emergency Room Visits for Respiratory Illnesses in Montreal, Quebec” American Journal of Respiratory and Critical Care Medicine 155: 568-576. Delfino, R. J.; Murphy-Moulton, A. M.; Becklake, M. R. (1998) “Emergency Room Visits for Respiratory Illnesses Among the Elderly in Montreal: Association with Low Level Ozone Exposure” Environmental Research 76: 67-77. Dockery, D. W.; Pope, C. A., III; Xu, X.; Spengler, J. D.; Ware, J. H.; Fay, M. E.; Ferris, B. G., Jr.; Speizer, F. E. (1993) “An Association Between Air Pollution and Mortality in six U.S. Cities” New England Journal of Medicine 329: 1753-1759. Dominici, F.; Daniels, M.; McDermott, A.; Zeger, S.; Samet, J. (2003) “Shape of the Exposure-Response Relation and Mortality in the NMMAPS Database” Revised Analyses of Time-series Studies of Air Pollution and Health, Special Report. Boston, MA: Health Effects Institute; pp. 9-24. Environmental Protection Agency (2005) “Review of the National Ambient Air Quality Standards for Particulate Matter: Policy Assessment of Scientific and Technical Information” OAQPS Staff Paper, U.S. Environmental Protection Agency, June 2005. Fairley, D. (1999) “Daily Mortality and Air Pollution in Santa Clara County, California: 1989-1996” Environmental Health Perspectives 107: 637-641. 62 CRA International Fairley, D. (2003) “Mortality and Air Pollution for Santa Clara County, California, 19891996” Revised Analyses of Time-series Studies of Air Pollution and Health, Special Report. Boston, MA: Health Effects Institute; pp. 97-106. Gold, D. R.; Schwartz, J.; Litonjua, A.; Verrier, R.; Zanobetti, A. (2003) “Ambient pollution and reduced heart rate variability. In: Revised analyses of time-series studies of air pollution and health” Special report. Boston, MA: Health Effects Institute; pp. 107-112. Goldberg, M. S.; Burnett, R. T. (2003) “Revised Analysis of the Montreal Time-series Study” Revised Analyses of Time-series Studies of Air Pollution and Health, Special Report. Boston, MA: Health Effects Institute; pp. 113-132. Grant, L.; Wilson W.; and NCEA PM Team Staff (2002) “Overview of Third External Review Draft (April 2002) of Air Quality Criteria for Particulate Matter: Key Revisions and Issues for CASAC Discussion,” presented to CASAC, USEPA, Research Triangle Park, NC, July 18-19. Health Effects Institute (2003) “Revised Analyses of Time-Series Studies of Air Pollution and Health”, Special Report, Boston, MA, May. Available at: http://www.healtheffects.org/Pubs/TimeSeries.pdf. Ito, K. (2003) “Associations of Particulate Matter Components with Daily Mortality and Morbidity in Detroit, Michigan” Revised Analyses of Time-series Studies of Air Pollution and Health, Special Report Boston, MA: Health Effects Institute; pp. 143-156 Janes, H.; Sheppard, L.; Lumley, T. (2004) “Referent Selection Strategies in Case-Crossover Analyses of Air Pollution Exposure Data: Implications for Bias” UW Biostatistics Working Paper Series, University of Washington, Paper 214. http://www.bepress.com/uwbiostat/paper214 Klemm, R. J.; Mason, R. (2003) “Replication of Reanalysis of Harvard Six-City Mortality Study” Revised Analyses of Time-series Studies of Air Pollution and Health, Special Report Boston, MA: Health Effects Institute; pp. 165-172. Korrick, S. A.; Neas, L. M.; Dockery, D. W.; Gold, D. R.; Allen, G. A.; Hill, L. B.; Kimball, K. D.; Rosner, B. A.; Speizer, F. E. (1998) “Effects of Ozone and Other Pollutants on the Pulmonary Function of Adult Hikers” Environmental Health Perspectives 106: 93-99. Krewski, D.; Burnett, R. T.; Goldberg, M. S.; Hoover, K.; Siemiatycki, J.; Jerrett, M.; Abrahamowicz, M.; White, W. H. (2000) “Reanalysis of the Harvard Six Cities Study and the American Cancer Society Study of Particulate Air Pollution and Mortality” A special report of the Institute's Particle Epidemiology Reanalysis Project. Cambridge, MA: Health Effects Institute. 63 CRA International Liao, D.; Creason, J.; Shy, C.; Williams, R.; Watts, R.; Zweidinger, R. (1999) “Daily Variation of Particulate Air Pollution and Poor Cardiac Autonomic Control in the Elderly” Environmental Health Perspectives 107: 521-525. Linn, W. S.; Gong, H., Jr.; Clark, K. W.; Anderson, K. R. (1999) “Day-to-day Particulate Exposures and Health Changes in Los Angeles Area Residents with Severe Lung Disease” Journal of the Air & Waste Management Association 49: PM108-PM115. Lipfert, F. W.; Morris, S. C.; Wyzga, R. E. (2000a) “Daily Mortality in the Philadelphia Metropolitan Area and Size-classified Particulate Matter” Journal of the Air & Waste Management Association 1501-1513. Lipfert, F. W.; Perry, H. M., Jr.; Miller, J. P.; Baty, J. D.; Wyzga, R. E.; Carmody, S. E. (2000b) “The Washington University-EPRI Veterans' Cohort Mortality Study: Preliminary Results” In: Grant, L. D., ed. PM2000: Particulate Matter and Health. Inhalation Toxicology 12(suppl. 4): 41-73. Lippmann, M.; Ito, K.; Nádas, A.; Burnett, R. T. (2000) “Association of Particulate Matter Components with Daily Mortality and Morbidity in Urban Populations” Cambridge, MA: Health Effects Institute; research report no. 95. Mar, T. F.; Norris, G. A.; Koenig, J. Q.; Larson, T. V. (2000) “Associations Between Air Pollution and Mortality in Phoenix, 1995-1997” Environmental Health Perspectives 108: 347-353. Mar, T. F.; Norris, G. A.; Larson, T. V.; Wilson, W. E.; Koenig, J. Q. (2003) “Air Pollution and Cardiovascular Mortality in Phoenix, 1995-1997” Revised Analyses of Time-series Studies of Air Pollution and Health, Special Report Boston, MA: Health Effects Institute; pp. 177-182. Moolgavkar, S. H. (2003) “Air Pollution and Daily Deaths and Hospital Admissions in Los Angeles and Cook Counties” Revised Analyses of Time-series Studies of Air Pollution and Health, Special Report Boston, MA: Health Effects Institute; pp. 183-198. Naeher, L. P.; Holford, T. R.; Beckett, W. S.; Belanger, K.; Triche, E. W.; Bracken, M. B.; Leaderer, B. P. (1999) “Healthy Women's PEF Variations with Ambient Summer Concentrations of PM10, PM2.5, SO42-, H+, and O3” American Journal of Respiratory and Critical Care Medicine 160: 117-125. Neas, L. M.; Dockery, D. W.; Koutrakis, P.; Tollerud, D. J.; Speizer, F. E. (1995) “The Association of Ambient Air Pollution with Twice Daily Peak Expiratory Flow Rate Measurements in Children” American Journal of Epidemiology 141: 111-122. Neas, L. M.; Dockery, D. W.; Burge, H.; Koutrakis, P.; Speizer, F. E. (1996) “Fungus Spores, Air Pollutants, and Other Determinants of Peak Expiratory Flow Rate in Children” American Journal of Epidemiology 143: 797-807. 64 CRA International Neas, L. M.; Dockery, D. W.; Koutrakis, P; Speizer, F.E. (1999) “Fine Particles and Peak Flow in Children: Acidity versus Mass” Epidemiology 10: 550-3. Ostro, B. D.; Lipsett, M. J.; Wiener, M. B.; Selner, J. C. (1991) “Asthmatic Responses to Airborne Acid Aerosols” American Journal of Public Health 81: 694-702. Ostro, B.; Lipsett, M.; Mann, J.; Braxtron-Owens, H.; White, M. (2001) “Air Pollution and Exacerbation of Asthma in African-American Children in Los Angeles” Epidemiology 12: 200-208. Ostro, B. D.; Broadwin, R.; Lipsett, M. J. (2003) “Coarse Particles and Daily Mortality in Coachella Valley, California” Revised Analyses of Time-series Studies of Air Pollution and Health, Special Report Boston, MA: Health Effects Institute; pp. 199-204. Peters, A.; Liu, E.; Verrier, R. L.; Schwartz, J.; Gold, D. R.; Mittleman, M.; Baliff, J.; Oh, J. A.; Allen, G.; Monahan, K.; Dockery, D. W. (2000) “Air Pollution and Incidence of Cardiac Arrhythmia” Epidemiology 11: 11-17. Peters, A.; Dockery, D. W.; Muller, J. E.; Mittleman, M. A. (2001) “Increased Particulate Air Pollution and the Triggering of Myocardial Infarction” Circulation 103: 2810-2815. Pope, C. A., III; Thun, M. J.; Namboodiri, M. M.; Dockery, D. W.; Evans, J. S.; Speizer, F. E.; Heath, C. W., Jr. (1995) “Particulate Air Pollution as a Predictor of Mortality in a Prospective Study of U.S. Adults” American Journal of Respiratory and Critical Care Medicine 151: 669-674. Pope, C. A., III; Burnett, R. T.; Thun, M. J.; Calle, E. E.; Krewski, D.; Ito, K.; Thurston, G. D. (2002) “Lung Cancer, Cardiopulmonary Mortality, and Long-term Exposure to Fine Particulate Air Pollution” Journal of the American Medical Association 287: 1132-1141. Post, E.; Watts, K.; Al-Hussainy, E.; Neubig, E. (2005) “Particulate Matter Health Risk Assessment for Selected Urban Areas” Report prepared for Office of Air Quality Planning and Standards, U.S. Environmental Protection Agency, June 2005. Ross, M.; Langstaff, J. (2005) “Updated Statistical Information on Air Quality Data from Epidemiologic Studies” Memorandum to PM NAAQS review docket EPA–HQ–OAR–2001– 0017. January 31, 2005. Ross, M.; Langstaff, J. (2006) “Statistical Information on Air Quality Data from Additional Epidemiologic Studies” Memorandum to PM NAAQS review docket EPA–HQ–OAR–2001– 0017. April 5, 2006. Schwartz, J.; Dockery, D. W.; Neas, L. M. (1996) “Is Daily Mortality Associated Specifically with Fine Particles?” Journal of the Air & Waste Management Association 46: 927-939. 65 CRA International Schwartz, J.; Neas, L. M. (2000) “Fine Particles are More Strongly Associated than Coarse Particles with Acute Respiratory Health Effects in Schoolchildren” Epidemiology 11: 6-10. Schwartz, J. (2003a) “Daily Deaths Associated with Air Pollution in Six US Cities and Shortterm Mortality Displacement in Boston” Revised Analyses of Time-series Studies of Air Pollution and Health, Special Report Boston, MA: Health Effects Institute; pp. 219-226. Schwartz, J. (2003b) “Airborne Particles and Daily Deaths in 10 U.S. Cites” Revised Analyses of Time-series Studies of Air Pollution and Health, Special Report Boston, MA: Health Effects Institute; pp. 211-218. Sheppard, L. (2003) “Ambient Air Pollution and Nonelderly Asthma Hospital Admissions in Seattle, Washington, 1987-1994” Revised Analyses of Time-series Studies of Air Pollution and Health, Special Report Boston, MA: Health Effects Institute; pp. 227-230. Smith, A. (2003) “Comments on EPA’s ‘Particulate Matter Health Risk Assessment for Selected Urban Areas: Draft Report’” (submitted to EPA as an attachment to comments from Utility Air Regulatory Group), EPA Docket No. EPA-HQ-OAR-2001-0017-0200 (October 28, 2003). Smith, A. (2005) “Comments on the Second Drafts of EPA’s Particulate Matter ‘Staff Paper’ and the ‘Particulate Matter Health Risk Assessment for Selected Urban Areas’” (submitted to EPA as an attachment to comments from Utility Air Regulatory Group), EPA Docket No. EPA-HQ-OAR-2001-0017-0344 (March 31, 2005). Smith, R. L.; Spitzner, D.; Kim, Y.; Fuentes, M. (2000) “Threshold Dependence of Mortality Effects for Fine and Coarse Particles in Phoenix, Arizona” Journal of the Air & Waste Management Association 50: 1367-1379. Thurston, G. D.; Ito, K.; Hayes, C. G.; Bates, D. V.; Lippmann, M. (1994) “Respiratory Hospital Admissions and Summertime Haze Air Pollution in Toronto, Ontario: Consideration of the Role of Acid Aerosols” Environmental Research 65: 271-290. Tolbert, P. E.; Klein, M.; Metzger, K. B.; Flanders, W. D.; Todd, K.; Mulholland, J. A.; Ryan, P. B.; Frumkin, H. (2000) “Interim Results of the Study of Particulates and Health in Atlanta (SOPHIA)” Journal of Exposure Analysis and Environmental Epidemiology 10: 446460. Tsai, F. C.; Apte, M. G.; Daisey, J. M. (2000) An Exploratory Analysis of the Relationship Between Mortality and the Chemical Composition of Airborne Particulate Matter” Inhalation Toxicology 12(suppl.): 121-135. 66 CRA International United States Environmental Protection Agency (2005) “EPA’s Review of the National Ambient Air Quality Standards for Particulate Matter (Second Draft PM Staff Paper, January 2005) – A Review by the Particulate Matter Review Panel of the EPA Clean Air Scientific Advisory Committee” EPA-SAB-CASAC-05-007, June 2005 Zhang, H.; Triche, E.; Leaderer, B. (2000) “Model for the Analysis of Binary Time Series of Respiratory Symptoms” American Journal of Epidemiology 151: 1206-1215. 67 Attachment 3 Comment on Proposal to Require Submission of Data on PM 2.5 Field Blank Mass in Addition to PM 2.5 Filter-Based Measurements Prepared for John J. Jansen Southern Company Birmingham, AL 35203 Prepared by Eric S. Edgerton Atmospheric Research & Analysis, Inc. Cary, NC 27513 Note: This report contains data collected by the Alabama Department of Environmental Management (ADEM), Florida Department of Environmental Protection (FDEP), Jefferson County Department of Health (JCDH) and Mississippi Department of Environmental Quality (MDEQ). The interpretation and conclusions presented in this report are those of the author and not the collecting agencies. The proposal to require reporting of field blank mass data is strongly endorsed. Field blank data will improve the scientific and regulatory understanding of PM2.5 and will facilitate comparisons across networks, across space and across time. Field blank data can and should be used to estimate method detection limits and the uncertainties associated with reported PM2.5 measurements. The proposal does not go far enough, however, in that it does not require the use of field blank data for blank-correction of PM2.5 concentration data. Without this requirement, blank-correction will be applied haphazardly (perhaps only by the scientific community) and the full benefit of field blank data (specifically, comparison of blank-corrected PM2.5 with the NAAQS) will not be realized. Support for the use of blank correction is shown in Table 1, which is a statistical summary of field blank mass data from the Southeastern Aerosol Research and Cha racterization (SEARCH) network, as well as various State and Local Agency networks in the states of AL, FL and MS. Note that several types of blanks are included in the table. FRM-FB and FRM-TB are field blanks and trip blanks collected according to Federal Reference Monitor (FRM) protocols. Field blanks are installed in the sampler and go through all aspects of the sampling process, but have no air drawn through them. Trip blanks are similar to field blanks, except they are never installed on the sampler. STN-FB are field blanks collected according the Speciation Trends Network protocols. Data were obtained from individual networks, and then subjected to Dixon’s test for outlier removal prior to calculation of summary statistics. Outlier removal was employed in order to exclude grossly contaminated samples and samples with obvious weighing errors, and thus to obtain a “best estimate” of field blank mass. Data in Table 1 show important similarities and differences across networks and across types of blanks. Mean mass loadings range from 5.2 micrograms (µg) for SEARCH to 10.9 µg for ADEM. Minimum and maximum loadings range from -16 ug (JCDH) to 30 ug (MDEQ), indicating that field blanks can undergo losses in mass as well as gains in mass. Calculation of 95 percent confidence intervals (95% CI) show that mean loadings are statistically different from zero with a very high degree of confidence, while the percentage of blanks greater than 0 (% >0) shows that the preponderance of blanks collected by each network exhibit positive mass loadings. Comparison of field and trip blanks from the ADEM network shows very similar results, with field blanks only about 5 percent (0.5 µg) higher than trip blanks. Although data are limited to only one network, this result suggests that filter contamination occurs, for the most part, in the handling and shipping process, not in the FRM sampler. Field blank loadings, therefore, are not the result of passive contamination while residing in the FRM sampler. From this, it can be concluded that field blank loadings are independent of ambient PM2.5 at the monitoring site. There is clear evidence of field blank contamination and that this contamination produces a positive bias in measured FRM sample mass and calculated PM2.5 concentrations. The magnitude of the bias, in terms of PM2.5 concentrations, is shown in Table 1 as Effective PM2.5, which is simply the mean field blank mass divided by the nominal sample volume (24.0 cubic meters for FRM samplers and 9.7 cubic meters for the JCDH STN sampler). For FRM field blanks alone, Effective PM2.5 varies by about a factor of two (0.22 for SEARCH to 0.45 for ADEM). Field bla nk mass and Effective Concentration differ significantly between these two networks, despite the fact that they employ the same FRM sampler and adhere to the same sampling protocols. Effective PM2.5 for FRM field blanks represents about 1.7 percent to 3.0 percent of the annual PM2.5 NAAQS (15.0 µg/m3 ). Although these percentages are small, they 2 can be significant for sites near, or slightly above, the NAAQS, and may become more significant if the NAAQS is revised to a lower value. If we include the JCDH STN-FB data, then Effective PM2.5 varies by more than a factor of four and the bias approaches 1.0 ug/m3 (6.7 percent of the NAAQS). We also urge the EPA to consider blank-correction of data from the chemical speciation networks. In this case, field blanks are being reported, but they are not being used for blankcorrection or other purposes. Table 2 contains a summary of chemical speciation field blank data from the SEARCH network and the JCDH network (three Speciation Trends Network sites). Note that only the major chemical components of PM2.5 are shown in Table 2. Field blank data are available for a large number of minor species, such as trace elements analyzed by x-ray fluorescence (XRF), but space precludes presentation here. Field blank data should be used to blank-correct the minor species. In addition, given that many minor species are below XRF detection limits, field blanks should be used to determine which of the minor species can be meaningfully reported and interpreted. Data in Table 2 show that the major chemical species are frequently present at detectable levels in field blanks, but that, with one exception, mass loadings are quite low. The one exception si organic carbon (OC), which is always detected in field blanks and has a mean mass of 2.8 µg for SEARCH and 9.6 µg for JCDH. Effective Concentrations (mean mass/nominal sample volume) for OC are 0.12 µg/m3 for SEARCH and 0.99 µg/m3 for JCDH. These Effective Concentrations are on the order of 3 percent of ambient concentrations reported by the SEARCH network and 18 percent of those reported by the JCDH sites. Data in Table 2 show that the effect of blank-correction can be very significant for OC. As an example of this, we compare total carbon (TC, the sum of OC and EC) measurements at the North Birmingham, AL site which is common to both the SEARCH and JCDH networks. TC must be compared because differences in laboratory analytical methods preclude direct comparison of OC. For the 2- year period December 14, 2002 through December 13, 2004 (roughly 220 samples), SEARCH and JCDH reported TC concentration at North Birmingham of 6.17 µg/m3 and 7.41 µg/m3 , respectively. The Effective Concentration for TC in JCDH field blanks is 1.04 µg/m3 , or approximately 80 percent of the difference in reported concentrations at North Birmingham. Stated another way, without blank correction JCDH TC concentrations are 20 percent higher than SEARCH; with blank-correction, JCDH TC concentrations are only 3 percent higher than SEARCH. The latter is well within combined measurement uncertainties of the two networks. In summary, field blanks are an integral component of any air quality measurement program, and blank-correction should be used to remove extraneous contamination from the data. Inspectio n of data from several networks in the southeastern U.S. shows that FRM field blanks exert a positive bias to the data. The bias is statistically significant and relatively consistent within networks, but not necessarily consistent across networks. This suggests that there will be significant benefits to blank correction and that it sho uld be performed on a network by network basis. Field blanks should be collected, analyzed, reported and used to interpret (i.e., to blankcorrect and to determine detection limits for) measurement data from all Federally- sponsored FRM and chemical speciation networks. 3 Table 1. Summary of field blank mass data from various networks in the southeastern U.S. Network/Agency Statistic SEARCH ADEM ADEM FDEP JCDH JCDH MDEQ Blank Type FRM -FB FRM -FB FRM -TB FRM -FB FRM -FB STN-FB FRM -FB Monitor R&P seq. R&P seq. R&P seq. R&P seq. BGI man. SASS R&P seq. Years 2000-4 2005 2005 2003-4 2004 2003-4 2004 N 665 256 128 2881 116 38 92 Mean (µg) 5.2 10.9 10.4 7.0 7.2 9.3 5.8 Min. (µg) -5 -7 -1 -12 -16 -2 -8 Max. (µg) 26 27 21 28 29 23 30 s.d. (ug) 3.9 5.8 4.8 6.5 8.0 6.2 8.0 Lower 95% CI 4.9 10.2 9.5 6.7 5.7 7.4 4.1 (µg) Upper 95% CI 5.5 11.6 11.2 7.2 8.6 11.3 7.5 (µg) % >0 94 96 98 85 84 92 71 Effective 0.22 0.45 0.43 0.29 0.30 0.96 0.24 PM 2.5 (µg/m3 ) FRM = Federal Reference Monitor; STN = Speciation Trends Network monitor; FB = Field Blank; TB = Trip Blank R&P seq. = Rupprecht & Patashnick sequential FRM; BGI man. = BGI manual FRM Table 2. Field blank summary for speciation samplers operated by SEARCH (2000-4) and JCDH (2003-4). SEARCH Total NO3 EC 451 352 0.21 0.0 <0.05 <0.5 0.55 0.40 0.12 0.01 0.20 0.0 Statistic SO42- NH4 + OC TC SO42- NH4 + N 456 455 357 352 38 40 Mean (µg) 0.51 0.07 2.8 2.8 0.18 0.01 Min. (µg) <0.25 <0.05 0.9 0.9 0.00 0.00 Max. (µg) 1.86 1.83 10.6 10.8 0.78 0.32 s.d. (ug) 0.07 0.16 2.2 2.2 0.24 0.05 Lower 95% CI 0.49 0.06 2.6 2.6 0.11 -0.01 (µg) Upper 95% CI 0.53 0.08 0.22 0.0 3.1 3.2 0.26 0.03 (µg) % > det. limit 4 9 6 5 100 100 42 5 Effective 0.02 0.03 0.01 0.00 0.12 0.13 0.02 0.01 Conc. (µg/m3 ) EC = elemental carbon; OC = organic carbon; TC = total carbon (i.e., EC+OC). 4 JCDH Total NO3 EC 40 40 0.36 0.5 0.00 0.0 1.20 2.9 0.35 0.9 0.25 0.2 OC 40 9.6 4.9 16.8 2.6 9.3 TC 40 10.1 4.9 18.2 2.8 9.5 0.47 0.8 10.9 11.6 40 0.04 60 0.05 100 0.99 100 1.04