August 27, 2019 Ms. Seema Verma, Administrator Centers for Medicare & Medicaid Services Department of Health and Human Services Hubert H. Humphrey Building, Room 445-G 200 Independence Avenue, S.W. Washington, D.C. 20201 RE: Medicare and Medicaid Programs; Policy and Technical Changes to the Medicare Advantage, Medicare Prescription Drug Benefit, Program of All-inclusive Care for the Elderly (PACE), Medicaid Fee-For-Service, and Medicaid Managed Care Programs for Years 2020 and 2021 (“Proposed Rule”) Dear Administrator Verma: America’s Health Insurance Plans (AHIP) appreciates the opportunity to comment on the Risk Adjustment Data Validation (RADV) proposal put forward by the Center for Program Integrity (CPI) and included in the Proposed Rule. AHIP is the national association whose members provide coverage for health care and related services for millions of Americans. Through these offerings, including Medicare Advantage (MA) plans, we improve and protect the health and financial security of consumers, families, businesses, communities, and the nation. We are committed to market-based solutions and public-private partnerships that improve affordability, value, access, and well-being for consumers. The MA program is critical to achieving national policy goals for improved health care, and we share your strong commitment to delivering better health outcomes, value, and satisfaction to Medicare beneficiaries. Proposed Rule Is Fatally Flawed and Should Be Withdrawn The MA program and its payment structure are designed to encourage MA plans to maximize the efficient provision of high-quality health care treatments and services to Medicare beneficiaries. They are also designed to ensure that MA plans have the resources needed to provide high-quality benefits and coordinated care to seniors and people with disabilities. As part of this structure, Congress directed CMS to adjust payments to each plan to account for the health status of its population. 1 CMS carries out this mandate through a risk adjustment model that is based on data from traditional Medicare. Congress also required CMS to ensure the risk adjustment model achieves actuarial equivalence between MA and traditional Medicare. 2 This actuarial equivalence requirement is fundamental to the proper functioning of the MA risk adjustment model, and therefore is a core component of a stable MA program. 1 42 USC 1395w-23(a)(3). 42 USC 1395w-23(a)(1)(C)(i) – “…the Secretary shall adjust the payment…for such risk factors as age, disability status, gender, institutional status, and such other factors as the Secretary determines to be appropriate, including adjustment for health status under paragraph (3), so as to ensure actuarial equivalence.” (emphasis added) 2 August 27, 2019 Page 2 As we note in our comments below, we believe the Proposed Rule violates this critical actuarial equivalence requirement. First and foremost, it fails to include a fee-for-service (FFS) adjuster to ensure actuarial equivalence between payments to MA plans and payments under the traditional Medicare program. 3 The rule also suffers from other serious substantive and procedural defects. Given our very strong legal and policy objections, we urge CMS to withdraw the RADV proposal. Proposed Rule Undermines MA and Confidence in Government as Fair Business Partner We appreciate CMS’ decision to provide data and additional methodological explanations regarding its technical study relating to the proposal in response to requests from stakeholders and the agency’s decision to extend the comment deadline on the RADV proposal, accordingly. However, these data and explanations do not solve or mitigate our serious concerns with the RADV proposal. Health insurance providers are accountable to the consumers they serve as well as the taxpayers who fund the MA program. Since 2010, MA plan sponsors have requested that CMS engage in a dialogue to develop a fair and appropriate oversight process. Rather than engage with its private-sector partners, CPI released a proposal that is neither fair nor appropriate. It would reverse the wellestablished and long-held principle that a FFS adjuster is necessary to meet statutory and actuarial requirements. 4 The Proposed Rule would also permit actions that exceed the agency’s legal authority, including collecting contract-wide payment amounts and retroactively changing rules going back almost a decade (to 2011). If CMS were to finalize the RADV provisions in the Proposed Rule, it would undermine stakeholder confidence in the agency’s willingness to comply with the law and to act as a fair partner with the private sector. Private-sector partners must be able to rely on the government’s word and know that the government will adhere to its commitments, whether stemming from statute or otherwise. A lack of trust injects significant uncertainty and risk into the system, undermines how the free market and public programs work together, and fundamentally weakens the integrity of the MA program. As a result, seniors and hardworking taxpayers might see higher costs, reduced benefits, and fewer MA plan options. There is a better way. We ask that CMS withdraw these provisions and work with us on real solutions that are fair, accurate, and legally permissible. Growing Value and Attractiveness of MA over Traditional Medicare MA plans deliver better care and better value through innovative, patient-centered programs that improve quality and reduce costs. In the past decade, enrollment in MA has nearly doubled. More 3 The FFS adjuster accounts for the fact that the documentation standard used in RADV – that claims must be submitted absolutely free of diagnosis coding errors – is different from the documentation standard used to calculate the risk adjustment model, which includes unsubstantiated FFS claims data with diagnosis coding errors. 4 Centers for Medicare & Medicaid Services, Notice of Final Payment Error Calculation Methodology for Part C Medicare Advantage Risk Adjustment Data Validation Contract-Level Audits, February 24, 2012. “…to determine the final payment recovery amount, CMS will apply a Fee-for-Service Adjuster (FFS Adjuster) amount as an offset to the preliminary recovery amount…The FFS adjuster accounts for the fact that the documentation standard used in RADV audits to determine a contract’s payment error (medical records) is different from the documentation standard used to develop the Part C risk-adjustment model (FFS claims).” Available at: https://www.cms.gov/Research-Statistics-Data-and-Systems/Monitoring-Programs/recovery-audit-program-parts-cand-d/Other-Content-Types/RADV-Docs/RADV-Methodology.pdf. August 27, 2019 Page 3 than 22 million Americans – over one-third of all Medicare beneficiaries – have chosen to enroll in MA plans. These plans provide financial security by limiting out-of-pocket costs, offering integrated drug coverage, and providing a rich array of benefits not available in traditional Medicare, including dental, vision, hearing, and other supplemental benefits. Enrollees in MA plans are highly satisfied with the MA program. 5 MA plans provide these benefits at the same cost as traditional Medicare.6 And in areas of the country where MA is popular, additional enrollment leads to slower traditional Medicare spending growth as providers employ MA practice patterns and care guidelines for their remaining traditional Medicare patients. 7 Summary of Key RADV Concerns We have attached comments with significantly more detail about our legal and policy concerns with the RADV changes in the Proposed Rule. They include the following: • A FFS Adjuster is required to ensure actuarial equivalence. We have attached an analysis from the actuarial firm Milliman, based on CMS’ data and methodologies as presented in the CMS technical study included with the Proposed Rule and subsequent data releases, that shows a FFS adjuster, or other similar adjustment, is necessary to ensure actuarial equivalence between payments to MA plans and payments under the FFS program. This adjustment is required due to the different documentation standards for the determination of diagnoses under MA and traditional Medicare. Milliman makes adjustments to address errors in CMS’ methodology to find that a FFS adjuster would be both positive and material, and concludes that the CMS technical analysis “cannot appropriately be used to conclude a FFS adjuster is not required.” Another study highlighted in our comments, and two recent federal district court decisions, also conclude that a FFS adjuster is required. In addition, the CMS technical study and addendum fail to address the key issue of actuarial equivalence in the context of RADV audits, contain multiple flaws and questionable assumptions, and in short, appear to have been designed to minimize error rates, enabling CMS to arrive at the conclusion that a FFS adjuster is not warranted. Further, CMS’ claim that a FFS adjuster would create inequity among plans is neither credible, reasonable, nor consistent with a recent court decision which confirms that the statute requires CMS adjust payments to ensure actuarial equivalence. We believe strongly that CMS is required to implement a FFS adjuster in payment recovery activities, and that such an adjustment is not only necessary to achieve actuarial equivalence but is equitable for both audited and unaudited plans in that context. • CPI has no legal authority for extrapolation and, even if it did, is proposing to use a flawed methodology. In RADV audits, CMS reviews medical records from a sample of beneficiaries. The Proposed Rule would provide for “extrapolation” – i.e., CMS would use the sample to calculate a contract-wide error rate and recover payments accordingly. The Social Security Act (SSA) provides authority to extrapolate, but only for Medicare 5 Morning Consult National Poll. November 28-29, 2018. In this poll, 90 percent of MA members reported satisfaction with their health care coverage and preventive services, and 84 percent reported satisfaction with their prescription drug coverage. 6 Medicare Payment Advisory Commission. Report to the Congress: Medicare payment policy. March 2019. For 2019, MA plan payments are equivalent to traditional Medicare costs. 7 Johnson, Garret, Figuero, Jose F., Zhou, Xiner, et al. Recent growth in Medicare Advantage enrollment associated with decreased fee-for-service spending in certain US counties. Health Affairs 35(9): 1707-1715. September 2016. August 27, 2019 Page 4 contractors auditing providers under Parts A and B, and only in limited circumstances. 8 The Proposed Rule provides no legal justification for extrapolation. CPI simply asserts it has such authority. 9 Further, based on the findings of a study by Wakely (attached) that identified several significant areas of concern, we believe the extrapolation methodology CMS published in 2012 raises serious policy concerns because it will produce arbitrary results. 10 • Retroactivity is prohibited by federal law and is unnecessary and unjustified. CMS proposes to grant itself authority to extrapolate audits for plan years 2011 and forward. The SSA clearly prohibits retroactive rules absent a statutory requirement, significant public safety concern, or other critical need 11 – none of which are present here. In addition, retroactivity poses major operational barriers for plans and providers. For example, CPI has recently begun conducting RADV audits for 2014 which review services rendered in 2013. This long passage of time could make it extremely difficult for plans to obtain medical records with respect to providers who, for example, are deceased, closed their practices, changed to new recordkeeping systems, etc. • Implementation of the RADV audit methodology would violate rulemaking requirements. The methodology used by CPI for 2011, 2012, and 2013 audits was never subject to prior notice and comment rulemaking. And while CPI is now soliciting comments, the Proposed Rule indicates that CPI can implement changes solely through Health Plan Management System (HPMS) notices. CPI also has begun moving forward with audits for 2014, using a new methodology, without allowing a comment opportunity or broadly providing any details to the public on the methodology it is using. These actions are inconsistent with the SSA requirements for notice-and-comment rulemaking as indicated by the U.S. Supreme Court in its recent ruling in Azar v. Allina Health Services. 12 RADV Recommendations Based upon the legal and policy reasons above, we urge CMS to take the following steps: • 8 Withdraw the RADV proposal. The RADV provisions in the Proposed Rule should not be finalized. The provisions should be withdrawn in their entirety so together we can develop a collaborative and constructive solution. 42 U.S.C. 1395ddd. See, e.g., 83 Fed. Reg. 54982, 54984 (Nov 1, 2018), where CMS asserts that extrapolation is “based on longstanding case law and best practices from HHS and other federal agencies” but provides no citations to or analysis of this authority. 10 Murray, T., Morgan, E., Sauter, M. Medicare RADV: Review of CMS sampling and extrapolation methodology. Wakely Consulting Group. July 2018. Available at: https://www.ahip.org/wp-content/uploads/2018/07/WakelyMedicare-RADV-Report-2018.07.pdf. 11 42 USC 1395hh(e)(1)(A) – “A substantive change in regulations, manual instructions, interpretative rules, statements of policy, or guidelines of general applicability under this subchapter shall not be applied (by extrapolation or otherwise) retroactively to items and services furnished before the effective date of the change, unless the Secretary determines that- (i) such retroactive application is necessary to comply with statutory requirements; or (ii) failure to apply the change retroactively would be contrary to the public interest.” 12 139 S.Ct. 1804 (2019). 9 August 27, 2019 Page 5 • Affirm that regulatory changes cannot be applied retroactively. CMS should follow the clear directive in the SSA to avoid retroactive application of new requirements. Any changes that impose new obligations on MA plans should be developed after appropriate and substantive interaction with the industry and apply only to payment years arising after the RADV proposal is finalized (i.e., prospectively). In other words, CMS can only apply changes in RADV methodology to payment years after publication of a final rule, and plans must have the ability to factor the RADV rules into their bids. Thus, even if CMS were to finalize any changes to the RADV methodology in 2019, the earliest it could apply would be 2021. • Acknowledge that a FFS adjuster is required under statute and improve CMS’ audit methodology. For many years, CMS expressly stated that a FFS adjuster is needed to meet statutory requirements for actuarial equivalence. CMS has reversed that long-held position with this rule. We strongly urge CMS to keep its word and develop a FFS adjuster, taking into account the multiple recent independent analyses finding such an adjustment is necessary, material, and legally required. In addition, the agency should improve the RADV audit methodology, including the design of a better process for determining whether a patient in fact has a given health condition through use of pertinent data sources. • Engage in meaningful, collaborative dialogue with plan partners to implement changes. We urge CMS to create a fair and open process to develop appropriate payment oversight standards, similar to processes used in the traditional Medicare program and in certain aspects of the Food and Drug Administration’s oversight of the pharmaceutical industry. This is especially important given the complexities of the MA payment system where various components (from benchmark-setting to risk adjustment to oversight) determine payments. Adopting such an approach is critical to the continued strength of the MA program and the ability of plans to meet the needs of the people they serve. The industry stands ready to work closely and collaboratively with CMS on the issues described here and other matters related to oversight of the MA program. Conclusion The RADV proposal violates numerous statutory requirements and is fundamentally unfair and illconceived. We urge CMS in the strongest possible terms to withdraw it and establish a collaborative process with stakeholders to create a workable alternative. We look forward to providing any additional information you may need and to continuing to work together to improve the health of the millions of Medicare beneficiaries our members serve. Sincerely, Matthew Eyles President and CEO Enclosure Medicare and Medicaid Programs; Policy and Technical Changes to the Medicare Advantage, Medicare Prescription Drug Benefit, Program of All-Inclusive Care for the Elderly (PACE), Medicaid Fee-for-Service, and Medicaid Managed Care Programs for Years 2020 and 2021 (“Proposed Rule”) I. Summary: The Centers for Medicare & Medicaid Services’ (CMS) proposed changes to the Risk Adjustment Data Validation (RADV) program fail to satisfy the Social Security Act, are based on flawed data, and are procedurally defective The Proposed Rule includes several critical changes relating to RADV audits: • • • • • • CMS would extrapolate RADV audit findings to Medicare Advantage (MA) contract level payments. The extrapolation would be applied retroactively, going back to audits from 2011 and after. RADV audits for payment years 2011, 2012, and 2013 would be extrapolated based on a methodology described in a notice dated February 24, 2012 posted on the CMS website 1 (the 2012 RADV Notice). Audits for payment years 2014 and beyond would also be extrapolated. CMS has subsequently noted that it will extrapolate some, but not all, of the 2014 audits. In the Proposed Rule, CMS indicates it could use a different methodology for extrapolation including a potential approach that would use sub-cohorts of enrollees and has subsequently indicated that a sub-cohort methodology will be used for the 2014 audits. In the 2012 RADV Notice, CMS stated that it would apply a fee-for-service (FFS) adjuster as an offset to any RADV recovery amount. CMS also stated that it would conduct a study to determine the amount of the FFS adjuster. Specifically, in the 2012 RADV Notice, CMS stated: “The FFS adjuster accounts for the fact that the documentation standard used in RADV audits to determine a contract’s payment error (medical records) is different from the documentation standard used to develop the Part C risk-adjustment model (FFS claims).” However, under the Proposed Rule, CMS reverses course by stating that a FFS adjuster is not appropriate as an offset for RADV recoveries. CMS states that based on a technical analysis of “audit miscalibration error” in the risk adjustment model (the CMS technical study 2), a FFS adjuster is unnecessary because the impact of audit miscalibration is negative and extremely close to zero. Separately, CMS asserts that a FFS adjuster is not appropriate, regardless of what the study found, because 1 Notice of Final Payment Error Calculation Methodology for Part C Medicare Advantage Risk Adjustment Data Validation Contract-Level Audits, accessed at: https://www.cms.gov/Research-Statistics-Data-andSystems/Monitoring-Programs/recovery-audit-program-parts-c-and-d/Other-Content-Types/RADV-Docs/RADVMethodology.pdf. 2 Fee for Service Adjuster and Payment Recovery for Contract Level Risk Adjustment Data Validation Audits Technical Appendix, available at: https://www.cms.gov/Research-Statistics-Data-and-Systems/MonitoringPrograms/Medicare-Risk-Adjustment-Data-Validation-Program/Other-Content-Types/RADV-Docs/FFS-AdjusterTechnical-Appendix.pdf. Page 1 • it would only correct payments to audited plans, which would be inequitable to unaudited plans. Following the publication of the Proposed Rule, CMS published additional data regarding the technical study and also released an addendum that contained additional information on the study’s assumptions and methodology. 3 Below we highlight our strong policy and legal objections to each component of the RADV proposal, beginning with the FFS adjuster. We have attached an analysis from the actuarial firm Milliman 4, based on CMS’ data and methodologies as presented in the CMS technical study and the addendum to the study, that shows a FFS adjuster, or other similar adjustment, is necessary to ensure actuarial equivalence between payments to MA plans and payments under the FFS program, as required by the Social Security Act (SSA). Milliman finds that the CMS technical analysis “cannot appropriately be used to conclude a FFS adjuster is not required.” Milliman adjusts for errors in CMS’ methodology to find that an FFS adjuster is both positive and material. As such, Milliman’s study refutes CMS’ conclusion that a FFS adjuster is not necessary. Another study highlighted below, and two recent federal district court decisions, also conclude that a FFS adjuster is required. In addition, CMS’ separate argument that it would be inequitable to have a FFS adjuster is a distraction from the question at hand – which is whether or not a FFS adjuster is required to ensure actuarial equivalence. The issue of actuarial equivalence arises whenever CMS seeks to apply a documentation standard for payment (medical records) that differs from the standard used in developing the risk adjustment model (claims). Plans not subject to a RADV audit may still be subjected to different document standards if they face overpayment claims by the government, and two district courts have recently held that the actuarial equivalence requirement must be satisfied in the overpayment context. Thus, equity in fact requires the application of a FFS adjuster to RADV audits. Further, unaudited plans that do not face payment recovery issues are not adversely affected by different documentation standards, and therefore the use of a FFS adjuster would not adversely affect them. In any event, as a membership organization representing the MA industry, we can say, without hesitation, that our members support a FFS adjuster regardless of whether one of their contracts is selected for audit. The proposal also includes several other substantive and procedural defects. For example, CMS does not have statutory authority to use its proposed extrapolation methodology. Even if it did, CMS’ published 2012 extrapolation methodology is so flawed that implementation would be arbitrary and capricious. (We have also attached an analysis from the actuarial firm Wakely that identifies the significant concerns with the 2012 RADV methodology.) Further, CMS’ proposed retroactive application of the regulation going back almost a decade – to audit results for the 2011 payment year, based on diagnosis data from 2010 – is impermissible under the law. In addition, CMS proposes to apply its 2012 published extrapolation methodology 3 Addendum to the Fee-for-Service Adjuster study, available at: https://www.cms.gov/Research-Statistics-Data-andSystems/Monitoring-Programs/Medicare-Risk-Adjustment-Data-Validation-Program/Other-Content-Types/RADVDocs/RADV-Provision-CMS-4185-N4-Data-Release-June-2019.zip. 4 Available at: http://us.milliman.com/insight/2019/Medicare-Advantage-RADV-FFS-adjuster/. Page 2 to 2011-2013 audits using an extrapolation methodology never developed through rulemaking. CMS is also moving forward with a new methodology for audits of the 2014 plan year without providing adequate detail or any opportunity for comment, despite statutory rulemaking requirements. These actions are inconsistent with the SSA requirement for notice-and-comment rulemaking as indicated by the U.S. Supreme Court in its recent ruling in Azar v. Allina Health Services. 5 Given the substantive and procedural defects in CMS’ proposals, we urge CMS to take the following steps: • • • • • • Withdraw these proposals in their entirety. Affirm that the agency cannot change regulations retroactively. Acknowledge that a FFS adjuster is required under statute to ensure actuarial equivalence since documentation standards applied to payment differ from the standards used in developing the risk adjustment model. Improve the agency’s audit methodology. Note that CMS does not have statutory authority to conduct extrapolations under RADV. Engage in meaningful, collaborative dialogue with the industry to address these issues going forward. II. FFS adjuster is required by law The Medicare statute clearly requires actuarial equivalence in payments between traditional Medicare and the MA program. 6 Actuarial equivalence can either be achieved through a FFS adjuster in assessing payment errors in the MA program, or, alternatively, through CMS estimating the risk adjustment model using audited FFS data. This interpretation was supported by CMS itself and has been upheld by two recent court decisions. The Milliman study shows that a FFS adjuster, or other similar adjustment, is necessary to ensure actuarial equivalence between payments to MA plans and payments under the FFS program. In addition, CMS’ claim that equity concerns would prevent application of a FFS adjuster is meritless. Each of these issues are discussed in detail below. A. FFS adjuster is required by statutory language ensuring actuarial equivalence i. Background on statute and risk adjustment Section 1853(a)(1)(C)(i) of the Social Security Act (SSA) states in relevant part that the Secretary of HHS: shall adjust the payment amount [to an MA plan] for such risk factors as age, disability status, gender, institutional status, and such other factors as the Secretary determines to be appropriate, including adjustment for health status . . . , so as to ensure actuarial equivalence. The Secretary may add to, modify, or substitute for 5 6 139 S.Ct. 1804 (2019). See Section 1853(a)(1)(C)(i) of the SSA. Page 3 such adjustment factors if such changes will improve the determination of actuarial equivalence. CMS makes the required statutory adjustments for health status through a risk adjustment model that applies a risk score to the MA payment for each enrollee. A risk score represents the relative costs of an individual compared to that for an average beneficiary. CMS uses the CMSHierarchical Condition Category (HCC) model to calculate these risk scores, which are determined based on demographic (e.g., age, gender, Medicaid status) and disease characteristics. Diseases are assigned to HCCs. In this sense, each HCC represents a disease group (e.g., diabetes, congestive heart failure). In general, each plan is paid its bid, multiplied by the risk score for that enrollee. The bid represents the average costs for an enrollee in that plan to receive Medicare Parts A and B items and services, standardized to a 1.0 risk score. The risk model pays more for sicker enrollees, and less for healthier enrollees. The CMS-HCC risk model is estimated – or calibrated – on claims data from traditional Medicare (also known as Medicare FFS). Diagnoses from a previous year are used to estimate costs in a current year (e.g., 2013 diagnoses to predict 2014 costs). CMS uses a weighted least squares regression to determine the dollar amount associated with each HCC (e.g., chronic obstructive pulmonary disease, diabetes, etc.) and for demographic characteristics (e.g., age, gender, etc.). These estimated dollar amounts for each HCC or demographic characteristic are also known as model coefficients. Through a process that CMS describes as “normalization”, CMS runs the model on FFS claims data to determine the total predicted spending for the population. CMS divides the total predicted spending by the number of enrollees to determine the average predicted spending for the population. CMS then converts the dollar amounts for the model coefficients to relative factors by dividing each coefficient by the average predicted costs for the population. The sum of these relative factors is the beneficiary risk score. This process leads to an average risk score of 1.0, because all dollar coefficients are divided by the average predicted spending for the population. As noted earlier, plans submit bids to CMS that estimate the expected costs for their population to provide Medicare Part A and B items and services. For example, if the plan’s standardized bid is $1,000 per member per month, and the risk score for an enrollee is 1.1, the plan will be paid $1,100 for that enrollee. CMS conducted audits for plan years 2011, 2012, and 2013 to determine whether diagnoses submitted by plans and used by CMS to determine risk scores were supported by medical records. 7 CMS used a documentation standard for these audits that differed from the documentation standard used to develop the risk adjustment model. The standard CMS used to determine payment errors under a RADV audit of an MA plan was the medical record, while the documentation standard used to develop the relative values in the HCC model was unaudited FFS claims data. This different documentation standard is why a FFS adjuster is necessary. 7 CMS also conducted pilot and contract-level audits for plan year 2007. Page 4 ii. Actuarial experts and court decisions affirm the need for a FFS adjuster to ensure actuarial equivalence The Actuarial Standards Board’s Actuarial Standards of Practice (ASOP) provide guidelines for an actuarial review of risk adjustment models. These standards expressly provide that “[t]he type of input data that is used in the application of risk adjustment should be reasonably consistent with the type of data used to develop the model.” 8 ASOP No. 45 requires consistency between how the model is developed and how it is applied in payment. However, the documentation standards used in the RADV audit to determine a plan’s payment error are different from the documentation standards used to develop the risk adjustment model. In addition, the United States District Court for the District of Columbia recently examined the underlying principle of actuarial equivalence and its applicability to section 1853(a)(1)(C)(i)’s actuarial equivalence requirement in a clearly written and well-reasoned opinion. See UnitedHealthcare Ins. Co. v. Azar, 330 F. Supp. 3d 173 (D.D.C. 2018) (Collyer, J.), appeal docketed, No. 18-5326 (D.C. Cir. Nov. 14, 2018). 9 The UnitedHealthcare court found that section 1853(a)(1)(C)(i) imposes a non-discretionary duty on CMS whereby the agency must ensure that “two modes of payment” – payment to providers and suppliers under Medicare Parts A and B, on the one hand, and payment to MA plans under Medicare Part C, on the other – result in “present values [that] are equal under a given set of actuarial assumptions.” 330 F. Supp. 3d at 185 (citations omitted). And by “given,” the UnitedHealthcare court meant “‘the same,’ as in two figures are actuarially equivalent when they share the same set of actuarial assumptions.” Id. at 185-86. And that “[d]ifferent assumptions behind the elements of a calculation would, necessarily, result in actuarially non-equivalent results.” Id. at 186 (emphasis added). In other words, the assumptions used in Medicare FFS payments must hold true for payments to MA plans. Applying the plain meaning of what constitutes “actuarial equivalence”, the UnitedHealthcare court vacated the CMS final rule on overpayments promulgated in 2014. A critical part of the holding was that CMS violated the actuarial equivalence requirement by calculating risk adjustment payments to MA plans using unsubstantiated FFS diagnosis codes to determine the expected additional costs of providing coverage to a beneficiary with a particular medical condition, while holding MA plans to a standard of perfection whereby diagnosis codes on claims submitted by MA plans had to be absolutely free of diagnosis error. The court found that “CMS cannot subject the diagnosis codes underlying [MA] payments to a different level of scrutiny than it applies to its own payments under traditional Medicare without impermissibly skewing the 8 Actuarial Standard of Practice No. 45 § 3.2. However, despite the significance of this decision, the Proposed Rule makes only passing mention of it in a footnote that states: “We are aware of the district court’s recent ruling in United HealthCare Insurance Co. v. Azar, No. 16-cv157 (D.D.C. September 7, 2018), and the government is reviewing that decision and considering its response. In any event, that ruling was made on the basis of the administrative record before the court, which did not include the results of our study.” 83 Fed. Reg. at 55,040 n.29. A few days after the Proposed Rule was published, the government filed a motion for reconsideration in the district court citing the study and claiming that it constitutes “new evidence.” See Defs.’ R. 60(b) Mot. for Partial Recons., UnitedHealthcare Ins. Co. v. Azar, No. 1:16-cv-00157 (D.D.C. Nov. 5, 2018), ECF No. 76. 9 Page 5 calculus: by doing so, it ensures that there will not be actuarial equivalence between traditional Medicare payments and [MA] payments for comparable patients.” Id. at 186. The court took particular note of the fact CMS had acknowledged this very principle in 2012 when it promised that in determining the final payment recovery amount in RADV audits, CMS would “apply a Fee-for-Service Adjuster (FFS Adjuster) amount as an offset to the preliminary recovery amount.” Id. at 188. That is, if the documentation standard used in determining payments to MA plans is not the same as the standard used to calibrate risk model coefficients based on FFS claims data, there must be an adjustment. Alternatively, CMS could achieve actuarial equivalence by estimating the risk adjustment model using audited FFS data. However, by eliminating the FFS adjuster and continuing to estimate the risk model based on unaudited FFS claims data, CMS would hold MA organizations to a perfection standard for medical record documentation that is clearly not applied in the FFS program – as shown in the error rates by HCC that CMS identified in its own study. To do so would contravene the SSA’s “actuarial equivalence requirement.” In another recent decision, the United States District Court for the Central District of California also examined the statutory provision in section 1853(a)(1)(C)(i) on actuarial equivalence. See United States ex rel. Benjamin Poehling v. UnitedHealth Group, Inc., No. 16-08697, 2019 WL 2353125 (C.D. Cal. Mar. 28, 2019). The government argued that the statutory language “merely arms the Secretary with broad discretionary power to adjust payment levels based on the health status of Medicare beneficiaries.” Id. at 5 (emphasis in original). In denying the government’s motion for partial summary judgment, the court stated that it was “unpersuaded by the Government’s argument in light of the plain language of the statute, which provides that the Secretary shall adjust the payment amount for factors the Secretary deems appropriate so as to ensure actuarial equivalence. Such language is far from discretionary.” Id. at 6 (emphasis in original). The court also cited the decision in UnitedHealthcare Ins. Co. v. Azar on the need for a FFS adjuster as “persuasive authority”. Id. iii. Independent actuarial analyses clearly demonstrate the need for a FFS adjuster Since CMS published the Proposed Rule, two independent analyses – one by Avalere (the Avalere study10) and one by Milliman (the Milliman study) – clearly demonstrate the need for a FFS adjuster. Importantly, neither analysis estimates the appropriate amount of the FFS adjuster. Rather, each study demonstrates that CMS’ methodology, after adjusting for methodological flaws, would lead to a FFS adjuster that is not zero. Thus, CMS’ conclusion – that the audit miscalibration error is negative and extremely close to zero and therefore a FFS adjuster is not necessary – cannot be supported by its own technical study. 11 10 Avalere Health. Eliminating the FFS Adjuster from the RADV methodology may affect plan payment. March 2019. Available at: https://avalere.com/wp-content/uploads/2019/03/20190318-FFS-Adjuster-Analysis-Final-.pdf. 11 The methodological flaws demonstrate that, at a minimum, CMS has “relied on factors which Congress has not intended it to consider, entirely failed to consider an important aspect of the problem, offered an explanation for its decision that runs counter to the evidence before the agency, or [proposed a course of action that] is so implausible that it could not be ascribed to a difference in view or the product of agency expertise.” Motor Vehicle Mfrs. Ass’n v. State Farm Mut. Auto. Ins. Co., 463 U.S. 29, 43 (1983). Page 6 a. Avalere study Avalere analyzed CMS’ conclusion in the Proposed Rule that re-estimating risk scores from the HCC model, based on coding errors in FFS claims, does not have an impact on risk scores. According to CMS, the risk scores from the re-estimated model were almost equal to those produced by the original model. This led CMS to determine that erroneous coding in FFS has minimal impact on MA risk scores, and therefore no FFS adjuster is needed. Avalere noted that “certain key assumptions embedded in CMS’ analysis do not appropriately capture the full variation in the data and minimized the impact of documentation error.” For example, Avalere explained that CMS’ simplifying assumption that each person in the sample has an average number of claims supporting a particular disease group, or HCC, is flawed because the distribution of Medicare claims is skewed. That is, the average number of claims is higher than the median, or midpoint. Avalere found, just by applying CMS’ methodology to the actual distribution of error rates, that MA risk scores from the re-estimated model accounting for coding errors in FFS claims would be almost 8 percent lower than the original model. Avalere also says that “assuming that each claim supporting an HCC has an equal probability of error suggests that coding and documentation errors occur randomly. However, it is probable that there are correlations in errors.” b. Milliman study AHIP sponsored an analysis by Milliman to evaluate CMS’ conclusion that a FFS adjuster is not necessary. Milliman reproduced CMS’ methodology and used CMS’ published assumptions, the data, and the related files CMS provided. 12 The Milliman study identifies multiple significant issues in the CMS assumptions and methodologies. The Milliman study focused on and adjusted for significant shortcomings in CMS’ assumptions relating to how: 1) CMS normalized the risk adjustment model, and 2) CMS derived and applied beneficiary-level diagnosis error rates. Milliman found that CMS did not state that it calculated a FFS adjuster in the technical analysis accompanying the Proposed Rule and that “the CMS analysis measured a model calibration difference rather than addressing the question of whether a FFS adjuster is required in RADV audits.” The study explains it did not attempt to identify all potential issues and makes no judgment about the appropriateness of other methodologies that could be used to determine an appropriate FFS 12 These data, released by CMS in March and June 2019, include: diagnosis data used to calibrate the CMS-HCC model through 2011; the model calibration file used to calculate 2009 MA payments; medical record review findings from the 2008 Comprehensive Error Rate Testing review; a mapping of ICD-9 diagnosis codes to version 12 of the CMS-HCC model; MA data for a sample of RADV eligible and non-RADV eligible beneficiaries from the CMS Enrollment Data Base, Model Output File, and Monthly Membership Report for 2011; dollar coefficients and risk factors for the original data, as well as 50 simulated ‘corrected’ iterations of the data, both before and after an adjustment to account for deletion bias is made to each iteration; text file versions of the SAS programs used to conduct the analysis summarized in the study and addendum; and a variable crosswalk and sort file used in the program to conduct the analysis. Upon review of the published SAS code, Milliman verified that the CMS implementation of the process described in the technical appendix was not materially different from its reproduction of the CMS analysis. Page 7 adjuster. Milliman further notes that, depending on other potential issues and alternative assumptions and methodologies used, other valid analyses may lead to reasonable FFS adjusters that are outside the ranges considered in the paper. Milliman states in the study that “we have not been able to conceive of a reasonable methodology that would lead to the conclusion a FFS adjuster is unnecessary.” Milliman summarizes its findings in the Executive Summary of the study as follows: The Centers for Medicare and Medicaid Services (CMS) issued a proposed rule13 on November 1, 2018, which contained provisions regarding risk adjustment data validation (RADV) audits. In particular, this proposed rule removed what is known as the fee-for-service (FFS) adjuster, which is a mechanism for adjusting RADV audit recoveries to ensure actuarial equivalence between FFS and MA payments. Actuarial equivalence is required by law 14. Based on the analysis described in this white paper, we determined: • A FFS adjuster, or other similar adjustment, is necessary to ensure actuarial equivalence between payments to Medicare Advantage Organizations (MAOs) and payments under Medicare FFS. • CMS analyzed the difference between two calibrations of the CMS Hierarchical Condition Category (HCC) model to investigate what it referred to as “audit miscalibration.” 15 CMS normalized the revised model inconsistently within the context of a FFS adjuster or a RADV audit; therefore its technical analysis cannot appropriately be used to conclude a FFS adjuster is not required. • CMS underestimates the level of diagnosis coding errors present in FFS claims data. Notably: o CMS assumes diagnosis coding errors are independent from each other, which materially understates HCC error rates in FFS. o CMS uses an average number of claims per HCC in its estimation of error rates rather than a distribution of the number of claims, which materially understates HCC error rates in FFS. 13 Medicare and Medicaid Programs; Policy and Technical Changes to the Medicare Advantage, Medicare Prescription Drug Benefit, Program of All-inclusive Care for the Elderly (PACE), Medicaid Fee-For-Service, and Medicaid Managed Care Programs for Years 2020 and 2021, 83 Fed. Reg. 54982 (2018). 14 Title 42 U.S. Code § 1395w–23(a)(1)(C)(i). 15 CMS coins the term “audit miscalibration” in its FFS adjuster executive summary. Retrieved December 20, 2018, from https://www.cms.gov/Research-Statistics-Data-and-Systems/Monitoring-Programs/Medicare-Risk-AdjustmentData-Validation-Program/Other-Content-Types/RADV-Docs/FFS-Adjuster-Excecutive-Summary.pdf. The proposed rule describes a similar concept. 83 Fed. Reg. 55041 (2018). Page 8 o CMS excludes claims that do not have medical records or necessary documentation available, which also understates the HCC error rates in FFS relative to RADV audit procedures. This white paper discusses and supports our findings that a FFS adjuster is required in RADV audits. The CMS technical analysis excluded simulated unsupported diagnoses in the calibration of the CMS-HCC model, but included them in the normalization of the model. CMS should have excluded unsupported FFS diagnoses in all steps of creating the CMS HCC model to properly address the question of whether a FFS adjuster is required in RADV audits. This paper shows had CMS excluded unsupported diagnoses from all steps, their analysis would have confirmed that a FFS adjuster is required. Milliman further explains the purpose of their study as follows: The purpose of this study is to evaluate the CMS conclusion that a FFS adjuster is not appropriate; it is not to determine the appropriate amount of a FFS adjuster. The study shows that using CMS’ methodology and data but adjusting for certain issues with that methodology, as described in this paper, leads to a conclusion that a FFS adjuster is required and is likely significantly greater than zero. As described in various sections of this paper, including those titled (a) ‘CMS underestimated error rates for HCCs – Overview’, (b) ‘CMS underestimated error rates for HCCs – Is the sample size sufficient?’, (c) ’Technical analysis - Model and data selection’, and (d) ‘Conclusion’, further study of error rates is necessary to determine the true magnitude of a FFS adjuster. While the Milliman study does not determine the appropriate amount of a FFS adjuster, it includes an estimate of what the FFS adjuster would be if it were to be calculated using the CMS error rates and methodology with an adjustment for the normalization process and the actual number of diagnoses per beneficiary (rather than the average). 16 Milliman explains that: Under this approach, we calculated a FFS adjuster using claim level error rates, actual distributions of the number of diagnoses (assuming full independence 17), and an HCC error rate of 33% (assuming full dependence 18), in addition to several scenarios in between. This approach resulted in estimated values of a FFS adjuster19 16 Milliman’s study is based on CMS’ data and methodology. We discuss additional flaws with the CMS data and methodology that Milliman did not correct for in Sections II.B.ii-iv below. 17 Independence, in this context, means diagnosis coding errors on individual claims are not related to diagnosis coding errors on other claims. 18 Dependence, in this context, means diagnosis coding errors on claims are made in the same way for all claims for a particular HCC for each beneficiary. 19 We define the FFS adjuster as the percentage reduction to a risk score based upon claim diagnoses to move to a medical record diagnosis basis for a FFS population. We calculated this percentage including beneficiaries with no HCCs and beneficiaries with one or more HCCs. When applying a FFS adjuster, care must be taken to apply it to the correct population, as the difference between the two definitions is significant. If this adjuster is applied to only beneficiaries who are RADV-eligible under the current CMS rules, the adjuster would need to be grossed up to apply only to that population. Page 9 between 8% and 21%. For perspective, 8% of federal payments to MAOs exceeds $16 billion and 21% exceeds $42 billion per year, 20 the majority of which are riskadjusted. A FFS adjuster, based on CMS’s data modified to reflect reasonable error rates using an adjusted methodology (e.g., adjusts for the normalization process, the distribution of claims, and claim independence) likely lies somewhere between the two endpoints, 8% and 21%. We also note that CMS clarified in the June 2019 Addendum that they “…excluded claims where providers refused to submit medical records, or did not provide sufficient documentation.” Although we do not have the information to evaluate the impact of these exclusions on the error rates, this exclusion is inconsistent with the RADV audit process. Properly including these unsupported diagnoses in the calculation of error rates would increase the magnitude of a FFS adjuster from the figures described in this paper. As noted above, we make no judgment about the appropriateness of other methodologies that could be used to determine an appropriate FFS adjuster. Depending on other potential issues and alternative assumptions and methodologies used, other valid analyses may lead to reasonable FFS adjusters that are outside the range considered in this paper. The magnitude of a FFS adjuster is highly sensitive to the specific HCC error rates used in the analysis, and the HCC error rates in the CMS analysis are highly sensitive to both the use of an average number of claims (versus a distribution of the number of claims) within an HCC and how independent the coding of one claim is to the next. Further analysis must be completed to calculate an accurate FFS adjuster. In any case, the range is wide and even the bottom end is material and significant. iv. Simplified illustrations of why actuarial equivalence requires a FFS adjuster To see how actuarial equivalence works in practice, and why CMS violates actuarial equivalence if it does not apply a FFS adjuster, consider the following example. It is described in Table 1 and discussed in the Milliman study. This simplified example is based on an example that CMS developed when considering the need for a FFS adjuster. 21 Assume CMS develops a risk model based on four individuals in FFS Medicare. In the example, the only cost of treatment is associated with diabetes. Each of the four is coded as having diabetes. The cost of treating a person with diabetes (which is supported in the medical record) is $4,000. The cost of a person who actually does not have diabetes (the medical record has no support for diabetes) is $0. Because CMS estimates the HCC model on diagnoses codes from 20 Based on $204.7 billion in 2017 Part C federal spending. See HHS FY 2017 Budget in Brief - CMS – Medicare, available at https://www.hhs.gov/about/budget/fy2017/budget-in-brief/cms/medicare/index.html. 21 See Decl. of Daniel Meron, Ex. B at 8, UnitedHealthcare Ins. Co. v. Azar, No. 1:16-cv-00157 (D.D.C. Oct. 2, 2017), ECF No. 44-3. Page 10 claims, regardless of whether they are supported by the medical record, CMS divides the $12,000 of total cost by the count of beneficiaries with a diabetes diagnosis on a claim. In this example, because four beneficiaries have diabetes on the claims, the estimated payment to treat diabetes is $12,000 divided by 4, or $3,000. Importantly, beneficiary D in Table 1 below does not have diabetes coded on the medical record, but since it is included on the claim, that person is used to determine the payment for an individual with diabetes. Table 1. Example Showing Calculation of MA Payment Amount for Diabetes Beneficiary A Beneficiary B Beneficiary C Beneficiary D Diabetes on Claim? Yes Yes Yes Yes Diabetes in Medical Record? Yes Yes Yes No Total Diabetes Value for MA Payment FFS Cost $4,000 $4,000 $4,000 $0 $12,000 $3,000 Now consider the example illustrated in Table 2 below, which is also described in the Milliman study and based on an example from CMS. In this example, a plan has five enrollees who had diabetes coded on claims, but three of them have diabetes supported in the medical record, and two do not. The total cost for the five beneficiaries in FFS is $12,000. However, if CMS were to recover funds for unsupported codes in a RADV audit without a FFS adjuster, CMS would take back $6,000 (for Beneficiaries D and E). This means the plan would be paid only $9,000, which is $3,000 less than under FFS. 22 The example clearly demonstrates there is not actuarial equivalence between FFS and MA when a RADV audit is performed without a FFS adjuster. Table 2. Example Showing Actuarial Equivalence Not Achieved Diabetes Diabetes in on Claim? Medical Record? Beneficiary A Beneficiary B Beneficiary C Beneficiary D Beneficiary E Yes Yes Yes Yes Yes Yes Yes Yes No No Total CMS Payment to Plan $3,000 $3,000 $3,000 $3,000 $3,000 $15,000 Plan Cost $4,000 $4,000 $4,000 $0 $0 $12,000 RADV CMS Payment to Plan $3,000 $3,000 $3,000 ($3,000) $0 ($3,000) $0 ($6,000) $9,000 22 As Milliman notes, in this example, no normalization step is required because total FFS dollar costs are shown; therefore the $12,000 is already effectively normalized to a risk score of 1.0. Page 11 v. Applied example demonstrating the need for a FFS adjuster that includes risk adjustment model calibration The Milliman study builds on the above example to show the need for a FFS adjuster within the MA payment framework. Below and in Table 3 are the key elements of the analysis. For this example, Milliman created a simplified risk model using a least squares regression 23 that includes demographic and disease components, which is what CMS does when it estimates the CMS-HCC model. In this example, there are four individuals – two are 70 years old, one is 75, and one is 80 – and all have diabetes coded on their claims. Milliman estimated the model using the same set of assumptions that CMS uses, where only the claim is used as documentation. Table 3. Model Estimated Based on Claim Information On Claim? Beneficiary 1 70 year old Diabetes Subtotal Beneficiary 2 70 year old Diabetes Subtotal Beneficiary 3 75 year old Diabetes Subtotal Beneficiary 4 80 year old Diabetes Subtotal Total FFS Cost (Actual) Predicted FFS Cost Relative Coefficient $9,000 $6,500 $3,000 $9,500 0.650 0.300 0.950 $10,000 $6,500 $3,000 $9,500 0.650 0.300 0.950 $10,000 $7,000 $3,000 $10,000 0.700 0.300 1.000 $11,000 $8,000 $3,000 $11,000 0.800 0.300 1.100 $40,000 $40,000 1.000 Yes Yes Yes Yes Milliman then applies these figures to a case in which a plan has four individuals with a claim of diabetes, but only three have the diagnosis supported in the medical record. This example assumes a plan bid of $10,000 per year. See Table 4. As discussed further in the attached Milliman report, without a FFS adjuster, actuarial equivalence will not be achieved because plan payments will be $37,000 when the actuarially equivalent amount is $40,000. 23 As noted in the Milliman study, due to the simplistic nature of this example, the least squares regression does not produce a unique solution. Milliman used SAS for the regression calculations and seeded the starting values to ensure the particular solution would most resemble the original CMS example we are expanding upon. Page 12 Table 4. RADV With and Without a FFS Adjuster MA Payment Without FFS Adjuster On On Medical Before Claim? Record? Coefficient RADV Beneficiary 1 70 year old Diabetes Subtotal Beneficiary 2 70 year old Diabetes Subtotal Beneficiary 3 75 year old Diabetes Subtotal Beneficiary 4 80 year old Diabetes Subtotal After RADV MA Payment With FFS Adjuster Before RADV After RADV Yes Yes 0.650 0.300 0.950 $6,500 $3,000 $9,500 $6,500 $3,000 $9,500 $6,500 $3,000 $9,500 $6,500 $3,000 $9,500 Yes Yes 0.650 0.300 0.950 $6,500 $3,000 $9,500 $6,500 $3,000 $9,500 $6,500 $3,000 $9,500 $6,500 $3,000 $9,500 Yes Yes 0.700 0.300 1.000 $7,000 $3,000 $10,000 $7,000 $3,000 $10,000 $7,000 $3,000 $10,000 $7,000 $3,000 $10,000 Yes No 0.800 0.300 1.100 $8,000 $3,000 $11,000 $8,000 $0 $8,000 $8,000 $3,000 $11,000 $8,000 $0 $8,000 1.000 $40,000 $37,000 $40,000 $37,000 Raw RADV Recovery FFS Adjuster Final RADV Recovery Final Payment to MAO $40,000 $3,000 $0 $3,000 $37,000 $40,000 $3,000 $3,000 $0 $40,000 Actuarially Equivalent? Yes No Yes Yes Total The Milliman study includes several additional scenarios that review the impact of calibrating the risk model with one set of documentation standards yet recovering funds in RADV audits using a different set of documentation standards. All these examples demonstrate that when the CMSHCC model is calibrated and normalized based on unaudited claims data, a FFS adjuster is necessary under a RADV audit to maintain actuarial equivalence as required by statute. Page 13 vi. Nothing in SSA sections 1853(a)(1)(C)(ii) or (iii) change the requirement for a FFS adjuster. In the request for additional comment published in the Federal Register on June 28, 2019, CMS sought input “on whether section 42 U.S.C. 1395w–23 [section 1853 of the SSA]—and in particular clause (a)(1)(C), which requires risk adjustment in subclause (a)(1)(C)(i), mandates a downward adjustment of risk scores in subclause (a)(1)(C)(ii), and includes provisions about risk adjustment for special needs individuals with chronic health conditions in subclause (a)(1)(C) (iii)—mandates an FFS Adjuster, prohibits an FFS Adjuster, or should otherwise be read to inform our proposal not to apply an FFS Adjuster in any RADV extrapolated audit methodology.” As stated above, section 1853(a)(1)(C)(i) clearly requires actuarial equivalence in payments between traditional Medicare and the MA program. 24 Actuarial equivalence can either be achieved through a FFS adjuster in assessing payment errors in the MA program, or, alternatively, through CMS estimating the risk adjustment model using audited FFS data. This interpretation was supported by CMS itself and has been upheld by two recent court decisions. The provisions in subsections (ii) and (iii) relate to adjustments for coding intensity and for risk adjustment for new enrollees in chronic condition special needs plans. They have nothing to do with the requirement for a FFS adjuster under subsection (i). For example: • • • Under the plain language of the statute, subsection (i) does not refer to subsections (ii) or (iii). The requirements are completely independent. Congress added subsections (ii) and (iii) a number of years after subsection (i), but despite multiple chances to change the actuarial equivalence language in subsection (i), did not do so. Thus, there is nothing in the statute to suggest subsections (ii) and (iii) support removal of the FFS adjuster from RADV methodology. The coding intensity adjustment in subsection (ii) addresses coding pattern differences between MA and FFS. CMS has expressly stated that RADV audits address coding accuracy issues, not coding pattern differences. 25 Accordingly, even if the statute gave CMS discretion to not apply a FFS adjuster based on provisions in subsection (ii) (which it does not), CMS could not avoid applying a FFS adjuster without explaining its shift in legal interpretation that the two provisions are unrelated; providing a detailed analysis demonstrating how the coding intensity adjustment allegedly undercut the need for a FFS adjuster; and providing a comment opportunity through formal notice-and-comment rulemaking. Subsection (iii) requires CMS to apply a higher risk score for new enrollees in special needs plans for those with chronic conditions. This provision is clearly irrelevant to the general actuarial equivalence requirement in subsection (i) or the need for a FFS adjuster. 24 See Section 1853(a)(1)(C)(i) of the Social Security Act. Announcement of Calendar Year (CY) 2019 Medicare Advantage Capitation Rates and Medicare Advantage and Part D Payment Policies, at 38-39 (April 2, 2018). 25 Page 14 B. CMS’ study fails to support its position on the FFS adjuster Notwithstanding the UnitedHealthcare decision and its reliance on the 2012 RADV Notice, CMS now proposes to eliminate the FFS adjuster and offers two flawed and unsupportable reasons for doing so: • • Systematic Effect: Ignoring the agency’s previous conclusion that the FFS adjuster “accounts for the fact that the documentation standard used in RADV audits . . . is different from the documentation standard used to develop the Part C risk adjustment model,” the Proposed Rule relies on the results of the CMS technical study. According to CMS, the study results “suggest[] that errors in FFS claims data do not have any systematic effect on the risk scores calculated by the CMS-HCC risk adjustment model, and therefore do not have any systematic effect on the payments made to MA organizations.” Inequities Between Audited and Unaudited Plans: CMS also asserts that “even if [it] had found that diagnosis error in FFS claims data led to systematic payment error in the MA program, we no longer believe that a RADV-specific payment adjustment would be appropriate. . . . Doing so would introduce inequities between audited and unaudited plans, by only correcting the payments made to audited plans.” Below we discuss the limitations in the CMS technical study at length. In general, the CMS technical study fails to address the key issue of actuarial equivalence in the context of RADV audits. In addition, as the Milliman study points out, the level of the FFS adjuster depends in large part on the assumptions used. However, the CMS technical study contains multiple flaws and questionable assumptions that led to the calculation of artificially low error rates and, as a result, to CMS concluding that a FFS adjuster was not necessary. We reference the findings from the Milliman study where appropriate in each section on these limitations. After our discussion on the limitations of the CMS technical study, we discuss our strong disagreement with the rationale used to justify the inequities argument. The limitations in CMS’ study are as follows: i. Limitation #1: An analysis of the systematic effects of the risk model does not address the actuarial equivalence question in the context of RADV audits The question CMS identified in the 2012 RADV Notice related to how the FFS adjuster applies within the context of RADV, when recoveries are made if a medical record does not support a diagnosis code, but the risk model is developed based on claims and medical records that are not reviewed. This is the same issue addressed in the UnitedHealthcare case, where the court noted that “two figures are actuarially equivalent when they share the same set of actuarial assumptions. Different assumptions behind the elements of a calculation would, necessarily, result in actuarially non-equivalent results.” UnitedHealthcare, 330 F. Supp. 3d at 186. The Proposed Rule, however, makes no effort to address this meaning of “actuarial equivalence” in the context of RADV. As discussed in Section II.A.vi above, CMS did not even seek public comment on whether the statutory language mandating actuarial equivalence at section 42 U.S.C. Page 15 1395w–23(a)(1)(C) should be considered in the context of applying a FFS adjuster in the RADV audit methodology until the agency released additional data in June 2019. Instead, the Proposed Rule, through the CMS technical study, purports to ask and answer a different question: namely, whether diagnosis errors in FFS claims have a “systematic effect on the risk scores calculated by the CMS-HCC risk adjustment model, and therefore [a] systematic effect on the payments made to MA organizations.” This issue is irrelevant to the question of whether a FFS adjuster is needed to ensure actuarial equivalence in the context of RADV audits. That is, even assuming for the sake of argument that the CMS study actually supports the aggregate negative “systematic effect” conclusion for which it is cited (which in fact it does not), section 1853(a)(1)(C)(i) of the SSA imposes a mandatory duty on CMS to ensure actuarial equivalence. This mandatory duty was recognized in recent rulings by district courts in both the UnitedHealthcare Ins. Co. v. Azar and Poehling proceedings, as described above. This requirement does not terminate when payment is made to an MA plan. The statute’s use of the word “ensure” confirms that the actuarial equivalence requirement remains in effect through and including whatever post-payment audit process CMS may devise. The actuarial equivalence requirement is not met when CMS estimates the model on unaudited FFS data because it uses an audit methodology that applies a documentation standard drastically different from that applied to FFS claims. The use of these different documentation standards in model estimation and payment must therefore be addressed through the application of a FFS adjuster. Not doing so violates the fundamental standards of actuarial equivalence, which require consistency between the way a risk adjustment model is developed and how it is applied. The continuing nature of the statutory duty imposed by section 1853(a)(1)(C)(i) of the SSA was at the heart of the decision in UnitedHealthcare. CMS stated before the court that the statute only imposed a duty regarding the manner in which the agency calculated initial payments made to MA plans. However, as Judge Collyer explained when questioning counsel for CMS during oral argument: “Their argument [referring to the UnitedHealthcare plaintiffs] is that by figuring the coefficients on unaudited files and then paying out but requiring repayment on anything that is not substantiated in a medical record is to start at the beginning as if it were actuarially equivalent, but set up a system whereby it no longer is. The beginning is arguably equivalent, the process is not.” Hr’g Tr. 39:14–24, UnitedHealthcare Ins. Co. v. Azar, No. 1:16-cv-00157 (D.D.C. Aug. 9, 2018), ECF No. 73. Therefore, even if errors in diagnosis coding under Medicare Parts A and B do not have a “systematic effect” on the aggregate risk scores used to calculate payments made to all MA plans, section 1853(a)(1)(C)(i) of the SSA requires CMS to take into account those diagnosis coding errors when determining how much, if anything, a particular MA plan may owe as the result of a RADV audit. This requires considering the particular enrollees included in the sample under review and, if contract-level extrapolation is deemed lawful, the particular enrollees included in the MA contract under review. The statute permits no construction—let alone a reasonable construction—that applies a different standard. See, e.g., Michigan v. EPA, 135 S. Ct. 2699, 2708 (2015) (“Chevron allows agencies to choose among competing reasonable interpretations of a Page 16 statute; it does not license interpretive gerrymanders under which an agency keeps parts of statutory context it likes while throwing away parts it does not.”). ii. Limitation #2: The CMS technical study is based on inconsistent and arbitrary data CMS uses three sources of input data for its technical study: 1) Calendar Year (CY) 2008 Comprehensive Error Rate Testing (CERT) audit data 2) 2004-2005 FFS claims data 3) Two million MA records sampled from the 2011 overpayment run, split evenly between RADV eligible and non-RADV eligible beneficiaries 26 CMS does not explain why it chose each of these sources, in terms of the data or the time periods which the data represent. However, the data and time periods raise numerous questions. For example: • • • • More recent FFS claims data and MA records could have been used, rather than 20042005 FFS claims and 2011 MA data. By conducting its analysis using data sources from different years, the study may be inappropriately accounting for differences in health care treatment patterns between these different time periods. One-half of the MA records relate to beneficiaries who are not eligible to be included in RADV audits, which raises serious questions about whether the study could accurately assess the impact of removing diagnosis errors in a RADV audit of beneficiaries who are eligible to be included. CMS acknowledges on page 55037 of the Proposed Rule that the CMS-HCC model is “recalibrated approximately every 2 years to reflect newer treatment and coding patterns in Medicare FFS.” Given this practice, it is unclear why CMS is relying on a point-in-time analysis of coding patterns from over 10 years ago. The absence of any explanation for these data sources raises transparency concerns as commenters are left without an indication of the rationale for CMS’ decision to use these data. iii. Limitation #3: The CMS technical study is not a RADV-like review In the 2012 RADV Notice, CMS states that the amount of the FFS adjuster would be “calculated by CMS based on a RADV-like review of records submitted to support FFS claims data.” However, CMS did not conduct a RADV-like review of FFS claims data. In a RADV audit, CMS 26 Per Section 1128J(d) of the SSA and the overpayment regulation 42 CFR §422.326, all MA plans are required to report and return overpayments. CMS recovers these overpayments on an annual basis by conducting “risk score reruns” for prior payment years within a six-year lookback period. From the data subsequently released by CMS in March 2019, we understand that CMS sampled MA records from those submitted for PY 2011 (2010 dates of service). However, CMS does not clarify that these data had been processed for overpayment recovery or specify during which calendar year the 2011 overpayment run took place (the most recent deadline for PY 2011 overpayment submission was July 6, 2018 and risk scores for 2011 were rerun for overpayment recovery purposes in payments made to MA plans on October 1, 2018). Page 17 randomly selects 201 beneficiaries from a MA contract in equal numbers across three strata based on risk score, determines whether the HCCs submitted for payment for each of these beneficiaries is substantiated based on a prescriptive medical record review, and then determines a payment error based on the difference between the original risk scores calculated for these beneficiaries and the corrected risk scores based on the medical record review. This payment error would then be extrapolated to determine a final recovery amount. Instead of following the RADV methodology in reviewing FFS claims data, CMS designed its flawed study to determine the impact on the CMS-HCC risk model of diagnosis coding errors in FFS claims data. In addition, the RADV parameters were not strictly followed. For example: • • • • • Instead of measuring diagnosis discrepancies using data sampled at the beneficiary level, CMS generated an estimate of these errors by using data sampled at the claims level. In its methodology, CMS commits a number of errors in assigning the beneficiary-level error rate based on this review of the sampled claims data. As a result, CMS’ methodology is fatally flawed and does not represent an accurate representation of the beneficiary-level error rates. o Specifically, instead of selecting a random sample of FFS beneficiaries and reviewing the medical records to support each risk adjustment-eligible diagnosis reported for those beneficiaries, CMS reviewed a random sample of 8,630 FFS outpatient claims from the CERT audit data. o The CERT data do not resemble a RADV audit sample in any way, and CMS also admits that these data lack a large enough sample size for many HCCs to generalize error rates to the total population. o If the CERT audit data were somehow shown to be appropriate, which has not occurred, 2008 data predates the implementation of ICD-10 codes and therefore have questionable applicability to diagnosis coding trends in today’s environment. CMS includes beneficiaries who are not eligible for inclusion in a RADV study. Given that one-half of the MA records used in the CMS technical study were not RADV eligible, the study does not represent a ‘RADV-like’ review of claims data, as CMS stated it would conduct to generate the FFS adjuster. The CMS methodology does not take into account the process plans must follow to validate HCCs in RADV medical record review, which allows submission of a specified number of medical records to substantiate an HCC. As noted in its addendum to the technical study, CMS “excluded claims where providers refused to submit medical records, or did not provide sufficient documentation” rather than determining the diagnoses on these claims were not supported, as would have occurred in a RADV audit. Instead of comparing original risk scores from a sample of beneficiaries to corrected risk scores based on medical record review, CMS calculated coefficients based on FFS claims data reflecting the simulated error rates and applied those coefficients to MA data. As the technical analysis that CMS performed in no way resembled a “RADV-like review” – which the agency stated it would conduct in order to calculate the amount of the FFS adjuster – Page 18 and relies on statistical concepts not relevant to the key issue of actuarial equivalence, the study cannot be used to support a position that the FFS adjuster is not necessary. iv. Limitation #4: CMS makes invalid statistical assumptions about claim independence A critical assumption that allows CMS to find that a FFS adjuster is not necessary is that each Medicare claim is independent of one another. By making this assumption, CMS dramatically understates the likelihood that a beneficiary will have a diagnosis coding error corresponding to a given HCC. For example, in its review of the 2008 CERT data, CMS finds that among HCC 80 (Congestive Heart Failure), 156 claims out of 519 claims were discrepant, for an error rate of 30.1 percent. The agency further notes that an average enrollee with HCC 80 would have six claims, which leads to a probability that HCC 80 would be in error for a beneficiary of 0.301*6.1=0.8 percent. Not surprisingly, when CMS applies such low error rates to its data, the impact of removing diagnosis codes is minimal. On page 9 of the CMS technical study appendix, CMS asserts – without any additional support – that “each enrollee HCC potentially has multiple claims with independent supportive medical records.” In reality, coding errors will not be independent from one claim to the next, especially if the patient is seeing the same healthcare provider. As pointed out in the Milliman study: “We believe it is more likely that a provider or medical coder would tend to make similar errors from one claim to the next based upon their work habits, training, office practices, and by looking at their own prior diagnosis coding when coding a subsequent claim; thus errors would be correlated to at least some degree. The assumption that providers code randomly must hold to assume independence.” Further, Milliman points out: “This independence assumption can be expected to result in HCC-level error rates that are significantly lower than if providers or medical coders make errors that are related to each other, perhaps from copying diagnoses from a prior visit or from particular personnel repeatedly making the same type of error.” Avalere makes a similar point in its study, noting that “it is probable that there are correlations in errors. For example, a healthcare provider submitting multiple claims for the same beneficiary might repeat the same coding or documentation error.” To summarize, CMS cannot multiply probabilities when the events are not independent. If the same provider has seen the enrollee, it is far more likely that the events are dependent. If there is a 50 percent chance that a provider will make an error when seeing an enrollee, that same probability applies regardless of how many visits the enrollee has to the provider. This assumption by CMS – which Milliman critiques in their analysis – is simply not credible. v. Limitation #5: The average number of claims per beneficiary cannot be used to determine a beneficiary level error rate Milliman notes that using an average number of claims per member in calculating a beneficiary level error rate ignores the fact that the number of claims per person will vary. Avalere makes a similar argument in its study. 27 That is, some beneficiaries have more than the average number of 27 Op cit. 9 Page 19 claims, and some have fewer. In addition, the data are not normally distributed, as noted in the Avalere study. That is, a small number of enrollees have a large number of claims, which skews the distribution of claims underlying the CMS study. Milliman describes an example using HCC 55 (Major Depressive, Bipolar, and Paranoid Disorders), which CMS assumes has about a 50 percent error rate on a HCC basis. Milliman presents an illustrative example where two beneficiaries have HCC 55; one has two claims, while the other has 10. In this case, the error rate, assuming independence of diagnosis coding errors, is 0.5 ^ 2 = 0.25 for the first beneficiary and 0.5 ^ 10 = 0.0001 for the second. By averaging these error rates, Milliman finds the average error rate is 0.125. By contrast, CMS’ methodology, which focused on an average number of claims for all beneficiaries, would result in an average error rate of 0.5 ^ 6 = 0.016. In other words, Milliman finds an average error rate in this example that is nearly eight times higher than the error rate calculated using CMS’ approach. In Milliman’s analysis, they adjust for this methodological error by using the actual distribution of claims, rather than the average number of claims. Due to limitations in the data provided by CMS, Milliman used the 5 percent Limited Data Set (LDS) claims files for this analysis. The Avalere study also used the LDS files. Milliman’s and Avalere’s use of the actual number of claims represents a much more accurate depiction of how error rates can be applied to determine a FFS adjuster. By using the average number of claims instead of the actual number of claims, CMS calculated an inaccurate estimate of the audit miscalibration. Making this adjustment, as Milliman demonstrates, dramatically increases the level of the necessary FFS adjuster and proves that CMS’ conclusion that a FFS adjuster is not necessary is flawed. vi. Limitation #6: CMS does not properly “normalize” the risk adjustment model in its simulations In its methodology, CMS excludes unsupported diagnosis codes in the calibration of the CMSHCC model. However, when transforming the coefficients calculated by the risk model to relative factors used to determine a risk score – a process referred to as normalization, which ensures the overall risk score is 1.0 – CMS includes the unsupported diagnosis codes and therefore does not correctly normalize the model. As Milliman finds, this step leads CMS to its erroneous conclusion that a FFS adjuster is not necessary, when in fact adjusting for this error alone demonstrates that a FFS adjuster is necessary. In addition, the magnitude of the FFS adjuster is material (that is, nonzero) – even when using the average number of claims, which as noted above is not a valid assumption. CMS, in their Addendum to the FFS study28, provided additional information on the normalization process. This additional detail, however, does not correct for any of the flaws inherent in the agency’s chosen methodology. In particular, CMS includes a mathematical “explanation” of their approach to “offset deletion bias”. CMS also described the Inflated PostAudit Risk Score (IPARS) adjustment that they made to “offset the bias that the deletion procedure itself creates in expenditures.” 28 Op cit. 3 Page 20 Milliman reviewed the mathematical arguments put forward by CMS in the Addendum and found notable flaws in CMS’ approach. In particular, Milliman notes the following: The mathematical explanation contains some errors. For example, step 2 defines Iji as the complete matrix of all HCC disease indicators and further that the sumproduct of all coefficients and indicators is equal to the total FFS expenditure (E): 𝑚𝑚 𝑝𝑝 𝑚𝑚 � � 𝑏𝑏𝑗𝑗𝑗𝑗 𝐼𝐼𝑗𝑗𝑗𝑗 = � 𝐸𝐸𝑗𝑗 𝑗𝑗=1 𝑖𝑖=1 𝑗𝑗=1 However, the disease indicators do not include demographic variables, which are included in the CMS HCC model and explain a significant portion of expenditures. Further, the use of averages to describe coefficient values in step 5 is inconsistent with Ordinary Least Squares (OLS) because it ignores the difference in weight and frequency of the coefficients and independent variables within the regression model. If regression concepts were considered rather than average coefficient values, then the removal of a disease indicator for a beneficiary with above average spend for that HCC would decrease, rather than increase (as CMS described in step 6), the coefficient value resulting from OLS. However, these mathematical problems with the CMS explanation should not be expected to invalidate the overall conclusion that, when the CMS HCC model is calibrated and normalized to produce the total FFS expenditures on separate sets of independent variables, the total always balances to the total FFS expenditure. With respect to the IPARS, Milliman finds that CMS’ calculation of the IPARS is itself evidence of the need for a FFS adjuster. IPARS represents an interim step in the methodology that was implicit in the description CMS provided in its technical appendix but made explicit in the Addendum – IPARS is the name CMS ascribes to the process through which the CMS-HCC model was normalized inconsistently in order for CMS to conclude that a FFS adjuster is not necessary. In particular, Milliman states that: “CMS calculates IPARS to be 0.9%. The CMS Addendum does not discuss the significance of IPARS; however, a non-zero IPARS demonstrates the need for a FFS adjuster. Further, the CMS IPARS calculation is consistent with our calculation of a payment discrepancy of 1.1% in the next section titled ‘Adjustment of CMS technical approach.’ As demonstrated in the examples and conceptual discussion, above, this difference in risk score and payment results is evidence of the need for a FFS adjuster in RADV audits. If the technical issues with CMS’s estimated HCC error rates were resolved, IPARS would be dramatically larger, emphasizing the critical need for a FFS adjuster.” Page 21 C. CMS’ claim that a FFS adjuster would result in inequity among plans is not credible or reasonable CMS provides an alternative rationale for not applying a FFS adjuster: even if the CMS technical study might otherwise support a FFS adjuster, the agency believes a FFS adjuster should not be applied because it would create inequities between audited and unaudited plans. We strongly disagree. The FFS adjuster is needed to ensure compliance with the actuarial equivalence requirement in the statute. The issue of actuarial equivalence arises whenever CMS seeks to apply a documentation standard for payment (medical records) that differs from the standard used in developing the risk adjustment model (claims). Plans not subject to a RADV audit may still be subjected to different document standards if they face overpayment claims by the government, and two district courts have recently held that the actuarial equivalence requirement must be satisfied in the overpayment context. Thus, equity requires the application of a FFS adjuster to RADV audits. Further, unaudited plans that do not face payment recovery issues are not adversely affected by different documentation standards, and therefore the use of a FFS adjuster would not adversely affect them. In addition, even assuming CMS was otherwise correct in identifying certain cases where there is a potential for different treatment, it would still be unreasonable for CMS to use the alternative rationale to avoid implementing a FFS adjuster. CMS seeks to justify not applying corrective action when such action is required to maintain actuarial equivalence for payments made to a specific MA plan undergoing a RADV audit, simply because no such corrective action will be taken by CMS with respect to MA plans not undergoing RADV audits. In other words, CMS wants to withhold fairness for some because the agency refuses to do justice to all. That interpretation is clearly not permitted under section 1853(a)(1)(C)(i) of the SSA. See Global Tel*Link v. FCC, 866 F.3d 397, 418 (D.C. Cir. 2017) (Silberman, J., concurring) (“Chevron’s second step can and should be a meaningful limitation on the ability of administrative agencies to exploit statutory ambiguities, assert farfetched interpretations, and usurp undelegated policymaking discretion.”). Moreover, the recent decision in Poehling confirms that the statute requires that CMS adjust payments to ensure actuarial equivalence. See United States ex rel. Benjamin Poehling v. UnitedHealth Group, Inc., No. 16-08697, 2019 WL 2353125 (C.D. Cal. Mar. 28, 2019). CMS does not have the discretion to ignore that statutory requirement simply because it believes that requirement to be inequitable. As a membership organization representing the MA industry, we can say without hesitation that our members support a FFS adjuster regardless of whether one of their contracts is selected for audit. We believe strongly that CMS is required to implement a FFS adjuster in payment recovery activities, and that such an adjustment is not only necessary to achieve actuarial equivalence but is equitable for both audited and unaudited plans in that context. Page 22 D. The FFS adjuster is required under the “same methodology” language in the statute Section 1853(b)(4)(D) of the SSA requires that in computing expenditures for traditional Medicare, CMS must use the “same methodology as is expected to be applied in making payments to [MA plans].” CMS violates this statutory command if the “‘methodology’ applied in ‘making payments’ to [MA] insurers involves reconciliation based strictly on audited diagnosis codes for [MA] patients, in sharp contrast to unverified diagnosis codes for traditional Medicare patients from which payment rates were set.” UnitedHealthcare, 330 F. Supp. 3d at 187. Just like the 2014 Overpayment Rule vacated in UnitedHealthcare, the Proposed Rule “fails to recognize a crucial data mismatch, and without correction, it fails to satisfy [1853(b)(4)(D)].” Id. A similar statutory interpretation – that this section of the statute is applicable to risk adjustment payments, and not just to CMS’ annual reporting requirements as the Government had argued – was also recently upheld in the U.S. District Court for the Central District of California in its ruling in United States ex rel. Benjamin Poehling v. UnitedHealth Group, Inc.: “But the Court is unpersuaded that the statute is so limited [to CMS’ annual reporting requirement], given that the face of the statute also requires ‘computation [of] … [t]he average risk factor for the covered population . . . using the same methodology as is expected to be applied in making payments’ to MA plans.” No. 16-08697, 2019 WL 2353125 at 5 (C.D. Cal. Mar. 28, 2019) (citing 42 U.S.C. § 1395w-23(b)(4)) (emphasis in original). III. CMS’ extrapolation proposal is procedurally defective, exceeds the Agency’s statutory authority, and is arbitrary and capricious A. The Proposed Rule fails to adequately reference the legal authority for extrapolation CMS asserts that it may use contract-level extrapolation “based on longstanding case law and best practices from [the Department of Health and Human Services] and other federal agencies” (Preamble p. 54984). However, agency authority must be derived from statute, and the Proposed Rule never specifies what statute CMS believes grants it the authority to use extrapolation with respect to MA plans. As a result, the Proposed Rule violates the fundamental requirement imposed by the Administrative Procedures Act (APA) whereby a notice of proposed rulemaking must include “reference to the legal authority under which the rule is proposed.” 29 The Proposed Rule also does not identify the “longstanding case law” referred to by the agency, thereby requiring the public to speculate regarding the decision(s) on which CMS might be relying. The APA’s notice-and-comment requirements do not permit such an approach. 29 5 U.S.C. § 553(b)(2); see also Attorney General’s Manual on the Administrative Procedure Act 29 (1947) (“The reference [to legal authority required by § 553(b)(2)] must be sufficiently precise to apprise interested persons of the agency’s legal authority to issue the proposed rule.”). Page 23 B. CMS does not have statutory authority to extrapolate RADV audit results CMS does not have authority to use contract-level extrapolation against MA plans under the SSA. Most case law related to extrapolation does not address the threshold question of whether a federal agency has statutory authority to use extrapolation. Instead, it addresses the separate question of whether the use of extrapolation violates the constitutional right to due process. 30 The only appellate decision of which we are aware that addressed a somewhat similar statutoryauthority question did so solely with respect to the use of extrapolation in FFS Medicare at a time when such extrapolation had already become a “long-standing and well-established practice” as applied to providers of services and suppliers under Medicare Parts A and B. Chaves Cnty. Home Health Serv., Inc. v. Sullivan, 931 F.2d 914, 923 (D.C. Cir. 1991). Even in that instance, however, the D.C. Circuit openly acknowledged that the question of statutory authority to use extrapolation was “close.” Id. at 923. The D.C. Circuit also found that nothing in the Medicare Act at the time spoke directly to the use of extrapolation. See id. at 916–18. However, after repeatedly noting that the appellants (three home health agencies) failed to challenge the statistical validity of the calculations at issue, the court found that the use of extrapolation in the particular context before it represented a reasonable interpretation of the “authority to recoup overpayments from providers,” Id. at 916-17, 921-22 (emphasis added). Yet much has changed since the D.C. Circuit’s decision in Chaves County, the statutory-authority holding of which has essentially gone untested in any other circuit court of appeals. Not only has that holding been undermined by the Supreme Court’s rejection of the “novel project” of “Trial by Formula,” 31 the government itself has acknowledged the need for legislation before proceeding as suggested in the Proposed Rule, explaining in testimony before Congress: The President’s Budget includes seven legislative and administrative proposals that will strengthen efforts to fight Medicare and Medicaid fraud and abuse . . . Legislative Proposals Included in the Budget Extrapolate MA Plan Sample Error Rate to Entire Plan Payment in Risk Adjustment Audits: Historically, CMS has only recovered overpayments from risk adjustment 30 See, e.g., Ratanasen v. Cal. Dep’t of Health & Human Servs., 11 F.3d 1467, 1469–71 (9th Cir. 1993) (addressing bankruptcy court’s use of extrapolation with respect to amounts owed to state Medicaid program); Yorktown Med. Lab., Inc. v. Perales, 948 F.2d 84, 89–90 (2d Cir. 1991) (addressing state Medicaid agency’s use of extrapolation); Ill. Physicians Union v. Miller, 675 F.2d 151, 154–56 (7th Cir. 1982) (same); see also Mich. Dep’t of Educ. v. U.S. Dep’t of Educ., 875 F.2d 1196, 1204–06 (6th Cir. 1989) (addressing whether federal agency’s use of extrapolation in recouping vocational-rehabilitation funds from State satisfied substantial-evidence standard); Georgia ex rel. Dep’t of Human Res. v. Califano, 446 F. Supp. 404, 409–10 (N.D. Ga. 1977) (addressing whether federal agency’s use of extrapolation in recouping Medicaid funds from State was arbitrary and capricious). 31 Wal-Mart Stores, Inc. v. Dukes, 564 U.S. 338, 367 (2011). Writing for the Supreme Court in Dukes, Justice Scalia found that it was improper to certify a class action on the premise that the defendant would only be able to litigate its defenses with respect to monetary claims asserted by a sample of class members, the outcome of which would then be extrapolated to the class as a whole. See id. The Supreme Court recently went further by limiting the use of extrapolation to those instances where statistical evidence would be relevant in adjudicating an individual claim of liability. See Tyson Foods, Inc. v. Bouaphakeo, 136 S. Ct. 1036, 1046 (2016). Page 24 errors found in the audited sample. This proposal would require that CMS recover risk adjustment overpayments by extrapolating sample error rates to all audited plans through risk adjustment validation (RADV) audits. The plan payment will only be adjusted on a statistically valid sample of beneficiaries . . . 32 It would “strain[] credulity to suggest that” the government submitted such a request to Congress “without analyzing the relevant statutes.” U.S. House of Representatives v. Burwell, 185 F. Supp. 3d 165, 186 (D.D.C. 2016). Further, Congress has not stood silent with respect to the use of extrapolation. Instead, it has authorized CMS to use extrapolation, but only with respect to a limited universe of Medicare overpayments and only under carefully prescribed circumstances. In 2003, Congress added section 1893(f) to the SSA, entitled “RECOVERY OF OVERPAYMENTS.” The new subsection (f) combined together a collection of overpayment-related provisions specific to a “provider of services or supplier,” which are terms of art that refer to physicians, hospitals, and other entities but do not include MA organizations. The list of overpayment-related provisions for a “provider of services or supplier” included the use of repayment plans; limitations on recoupment; the provision of supporting documentation; the use of consent settlements; notice of code overutilization; and payment audits. Importantly, subsection (f)(3), entitled “LIMITATION ON USE OF EXTRAPOLATION,” states: A [M]edicare contractor may not use extrapolation to determine overpayment amounts to be recovered by recoupment, offset, or otherwise unless [CMS] determines that— (A) there is a sustained or high level of payment error; or (B) documented educational intervention has failed to correct the payment error. There shall be no administrative or judicial review under section 1869 [referring to appeal rights specific to Medicare Parts A and B], section 1878 [referring to additional appeal rights specific to certain providers of services under Part A], or otherwise, of determinations by [CMS] of sustained or high levels of payment errors under this paragraph. The language of paragraph (3), which is included in the midst of a subsection focused entirely on overpayment issues related to providers and suppliers under Medicare Parts A and B, does not provide CMS with authority to use extrapolation with respect to anyone other than providers and suppliers under Medicare Parts A and B. “Statutory language cannot be construed in a vacuum. It 32 Departments of Labor, Health and Human Services, Education, and Related Agencies Appropriations for 2011: Hearings Before the H.R. Comm. on Appropriations, 111th Cong. pt. 7 at 14 (2010) (written statement of William Corr, Dep’t Sec’y, Dep’t of Health & Human Servs.); see also Ctrs. for Medicare & Medicaid Servs., Dep’t of Health & Human Servs., Fiscal Year 2011 Performance Budget 177 (2010) (describing proposal that would “[c]larify in statute that CMS can extrapolate the error rate found in the risk adjustment validation (RADV) audits to the entire MA plan payment for a given year when recouping overpayments”). Page 25 is a fundamental canon of statutory construction that the words of a statute must be read in their context and with a view to their place in the overall statutory scheme.” Sturgeon v. Frost, 136 S. Ct. 1061, 1070 (2016) (internal quotation marks and citation omitted). Nor can CMS use Congress’s specific grant of extrapolation authority with respect to Medicare Parts A and B as an implicit grant of such authority with respect to Medicare Part C. See, e.g., Ry. Lab. Executives’ Ass’n v. Nat’l Mediation Bd., 29 F.3d 655, 670 (D.C. Cir. 1994) (en banc) (“Unable to link its assertion of authority to any statutory provision, the [agency’s] position in this case amounts to the bare suggestion that it possesses plenary authority to act within a given area simply because Congress has endowed it with some authority to act in that area. We categorically reject that suggestion.”). Furthermore, even if one were to view section 1893(f)(3) in isolation, the Proposed Rule makes no effort to explain how the statute’s prerequisites for the use of extrapolation—i.e., a determination of a “sustained or high level of payment error” or “documented educational intervention [that] has failed to correct the payment error”—have been satisfied with respect to those MA plans selected to undergo RADV audits. See also H.R. Rep. No. 108-391, at 785 (2003) (Conf. Rep.) (explaining that “[e]xtrapolation is limited to those circumstances where there is a sustained or high level of payment error, as defined by [CMS] in regulation, or document[ed] educational intervention has failed to correct the payment error”). C. CMS extrapolation proposal creates an unlawful presumption Even if CMS has statutory authority to use extrapolation under certain circumstances with respect to MA plans, it would not mean CMS has statutory authority to use extrapolation in any manner that it chooses. For example, decisions such as Chaves County speak to the use of extrapolation on a provider-specific basis, where it is at least plausible that a provider who is demonstrated to have submitted incorrect payment claims under certain circumstances with respect to certain patients did so with respect to other, similarly situated patients under the same provider’s care during the same time period. The Proposed Rule, in contrast, would establish a system whereby CMS reviews a sample of patients treated by certain healthcare providers under contract with an MA plan, determines whether those particular providers maintained (in CMS’ view) adequate medical documentation to support the diagnoses they reported to the MA plan, and applies an error rate to a universe of diagnoses reported to the MA plan by thousands of other, unrelated providers simply because they, too, are under contract with the MA plan. In doing so, CMS would essentially establish a presumption that is impossible for an MA plan to rebut. The MA plan would have no way of establishing the existence of sufficient supporting medical documentation related to the extrapolated universe of cases because the total overpayment amount will not be tied to specific providers and specific patients. The establishment of such a presumption exceeds CMS’ statutory authority. “[A]n agency is not free to ignore statutory language by creating a presumption on grounds of policy to avoid the necessity for finding that which the legislature requires to be found.” United Scenic Artists v. NLRB, 762 F.2d 1027, 1034 (D.C. Cir. 1985). The creation of such a presumption “is beyond the Page 26 [agency’s] statutory authority.” Id. at 1035. At a minimum, there must be a “sound factual connection . . . between the facts giving rise to the presumption and the facts then presumed.” Holland Livestock Ranch v. United States, 714 F.2d 90, 92 (9th Cir. 1983) (internal quotation marks and citation omitted). No such connection exists with respect to the presumption created by the extrapolation regime contained in the Proposed Rule, which avoids using the word “presumption” even though CMS has previously acknowledged that the use of statistical sampling and extrapolation does, in fact, create a presumption. See, e.g., Use of Statistical Sampling to Project Overpayments to Medicare Providers and Suppliers, HCFA Ruling No. 86-1, at 11 (Feb. 20, 1986) (“Sampling only creates a presumption of validity as to the amount of an overpayment which may be used as the basis for recoupment.”). Just because a contracted provider maintained documentation that CMS, years after the fact, believes is insufficient to support diagnoses reported for a particular patient, does not logically suggest that the same is true of all patients treated by that provider, let alone that the same is true of all other providers under contract with the MA plan treating other patients. “The conclusion . . . simply does not follow from the premise,” rendering the presumption beyond CMS’ statutory authority. United Scenic Artists, 762 F.2d at 1035. IV. CMS’ published extrapolation methodology is so flawed that implementation would be arbitrary and capricious AHIP commissioned a study by Wakely Consulting Group that examined the RADV sampling and extrapolation methodology applied in the contract-level audits conducted by CMS, but not finalized, for payment years 2011 to 2013. 33 The Wakely study identified several significant areas of concern including: • CMS’ extrapolation approach is subject to a high degree of randomness and could result in inequitable treatment of similar contracts, because the application of the RADV process to contracts with similar average error rates may yield materially different payment penalties. The use of relatively small samples (201 enrollees), as well as the fact that coding errors can be rare, may result in erratic penalty results. To examine this issue, Wakely explored various scenarios of contract size and assumed coding error rates. For each scenario, Wakely ran 100,000 simulations of the RADV sampling process (selecting 201 enrollees). In one of the scenarios, Wakely assumed that 10 percent of HCCs are unsupported (coded but not supported by medical record), while 6 percent are supported but not reported (e.g., not coded but supported in medical record). 34 Wakely assumes a standardized bid of $850 per member per month (PMPM) for their simulations. They find wide variation in the penalties – from $0 PMPM to $67.07 PMPM – and note that “such variation in payment 33 Murray, T., Morgan, E., Sauter, M. Medicare RADV: Review of CMS sampling and extrapolation methodology. Wakely Consulting Group. July 2018. Available at: https://www.ahip.org/wp-content/uploads/2018/07/WakelyMedicare-RADV-Report-2018.07.pdf. 34 Wakely also ran other scenarios based on different assumptions about the level of unsupported versus supported diagnoses – the assumption here is primarily used for illustrative purposes. Page 27 • • • • penalty for randomly chosen RADV samples from the same contract is obviously problematic.” 35 The methodology is sensitive to which beneficiaries and conditions are included in the sample, because certain diseases can have a disproportionate impact on the payment error. As one example, based on a simulation of the RADV process, a single unsupported diagnosis of metastatic cancer could increase the payment penalty by 16.7 percent. Because the methodology does not take into account diagnosis-specific error rates, which are acknowledged by CMS to vary, penalties from one sample may be higher due to the “luck of the draw” for which diagnoses are selected (i.e., two different samples from the same plan could yield different payment penalties due to randomly selected diagnoses having a higher incidence of coding errors in the industry). Wakely notes that “random chance could drive material swings in extrapolated payment penalties” as a result. 36 The methodology could drive bias against higher enrollment contracts and contracts with low absolute risk scores. The sampling approach makes proportionally higher penalties more likely for larger enrollment contracts compared to smaller contracts. Wakely finds that “larger contract sizes are generally penalized by greater randomness in penalties.” 37 The Proposed Rule explains that in choosing the enrollee population from which a sample will be taken for each MA contract selected for a RADV audit, CMS requires that enrollees have had “at least one diagnosis during the data collection year leading to at least one CMS-HCC assignment in the payment year.” In other words, CMS excludes enrollees with no HCCs, thereby eliminating the possibility that such enrollees will be included in the sample from which CMS derives an overall payment error rate. Wakely notes that this practice “biases the sample payment error rate upwards.” The report further explains: “Excluding non-HCC members from the RADV audit samples biases the observed payment error by removing potential supported but not reported codes for non-HCC members. This makes the expected observed payment error rate higher than the true payment error rate over the entire contract (RADV-eligible plus non-eligible).” In addition, the process for RADV that CMS uses for the exchange plans specifically considers the no-HCC population as its own stratum. 38 The methodology has a nonzero probability of yielding a payment penalty higher than the actual payment error, which is problematic even at very low probabilities. Such a penalty could yield a significant forfeiture of funding due to the randomness of the sampling methodology, and not due to coding accuracy. 35 Ibid., p. 11. Ibid., p. 12. 37 Ibid. 38 Centers for Medicare & Medicaid Services, Center for Consumer Information and Oversight. 2017 benefit year protocols: PPACA HHS risk adjustment data validation [Version 2.0]. August 10 2018. Available at: https://www.regtap.info/uploads/library/HRADV_2017Protocols_Updates_v2.0_081018_v1_5CR_081018.pdf. For example, CMS notes on page 28 of this document “With the No-HCC population, the risk score errors will likely be under-statements, meaning the No-HCC risk scores should be adjusted upward.”… “Consequently, there is some risk that CMS may be understating the error rate, variance, and risk score assumptions for the No-HCC stratum.” 36 Page 28 Therefore, even assuming for the sake of argument that CMS has statutory authority to extrapolate RADV audit results, AHIP believes the extrapolation methodology is so flawed that, if it were finalized, it would amount to an arbitrary and capricious agency action. For example, the methodological flaws leading to random results that cause similarly situated MA plans to be treated differently is a hallmark of arbitrary agency action. See, e.g., Cnty. of L.A. v. Shalala, 192 F.3d 1005, 1022 (D.C. Cir. 1999) (“A long line of precedent has established that an agency action is arbitrary when the agency offers insufficient reasons for treating similar situations differently.”) (internal quotation marks, citations, and brackets omitted). Further, as noted in Section III.C above, CMS presumes that provider documentation practices are uniform throughout the universe of providers under contract with an MA plan, a presumption that appears clearly arbitrary without evidence, particularly given that, to our knowledge, CMS has never applied extrapolation on anything other than a provider- or supplier-specific basis throughout the history of Medicare Parts A and B. V. CMS’ retroactive application of the regulation is impermissible and not necessary or justified Section 1871(e)(1) of the SSA specifies that a “substantive change in regulations, manual instructions, interpretative rules, statements of policy, or guidelines of general applicability under” title XVIII of the SSA shall not be applied retroactively to “items and services furnished before the effective date of the change” unless CMS makes one of two determinations. Either “such retroactive application is necessary to comply with statutory requirements;” or “failure to apply the change retroactively would be contrary to the public interest.” The Proposed Rule states that CMS intends to extrapolate RADV audit results beginning with payment year 2011. The Proposed Rule also states that even though the 2012 RADV Notice promised that CMS would apply the FFS adjuster in RADV audits beginning with those related to payment year 2011, CMS intends to break its promise with respect to past payment years. CMS solicits comment on whether applying the methodology to previous plan year audits would constitute retroactive rulemaking. However, CMS also indicates that even if doing so would constitute retroactive rulemaking, CMS will invoke authority under section 1871(e)(1)(A) to engage in such rulemaking. The changes contained in the Proposed Rule clearly constitute retroactive rulemaking. We also believe the changes clearly exceed CMS’ limited authority to engage in retroactive rulemaking. A. The Proposed Rule would constitute retroactive rulemaking “To determine whether a rule is impermissibly retroactive, [a court] first look[s] to see whether it effects a substantive change from the agency’s prior regulation or practice.” Ne. Hosp. Corp. v. Sebelius, 657 F.3d 1, 14 (D.C. Cir. 2011) (emphasis added) (internal quotation marks and citation omitted). The Proposed Rule does both. The Proposed Rule significantly revises the regulations that govern MA plans. For example, CMS would amend 42 C.F.R. § 422.310(e) by adding new language, stating: “MA organizations must Page 29 remit improper payments based on RADV audits and established in accordance with stated methodology, in a manner specified by CMS. For RADV audits, CMS may extrapolate RADV Contract-Level audit findings to Payment Year 2011 forward.” Similarly, CMS would amend 42 C.F.R. § 422.311 by adding the following language: “Recovery of improper payments from MA organizations will be conducted according to the Secretary’s payment error extrapolation and recovery methodologies. CMS will apply extrapolation to plan year audits for payment year 2011 forward.” The 2012 RADV Notice promised that CMS “would apply a FFS Adjuster as an offset before finalizing the audit recovery.” CMS attempts to use that notice as evidence that implementing the Proposed Rule “would not upset any settled interest” as it relates to the use of extrapolation generally (Preamble p. 55040). However, this is clearly incorrect given the reversal on the FFS adjuster. 39 Further, existing case law demonstrates that the 2012 RADV Notice cannot be used to thwart a claim of retroactivity. See, e.g., Bowen v. Georgetown Univ. Hosp., 488 U.S. 204, 215 (1988) (finding rule change impermissibly retroactive even though it had first been announced in a notice published years earlier in the Federal Register); Nat’l Mining Ass’n v. Dep’t of Lab., 292 F.3d 849, 868 (D.C. Cir. 2002) (rejecting agency’s argument against retroactivity where past agency practice was “encapsulated only in a manual, not in a regulation promulgated pursuant to notice-and-comment rulemaking”). We note that retroactivity in this context is not limited to RADV audits already undertaken for plan years 2011-2013; it covers any periods before the final rule is implemented and includes audits for 2014 that CMS recently initiated. In other words, CMS can only apply changes in RADV methodology to payment years after publication of a final rule, and plans must have the ability to factor the RADV rules into their bids. Thus, even if CMS were to finalize a proposal on extrapolation in MA in 2019, the earliest it could apply would be CY 2021. B. Retroactive application is not necessary to satisfy a statutory requirement The Proposed Rule asserts in passing that in retroactively applying the proposed changes, “CMS would be acting in compliance with” the Improper Payments Elimination and Recovery Improvement Act of 2012 (IPERIA). However, choosing a course of action that the agency (mistakenly) believes to be “in compliance with” a particular statute is fundamentally different from the determination required by section 1871(e)(1)(A)(i): namely, a determination that “such retroactive application is necessary to comply with statutory requirements.” CMS made no such 39 Separate from the general question of authority for retroactive rulemaking, we believe a refusal to honor the promise of a FFS adjuster would be arbitrary and capricious. In explaining a changed position, an agency must be “cognizant that longstanding policies may have ‘engendered serious reliance interests that must be taken into account.’” Encino Motorcars, LLC v. Navarro, 136 S. Ct. 2117, 2126 (2016) (quoting FCC v. Fox Television Stations, Inc., 556 U.S. 502, 515 (2009)). The Supreme Court has specifically cautioned agencies that “[i]t would be arbitrary and capricious to ignore such matters.” Fox Television, 556 U.S. at 515. The entire MA community has reasonably relied on the 2012 RADV Notice, which CMS claimed at the time was a product of the agency “carefully review[ing] the more than 500 comments received on the draft methodology” that CMS published on its website in late 2010. Page 30 necessity determination in the Proposed Rule. Moreover, nothing in IPERIA requires CMS to engage in retroactive rulemaking in this context. C. Retroactive application is not justified as being in the public interest A public-interest determination under section 1871(e)(1)(A)(ii) is, at a minimum, subject to review under the arbitrary-and-capricious standard of the APA. See, e.g., Sec’y Br. at 43, St. Francis Med. Ctr. v. Azar, 894 F.3d 290 (D.C. Cir. 2018). The APA, in turn, requires that an agency “examine the relevant data and articulate a satisfactory explanation for its action including a rational connection between the facts found and the choice made.” State Farm, 463 U.S. at 43 (internal quotation marks and citation omitted). “In reviewing that explanation, [a court] must consider whether the decision was based on a consideration of the relevant factors and whether there has been a clear error of judgment.” Id. (internal quotation marks and citations omitted). “Normally, an agency rule would be arbitrary and capricious if the agency has relied on factors which Congress has not intended it to consider, entirely failed to consider an important aspect of the problem, offered an explanation for its decision that runs counter to the evidence before the agency, or is so implausible that it could not be ascribed to a difference in view or the product of agency expertise.” Id. The public-interest determination is not justified in this case for several reasons. First, if CMS could simply claim financial recovery as a basis, as it does here, CMS effectively would have almost limitless authority to implement changes retroactively. Essentially the publicinterest exception would swallow the general rule against retroactive rules. And this interpretation would not be limited to RADV in the MA program; it in theory could apply to any one of the payment systems governing the traditional Medicare program. Second, “agencies do not have free rein to use inaccurate data.” Dist. Hosp. Partners, L.P. v. Burwell, 786 F.3d 46, 56 (D.C. Cir. 2015). As the D.C. Circuit recently emphasized in a case involving CMS, an agency “is required to ‘examine the relevant data and articulate a satisfactory explanation for its action including a rational connection between the facts found and the choice made.’” Id. at 56–57 (quoting State Farm, 463 U.S. at 43) (emphasis supplied by D.C. Circuit). “If an agency fails to examine the relevant data—which examination could reveal, inter alia, that the figures being used are erroneous—it has failed to comply with the APA.” Id. at 57. In this case, the public-interest determination is predicated on the assumption that extrapolation of RADV audit results in past payment years will result in the “recoupment of millions of dollars of public money improperly paid to private insurers.” To arrive at these estimates, the Proposed Rule mischaracterizes the level of alleged MA improper payments. CMS asserts that MA plans have had “high levels of payment error in the Part C program” (Preamble p. 55039, footnote 27). CMS says the “amount of improper payments” identified under the MA program is $14.35 billion or 8.31 percent of total MA payments in FY 2017 (Preamble p. 55039). 40 However, this figure represents the “gross” improper payment rate, which is a combination of two payment error estimates: 1) ‘overpayments’ to MA plans, and 2) ‘underpayments’ to MA plans. An overpayment 40 This figure comes from the annual National RADV audit, conducted in accordance with IPERIA. Page 31 is defined as an instance where a diagnosis code submitted for payment purposes was not supported by the beneficiary’s medical record. An underpayment occurs when the medical record review identifies an additional diagnosis code that should have been submitted to CMS and used for payment. The gross improper payment rate represents the sum of overpayments and underpayments; the two numbers are not netted. Therefore, underpayments increase the gross improper payment rate to the same degree as overpayments. Use of the gross improper payment rate vastly overstates the purported impact to the government of errors in the MA program. CMS’ estimates show that underpayments for FY 2017 comprise 35 percent of the $14.35 billion estimate of improper payments. 41 And that level is increasing; in FY 2018, underpayments were 42 percent of total improper payments. 42 We also note CMS has consistently found that the MA program has a significantly lower net improper payment rate than the FFS Medicare program. Chart 1 below shows the difference in the FFS and MA program gross and net improper payment amounts over time. Chart 1. Underpayments and Overpayments at a Percent of Improper Payments, FFS vs. MA (FY2012-2018) Medicare Advantage - Breakdown of Improper Payments, FY2012-FY2018 Medicare Fee-for-Service - Breakdown of Improper Payments, FY2012-FY2018 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% 96% 96% 97% 97% 97% 97% 97% 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% 25% 21% 34% 27% 29% 35% Underpayments as % of Improper Payments Underpayments as % of Improper Payments Overpayments as % of Improper Payments Overpayments as % of Improper Payments 42% If the prevalence of underpayments in the MA program were properly considered, it would show a net improper payment rate of 1.37 percent or $2.6 billion in MA for FY 2018. 43 As shown in Chart 2 below, the MA net improper payment rate has decreased considerably since FY 2012, and in any event is much smaller than the FFS net improper payment rate. Accordingly, even if retroactive application of extrapolation were legally permissible in theory under the publicinterest exception, we believe the relevant data demonstrates that the problem is small in relative 41 Department of Health and Human Services. FY2017 Agency Financial Report. November 2017. Department of Health and Human Services. FY2018 Agency Financial Report. November 2018. 43 Ibid. 42 Page 32 terms, is shrinking, and therefore does not support a step as extraordinary as retroactive rulemaking. Chart 2. Gross vs. Net Improper Payment Rates, FFS vs. MA (FY2012-2018) Medicare Fee-for-Service - Gross vs. Net Improper Payment Rate, FY2012-FY2018 14.0% 12.0% 10.0% 8.0% 6.0% 12.7% 12.09% 11.00% 10.1% 9.51% 11.8%11.39% 8.5% 8.12% 10.33% 9.3% 8.92% 7.8% 7.58% Medicare Advantage - Gross vs. Net Improper Payment Rate, FY2012FY2018 14.0% 12.0% 10.0% 11.4% 9.5% 9.0% 9.50% 9.99% 8.31% 8.10% 8.0% 6.0% 4.0% 4.0% 2.0% 2.0% 0.0% 0.0% 5.7% 5.5% 2.9% 4.32% 4.19% 2.47% Gross improper payment rate Gross improper payment rate Net improper payment rate Net improper payment rate 1.37% Third, CMS’ public-interest determination fails to consider the interest of finality, which is conspicuous for at least two reasons. CMS has long cited the interest of finality as a principal reason not to upset Medicare payment determinations. In addition, CMS has gone so far as to cite the interest of finality as a reason to engage in retroactive rulemaking under section 1871(e)(1)(A)(ii). 44 The interest of finality serves more than just financial considerations. As CMS recently explained to the D.C. Circuit, the interest of finality also reflects “evidentiary and administrability considerations. Records grow stale, memories fade, personnel move on, and retention is costly.” 45 This is especially true in the case of RADV audits. Such audits are predicated on the review of medical records related to services that may have been provided many years earlier. These operational barriers to retroactive rulemaking are illustrated in the RADV audits for 2014 that CMS began earlier this year. Under these audits, plans must collect medical records from providers for services rendered in 2013. Because of the passage of time, medical records from 2013 may not exist or may be nearly impossible for plans to retrieve, for numerous reasons: 44 See Hospital Outpatient Prospective Payment and Ambulatory Surgical Center Payment Systems, 78 Fed. Reg. 74,826, 75,165 (Dec. 10, 2013). In this rule, CMS asserted that not applying proposed regulatory changes retroactively would “undermine . . . the interests of both the Medicare program and Medicare providers in the finality of reimbursement determinations.” 45 Sec’y Br. at 47, St. Francis Med. Ctr. v. Azar, 894 F.3d 290 (D.C. Cir. 2018). Page 33 • • • • Solo practitioners have passed away, with no accessible repository for old medical records. Providers switched electronic medical record systems and have IT challenges in producing medical records prior to the switch (particularly for patients no longer seen by the provider and whose records were never migrated to the new system). Providers placed their records into off-site storage facilities whose personnel cannot locate the records in a timely fashion. Mental health providers are unwilling to provide data due to privacy concerns (that is, these providers will not release medical records without explicit beneficiary permission that may be difficult or impossible to obtain). In addition to increasing the risk that records cannot be found to substantiate diagnoses, the long delay in the 2014 plan year audit will contribute to other documentation challenges for diagnoses. For example, provider signatures may be incomplete on some medical records, but if the providers can no longer be located to complete the attestations, CMS will not accept the records. In addition, since October 1, 2015, providers have used ICD-10 codes. However, the HCCs for the 2014 audit were based on ICD-9 codes which are no longer in use. Moreover, two different CMS-HCC risk adjustment models were used for payment in 2014. Even assuming plans still employ or can hire medical record reviewers who are knowledgeable about these outdated provisions, the situation lends itself to more disputes between plans and CMS. Thus, if CMS is allowed to apply extrapolation retroactively, the agency will artificially inflate the number of cases identified as coding errors and the resulting amount of alleged overpayments. Diagnoses may be found unsubstantiated, not because the beneficiaries’ clinical conditions did not exist, but because their medical records could not be obtained or validated (e.g., because providers are deceased or retired). These impacts stem directly from CMS’ long delay in initiating the 2014 audits. Fourth, we are concerned that CMS further justifies its proposal by failing to acknowledge the clinical importance of accurate diagnosis coding in MA. CMS states that “there is an incentive for plans to potentially over-report diagnoses so that they can increase their payment” (Preamble p. 55037). This statement fails to acknowledge that identifying accurate diagnoses is a crucial mechanism for understanding the health status of a patient. Due to the nature of capitated payments, MA promotes accurate diagnosis coding to support coordinated and integrated care because plans must consider the entire patient and how each of their clinical conditions interact. More accurate and detailed diagnosis coding helps plans identify and support the specific health care needs of their enrollees to ensure they receive integrated care coordination and chronic disease management. Fifth, we note it would not be in the public interest to invoke section 1871(e)(1)(A)(ii) to retroactively “fix” CMS’ failure to engage in notice-and-comment rulemaking with respect to the 2012 RADV Notice (see Section VI below). Cf. Georgetown Univ. Hosp. v. Bowen, 821 F.2d 750, 758 (D.C. Cir. 1987) (“The Secretary’s suggestion that retroactive rulemaking is permissible to remedy a procedural defect in a rule would, if accepted, make a mockery of the provisions of the APA. Obviously, agencies would be free to violate the rulemaking requirements of the APA with Page 34 impunity if, upon invalidation of a rule, they were free to ‘reissue’ that rule on a retroactive basis.”), aff’d on other grounds, 488 U.S. 204 (1988). VI. Implementation of the RADV audit methodology would violate rulemaking requirements The Proposed Rule is procedurally invalid because it fails to comply with the notice-andcomment provisions applicable to substantive rules under the Administrative Procedures Act (APA). See Section 5 U.S.C. 553. CMS, in addition, is bound by the Medicare-specific rulemaking requirements in the Medicare Act. Section 1871(a)(2) of the SSA states that “[n]o rule, requirement, or other statement of policy (other than a national coverage determination) that establishes or changes a substantive legal standard governing . . . the payment for services . . . under this title shall take effect unless it is promulgated by the Secretary by regulation.” The U.S. Supreme Court in Azar v. Allina Health Services recently upheld a D.C. Circuit Court opinion which invalidated how CMS calculated certain hospital payments under Medicare Part A because the methodology was not issued through regulation as required by section 1871(a)(2). 139 S. Ct. 1804 (2019). The Court noted that the policy “dramatically—and retroactively—reduced payments to hospitals serving low-income patients” and that “[b]ecause affected members of the public received no advance warning and no chance to comment first, and because the government has not identified a lawful excuse for neglecting its statutory notice-and-comment obligations” the agency’s policy must be vacated. Id. at 1808. 46 The Proposed Rule has components that are very similar to the policy invalidated in the Allina case. For example, the Proposed Rule applies extrapolation to 2011-2013 audits, a process that can result in “dramatically—and retroactively—reduced payments” to MA plans, using an extrapolation methodology never developed through rulemaking. CMS proposed its methodology for conducting RADV audits, including extrapolation, via the CMS website on December 20, 2010 instead of the Federal Register (Preamble p. 55038). The agency asked for comments 30 days after publication, and not 60 days, as is required under section 1871(b)(1). CMS then published the final methodology in the 2012 RADV Notice, which was essentially the same methodology that they had proposed, with one notable change – CMS acknowledged the need for the FFS adjuster, to adjust for different substantiation standards between MA payment and model development. Neither the 2010 proposal nor the 2012 RADV Notice had any discussion of alternatives considered, the impact on the industry, the rationale for the policy, or how the methodology fit with existing regulatory or statutory requirements. The Proposed Rule attempts to justify the process for developing the 2012 RADV Notice, noting that “we invited public comment on this proposed methodology, and received more than 500 comments, which we carefully reviewed” (Preamble p. 55038). However, as summarized above, 46 While the Court did not expressly endorse how the appeals court defined “substantive legal standard” under 1871(a)(2) or provide detailed guidance about how to interpret that phrase, the Court upheld the lower court because none of the Government’s legislative history and policy-based arguments for avoiding the rulemaking requirement were persuasive. Azar v. Allina Health Services, 139 S. Ct. 1804, 1814-16 (2019). Page 35 that process failed to incorporate key elements of a formal rulemaking process. Moreover, while CMS may have carefully reviewed 500 comments, they did not respond substantially or directly to any of them, including comments raised by AHIP, except as related to one item: the FFS adjuster. This prevented the public from having an opportunity to learn why the agency decided to make policy as it did, and how the agency would respond to concerns of affected stakeholders. 47 We recognize CMS is now requesting comments through the Proposed Rule on the audit methodology. However, we understand the comment request does not affect audits already conducted from 2011 to 2013, because it would be entirely impractical to conduct new audits on those years given the passage of time. In addition, the Proposed Rule states that CMS will develop a new RADV methodology for audit years after 2013. We believe this new RADV methodology would also reflect a change in a substantive legal standard governing payments and therefore require rulemaking under Allina. However, CMS clearly does not intend to provide stakeholders with the opportunity to meaningfully analyze and provide comments on the proposed methodology for audit years after 2013. For example, the Preamble states that “CMS is not required to set forth the methodology for calculating an extrapolated payment error through regulatory provisions.” (Preamble p. 55038). While CMS says that in the “interest of transparency”, it would describe its intent to develop a new RADV methodology, CMS’ description of the methodology in the Proposed Rule for audit years after 2013 is far too vague to meet rulemaking requirements. 48 In addition, the agency is actively conducting audits for 2014, using a new methodology, before the comment period on the rule closed. This shows the agency has pre-judged the issues raised in the Proposed Rule. Moreover, despite initiating audits for 2014, CMS has yet to provide the type of technical details that stakeholders need to understand CMS’ methodology. For example, CMS held a training on April 2, 2019 limited to only those contracts selected for the 2014 audit. CMS provided limited details on the new methodology, including a two-tiered approach to sampling that reflects a subcohort methodology. 49 No information in the training was proposed for public comment prior to 47 American College of Emergency Physicians v. Price, 264 F.Supp.3d 89, 94 (D.D.C. 2017) (“Although an agency ‘need not address every comment’ made during the notice and comment period, ‘it must respond in a reasoned manner to those that raise significant problems.’”(citations omitted)) 48 In a brief outline of the new methodology, CMS says that it “would calculate improper payments made on the audited MA contract for a particular sub-cohort or sub-cohorts in a given payment year.” CMS further says that its methodology would be based on “statistically valid sampling and extrapolation methodologies.” The agency also indicates that sub-cohorts could be “enrollees for whom a particular HCC or one of a related set of HCCs was reported.” CMS “could often use a much smaller sample size” while generating “statistically significant recoveries.” (Preamble p. 55039). 49 In what CMS calls “Tier One”, the sample would consist of 299 enrollees across 131 MA contracts. This cohort was based on the enrollees with “the highest predicted overpayment.” CMS would not extrapolate results from Tier One audits. In what CMS calls “Tier Two”, CMS would apply a sub-cohort methodology to enrollees with a high predicted overpayment rate – estimated through a regression model, the details for which have not been provided – and that have diabetes. Only 32 beneficiaries are being sampled per contract for this sub-cohort methodology, which is well below the sample size of 201 beneficiaries used in the 2011-2013 audits. CMS would extrapolate the results from Tier Two audits after the proposed regulation is finalized. Additionally, rather than auditing 30 contracts as CMS has historically selected (and suggested in its impact analysis of the Proposed Rule), CMS has selected 188 contracts for Tier Two audits. Page 36 being shared. This training has not been publicly posted on the CMS RADV website. Accordingly, the public has not had an opportunity to determine whether CMS’ approach would be statistically valid or to otherwise adequately assess the proposal. Critical issues that are not addressed or left unclear include the following: • • • • The process for selecting sub-cohorts for audit purposes. The sizes needed to calculate a “statistically significant extrapolated recovery” or even what is meant by a “statistically significant extrapolated recovery.” How CMS determines which contracts would be audited. How CMS would extrapolate the findings from these audits. CMS has also not disclosed which contracts were selected for the 2014 RADV audit. However, CMS has a public document available on its website that lists every contract selected for RADV audit by year since 2007. 50 In addition to publicly releasing more detailed information about the audit methodology being used for the 2014 audit, we urge CMS to update this document with the list of contracts selected. We also have serious concerns with the CMS statement in the Preamble that “we would make any future changes to that methodology (or those methodologies) through the Health Plan Management System.” The agency is clearly stating an intent to use sub-regulatory guidance to issue RADV policy in the future, much as it did to establish the 2011 to 2013 audit methodology. We believe this is inconsistent with the SSA requirement for notice-and-comment rulemaking, as indicated in the Allina case. It also signals an unfortunate lack of CMS willingness to engage stakeholders on this critical issue. VII. Other Issues A. CMS’s substantiation standards are insufficient for determining if the patient has the disease On page 55037 of the Preamble of the Proposed Rule, CMS discusses the need for medical record documentation of diagnosis codes to support payment. The agency points to sub-regulatory guidance – none of which has ever been subject to notice-and-comment rulemaking – in which it has explained this requirement “since the beginning of the MA program.” The only CMS RADV guidance that arguably satisfied the requirement was CMS’ rule proposed in 2009 and finalized in 2010. 51,52 Commenters to the 2009 rule expressed concern that the medical record requirement was overly proscriptive. It did not adequately consider the fundamental issue at stake – whether a person in 50 Centers for Medicare & Medicaid Services. Medicare Advantage Risk Adjustment Data Validation Audits Fact Sheet (updated June 1, 2017). Available at: https://www.cms.gov/Research-Statistics-Data-and-Systems/MonitoringPrograms/recovery-audit-program-parts-c-and-d/Other-Content-Types/RADV-Docs/RADV-Fact-Sheet-2013.pdf 51 Medicare Program; Policy and Technical Changes to the Medicare Advantage and the Medicare Prescription Drug Benefit Programs; Proposed Rule, 74 Fed. Reg. 54633 (2009). 52 Medicare Program; Policy and Technical Changes to the Medicare Advantage and the Medicare Prescription Drug Benefit Programs; Final Rule, 75 Fed. Reg. 19677 (2010). Page 37 question actually has a disease. CMS noted such concerns, stating that “commenters contended that the one best medical record policy forces plans to omit relevant data that could be supported through documentation that CMS does not permit – such as prescription drug data and lab results.” However, CMS did not propose any changes to address these concerns. They responded instead that “the RADV risk adjustment model is based upon FFS claims data from specific risk adjustment provider types, and not alternative data sources, such as, prescription drug data or lab results. Therefore, the RADV audit process is based upon supporting medical record documentation from provider data sources that are used to calibrate the model.” If CMS were to finalize its proposal on extrapolation, we believe it would need to revisit its position on documentation. The purpose of the RADV program should be to determine if a person’s diagnosis is correct. Absence of documentation in a RADV audit does not mean a person does not have the disease or is not being treated for it. Rather, it simply means that a medical record could not be located that supports the diagnosis code. CMS has never adequately explained its continued refusal to consider other sources of information that could substantiate a diagnosis code. Medical records themselves are indications of what the person is being actively treated for, and not necessarily of what conditions the person has. For example, a person who has diabetes under control through medication may see a provider for a different condition. The medical record in this situation may not reflect that this person has diabetes, yet a simple check of prescription data for that person would show the person is taking insulin and therefore does, in fact, have diabetes. The person may also be receiving other services related to the diabetes. Because the HCC is meant to capture the incremental costs of an individual with diabetes, excluding the diagnosis code here would be inaccurate. Yet that is what CMS would do in the RADV audit by virtue of depending only on the medical record for proof that the person has diabetes. CMS also seems to be under the impression that diagnosis coding is precise. In reality, coding is not an “exact science” and reasonable people can interpret the same medical record in different ways. Coding guidelines can be unclear and interpreted differently by different people. And while CMS references the ICD-10-CM guidelines, it did not publish the guidelines that it was using for the 2011 to 2013 RADV audits. In this sense, plans were completely in the dark about what guidelines CMS would use for these audits. For these reasons, CMS should revisit its requirement that HCCs can only be supported by the medical record and should consider alternative sources of data to substantiate diagnosis codes. B. Potential recovery based on OIG findings raises serious concerns We strongly oppose the suggestion in the Preamble that MA organizations could be forced to remit payments based on findings from OIG audits. As CMS indicates in footnote 25 on page 55039, OIG “does not seek comment on its methodology for risk adjustment audit work that may lead to overpayment recoveries from MA organizations.” Although OIG is required by statute to follow generally accepted government auditing standards, this requirement does not adequately Page 38 account for the necessary actuarial and statistical methodological principles at issue in the complex system of MA risk adjustment and payment recovery. Several MA organizations raised similar concerns about the potential for lack of consistency in methodology and audit process between CMS and OIG when CMS proposed to expand authority for conducting RADV audits in 2014. 53 At that time, CMS finalized its proposal to specify that OIG has the authority to conduct RADV audits but did not address these concerns. Because OIG does not seek comment on its methodology and is not required to employ the same methodology as finalized by CMS after formal notice-and-comment rulemaking, it is possible that CMS and OIG could arrive at different conclusions about the payment accuracy of the same MA contract due to different methodologies. For example, CMS released the “Contract-Level Risk Adjustment Validation Medical Record Reviewer Guidance” 54 that would be used to determine whether a medical record supported a given diagnosis for any RADV audit that occurs after September 2017; however, OIG has never released the diagnosis coding guidelines that it is currently using or will use in the future to establish supporting documentation for a diagnosis in a medical record. In addition, CMS does not provide any additional information on what remedies would be available to plans that do not concur with audit findings by the OIG. For example, if plans were forced to remit payments based on OIG audit findings then they should have the same right to appeal those findings as to appeal CMS audit findings, and CMS in conjunction with OIG would need to establish such as an appeals process. Therefore, MA organizations could be subject to conflicting RADV audit methodologies employed by different government agencies with possibly divergent diagnosis coding guidelines that potentially could have different appeal rights. In fact, unless CMS and OIG coordinate appropriately, it is possible an MA organization could be subject to conflicting results for the same people. From the perspective of an MA organization, it would be entirely arbitrary which government agency chose one of its contracts to audit. Furthermore, the regulations governing RADV audits apply to all RADV audits and do not distinguish between RADV audits conducted by CMS and those conducted by OIG or any other government agency. Unless OIG is required to conduct RADV audits using the exact same methodology employed as CMS, we recommend that CMS rescind this proposal. C. Further detail is needed on potential CMS RADV appeals proposal CMS indicates that the agency is considering whether to explicitly expand MA organizations’ appeal rights in RADV. It describes one option as adopting practices for MA plans to appeal the RADV payment error calculation methodology similar to those for providers and suppliers in Medicare FFS Parts A and B. 53 Medicare Program; Contract Year 2015 Policy and Technical Changes to the Medicare Advantage and the Medicare Prescription Drug Benefit Programs; Final Rule, 79 Fed. Reg. 29843 (2014). 54 This document was first released on September 27, 2017 for audits commencing after that date. On March 20, 2019, CMS released an updated document in effect as of that date but applied retrospectively for audits commencing after September 27, 2019. Page 39 As CMS has not made a specific proposal to expand MA organization appeal rights under RADV audit, it is not clear how the agency is considering applying the Medicare FFS appeals framework to the RADV program. We therefore are unable to provide specific feedback in response to CMS’ invitation for comments on this point. In addition, the RADV audit appeals process in 42 C.F.R. § 422.311 unnecessarily restricts the rights of MA organizations by allowing them to appeal only one medical record per HCC. We urge CMS to expand the number of medical records that can be appealed for each HCC audited, to allow a more complete review of key evidence that can substantiate a clinical diagnosis. D. RADV guidance related to encounter data is needed CMS points out that plans submit diagnoses through two systems – the Risk Adjustment Processing System (RAPS) and the Encounter Data System (EDS) (Preamble p. 55037). However, these two systems are quite different. In RAPS, plans pre-identify all diagnoses to be submitted to CMS. In EDS, plans submit all claims data – also known as encounter data – and CMS then identifies which diagnoses should be used for risk adjustment using what is known as the “filtering logic”. Nowhere in the Preamble does CMS address the application of RADV in an encounter data setting. Plans have submitted encounter data to CMS since 2012, and CMS has selected diagnoses from these data to calculate payment since 2015. By 2022, CMS anticipates that EDS will be the only source for diagnoses used for payment (Part I of the Advance Notice for 2019). 55 The EDS rules are such that plans are required to submit all data to CMS – regardless of whether these data are to be used for risk adjustment. And then, CMS determines through its own set of rules which diagnoses are allowed for risk adjustment. We believe CMS needs to give serious consideration to potential changes that may be required in RADV to reflect the EDS process. The agency should collaborate closely with industry to develop that approach and address this issue through future rulemaking. VIII. Recommendations Based on the foregoing: • • • We urge the agency to withdraw the RADV Proposed Rule. We ask that the agency affirm that it cannot apply regulations retroactively. The agency should acknowledge that, in the absence of recalibrating the HCC model using audited FFS diagnosis data, a FFS adjuster is required under statute whenever it attempts to determine the accuracy of risk adjusted payments to MA plans by auditing MA diagnosis data against the medical records, and improve the audit methodology. 55 Centers for Medicare & Medicaid Services. Advance Notice of Methodological Changes for Calendar Year (CY) 2019 for the Medicare Advantage (MA) CMS-HCC Risk Adjustment Model. December 27, 2017. Available at: https://www.cms.gov/Medicare/Health-Plans/MedicareAdvtgSpecRateStats/Downloads/Advance2019Part1.pdf. Page 40 • We urge CMS to engage in meaningful, collaborative dialogue with the industry to develop RADV methodological changes going forward, and to ensure they are implemented solely through notice-and-comment rulemaking and on a prospective basis. Page 41 WHITE PAPER Medicare Advantage RADV FFS adjuster: White paper Commissioned by America?s Health insurance Plans August 23, 2019 Rob Pipich. FSA, - . . I liman Executive Summary The Centers for Medicare and Medicaid Services (CMS) issued a proposed rule1 on November 1, 2018, which contained provisions regarding risk adjustment data validation (RADV) audits. In particular, this proposed rule removed what is known as the fee-for-service (FFS) adjuster, which is a mechanism for adjusting RADV audit recoveries to ensure actuarial equivalence between FFS and MA payments. Actuarial equivalence is required by law.2 Based on the analysis described in this white paper, we determined: . A FFS adjuster, or other similar adjustment, is necessary to ensure actuarial equivalence between payments to Medicare Advantage Organizations (MAOs) and payments under Medicare FFS. . CMS analyzed the difference between two calibrations of the CMS Hierarchical Condition Category (HCC) model to investigate what it referred to as ?audit miscalibration." 3 CMS normalized the revised model inconsistently within the context of a FFS adjuster or a RADV audit; therefore, its technical analysis cannot appropriately be used to conclude a FFS adjuster is not required. . CMS underestimates the level of diagnosis coding errors present in FFS claims data. Notably: CMS assumes diagnosis coding errors are independent from each other, which materially understates HCC error rates in FFS. . CMS uses an average number of claims per HCC in its estimation of error rates rather than a distribution of the number of claims, which materially understates HCC error rates in FFS. CMS excludes claims that do not have medical records or necessary documentation available, which also understates the H00 error rates in FFS relative to RADV audit procedures. This white paper discusses and supports our findings that a FFS adjuster is required in RADV audits. The CMS technical analysis excluded simulated unsupported diagnoses in the calibration of the CMS-HCC model, but included them in the normalization of the model. CMS should have excluded unsupported FFS diagnoses in all steps of creating the CMS HCC model to properly address the question of whether a FFS adjuster is required in RADV audits. This paper shows, had CMS excluded unsupported diagnoses from all steps, their analysis would have con?rmed a FFS adjuster is required. 1 Medicare and Medicaid Programs; Policy and Technical Changes to the Medicare Advantage, Medicare Prescription Drug Bene?t, Program of All-inclusive Care for the Elderly (PACE), Medicaid Fee-For-Service, and Medicaid Managed Care Programs for Years 2020 and 2021 . 83 Fed. Reg. 54982 (2018). Retrieved December 20, 2018, from 2 Title 42 U.S. Code 3 CMS coins the term ?audit miscalibration" in its FFS adjuster executive summary. Retrieved December 20, 2018. from The proposed rule describes a similar concept. 83 Fed. Reg. 55041 (2018). Medicare Advantage RADV FFS adjuster: White paper 1 August 2019 MILLIMAN WHITE PAPER The key items presented in this white paper include: . An explanation for why a FFS adjuster is required in a RADV audit to maintain actuarial equivalence. as required by statute and con?rmed in UnitedHealthcare Ins. Co. v. Azar?. . A simpli?ed numeric example demonstrating the argument described in the prior bullet. This is an example expanded upon from an example created by CMS. . A summarized description of CMS's detailed technical analysis and an explanation of why we believe the methodology does not support the removal of a FS adjuster. . An adjusted version of the CMS analysis using a consistent set of diagnoses throughout the entire analysis showing why we believe a FFS adjuster or similar adjustment mechanism is necessary. . A discussion of CMS's development of the Medicare FFS HCC error rates. which we conclude results in a signi?cant understatement of the H00 error rates and therefore should not be used in assessing the magnitude of the FFS adjuster. PURPOSE OF THIS STUDY The purpose of this study is to evaluate the CMS conclusion that a FFS adjuster is not appropriate; it is not to determine the appropriate amount of a FFS adjuster. The study shows that using methodology and data but adjusting for certain issues with that methodology. as described in this paper. leads to a conclusion that a FFS adjuster is required and is signi?cantly greater than zero. As described in various sections of this paper. including those titled underestimated error rates for HCCs Overview', underestimated error rates for HCCs Is the sample size suf?cient??. 'Technical analysis - Model and data selection'. and 'Conclusion'. further study of error rates is necessary to determine the true magnitude of a FFS adjuster. This study uses CMS published assumptions. methodology, and data. and identi?es multiple signi?cant issues in CMS assumptions and methodologies. We did not attempt to identify all potential issues. We make no judgment about the appr0priateness of other methodologies that could be used to determine an appropriate FFS adjuster. Depending on other potential issues and alternative assumptions and methodologies used. other valid analyses may lead to reasonable FFS adjusters that are outside the ranges considered in this paper. However. we have not been able to conceive of a reasonable methodology that would lead to the conclusion a FFS adjuster is unnecessary. BACKGROUND AND DEFINITIONS MAOs are paid. in large part and with certain adjustments. based upon the expected cost of the individual bene?ciaries who enroll in the MAO's plans had those bene?ciaries received bene?ts through the Medicare FFS program. Generally. CMS uses a risk adjustment system to multiply a ?xed capitation payment times a bene?ciary-speci?c risk score to adjust payments to MAOs based on health status. That approach to determining the capitation payment results in higher payments to MAOs for less healthy bene?ciaries and lower payments for healthier bene?ciaries. Title 42 of the United States Code states that the risk adjustment mechanism used by CMS should be implemented in a manner that achieves actuarial equivalence between Medicare FFS and Medicare Advantage (MA). CMS recognized this requirement in its February 24. 2012, notice?, which set forth the methodology for RADV audit recovery calculations. The notice acknowledged that the CMS HCC risk score model is developed based upon diagnoses from FFS claims, including those not supported by medical records. Therefore. if a RADV audit removes unsupported diagnoses from an MAO's risk score calculation. the MAO must be allowed the same level of unsupported diagnoses as FFS 330 F.Supp.3d 173 (D.D.C. 2018) (Collyer, J.). appeal docketed. No. 18?5326 (D.C. Cir. Nov. 14. 2018). 5 Available at 5 CMS (February 24. 2012). Notice of Final Payment Error Calculation Methodology for Part Medicare Advantage Risk Adjustment Data Validation Contract-Level Audits. Retrieved December 20. 2018. from Medicare Advantage RADV FFS adjuster: White paper 2 August 2019 MILUMAN WHITE PAPER in order to maintain actuarial equivalence. Failing to do so would result in CMS paying less, on average. for an identical bene?ciary under the MA program than under the FFS program. violating the principle of actuarial equivalence. To avoid confusion throughout this paper, we de?ne a few terms. The term ?calibrate." as it applies to the HCC model. is often loosely used to refer to both the process where CMS calibrates the HCC model and then normalizes the model. In this white paper. we use the term calibrate to refer to the application of a least squares regression to calculate the relative cost of medical conditions and demographic indicators included in the HCC model. We use the term normalization to refer to the process by which CMS ensures that the H00 model, when applied to the FPS population. results in a 1.0 average risk score. ANALYSIS AND RESULTS In the February 24. 2012 notice. CMS acknowledged the need for a FFS adjuster and included it in the RADV audit procedures. CMS is now proposing to remove the FFS adjuster. The CMS proposal to remove the FPS adjuster is primarily supported by a technical analysis7 showing that calibrating the HCC model using a data set containing all diagnoses versus only supported diagnoses diagnoses supported by medical records) does not materially impact overall MAO payment levels. CMS argues this occurs because of the normalization process CMS uses to ensure that the average risk score for the FPS population is 1.0. However, it appears that CMS performed the normalization process by including unsupported diagnoses that should have been excluded. The result is that the portion of the CMS analysis intended to represent a scenario without unsupported diagnoses does not. in fact. remove the unsupported diagnoses. The CMS Technical Appendix8 did not provide all the details surrounding the technical calculations in the CMS analysis, and so to initially con?rm our understanding of what CMS did. we successfully reproduced the CMS technical analysis described above. We reproduced the CMS analysis using both the data CMS released in March 2019 to support its technical analysis and using the 2014/2015 Limited Data Set 5% Samples Samples).9 OMS subsequently (June 2019) published an 'Addendum to the Fee~For? Service Study' (Addendum)??, which included many of the previously missing technical calculation details and certain CMS SAS code: we veri?ed that the CMS implementation of the process described in the technical appendix was not materially different from our reproduction of the CMS analysis. The CMS analysis includes certain simplifying assumptions that result in materially understated FFS HCC error rates. The CMS simulations used those understated FFS HCC error rates. We analyzed several variations of the CMS technical analysis: excluding the unsupported diagnoses (from not only the HCC model calibration process. but also from the normalization process). calculating HCC level error rates based on the CMS claim level error rates and the actual distribution of claims per bene?ciary (as opposed to the average across all bene?ciaries). and testing several levels of HCC error rates. We used the CMS error rates and methodology with an adjustment for the normalization process and the actual number of diagnoses per bene?ciary (rather than the average). Under this approach, we calculated a FFS adjuster using claim level error rates. actual distributions of the number of diagnoses (assuming full independence?). and an HCC error rate of 33% (assuming full dependence?). in addition to several 7 CMS (October 26. 201B). RADV Resources. Retrieved December 20. 2018. from '3 Available at 3 The 5% Samples are Limited Data Sets made available by CMS and we utilized the particular ?les that contain approximately 5% of Medicare member's FFS claims. Additional information is available at 1? Retrieved June 26. 2019 from 1? Independence. in this context. means coding errors on individual claims are nol related to coding errors on other claims. ?2 Dependence, in this context. means coding errors on claims are made in the same way for all claims for a particular HCC for each bene?ciary. Medicare Advantage RADV FFS adjusten White paper 3 August 2019 MILLIMAN WHITE PAPER scenarios in between. This approach resulted in estimated values of a FFS adjuster? between 8% and 21%. For perspective. 8% of federal payments to MAOs exceeds $18 billion and 21% exceeds $42 billion per year,? the majority of which are risk?adjusted. A FFS adjuster. based on data modi?ed to re?ect reasonable error rates using an adjusted methodology adjusts for the normalization process. the distribution of claims. and claim independence) likely lies somewhere between the two endpoints. 8% and 21%. We also note that CMS clari?ed in the June 2019 Addendum that they . .excluded claims where providers refused to submit medical records, or did not provide suf?cient documentation." Although we do not have the information to evaluate the impact of these exclusions on the error rates, this exclusion is inconsistent with the RADV audit process. Properly including these unsupported diagnoses in the calculation of error rates would increase the magnitude of a FFS adjuster from the ?gures described in this paper. As noted above, we make no judgment about the appropriateness of other methodologies that could be used to determine an appropriate FFS adjuster. Depending on other potential issues and alternative assumptions and methodologies used, other valid analyses may lead to reasonable FFS adjusters that are outside the range considered in this paper. The magnitude of a FFS adjuster is highly sensitive to the speci?c HCC error rates used in the analysis, and the H00 error rates in the CMS analysis are highly sensitive to both the use of an average number of claims (versus a distribution of the number of claims) within an HCC and how independent the coding of one claim is to the next. Further analysis must be completed to calculate an accurate FFS adjuster. In any case. the range is wide and even the bottom end is material and signi?cant. We conclude that not applying a FFS adjuster in a RADV audit. as proposed by CMS. would violate actuarial equivalence. Additionally, applying a FFS adjuster based on the HCC error rates in the CMS Technical Appendix would also violate actuarial equivalence because the H00 error rates CMS uses are biased. A FFS adjuster must be developed consistent with the intended application to ensure actuarial equivalence. l. Rob Pipich. am a Member of the American Academy of Actuaries and I meet the Quali?cation Standards of the American Academy of Actuaries to render the actuarial opinions expressed herein. Introduction The issues involved in Medicare risk scores, RADV audits. and actuarial equivalence are complex. We organize this white paper to facilitate a simpler way to understand the issues. The executive summary above provides an overview of our analysis and ?ndings. The remaining sections describe our analysis in more detail and provide support for our ?ndings. The following is a list of the topics in the order we address them: . Background . Actuarial equivalence requires a FFS adjuster in RADV . CMS technical analysis should not include unsupported FFS diagnoses . CMS underestimated error rates for HCCs . A CMS example demonstrating the need for a FFS adjuster ?3 We de?ne the FFS adjuster as the percentage reduction to a risk score based upon claim diagnoses to move to a medical record diagnosis basis for a FFS population. We calculated this percentage including bene?ciaries with no H005 and bene?ciaries with one or more HCCs. When applying a FFS adjuster. care must be taken to apply it to the correct population. as the difference between the two de?nitions is signi?cant. if this adjuster is applied to only bene?ciaries who are RADV-eligible under the current CMS rules. the adjuster would need to be grossed up to apply only to that population. ?4 Based on $204.? billion in 2017 Part 0 federal spending. See HHS FY 2017 Budget in Brief - CMS Medicare. available at ovlabouUbud et/ 2017i?bud Medicare Advantage RADV FFS adjuster: White paper 4 August 2019 MILLIMAN WHITE PAPER . An expanded example incorporating normalization and RADV audits . Discussion of our technical analysis, which mirrors the CMS analysis . Additional context and considerations surrounding the H00 risk model and a FFS adjuster . Conclusion - Appendices of additional charts and examples Background MAOs are paid ?xed per bene?ciary amounts to deliver care to Medicare bene?ciaries. These ?xed amounts are calculated based upon a combination of amounts MAOs submit to CMS in the annual bid process and the projected health status of each bene?ciary as determined from their actual diagnoses and demographic information. While the complexities of the bid process are outside the scope of this paper. the majority of funding from CMS to MAOs is calculated by multiplying the plan bid amount at a 1.0 risk score times the actual risk score of the bene?ciary. As a result. the actual bene?ciary risk scores are a key determinant of total revenue for MAOs. Risk scores are calculated based upon diagnosis information from claims data using the CMS HCC model. Generally, more diagnoses result in higher payments by triggering more HCCs. it is to note that not all diagnoses map to an H00 and coding the same HCC more than once for an individual does not impact the risk scores. CMS calculates the dollar amount each HCC is worth in the CMS HCC model utilizing a weighted least squares regression. with certain constraints.15 based on one year of FFS diagnosis data from claims and the following year's FFS claims cost data. In essence. the dollar amount each HCC is worth. divided by the overall average claims cost for a FFS bene?ciary, is referred to as the coef?cient for each HCC. The steps thus far are typically referred to as ?calibration." To normalize the model to a 1.0 risk score for the FFS population. CMS calculates an average risk score for the FFS population and then divides all model coef?cients by that average FFS population risk score. Additional details regarding the creation of the CMS HCC model can be found in "Risk Adjustment of Medicare Capitation Payments Using the CMS- HCC Model.? published in Health Care Financing Review.?5 The term ?calibrate." as it applies to the H00 model. is widely used to refer to both the process where CMS calibrates the H00 model and then normalizes the model. In this white paper we clarify and distinguish the terms and use calibrate to refer to the application of a least squares regression to calculate the relative cost of medical conditions included in the H00 model. We use the term ?normalization?to refer to the process by which CMS ensures that the H00 model. when applied to the FFS population. results in a 1.0 risk score on average. After diagnoses have been reported and CMS issues ?nal payments to MAOs based upon the ?nal diagnoses, OMS then performs RADV audits on a selected set of MAOs. stated intent for RADV audits is to validate the accuracy of risk-based payments by validating the diagnoses. through medical records. submitted by MAOs that map to an HCC for payment. Conceptually. through these RADV audits. CMS intends to recover overpayments made to MAOs. ?5 The constraints are technical in nature. such as disallowing negative coef?cients. 1? Pope. G.C.. Kautter. J.. Ellis. RP. at al. (2004). Risk adjustment of Medicare capitation payments using the CMS-HCC model. Health Care Financing Review. Summer 2004. Vol. 25 No.4. Retrieved December 20. 2013. from Reviewidownloadsl04summerpg1 19.pdf. Medicare Advantage RADV FS adjuster: White paper 5 August 2019 MILLIMAN WHITE PAPER As described in the notice dated February 24, 201217, RADV audits, in general simpli?ed terms, involve: 1. Excluding end-stage renal disease (ESRD) and hospice bene?ciaries as well as any beneficiary not continuously enrolled from January of the diagnosis year to January of the payment year and who does not have an H00. 2. Ranking beneficiaries in each MA contract by risk score and dividing them into three equal groups. 3. Sampling 57 bene?ciaries from each group. 4. Requesting and auditing medical records from the MAO for each HCC recorded among the sampled bene?ciaries. 5. Calculating a ?payment error" based on the difference in the original payment and the RADV- audit-adjusted payment. 6. Calculating a 99% con?dence interval (CI) for the annual payment error per MA contract. 7. Selecting the lower bound of the Cl and. if it is above zero. reducing it by the FFS adjuster. 8. Extrapolating (for recovery) the result of step 7 to every RADV eligible bene?ciary in the contract if the result of step 7 is a positive value. CMS also stated the following in the February 24, 2012 notice for the rationale for a FFS adjuster: ?The FFS adjuster accounts for the fact that the documentation standard used in RADV audits to determine a contract's payment error (medical records) is different from the documentation standard used to develop the Part risk-adjustment model (FFS claims)." On November 1, 2018, CMS published the proposed rule proposing to eliminate the FFS adjuster, with a comment deadline of December 31, 2018. Subsequently, CMS extended the comment deadline for stakeholders to April 30 after announcing it would publish additional data. CMS again extended the deadline to August 28, 2019, publishing the programming code and additional data from 50 new simulations that CMS ran. Milliman obtained and evaluated the additional data. The proposed rule's provisions on RADV is the subject of this white paper. Actuarial equivalence requires a FFS adjuster in RADV If CMS applies different standards for determinations of diagnoses under Medicare FFS and MA. the required actuarial equivalence is not achieved. in its proposed rule, CMS proposes to apply the claims diagnoses to Medicare FFS and the medical record diagnoses to MAOs under RADV audit. This approach will not generate actuarially equivalent results without an adjustment to account for the difference between the claims diagnoses and the medical record diagnoses present in the FFS data a FFS adjuster). The Secretary of the US. Department of Health and Human Services (HHS) implemented the CMS HCC risk score model under authority granted by Title 42 U.S. Code underline and bold added for emphasis: Demographic adjustment, including adjustment for health status In general The Secretary shall adjust the payment such risk factors as age, disability status, gender. institutional status, and such other factors as the Secretary determines to be appropriate, includinq adiustment for health status under paragraph (3). so as to ensure actuarial eguivalence. The Secretary may add to, modify, or substitute for such adjustment factors if such changes will improve the determination of actuarial equivalence." ?7 During the comment period for the proposed rule, CMS released a notice revising the RADV audit procedures for 2014. Since these procedures are not ?nalized. will be subject to the ?nal rule, and are not included in the CMS analysis accompanying the proposed rule. we do not comment on the 2019 notice in this paper. Medicare Advantage RADV FFS adjuster: White paper 6 August 2019 MILLIMAN WHITE PAPER As stated by CMS in the February 24, 2012 notice, the documentation standard used to determine payment errors under a RADV audit of an MAO is medical records, but the documentation standard used to develop the HCC model is FFS claims data. The introduction of this different documentation standard violates actuarial equivalence unless a FFS adjuster is included. In UnftedHealthcare Ins. Co. v. Azania the court ruled that both Title 42 U.S. Code 1395w? requires the Secretary to implement a risk adjustment program that effectuates actuarially equivalent risk adjustment of payments between the FFS and MA programs, and the varying documentation standards violate actuarial equivalence. Further, CMS itself. in internal documents released in response to a Freedom of Information Act (FOIA) request, agrees and states: "We think this approach makes sense and from a technical point of view is the right thing to do,"19 in reference to including a FFS adjuster to address the issue of differing coding standards. CMS technical analysis should not include unsupported FFS drag noses CMS included both supported and unsupported diagnoses in the technical analysis it described as simulating HCC model creation with only supported diagnoses. In effect, the CMS technical analysis compared a model created with all diagnoses to another model created with all diagnoses, effectively making the analysis irrelevant to the discussion of whether or not a model calibrated and normalized using only supported diagnoses would produce different payments to MAOs. Stated differently, the CMS analysis did not serve to address the question of whether or not a FFS adjuster is necessary. We discuss speci?c assertions and explanations put forth by CMS in the 'Technical Analysis' section, below. The remainder of this section focusses on a conceptual discussion, followed by examples, both of which clearly demonstrate the need for a FFS adjuster and that the CMS technical analysis should not include unsupported FFS diagnoses. The CMS technical analysis was put forth to demonstrate that MAO payments do not materially change based upon calibrating the CMS HCC model, including or excluding unsupported HCCs. However, the calibration of the model is only a portion of the issue, and for the other portion of the issue. which is normalization, CMS did not exclude the simulated unsupported HCCs. We summarize the CMS description of its process as: 1. Calibrate the HCC model utilizing the original uncorrected data set (where the uncorrected data set includes unsupported diagnoses). 2. Normalize this HCC model using the original uncorrected data set to achieve a 1.0 risk score in total. Calculate claims-level error rates using the FFS Comprehensive Error Rate Testing (CERT) data. Convert the claims-level error rates into HCC-level error rates. Utilize the HCC error rates to simulate the removal of unsupported diagnoses from the original uncorrected data set to produce a simulated corrected data set. 6. Calibrate the HCC model utilizing the simulated corrected data set. 91:55? ?3 330 F.Supp.3d 173 (D.D.C. 2018) (Collyer. J.), appeal docketed, No. 18-5326 (D.C. Cir. Nov. 14, 2018}. Retrieved December?l, 2018, from ?9 See Appendix below. Acquired from: DOCKET 44. UNITED PRICE NO. Medicare Advantage RADV FFS adjuster: White paper 7 August 2019 MILLIMAN WHITE PAPER 7. Normalize this HCC model using the original uncorrected data set to achieve a 1.0 risk score in total.20 8. Apply both models to a sample MAO data set and compare the resultant risk scores. The calibration of the H00 model utilizes a weighted least squares regression (see the Statistical Background section below for more details) to determine how the risk score coef?cient for each HCC relates to the coef?cients for other HCCs. For example. calibration might determine that the coef?cient for HCC 1 is 10% higher than HCC 2. 25% higher than H00 3. etc. The calibration step does not determine the ?nal level of the coef?cients. It is the normalization step that determines the ?nal level of the coef?cients for each HCC. Speci?cally. CMS applies the calibrated risk score model to the uncorrected FFS data set. and divides all the coef?cients by the resulting risk score to ensure that the ?nal normalized model produces an average risk score of 1.0 for the FS population. CMS used the uncorrected data set to normalize both the HCC model that was calibrated with the uncorrected data set and the HCC model that was calibrated with the simulated corrected data set. The calibration step is not signi?cant in the context of determining the overall risk score. it simply adjusts the relative value of each HCC. The normalization step is critical because it scales how much each HCC counts in determining an overall risk score. As mentioned. OMS utilized the uncorrected data set to normalize both HCC models, so neither model re?ects the removal of the simulated unsupported diagnoses. Stated differently. CMS removed the simulated unsupported diagnoses for the calibration step and then immediately put them back into the analysis for the normalization step. When CMS compared the MA risk scores produced by the two different models. it was really calculating the effect of MA having a higher incidence of certain HCCs than FFS and a lower incidence of others (which is the small difference it identi?ed). In CMS's technical analysis comment. CMS references a potential difference of this sort and discards it as possible but immaterial. The CMS analysis does not compare the effect of calibrating and normalizing the model with and without unsupported HCCs, which is critical for calculating a FFS adjuster. An example. developed by CMS. illustrating this concept is included in Appendix A. We expanded the example explicitly to include the impact of model normalization in the section entitled ?Example demonstrating actuarial equivalence is violated." There also appears to be an inconsistency in the CMS Technical Appendix. Speci?cally. CMS describes a proper procedure but then uses a different procedure in practice. On page 12 of the appendix. CMS states (emphasis added): "Although fundamentally based on expenditures, the regression is adjusted such that the H00 and demographic factors will provide an average risk score of one on the calibrating FFS dataset.? As described on the next page of CMS's Technical Appendix (emphasis added): ?We then estimate the CMS-HCC model on the simulated corrected data. In the next step. we take the new coef?cients and apply them on the original FFS data set. normalizing a new set of relative factors to one." 2? In documents such as rate announcements and proposed rules on risk scores. OMS describes a process of creating the CMS HCC risk model as including a step to divide dollar-based HCC coef?cients by a total denominator year predicted cost. The Technical Appendix does not describe this step. but does describe normalization. As we interpret the CMS Technical Appendix. the normalization step is comparable to the denominator year adjustment. This understanding is supported by additional details provided by CMS in the June Addendum. Medicare Advantage RADV FFS adjuster: White paper 8 August 2019 MILLIMAN WHITE PAPER Because CMS has normalized back to the "original FFS data set" and not the ?simulated corrected data," which was the ?calibrating FFS dataset," CMS effectively added the simulated unsupported diagnoses back into the data set. which sets the documentation standard back to a claims diagnosis basis. Thus, the CMS analysis measured a model calibration difference rather than addressing the question of whether a FFS adjuster is required in RADV audits. CMS underestimated error rates for HCCs OVERVIEW CMS established HCC error rates for the purpose of evaluating a FFS adjuster utilizing data and methodologies that led to underestimation of the H00 error rates. In the Technical Appendix, CMS recognizes certain shortcomings in the calculation of error rates and the data used to calculate the error rates. Utilizing the potential range of HCC error rates from the CMS analysis that would result from alternative assumptions regarding the degree of independence of claims-level error rates, we estimate that CMS signi?cantly understated the H00 error rates. Speci?cally, CMS utilized an aggregate HCC error rate of 2% when the true error rate, based on CMS data and varying the degree of dependence, is likely to be between 12% and 33%. Appropriate testing of FFS data to support the calculation of an HCC error rate must be performed to properly calculate the magnitude of a FFS adjuster. In particular, all claims for a sample of bene?ciaries must be used rather than a sample of claims from a wide array of bene?ciaries that are converted to a bene?ciary basis. Claims must not be excluded simply because the provider did not provide sufficient medical records or documentation, because a RADV audit would include such claims and count them as unsupported diagnoses errors). Further, claims from all settings of care should be used with an appropriate sample size. Strati?ed sampling by HCC combined with oversampling for low frequency HCCs may be an appropriate method to reduce the required sample size. CLAIMS CODING ERROR INDEPENDENCE The CMS Center for Program Integrity (CPI) "performed a review on the CERT data,? which included 2008 outpatient FFS diagnosis data. Claims were only included if they had diagnoses that mapped to 8,630 unique claims were included, which is a relatively small total sample size given the large number of diagnosis codes and H003. While CMS stated that it used ?RADV-like review" procedures, CMS deviated from RADV procedures in several important ways. CMS did not include claims for which providers did not provide suf?cient medical record support. Further CMS did not review all claims for individual bene?ciaries; rather. CMS reviewed and calculated error rates on individual outpatient claims. An audit using all claims mapped to an HCC for a representative sample of individual bene?ciaries is necessary to properly estimate the HCC error rate for the Medicare FFS program. Diagnoses can be coded by different providers in different settings. Coding of a single supported diagnosis that maps to a particular HCC is suf?cient to include that HCC for a bene?ciary. As such, accurate estimation of HCC error rates must be completed by reviewing all the claims with diagnoses that trigger an HCC for an individual bene?ciary and determining whether or not at least one diagnosis is supported by the medical record. Many coding errors are not independent from one claim to the next. CMS's approach ignores any correlation between coding errors, effectively assuming that providers randomly make coding errors without regard to errors they have made in the past. We believe it is more likely that a provider or medical coder would tend to make similar errors from one claim to the next based upon their work habits, training, office practices, and by looking at their own prior diagnosis coding when coding a subsequent claim; thus Medicare Advantage RADV FFS adjuster: White paper 9 August 2019 MILLIMAN WHITE PAPER errors would be correlated, at least to some degree. The assumption that providers code randomly must hold to assume independence. For example, for a beneficiary with Major Depressive, Bipolar, and Paranoid Disorders (HCC 55),21 CMS calculated a claims-level coding error rate of about 50%, the same probability of ?ipping heads in a coin toss. CMS further calculated a bene?ciary with HCC 55 is likely to have about six claims per year with diagnoses mapping to HCC 55. CMS then assumed each claim is independent, as ?ips of a coin are independent. Under this assumption of independence, we would statistically expect three codes for an average bene?ciary to be supported and three codes to be unsupported. Under this scenario where providers behave randomly (like a coin it would be extremely unlikely to have six coding errors on six visits (like ?ipping heads six times in a row). This independence assumption can be expected to result in HCC-level error rates that are signi?cantly lower than if providers or medical coders make errors that are related to each other, perhaps from copying diagnoses from a prior visit or from particular personnel repeatedly making the same type of error. Nevertheless, in calculating HCC error rates CMS has assumed independence of errors among claims. CMS assumes that each claim is equally and independently likely to have an unsupported diagnosis coded. As such, CMS raises the probability of an error on a single claim to the power of the average number of claims. In our example with HCC 55, CMS assumes the probability of that error occurring six times for the same bene?ciary is 0.5 A 6 Another scenario where the claims-level error rate is 50% for bene?ciaries with HCC 55 can be illustrated simply by considering two beneficiaries. Assume both beneficiaries have HCC 55, visited a provider six times, and have had HCC 55 for several years. Bene?ciary A's provider reviewed the patient history and copied support for HCC 55 in the electronic medical record from prior visits and pasted it in the medical records again for the current year, but Bene?ciary B?s provider continued treating the patient without rerecording the support in the medical record. In this example, Bene?ciary A has six supported diagnoses and Bene?ciary has zero, resulting in a claims-level error rate of 6 I 12 50%. The HCC error rate is also 50% (1 12 The assumption of independence signi?cantly reduces the HCC error rate. In the example illustrated here and looking solely at the issue of independence, the true HCC error rate can be expected to be between 1.6% and 50%, depending upon the speci?c coding patterns of the providers and medical coders involved. A provider's work habits, job training, and of?ce operating procedures all lead to an increase in the degree of dependence in coding errors. For example, if a particular provider's of?ce has a gap in its training of medical coders around coding diagnoses that map to HCC 55, those coders are likely to repeatedly make the same mistake. This could lead to every bene?ciary who is treated by the of?ce having the same coding error for every claim. This leads to the same result described in the previous paragraph?s example. Under this assumption, each bene?ciary has a 50% chance of an HCC error being recorded. Again. the true HCC error rate can be expected to be between this scenario and the full independence error rate, that is, between 1.6% and 50% for HCC 55. THE AVERAGE NUMBER OF CLAIMS PER BENEFICIARY CANNOT BE USED TO TRANSLATE TO A BENEFICIARY- LEVEL ERROR RATE Using the average number of claims per bene?ciary materially understates error rates when translating claims-level error rates to bene?ciary-level error rates. We describe our approach for adjusting for this issue in the ?Technical analysis: Model and data selection' section of this paper, below. The CMS technical analysis uses the average number of claims per bene?ciary to convert a claims-level error rate to a bene?ciary-level HCC error rate. Ignoring the issue with independence, as discussed above, failing to account for the distribution of the number of claims per bene?ciary within an HCC will bias the error rate downward from the true value. *1 HCC 55 included 448 claims in the 2008 FFS CERT sample, a number likely to be credible to calculate the error rate on claims. The exact error rate CMS calculated was 51.80%. Further, CMS estimated 6.1 visits per year per bene?ciary with a diagnosis code mapping to HCC 55. Medicare Advantage RADV FFS adjuster: White paper 10 August 2019 MILLIMAN WHITE PAPER Some bene?ciaries will have more claims than the average and some will have fewer. The approach CMS uses applies an exponent, which represents the average number of claims per HCC, to claims-level error rates that are below 1.0. As the number of claims increases, adding an additional claim does not materially change the assumed HCC?level error rate. However, at a lower number of claims per HCC, each additional claim does make a material difference. Consider a continuation of the H00 55 example from above. CMS assumes an average claims error rate of 50% with an average of six claims per bene?ciary. If there are two bene?ciaries with HCC 55 and one has two claims while the other has 10, the HCC error rate (assuming independence for simplicity only) is 0.5 A 2 0.25 for the ?rst bene?ciary and 0.5 A 10 0.0001 for the second bene?ciary. Averaging these error rates yields an average HCC error rate of 0.125. A similar calculation utilizing an average number of claims for all bene?ciaries yields an average error rate of 0.5 A 6 0.016 for each bene?ciary. The true average error rate in this example is nearly eight times higher than the error rate calculated using an average number of claims per beneficiary. SENSITIVITY OF A FFS ADJUSTER T0 ERROR RATES The results of the CMS study are very sensitive to the speci?c error rates used in the analysis. The error rates are highly sensitive to how independent the coding of one claim is to the next as well as to the distribution of the number of claims with a diagnosis mapping to a particular HCC. We performed sensitivity analyses and present the FFS adjusters we calculated when assuming full independence with an average number of diagnoses per bene?ciary, full independence with a distribution of the number of diagnoses per bene?ciary in the 2014 5% Sample. (0) complete dependence, and 25%, 50%, and 75% of the way between the full independence with a distribution of diagnoses and full dependence scenarios. We calculated the following FFS adjuster percentages, by the percentage of independence assumed in claims coding errors. as shown in Figure 1. FIGURE 1: FFS ADJUSTER PERCENTAGES 0F INDEPENDENCE FFS ADJUSTER 100% (fully independent) Average diagnoses i bene?ciary 100% (fully independent) 8.2% Actual diagnoses I bene?ciary 75% 11.6% 50% 14.9% 25% 18.1% 0% (fully dependent) 21.3% The scenario using average diagnoses is shown only for reference as a crosswalk from the CMS analysis. Average diagnoses per bene?ciary is not a reasonable scenario for calculating a FFS adjuster. Higher error rates produce similarly larger deviations from actuarial equivalence under the scenario where CMS does not utilize a FFS adjuster in RADV audits. Our simulations of the CMS methodology with varying HCC error rates produce a relatively direct relationship between the error rate and the impact to a FFS adjuster. That is, when the H00 error rates doubled. the deviation from actuarial equivalence also approximately doubled. COMPUTATIONAL ISSUES CMS cites the average (mean) error rate at 3% with a median of CMS does not describe how those estimates were calculated, but based upon the data provided in the Technical Appendix, it appears the Medicare Advantage RADV FFS adjuster: White paper 11 August 2019 MILLIMAN WHITE PAPER error rate it calculated for each HCC was equally weighted without regard to the prevalence of each HCC in the data set. We utilized the prevalence of H005 in the 2014 5% Sample and weighted the error rates CMS calculated by HCC to produce an error rate of This does not impact the results of either the CMS analysis or our analysis. We mention it to identify what may otherwise appear to be an inconsistency in HCC error rates cited in this white paper versus the CMS Technical Appendix. IS THE SAMPLE SIZE As a result of decision to use CERT data, which samples claims rather than bene?ciaries for RADV-like reviews of FFS data, it is not possible to de?nitively determine whether the sample CMS utilized is of suf?cient size to be credible to determine the overall HCC error rate. The CMS Technical Appendix asserts statistical calculations to demonstrate the sample is large enough in total, but those statistics require an assumption of independence. which is inappropriate, as previously discussed. The CMS Technical Appendix does recognize that the error rates they calculate are not credible at the HCC level: "One of the principle challenges of using FFSOS for this purpose is that the CERT sample was not designed to produce a representative sample of diagnoses. As a consequence, for many of the diagnoses and by extension. the HCCs, we have an insuf?cient sample size to develop reliable discrepancy rates at the HCC level. As shown in Table 2a, discrepancy rates ranged from 0-100%. As expected, sample size was an issue for a number of the HCCs. Nearly half of the HCCs had fewer than 28 observations." As asserted by many MAOs in their criticism of the 2007 RADV audit methodology and demonstrated by CMS in highlighting the widely varying error rates by HCC. the distribution of HCCs in a sample is very important to the results of a RADV audit. CMS has not demonstrated that the sample size utilized in its analysis is large enough to property calculate a FFS adjuster. Example demonstrating actuarial equivalence is violated The theoretical arguments that a FFS adjuster is required in RADV audits are compelling. and we supplement these arguments with concrete examples. The concepts and statistical work required for full calculation of risk scores. calibration, normalization, and RADV audits is extremely complex. Both we and CMS have created simpli?ed examples to highlight the relevant concepts. CMS EXAMPLE The CMS developed example is simpler and it was created before the recent proposed rule; however, it does not highlight all of the concepts discussed herein. Appendix A includes this example22 and clearly demonstrates the need for a FFS adjuster. We acquired this example from the briefs ?led in the UnitedHealthcare Ins. Co. v. Azar case. The ?rst table in the CMS example (reproduced in Figure 2 below) shows four bene?ciaries, all of whom have diabetes indicated on their claims records. The ?rst three also have diabetes coded in their medical records. while the fourth does not. CMS then lays out an illustrative cost of $4,000 for each bene?ciary who has diabetes coded in their medical record. Other conditions and treatments are ignored. This results in a total FFS cost of $12,000 for all four bene?ciaries. 22 The CMS example would be clearer if CMS did not add Bene?ciary in the second slide; however. the result remains the same. This bene?ciary increases the initial payment to the plan from the original four bene?ciaries but does not change the actual cost to provide care nor does it change the ?nal payment to the plan. In the example as presented. the plan is still underpaid by $3,000 reiative to FFS. Medicare Advantage RADV FFS adjuster: White paper 12 August 2019 MILLIMAN WHITE PAPER Because CMS calibrates and normalizes the H00 model on diagnoses that are on claims, CMS divides the $12,000 of cost by the count of beneficiaries with a diabetes diagnosis on a claim. In this example, there are four bene?ciaries, resulting in $12,000 I 4 $3,000 of cost for each diabetes diagnosis. FIGURE 2: CMS EXAMPLE, FIRST TABLE DIABETES IN DIABETES ON MEDICAL FFS COST Bene?ciary A Yes Yes $4,000 Bene?ciary Yes Yes $4,000 Bene?ciary Yes Yes $4,000 Bene?ciary Yes No $0 Total $12,000 Diabetes Value for MA Payment $3,000 The second table in the CMS example (reproduced in Figure 3) demonstrates how an MAO is paid. In this example OMS includes ?ve bene?ciaries, all with diabetes coded on claims. The MAO is paid $3,000 each for a total of $15,000. However, Bene?ciaries and do not have diabetes coded on their medical records. As a result, under a RADV audit, CMS recovers the $6,000 paid to the MAO for Bene?ciaries and E, resulting in a ?nal payment to the MAO of $9,000. Bene?ciaries A through are identical beneficiaries in the two tables. Under FFS the cost for the four bene?ciaries is $12,000,23 but under the scenario where the MAO undergoes a RADV audit without a FFS adjuster, the MAO is paid $9,000, which is $3,000 less than under FFS.24 CMS's example clearly demonstrates actuarial equivalence does not exist between FFS and MA when a RADV audit is performed without a FFS adjuster. FIGURE 3: CMS EXAMPLE, SECOND TABLE DIABETES REPORTED DIABETES CMS CMS BY MA IN MEDICAL PAYMENT PAYMENT T0 PLAN PLAN COST RADV TO PLAN Bene?ciary A Yes Yes $3,000 $4,000 $3,000 Bene?ciary Yes Yes $3,000 $4,000 $3,000 Bene?ciary Yes Yes $3,000 $4,000 $3,000 Bene?ciary Yes No $3,000 $0 ($3,000) $0 Bene?ciary Yes No $3,000 $0 ($3,000) $0 Total $15,000 $12,000 ($6,000) $9,000 CMS EXAMPLE: EXPANDED This section expands the prior CMS example with the inclusion of risk scores, normalization, and the calculation of a FFS adjuster to illustrate the normalization effect and the need for a FFS adjuster. First, for ease of calculations we assume the MAO's bid, that is, the risk-adjusted portion of payments from CMS to the MAO is $10,000 per bene?ciary per year. That is, OMS will pay $10,000 to the MAO for a bene?ciary with a 1.0 risk score and will pay $11,000 ($10,000 times 1.1) for a bene?ciary with a 1.1 risk score. 23 OMS assumes the plan cost is the same as the FFS cost and that Bene?ciaries and do not have diabetes. so there is no cost. 2? In this example, no normalization step is required because total FFS dollar costs are shown; therefore, the $12,000 is already effectively normalized to a 1.0. Medicare Advantage RADV FFS adjuster: White paper 13 August 2019 MILLIMAN WHITE PAPER We utilize the same four bene?ciaries from the CMS example and use the same costs. However, we add a demographic component and assign each bene?ciary a different demographic status and cost. Our full example is presented in Appendix B. and we present pieces in tabular format throughout the discussion in this section. Figure 4 shows the four bene?ciaries who all have diabetes coded on a claim along with their actual and assumed costs under FFS. We also performed a least squares regression25 to calibrate our simpli?ed HCC model (which contains three demographic factors and one HCC for diabetes). and the resulting risk score coef?cients are shown in Figure 4. FIGURE 4: EXPANDED DEMOGRAPHICS TABLE 1 MODEL CALIBRATED AND NORMALIZED WITH UNADJUSTED FFS DIAGNOSES FFS COST FFS BENEFICIARIES ON ACTUAL PREDICTED COEFFICIENT Bene?ciary 1 70 yr old $6,500 0.650 Diabetes Yes $3.000 0.300 Subtotal $9.000 $9.500 0.950 Bene?ciary 2 70 yr old $6.500 0.650 Diabetes Yes $3.000 0.300 Subtotal $10,000 $9.500 0.950 Bene?ciary 3 75 yr old $7.000 0.700 Diabetes Yes $3,000 0.300 Subtotal $10,000 3510.000 1.000 Bene?ciary 4 80 yr old Dual $8.000 0.800 Diabetes Yes $3,000 0.300 Subtotal $11,000 $11.000 1.100 Total $40,000 540.000 1.000 For the purposes of this example, we assume these four bene?ciaries represent the entire universe of FFS bene?ciaries. The total cost for these beneficiaries is $40,000 and we see the model is predicting $40,000 of cost in the Cost/Predicted" column. Weighting together the coef?cients. we see the model produces a 1.000 risk score for the entire FFS population and so is already normalized to a 1.000 risk score. using diagnoses coded on FFS claims (not medical records). The modeling in Figure 4 corresponds to the ?rst model calibration and normalization in the CMS technical analysis. that is. the version where diagnoses are calibrated and normalized on a FFS claims diagnosis basis. Next. we repeat these steps after reviewing the medical records and ?nding Bene?ciary 4 does not have diabetes documented. We apply least squares regression to recalibrate our simpli?ed HCC model to the medical record diagnoses and calculate the new Cost/Predicted" and "Coef?cients" columns shown in the table in Figure 5. 25 Due to the simplistic nature of this example. the least squares regression does not produce a unique solution. We used SAS for the regression calculations and seeded the starting values to ensure the particular solution would most resemble the original CMS example we are expanding upon. Medicare Advantage RADV FFS adjuster: White paper 14 August 2019 MILLIMAN WHITE PAPER FIGURE 5: RECALIBRATED TABLE 2 MODEL CALIBRATED AND NORMALIZED WITH UNADJUSTED FFS DIAGNOSES FFS COST FFS ON MEDICAL BENEFICIARIES ACTUAL PREDICTED COEFFICIENT Bene?ciary 1 7'0 yr old $5.500 0.550 Diabetes Yes $4.000 0.400 Subtotal $9,000 $9.500 0.950 Bene?ciary 2 70 yr old $5,500 0.550 Diabetes Yes $4,000 0.400 Subtotal $10,000 $9,500 0.950 Bene?ciary 3 75 yr old $6.000 0.600 Diabetes Yes $4,000 0.400 Subtotal $10,000 $10,000 1.000 Bene?ciary 4 80 yr old Dual $11,000 1.100 Diabetes No $0 - Subtotal $11,000 $11,000 1.100 Total $40,000 $40,000 1.000 Again. the total actual and predicted FFS cost is $40,000 and our model produces a total risk score of 1.000 when calibrated and normalized using the medical record diagnoses. However. comparing Figures 4 and 5, we observe diabetes has a coef?cient of 0.300 in the ?rst scenario and 0.400 in the second scenario. Note the total cost to provide care has not changed and the total risk score for the FFS population is 1.000 in both instances. This second scenario is not performed in the CMS technical analysis. though it should have been because it represents the entire process completed without diagnoses that are not supported on medical records. The table in Figure 6 illustrates the process that CMS used to develop the revised HCC model in its technical analysis. It shows the risk scores from the model calibrated with the simulated diagnoses documented on medical records in the "Before Normalizing" column. The "On Claim?" column shows a "Yes" where the HCC is applied for a bene?ciary, and in this case. shows that the unadjusted claims- based diagnoses are used. Note that the risk scores total to 1.100 for the same four bene?ciaries. CMS then applies the normalization step using unadjusted claims-based diagnoses and divides all coef?cients by the total risk score for all FFS bene?ciaries, which is 1.100. This step is required to ensure the model produces a 1.0 risk score for the FFS population. The resulting new coef?cients are in the column. labeled "After Normalizing." Medicare Advantage RADV FFS adjuster: White paper 15 August 2019 MILLIMAN WHITE PAPER FIGURE 6: CMS PROCESS TABLE 3 MODEL CALIBRATED WITH ADJUSTED FFS DIAGNOSES BUT NORMALIZED WITH UNADJUSTED DIAGNOSES FFS BEFORE BENEFICIARIES 0N NORMALIZING AFTER NORMALIZING Bene?ciary 1 70 yr old 0.550 0.500 Diabetes Yes 0.400 0.364 Subtotal 0.950 0.864 Bene?ciary 2 70 yr old 0.550 0.500 Diabetes Yes 0.400 0.364 Subtotal 0.950 0.864 Bene?ciary 3 75 yr old 0.600 0.545 Diabetes Yes 0.400 0.364 Subtotal 1.000 0.909 Bene?ciary 4 80 yr old Dual 1.100 1.000 Diabetes Yes 0.400 0.364 Subtotal 1.500 1.364 Total I 1.100 1.000 I Figures displayed in Figure 6 are rounded to three decimals. Unrounded values are used to produce Figure 7. Next. we calculate how an MAO would be paid for these identical bene?ciaries underfour scenarios: (1) no FFS adjuster without a RADV audit, (2) no FFS adjuster with a RADV audit, (3) FFS adjuster without a RADV audit, and (4) FFS adjuster with a RADV audit. The table in Figure 7 shows these four scenarios. Medicare Advantage RADV FFS adjuster: White paper 16 August 2019 MILLIMAN WHITE PAPER FIGURE 7: HOW MAOS ARE PAID: FOUR SCENARIOS TABLE 4 MA PAYMENT MA PAYMENT WITHOUT FFS ADJUTER WITH FFS ADJUSTER BEFORE RADV AFTER BEFORE RADV AFTER FFS RADV IMPACT RADV RADV IMPACT RADV Bene?ciary 1 70 yr old 555.000 $5.000 $5.000 $5.000 Diabetes $3.636 $3,636 $8,636 $3,636 Subtotal $8.636 $8,636 $8,636 $8,636 Bene?ciary 2 70 yr old $5,000 $5,000 $5.000 $5.000 Diabetes $3.636 $3,636 $3,636 $3,636 Subtotal $8,636 $8,636 $8,636 $8,636 Bene?ciary 3 75 yr old $5.455 $5.455 $5.455 $5.455 Diabetes $3,636 $3,636 $3,636 $3,636 Subtotal $9.091 $9,091 $9,091 $9,091 Bene?ciary 4 80 yr old Dual $10,000 $10,000 $10,000 $10,000 Diabetes $3,636 ($3,636) $0 $3.636 ($3,636) $0 Subtotal $13,636 $10,000 $13,636 $10,000 Total $40,000 $36,364 $40,000 $36,364 Raw RADV Recovery $3,636 $3,636 FFS Adjuster $0 $3,636 Final RADV Recovery $3,636 $0 Final Payment to MAO $40,000 $36,364 $40,000 $40,000 Actuarially equivalent? No Yes The payments to the MAO are calculated by multiplying the applicable risk scores or coef?cients by the annual MAO bid of $10,000. We utilize the risk scores under the scenario CMS modeled (in the "After Normalizing? column of Figure 6 above), where the model was calibrated with adjusted diagnoses but normalized with unadjusted diagnoses. Unsurprisingly, the two scenarios without a RADV audit produce the same payment as would have been made under FFS, $40,000. However. with a RADV audit. payments to the MAO are reduced to $36,364 because Bene?ciary 4 is found to not have diabetes documented in the medical record. The scenario without a FFS adjuster recovers $3,636 from the MAO. paying the MAO 9% less than would have been paid for identical bene?ciaries under FFS. thus violating actuarial equivalence. To calculate the ?nal payment under the ?nal scenario, with a FFS adjuster, we ?rst must calculate a FFS adjuster. Because we know the risk score under the applicable HCC model for the entire FFS population is 1.100 with claims-based diagnoses and 1.000 with medical records diagnoses, the FFS adjuster is 1.100 divided by 1.000 minus 1. that is. 10%. We calculate the FFS adjuster amount by multiplying the 10% times the payment the RADV audit found to be supported by the medical records, $36,364. and ?nd the FFS adjuster to be $3,636. Finally, the RADV recovery is reduced for the FFS adjuster and the recovery is Under this scenario the MAO is paid $40,000. exactly the same amount as the identical bene?ciaries would have cost under FFS. This con?rms actuarial equivalence. For completeness. Appendices and repeat the expanded example described here utilizing an HCC model that is calibrated and normalized under the other two scenarios described in this section (Figures 4 and 5). While the size of the FFS adjuster varies. the result is exactly the same. A FFS adjuster is required to maintain actuarial equivalence. Medicare Advantage RADV FFS adjuster: White paper 17 August 2019 MILLIMAN WHITE PAPER In summary. from these examples it is clear a FFS adjuster is required to maintain actuarial equivalence, as required by statute. The failure to include a FFS adjuster violates actuarial equivalence in every case. Technical analysis MODEL AND DATA SELECTION CMS utilized a model calibration data set of diagnoses from 2004 and claims from 2005 for the FFS portion of the technical analysis. We acquired those data sets in March 2019 when CMS released them. The CMS data set does not include claim level diagnoses that can be mapped to member level demographic and payment data. As a result. certain analyses on the data set cannot be performed. Speci?cally. when re-calibrating the CMS HCC model using the CMS 2004/2005 data. the actual distribution of the number of claims per HCC cannot be used. To analyze the effect of the CMS simplifying assumption of an average number of diagnoses per bene?ciary. we utilized the 2014 and 2015 5% Sample data sets to supplement the FFS portion of our analysis. This data and approach allow us to apply the CMS claim level error rates to claims: and then, to calculate HCC level error rates without assuming an average number of diagnoses per HCC. As described further in the 'Reproduction of CMS technical approach' section below. we note that our calculation of a FFS adjuster utilizing the CMS data set and the 5% Sample both produced 1.1% under the full independence scenario when using the CMS HCC level error rates and calibrating and normalizing the HCC model to the respective audited data sets. Similar to CMS. we used version 12 of the CMS HCC model, which was the model in effect for payment years through 2015 (payment years 2014 and 2015 utilized a blend of this model and a newer model.) We utilized the MA diagnosis data published by CMS in the March 2019 data release to calculate the effect of the various model recalibration scenarios on MA plans. The particular model or year of data utilized does not impact the conclusion of whether a FFS adjuster is required to maintain actuarial equivalence. though it may impact the magnitude of a FFS adjuster calculated. In the next section. we discuss our reproduction of the CMS results. serving as con?rmation that the particular year and version of the model are not material in demonstrating the concepts discussed in this paper. REPRODUCTION OF CMS TECHNICAL APPROACH We contacted CMS on several occasions to ensure our interpretation of CMS's analysis was correct. When we contacted CMS directly. CMS cited the Administrative Procedures Act and declined to answer questions and declined to con?rm that the text in the Federal Register and the technical backup were correct and as OMS intended. We also asked the same questions on the call CMS hosted to discuss the proposed rule, but the appropriate subject matter experts (SMEs) were not on the phone to answer the questions. CMS also indicated there would be no follow-up call with the SMEs and there would not be time for an FAQ before the end of the comment period. Absent con?rmation of our interpretation of the methods CMS utilized. we rely upon the text CMS released. as published. in combination with our reproduction of the methods and the results CMS described. We reproduced the CMS technical analysis using the CMS data set underlying the technical analysis. as well as the 2015 5% Sample data set (with 2014 diagnoses from the 2014 5% Sample) utilizing the 2013 CMS HCC model. We then applied the recalibrated and renormalized HCC model to the CMS MA HCC data set. In reproducing the CMS methodology, we con?rmed that our process also showed that when the CMS HCC model was calibrated with a simulated corrected FFS data set and then normalized with an Medicare Advantage RADV FFS adjuster: White paper 13 August 2019 MILLIMAN WHITE PAPER uncorrected data set, applying the resulting model to MAO bene?ciaries does not result in a signi?cant change to MAO risk scores?. In June 2019. subsequent to our initial technical analysis. CMS released an Addendum including additional information. additional data. and SAS programs. which further con?rmed we correctly understood and reproduced the CMS analysis. CMS ADDENDUM TO THE FEE-FOR-SERVICE ADJUSTER STUDY AND IPARS The Addendum included explicit confirmation of technical details we had inferred from prior CMS information releases. The Addendum also included a mathematical ?explanation" of the CMS approach to calculating a calibration bias in the CMS-HCC model in section IV.B.. titled ?General Expenditure Adjustment to Offset Delete Bias." The mathematical explanation contains some errors. For example. step 2 de?nes Iji as the complete matrix of all HCC disease indicators and further that the sumproduct of all coef?cients and indicators is equal to the total FFS expenditure (E): 39 2 brilri Er j=1i=1 j=1 However. the disease indicators do not include demographic variables, which are included in the CMS HCC model and explain a significant portion of expenditures. Further. the use of averages to describe coef?cient values in step 5 is inconsistent with Ordinary Least Squares (OLS) because it ignores the difference in weight and frequency of the coef?cients and independent variables within the regression model. If regression concepts were considered rather than average coef?cient values. then the removal of a disease indicator for a bene?ciary with above average spend for that HCC would decrease. rather than increase (as CMS described in step 6). the coef?cient value resulting from OLS. However. these mathematical problems with the CMS explanation should not be expected to invalidate the overall conclusion that, when the CMS HCC model is calibrated and normalized to produce the total FFS expenditures on separate sets of independent variables, the total always balances to the total FFS expenditure. By way of this explanation. CMS con?rms it asked and answered a question that does not address the need for a FFS adjuster. CMS addressed a question of accuracy in CMS HCC model coefficient calibration but has not calculated a preper FFS adjuster and not addressed actuarial equivalence or the issue of consistently applying the CMS HCC model to the calibration dataset and the payment dataset. We described this issue in the ?Actuarial equivalence requires a FFS adjuster in section and further expound upon it in the technical analysis should not include unsupported FFS diagnoses' section. above. We illustrate the need for a FFS adjuster in a RADV audit using CMS's example and an expansion of example in the 'Example demonstrating actuarial equivalence is violated' section. above. Further. in the next section we discuss one potential adjustment to the CMS approach that could address the question of whether or not a FFS adjuster is required. Finally. the Addendum repeats the original CMS 50 simulations that measured ?audit miscalibration." CMS completes a new set of 50 simulations. publishing the same results plus an intermediate step that focuses on the ratio of expenses projected by the simulated ?corrected" OMS HCC model using ?un- perturbed" FFS H005 to the average actual FFS expenses. CMS refers to this quantity as in?ated Post- We calculated a mean ?audit miscalibration? of 0.002 versus the CMS calculation of 0.001. which we consider to demonstrate successful reproduction of the CMS calculations. Note the calibrated CMS HCC models CMS created in this study do not follow all of the steps CMS uses when creating the ?nal model for actual payment to MA plans and. as such. demonstration of small differences are not suf?cient to conclude an actual difference exists. Medicare Advantage RADV FFS adjuster: White paper 19 August 2019 MILLIMAN WHITE PAPER Audit Risk Scores (IPARS). In the Addendum, CMS calculates IPARS to be The CMS Addendum does not discuss the signi?cance of however, a non-zero IPARS demonstrates the need for a FFS adjuster. Further, the CMS IPARS calculation is consistent with our calculation of a payment discrepancy of 1.1% in the next section titled ?Adjustment of CMS technical approach.? As demonstrated in the examples and conceptual discussion. above. this difference in risk score and payment results is evidence of the need for a FFS adjuster in RADV audits. If the technical issues with CMS's estimated HCC error rates were resolved, IPARS would be dramatically larger. emphasizing the critical need for a FFS adjuster. ADJUSTMENT OF CMS TECHNICAL APPROACH After con?rming we could reproduce the CMS results. we adjusted the normalization process to be completed excluding the simulated unsupported diagnoses. We then applied the new model that was calibrated and normalized on a simulated corrected data set to the MA HCC data and produced MA risk scores, which were, on average, 1% higher than the original model that did not exclude unsupported diagnoses. It is important to note that this 1% effect is certainly material; however, we believe it to be dramatically understated due to the CMS assumptions utilized to create the error rates discussed previously. We then repeated the analysis, as described throughout this white paper, using a range of HCC error rates that varied by the degree of assumed independence between coding errors from one claim to the next. To summarize, we completed the following steps to perform an adjusted technical analysis: 1. Filtered diagnoses a. Within the CMS data set, we utilized the flag provided by CMS indicating that diagnoses were valid for risk adjustment. b. For the 2014 5% Sample data set, we used Encounter Data System (EDS) ?ltering rules. (We tested for the impact of ?ltering with Risk Adjustment Processing System rules, found no material difference for the purpose of this study, and elected to use EDS rules for simplicity.) 2. Calibrated the CMS HCC model on unadjusted CMS data I 2014 and 2015 5% Sample data. For the 5% Sample data we utilized the July 2015 cohort of non-hospice, community population with 12 months of Medicare Part A and Part enrollment in 2014. 3. Normalized the resulting model to produce a 1.0 risk score for the same total FFS population. again without adjustment to simulate removal of unsupported HCCs. 4. Performed reasonability checks to ensure that the model was reasonably similar to the actual CMS model. Applied the resulting model to the CMS MAO data set to produce a starting point MAO risk score. Set the error rates a. For HCC error rate scenarios, set the HCC error rate to be consistent with the particular HCC error rate scenario being processed. b. For the claim error rate scenario. set the claim level error rates to the CMS published claim level error rates. 7. Simulation a. Simulated adjustments to the filtered CMS data H005 to produce simulated corrected HCCs b. Simulated claim level adjustments to ?ltered 2014 diagnoses to produce simulated corrected HCCs. 8. Repeated steps 2 through 5 above. using the simulated corrected HCCs for all steps, including the normalization step. 9. Compared the resulting risk scores for FFS using the original and simulated corrected HCCs under both versions of the HCC models. Under both models using the CMS data, the ratio of the .0391 Medicare Advantage RADV FFS adjuster: White paper 20 August 2019 MILLIMAN WHITE PAPER risk scores using original uncorrected and simulated corrected HCCs was between 1% and 21 depending upon the assumed HCC error rate. The full independence scenario processed at the claim level on the 5% Sample produced a 12% HCC error rate and an 8% FFS adjuster on the CMS FFS data. These scenarios result in the calculation of a FFS adjuster between 8% and 21% under these assumptions. 10. Compared the resulting risk scores for MA, based on the CMS MA diagnosis data ?le, using the original and simulated corrected HCC models. The impact on MA risk scores ranged from 10% to 32%, depending on the level of independence, which is larger than the FFS impact. Under the midpoint HCC error rate scenario, we performed the simulations and calculated a FFS adjuster of 14.9%, with a range of 8% to 21% for all scenarios (excluding the average claims per HCC scenario). See the chart in Figure 8 for a summary of the key error rate scenarios we calculated. Appendix F, Chart B. contains the same results when calculated on the CMS MA data. We calculate the impact on the MA data as a comparison point to the CMS calculation on MA data included in the technical analysis; however, a FFS adjuster should be calculated on FFS data, not MA data. As discussed earlier in this white paper. properly calculating a FFS adjuster requires performing a credible sampling of FFS bene?ciaries and then completing a RADV-type audit on all eligible claims for those bene?ciaries. It is not suf?cient to calculate error rates for HCCs based upon error rates of individual claims because the degree of independence cannot be known and the results are extremely sensitive to the degree of independence and the distribution of the number of diagnoses per bene?ciary. Further study is needed. 50 SIMULATIONS PRODUCE SIMILAR RESULTS We repeated the adjusted simulation process described above (steps 7 through 10) 50 times for each error rate scenario, as CMS did with its version of the analysis (but using a single error rate). We observed minimal variations in the resulting value of the FFS adjuster within each error rate scenario. Figure 8 shows the consistency of the FFS adjuster results across error rate scenario simulations. Appendix includes additional exhibits showing consistency of the impact on MA risk scores across simulations and highlighting selected key distributional statistics. FIGURE 8: FFS ADJUSTERS USING COEFFICIENTS RECALIBRATED WITH VARIOUS ERROR RATES AND SIMULATED AUDITED FFS DATA Chart A FFS Adjusters using Coefficients Recalibrated with Various Error Rates and Simulated Audited FFS Data 25.0% a: an 20.0% n: 6 15 0 Full Dependence . 0 OJ 25?? [95? 10.0% 509? 3 5.0% 75% o. 0 0.3, 0 Full Independence - 0 20 30 40 50 Iteration Under the midpoint error rate scenario and based upon 50 iterations, the FFS adjuster is between 14.85% and 14.90%, with a 99% level of con?dence. Medicare Advantage RADV FFS adjuster: White paper 21 August 2019 MILLIMAN WHITE PAPER Context around the CMS HCC risk model As set out in statute, the CMS HCC model is intended to adjust payment amounts made to MAOs by bene?ciary health status. The HHS Secretary has broad authority to add or remove adjustment factors if such changes will improve the determination of actuarial equivalence. which further highlights the emphasis on actuarial equivalence from Congress. Beyond requiring a risk adjustment model and actuarial equivalence. the statute goes on to require an adjustment for the coding pattern difference between FFS and MA. Understanding appropriate creation and application of the risk score model also requires an understanding of the background. procedures, and adjustments surrounding the implementation of the risk score model. RISK MODEL DESIGN Several considerations should go into designing a risk score model. In the case of the CMS HCC model. a strong model would compensate MAOs for the health status of the bene?ciaries they enroll without creating an incentive to enroll certain types of bene?ciaries over others. A strong risk adjustment model could be based upon diagnoses from medical records. because these diagnoses most closely re?ect the actual conditions bene?ciaries are treated for. Given that this is impractical from an administrative cost perspective. CMS needed data to serve as a proxy for medical record diagnoses. To fill this void. CMS designed the CMS HCC model utilizing diagnoses from claims data. While providers have not historically had a strong incentive to accurately report diagnoses in claims data (with the exception of inpatient claims). the claims-based diagnoses are a reasonable proxy for medical record diagnoses in the context of establishing the disease burden of an individual bene?ciary. Predictive models and risk score models are often measured based upon how well they predict results for individual bene?ciaries. However, as CMS points out in the "Weak Statistical Foundations" section of the Technical Appendix. referenced in the proposed rule. MAOs are paid to provide care for an entire population of bene?ciaries. It is important to pay MAOs accurately for the entire population of bene?ciaries. but is less important to pay MAOs correctly for each individual bene?ciary and HCC. As CMS lays out with mathematical formulas. if the actual cost of providing care for a bene?ciary with a particular HCC varies signi?cantly. the quality of the risk score model. in the context of paying MAOs. is not reduced as long as variation from the average cost of providing care is not biased. That is. the quality of the model is not reduced if the cost to provide care above the average and below the average for an HCC are approximately equivalent. The commentary in the ?Weak Statistical Foundations" section may be important in establishing a good risk score payment model. but it has no relevance for actuarial equivalence or a FFS adjuster. CMS goes on to discuss. in the Technical Appendix. a concept it refers to as "Calibration Error Correction Limited to Recoveries is Economically Problematic." The arguments put forth focus upon the concept that there may be calibration errors in the CMS HCC model. While catibration errors may impact the relative values of one HCC against another. they have little bearing on total payments as a result of the CMS step that normalizes the CMS HCC model to a 1.0 risk score for the FFS population. While minimizing calibration error may be important to developing a risk model, this tOpic is also not relevant to actuarial equivalence or a FFS adjuster. MODEL IMPLEMENTATION CMS's Risk Adjustment Participant Guides focus upon rules and guidelines for plans to ?lter claims data and submit the diagnoses attached to such claims through the Risk Adjustment Processing System (RAPS). That is. CMS publishes the rules by which plans must abide when submitting claims-based diagnosis data. RADV audits. however. do not primarily measure how well a plan complies with the ?ltering and submission process set forth by CMS. Rather. the RADV audit compares the claims-based - diagnoses to the diagnoses on the medical charts and cites the differences as errors made by the plan. Therefore. RADV audit procedures primarily measure how well claims-based diagnoses approximate medical chart diagnoses. Medicare Advantage RADV FS adjuster: White paper 22 August 2019 MILLIMAN WHITE PAPER The RADV audit process primarily measures the bias of the diagnosis proxy, that is, the difference between claims-based diagnoses and medical record diagnoses. Such a bias exists on both the FFS data and the MA data. Title 42 U.S. Code requires the risk model to "ensure actuarial equivalence" between FFS and MA. Removing the bias from either side without removing it from the other compromises the risk adjustment model by violating actuarial equivalence, and therefore statute. If the bias is removed from the MAO side but not the FFS side, one solution to maintain actuarial equivalence is to apply a FFS adjuster in the implementation of the RADV audit. The addition of a FFS adjuster is akin to adjusting the CMS HCC model to be on a medical record diagnosis basis, consistent with the methodology of the RADV audit for MA diagnosis support. OTHER ADJUSTMENT FACTORS CMS implements other adjustments surrounding the risk score model and its implementation. A few of these adjustments are discussed here for completeness. FFS normalization: Provider coding patterns change over time and the FFS Medicare population changes over time. Because the data required to create and calibrate an HCC model is several years old, CMS must project both changes in the FFS population and FFS provider coding practices in an attempt to maintain a 1.0 risk score for future years. The FFS normalization factor is the CMS projected estimate of what the risk score of the FFS population will be in a future payment year. All risk scores are then divided by this factor. This concept is very similar to the normalization step discussed throughout much of this white paper. Medicare Secondary Payer (MSP) adjustment: Certain bene?ciaries have medical insurance aside from Medicare. For those bene?ciaries who have other coverage that pays primary to Medicare, CMS estimates a reduction to Medicare?s expense for those bene?ciaries. This reduction is generally over 80% and is applied in the MA bid process as a reduction to risk scores. MA coding pattern adjustment: The MA coding pattern adjustment is intended to capture any difference between how FFS and MA bene?ciary diagnoses are coded. The difference between claims-based diagnoses and medical record-based diagnoses may be different between FFS and MA. To the extent they are different, that difference between the documentation error rates may already be included in the MA coding pattern adjustment. Further study would be required to separate the impact of a true coding pattern adjustment from a difference in the way claims-based diagnoses and medical record-based diagnoses vary between FFS and MA. Other considerations for calculating FFS adjusters This white paper is focused primarily upon overall actuarial equivalence between FFS and MA and properly calculating error rates (generally the difference between claims and medical record diagnoses.) There are other considerations for calculating a ?nal FFS adjuster, or simply performing a more precise analysis regarding the need for one. The CMS technical analysis used a variety of data from a variety of time periods. The to HCC mapping was from a single time period and so may not be consistent with portions of the underlying data. As codes do change over time, and CMS updates the mapping over time, the applicable year?s model should be used. The CMS technical analysis uses random numbers to simulate unsupported diagnoses. However. the CMS HCC model is built on a causal relationship between diagnoses and claims. A proper analysis of accuracy in model calibration must use actual coding errors to maintain the assumed causal relationship of the H00 model, not randomized changes. Stated more technically, OLS (Ordinary Least Squares) regression measures correlation between dependent and independent variables. As such, modifying the independent variables in a random fashion compromises correlation and any conclusions drawn from OLS. Medicare Advantage RADV FFS adjuster: White paper 23 August 2019 MILLIMAN WHITE PAPER Further, error rates should be expected to change as CMS updates the H00 models and the mappings within them. For example. the 2014 HCC model included a clinical revision that was at least partially intended to address some of the coding differences present in MA versus FFS and this should be expected to impact the error rates. Provider coding practices change over time and should have an effect on error rates. The advent of ICD-10 during the fourth quarter of 2015 and the ever-increasing penetration of electronic medical records should also be expected to change the error rates over time. As CMS considers different time periods. the error rates should be revisited and recalculated frequently to re?ect the applicable time period's models. error rates, and coding practices. Additional statistical background Least squares regression approaches are a category of statistical methodologies intended to minimize the sum of the squares of the residuals. The residuals are the difference between the observed data used for calibrating the model and the amount predicted for that data point. These residuals are raised to the second power (squared) and then added across all observed data points. The residuals can be thought of as amounts that the calibrated model does not predict. The goal of least squares regression is to minimize the square of the residuals (error terms). OLS methodologies weight each data point equally. while weighted least squares applies a weight to each data point, for example the amount of claims or the number of months a bene?ciary is enrolled for in the projection year of the CMS HCC model. Conclusion CMS currently calibrates and normalizes the CMS HCC model on FFS data that is based upon diagnoses from claims records. Because RADV audits utilize medical records and a different coding standard, RADV ?ndings must be adjusted by the difference between those coding standards within FFS. that is, a FFS adjuster. Failure to make an adjustment, such as a FFS adjuster in the context of RADV audits and the current risk adjustment system, violates actuarial equivalence, and actuarial equivalence is required by federal law. The CMS technical analysis accompanying the proposed rule did not state CMS calculated a FFS adjuster and did not appropriately calculate a FFS adjuster in the context of RADV audits. Instead, it measured a calibration bias of a CMS HCC model, which does not answer the question of whether or not a FFS adjuster is required. At a minimum, an analysis of a FFS adjuster must exclude unsupported diagnoses from all steps of the calibration and normalization process. Since the CMS analysis does not exclude unsupported diagnoses from the normalization process. it cannot be used to support the removal of a FFS adjuster. Estimation of a FFS adjuster should be based upon data and models that are consistent with the data that will undergo a RADV audit. Further, FFS bene?ciaries should be sampled and. at a minimum, all claims containing diagnoses mapping to H003 should be audited. Error rates should then be calculated while considering the bene?ciary as a whole and including diagnoses for which the provider does not provide documentation. which is how bene?ciaries are treated for payment and how bene?ciaries are evaluated for HCCs. Medicare Advantage RADV FF 8 adjuster: White paper 24 August 2019 MILLIMAN WHITE PAPER Appendix A: CMS documents from Docket 44. United Price No. 1 Why does FFS Diagnosis Error Matter? Bene?ciary A Bene?ciary Bene?ciary Bene?ciary Diabetes reported by MA plan? Diabetes on Yes Yes Yes Yes Yes $4,000 Yes $4,000 Yes $4,000 No $0 Total $12,000 Diabetes Vatue $3,000 for MA Payment Plan Cost Bene?ciary A Bene?ciary Bene?ciary ene?ciary Bene?ciary Yes Yes Yes Yes Yes No Total Medicare Advantage RADV FFS adjuster. White paper $3,000 $4,000 $3,000 $4,000 $3,000 $4,000 $3,000 $0 ($3,000) $3,000 $0 ($3,000) $15,000 $12,000 ($6,000) 25 $0 $0 $9,000 August 2019 MILLIMAN WHITE PAPER Appendix B: Full expanded example of calibration and normalization of HCC model: Calibrated with adjusted diagnoses and normalized with unadjusted diagnoses: CMS proposed rule technical analysis a pproach MODEL CALIBRATED AND NORMALIZED WITH MODEL CALIBRATED AND NORMALIZED WITH UNADJUSTED FS DIAGNOSES ADJUSTED FFS DIAGNOSES ON FFS ON ACTUAL PREDICTED MEDICAL ACTUAL PREDICTED BENEFICIARIES . FFS COST FFS COST COEFFICIENT FFS COST FFS COST COEFFICIENT Bene?ciary 1 70 yr old $6,500 0.650 $5,500 0.550 Diabetes Yes $3,000 0.300 Yes $4,000 0.400 Subtotal $9,000 $9,500 0.950 $9,000 $9,500 0.950 Bene?ciary 2 70 yr old $6.500 0.650 $5.500 0.550 Diabetes Yes $3,000 0.300 Yes $4.000 0.400 Subtotal $10,000 $9,500 0.950 $10,000 $9,500 0.950 Bene?ciary 3 75 yr old $7,000 0.700 $6,000 0.600 Diabetes Yes $3,000 0.300 Yes $4,000 0.400 Subtotal $10.000 $10,000 1.000 $10,000 $10,000 1.000 Bene?ciary 4 80 yr old Dual $8,000 0.800 $11,000 1.100 Diabetes Yes $3,000 0.300 No $0 - Subtotal $11,000 $11,000 1.100 $11,000 $11,000 1.100 Total $40,000 $40,000 1.000 $40.000 $40,000 1.000 Medicare Advantage RADV FFS adjuster: White paper 26 August 2019 MILLIMAN WHITE PAPER MODEL CALIBRATED WITH ADJUSTED FFS MA PAYMENT DIAGNDSES BUT NORMALIZED WITH WITHOUT FFS Mgr: UNADJ USTED DIAGNOSES ADJUSTER BEFORE AFTER BEFORE AFTER BEFORE AFTER FFS BENEFICIARIES ON NORMALIZING NORMALIZING RADV RADV RADV RADV Bene?ciary 1 T0 yr old 0.550 0.500 $5,000 $5,000 $5,000 $5,000 Diabetes Yes 0.400 0.364 $3,636 $3,636 $3,636 $3,636 Subtotal 0.950 0.864 $8,636 $8,636 $8,636 $8,636 Bene?ciary 2 70 yr old 0.550 0.500 $5.000 $5.000 $5,000 $5,000 Diabetes Yes 0.400 0.364 $3.636 $3.636 $3,636 $3,636 Subtotal 0.950 0.864 $8.636 $8.636 $8.636 $8,636 Bene?ciary 3 75 yr old 0.600 0.545 $5.455 $5.455 $5.455 $5.455 Diabetes Yes 0.400 0.364 $3.636 $3.636 $3.636 $3.636 Subtotal 1.000 0.909 $9.091 $9.091 $9.091 $9.091 Bene?ciary 4 . 80 yr old Dual 1.100 1.000 $10,000 $10,000 $10,000 $10,000 Diabetes Yes 0.400 0.364 $3,636 $0 $3,636 $0 Subtotal 1.500 1.364 $13,636 $10,000 $13,636 $10,000 Total 1.100 1.000 I $40,000 $36,364 $40,000 $36,364 Raw RADV Recovery $3.636 $3.636 FFS Adjuster . $0 $3,636 Final RADV Recovery $3,636 $0 Final Payment to MAO $40,000 $36,364 $40,000 $40,000 Actuan?aily equivalent?? Yes No Yes Yes When the CMS HCC model is normalized with unadjusted diagnoses, actuarial equivalence is maintained at initial payment and under a RADV audit with a FFS adjuster. not with a RADV audit without a FFS adjuster. Medicare Advantage RADV FFS adjuster: White paper 27 August 2019 MILLIMAN WHITE PAPER Appendix C: Full expanded example of calibration and normalization of HCC model: Calibrated and normalized with adjusted diagnoses MODEL CALIBRATED AND NORMALIZED WITH ADJUSTED MA FFS FFS DIAGNOSES AFTER BEFORE AFTER FFS BENEFICIARIES 0N NORMALIZING RADV Bene?ciary 1 70 yr old 0.550 $5.500 $5.500 Diabetes Yes 0.400 $4.000 $4.000 Subtotal 0.950 $9.500 $9.500 Bene?ciary 2 70 yr old 0.550 $5.500 $5.500 Diabetes Yes 0.400 $4.000 $4.000 Subtotal 0.950 $9.500 $9.500 Bene?ciary 3 75 yr old 0.600 $6.000 $6.000 Diabetes . Yes 0.400 $4.000 $4.000 Subtotal 1.000 $10,000 $10,000 Bene?ciary 4 80 yr old Dual 1.100 $11,000 $11.000 Diabetes Yes 0.400 $4.000 $0 Subtotal 1.500 515.000 $11,000 Total 1.100 $44,000 $40,000 Raw RADV Recovery $4.000 FFS Adjuster $0 Final RADV Recovery $4.000 Final Payment to MAO $40.000 340.000 Actuarially equivalent?* No Yes When the CMS HCC model is normalized with adjusted diagnoses, a FFS adjuster is not required and actuarial equivalence is achieved only after a RADV audit. Medicare Advantage RADV FFS adjuster: White paper 28 August 2019 MILLIMAN WHITE PAPER Appendix D: Full expanded example of calibration and normalization of HCC model: Calibrated and normalized with unadjusted diagnoses, status quo before the proposed rule MODEL CALIBRATED AND NORMALIZED WITH magma; MA PAYMENT WITH UNADJUSTED FFS AD FFS DIAGNOSES AFTER BEFORE AFTER BEFORE AFTER FFS 0? NORMALIZING RADV RADV RAnv Bene?ciary 1 70 yr old 0.650 $6,500 $6,500 $6,500 $6,500 Diabetes Yes 0.300 $3,000 $3,000 $3.000 $3.000 Subtotal 0.950 $9.500 $9.500 $9,500 $9,500 Bene?ciary 2 70 yr old 0.650 $6.500 $6.500 $6.500 $6.500 Diabetes Yes 0.300 $3.000 $3.000 $3,000 $3,000 Subtotal 0.950 $9.500 $9.500 $9.500 $9.500 Bene?ciary 3 75 yr old 0.700 $7.000 $7.000 $7.000 $7.000 Diabetes Yes 0.300 $3,000 $3,000 $3.000 $3.000 Subtotal 1.000 $10,000 $10,000 $10,000 $10,000 Bene?ciary 4 80 yr old Dual 0.800 $8.000 $8.000 $8.000 $8.000 Diabetes Yes 0.300 $3.000 $0 $3,000 $0 Subtotal 1.100 $11,000 $8,000 $11,000 $8,000 Total 1.000 $40.000 $37,000 $40,000 $37.000 Raw RADV Recovery $3,000 $3.000 FFS Adjuster $0 $3,000 Final RADV Recovery $3,000 $0 Final Payment to MAO $40,000 $37,000 $40,000 $40,000 Actuarially equivalent? Yes No Yes Yes When the CMS HCC model is normalized with unadjusted diagnoses. actuarial equivalence is maintained at initial payment and under a RADV audit with a FFS adjuster, not with a RADV audit without a FFS adjuster. Medicare Advantage RADV FS adjuster: White paper 29 August 2019 MILLIMAN WHITE PAPER Appendix E: CMS documents from Docket 44. United Price No. 1 Case Document 44-4 Filed 10/02/17 Page 3 of 7 Model Calibration Factor. The ?rst issue is the extrapolation methodology that we?re going to use in RADV. The approach that we laid out in our December guidance was pretty straightforward and we are not recommending making any signi?cant changes - with one possible exception. Plans have raised the concern that we are holding them to a standard of perfection for diagnosis coding but that physician claims in PPS Medicare often include diagnoses that aren?t supported in the medical record. So they argue that we have two different documentation standards one for MA and one for FF S. And this wouldn?t matter except that we use FFS claims data to develop our risk adjustors for Medicare Advantage. In risk adjustment model, we are estimating the average relative cost of any given condition given the people who are reported to have it. So when we estimate the relative cost of any given condition. we use diagnosis and cost data from FF Medicare. So implicit in all of the adjustments we make to plans payments to account for the relative risk of their populations. are the factors that we developed using FPS data. Ifwe include diagnoses for bene?ciaries who don?t actually have the disease, or for whom the medical record documentation is not clear. this tends to reduce the estimated avera cost of various conditions and therefore our risk ad'ustment factors. So plans argue that we are paying them as if they are getting bene?ciaries who look like rather than the higher average cost of the bene?ciaries we are allowing to be claimed in MA under the RADV audits. The address this issue. we are proposing to develop a model calibration factor that estimates how much higher the plan?s payment would be if our risk adjustment model had been built using perfect data. This factor would reduce the estimated RADV over-payments due from the plan. We think this approach makes sense and from a technical point of view is the right thing to do. It also will help bring the overpayments into a range that is more realistic for plans to be able to accommodate. Medicare Advantage RADV FFS adjuster: White paper 30 August 2019 MILLIMAN WHITE PAPER Appendix F: Statistical results from 50 simulations Chart A FFS Adjusters using Coefficients Recalibrated with Various Error Rates and Simulated Audited FFS Data 25.0% 20.0% 5 15.0% 0 Full Dependence cu F30 10 0? 25% . ?50% 33 5.0% 75% n. I Full independence 0.iteration Chart Impact on MA Risk Scores of Coefficients Recalibrated and Normalized Using Simulated Audited FFS Data 35.0% 9 cu 30.0% Googaaoq?eeas0999:999909009999 an 25.0% 5 0 Full Dependence 20.0% ED 15.0% 25? 2 8 100% 50% a 75% D- 5.0% 0 Full Independence 0.iteration Table 5: FFS Distributional Statistics Degree of Independence 25% 75% 100% Mean FFS adjuster 21.3% 18.1% 14.9% 11.6% 8.2% Median FFS adjuster 21.3% 18.1% 14.9% 11.6% 8.2% MinirnumFFS adjuster 21.1% 18.0% 14.7% 11.5% 8.1% Maximum FFS adjuster 21.5% 18.3% 15.0% 11.7% 8.3% 25th Percentile 21.3% 18.1% 14.8% 11.5% 8.2% 75th Percentile 21.4% 18.2% 14.9% 11.6% 8.2% Sample Standard Deviation 0.09% 0.07% 0.06% 0.04% 0.04% Lower 99% Con?dence Bound 21.3% 18.1% 14.9% 11.5% 8.2% Upper 99% Con?dence Bound 21.3% 18.2% 14.9% 11.6% 8.2% Medicare Advantage RADV FFS adjuster: White paper 31 August 2019 wakely.com Medicare RADV: Review of CMS Sampling and Extrapolation Methodology July 2018 Prepared by: Wakely Consulting Group Tim Murray, FSA, MAAA Senior Consulting Actuary Evan Morgan, ASA, MAAA, PhD Consulting Actuary Matt Sauter, ASA, MAAA Consulting Actuary Page 1 Contents Executive Summary ........................................................................................................ 2 Introduction ..................................................................................................................... 5 Background on MA Risk Adjustment ............................................................................... 5 RADV Overview .............................................................................................................. 6 Evaluation of CMS RADV Methodology .......................................................................... 7 Goals ...................................................................................................................................... 7 Evaluation Approach .............................................................................................................. 8 Key Findings ........................................................................................................................... 9 Conclusion .................................................................................................................... 18 Considerations and Limitations ..................................................................................... 18 Appendix A – Monte Carlo Simulation Background and Results ................................... 20 Appendix B – CMS RADV Methodology........................................................................ 24 Sampling and Extrapolation Methodology ..............................................................................24 Medicare RADV: Review of CMS Sampling and Extrapolation Methodology July 2018 Page 2 Executive Summary The Centers for Medicare & Medicaid Services (CMS) conducts Medicare Advantage (MA) Risk Adjustment Data Validation (RADV) audits as part of its program integrity efforts. Per CMS, MA RADV audits evaluate whether the diagnosis codes submitted by Medicare Advantage Organizations (MAOs), which directly influence CMS payments to MAOs, can be validated by supporting medical record documentation. In February of 2012 CMS published incomplete details of its RADV audit payment error calculation methodology, which comprises a sampling method and a payment error (penalty) extrapolation method. This report endeavors to provide an overview and technical evaluation of CMS methodologies, but not to provide a comprehensive evaluation of RADV audit process operations. The CMS RADV audit methodology seeks not only to measure the payment error rate of selected MA contracts, but also to retroactively adjust CMS payments downward in instances where the CMS-derived payment error is higher than a to-be-defined coding accuracy standard. It is not administratively feasible for CMS to review the universe of medical record documentation for RADV contracts. Therefore, CMS must rely on a sampling method to approximate the payment error rate. CMS has indicated its intent to extrapolate the observed sample payment error across the MA contract’s RADV-eligible population (oftentimes the large majority of the contract population). As a result, the CMS payment error extrapolation approach means that payment recoupment will affect not only revenue associated with MA beneficiaries whose medical records are audited, but also beneficiaries whose records are not audited. Our technical evaluation explored how well the CMS sampling approach approximates the true payment error rate. The payment error rate reflects the combined impact of the coding error rate (frequency of coding errors) and the magnitude (risk score value or severity) of coding errors. The coding error rate reflects both the percentage of unsubstantiated codes and the percentage of supported but not submitted codes. The magnitude (severity) of coding errors is driven by the risk score value of specific coding errors, which may vary widely based on the morbidity profile (mix of diagnoses) of each MAO contract. Given the CMS-stated intent to extrapolate sample payment errors to retrospectively recoup MAO payments, we evaluated potential drivers of bias and inequity in the payment error extrapolation calculation. Specifically, we evaluated whether contract attributes other than the coding error frequency (e.g. contract size, diagnostic profile, average risk score) could potentially drive inequitable penalties. We also evaluated the risk that contracts with the same average payment error rate may experience inequitable payment penalties. In order to perform a technical evaluation of CMS’s RADV methodology, we simulated the RADV process on Medicare Limited Data Set (LDS) Standard Analytical Files. We used Monte Carlo simulation to replicate the CMS RADV process more than two million times on actual Medicare beneficiary claim and diagnosis data, varying MA contract sizes and assumed coding error rates. Medicare RADV: Review of CMS Sampling and Extrapolation Methodology July 2018 Page 3 As detailed in Appendix A, Monte Carlo simulation is a commonly-used mathematical technique to measure the statistical characteristics of processes that involve random variables and random sampling. Our detailed analysis yielded a number of key findings and observations, summarized briefly below:  The extrapolated payment error calculation is subject to a high degree of randomness the sampling methodology utilized by CMS has the potential to accurately estimate the contract-specific payment error rate, but payment error extrapolation allows for MA contracts with identical coding error rates to pay vastly different penalties.  The payment error calculation is very sensitive to small variations in diagnostic mix of the sample population - MA contracts with varying risk profiles by disease state may be subject to materially variant extrapolated penalties. We illustrate via simulation that even a single coding error in the randomly chosen RADV sample may drastically impact the RADV payment error penalty result.  The CMS methodology tends to levy disproportionate payment error penalties on higher enrollment contracts and low absolute risk score contracts.  CMS’s MA RADV approach gives no consideration to diagnosis-specific substantiation rates – an MA contract may have a high prevalence of hard-to-substantiate diagnosis codes and therefore a high expected coding error rate. CMS does not adjust its penalty calculation to account for this dynamic despite publicly acknowledging the potential for such diagnosis-specific variation in several recent regulatory publications.  Extrapolated payment penalties have the potential to be materially larger than the true payment error rate, a problematic situation even if occurring with very low probability. An extrapolated payment error rate that is larger than the true payment error rate, when extrapolated over an MAO’s RADV-eligible population, would expose MAOs to significant financial risk based not on MAO coding accuracy but rather on the volatility of the CMS RADV payment error calculation methodology.  CMS has yet to release information on the magnitude or derivation methodology of its FFS Adjuster offset to RADV payment penalties. The FFS Adjuster is intended to account for the fact that the documentation standard used to develop the MA risk adjustment model is inconsistent with the documentation standard used in RADV audits. Since CMS has yet to release details on its derivation, our technical analyses exclude consideration of the FFS Adjuster. In summary, while our simulation work indicates that the CMS RADV sampling approach has the potential to accurately approximate the payment error rate of a contract, the payment error extrapolation approach exposes MA contracts to materially inequitable treatment based on characteristics independent of coding accuracy. The method is also exposed to the risk of unintended and problematic consequences such as payment penalties larger than actual payment error rates, albeit with low probability. Such sources of bias and inequity exist independent from Medicare RADV: Review of CMS Sampling and Extrapolation Methodology July 2018 Page 4 the to-be-defined FFS Adjuster. Although the FFS Adjuster would directionally mitigate the RADV financial risk exposure to MA contracts, as currently contemplated it would not lessen the bias and inequity evident in CMS’s extrapolation approach. We detail our technical evaluation methodologies and findings in subsequent sections of this report. Medicare RADV: Review of CMS Sampling and Extrapolation Methodology July 2018 Page 5 Introduction The Medicare Advantage (MA) program has long-relied on risk adjustment to ensure that payments to Medicare Advantage Organizations (MAO) reflect the relative health care risk of MAO beneficiaries. A well-designed risk adjustment system facilitates the alignment of plan payments with expected medical claims costs and therefore helps to support equity for Medicare beneficiaries in seeking coverage. The Centers for Medicare & Medicaid Services (CMS) relies on MAOs to regularly submit beneficiary diagnosis data to substantiate the health care risk of beneficiaries. CMS reserves the right to audit such substantiation to ensure payment accuracy. In 2012 CMS proposed a Risk Adjustment Data Validation (RADV) methodology that comprised a beneficiary sampling approach as well as a payment error calculation that involves extrapolation beyond the sampled audit population. The financial stakes of such an approach are significant since payment errors observed on a subset of MA beneficiaries could be used to levy financial penalties across a much larger population of MA beneficiaries. In this report we provide a brief background on MA risk adjustment, an overview of the CMS 2012 published RADV methodology, and an evaluation of the risks associated with the methodology. As detailed in this report, we found that the CMS sampling approach has the potential to approximate actual coding error rates, but the payment error extrapolation approach may expose MA contracts to materially inequitable treatment based on characteristics independent of coding accuracy. America’s Health Insurance Plans (AHIP) engaged Wakely Consulting Group (Wakely) to perform a technical evaluation of CMS’s RADV payment error calculation methodology. This report summarizes the approach and key findings of the technical evaluation performed by Wakely. Background on MA Risk Adjustment MAOs receive monthly capitated payments from CMS that are adjusted to reflect the health care risk of enrolled beneficiaries. CMS uses a prospective risk adjustment system whereby MAOs submit health care diagnosis data to substantiate the risk profile of enrolled members. MAOsubmitted diagnoses directly influence the risk scores assigned to MAO enrolled members, and in turn directly influence monthly CMS payments to MAOs. CMS calibrates the MA risk score model by correlating categories of diagnosis codes called Hierarchical Condition Categories (HCCs) with expected health care costs. Using a regression model that correlates demographic factors and HCCs to expected claims costs, each HCC is assigned a risk score value or “coefficient.” If a beneficiary is identified via diagnosis code as being afflicted with a particular condition, the applicable HCC is triggered which may increase the risk score, and therefore the CMS payment, assigned to the beneficiary. Demographic characteristics, Medicaid eligibility status, and comorbidities among HCCs are among the numerous characteristics that the CMS risk adjustment model endeavors to account for in its payments to MAOs. Medicare RADV: Review of CMS Sampling and Extrapolation Methodology July 2018 Page 6 For perspective, a beneficiary with an average health care risk profile is assigned a risk score of 1.0, whereas a beneficiary with a risk score of 2.0 would be expected to have costs twice as high as the average beneficiary. An optimal risk adjustment program attempts to correlate beneficiaryspecific funding with beneficiary-specific risk. This helps to ensure that financial resources are appropriately directed to plans that enroll complex, chronically ill, and costly beneficiaries. Complete and accurate documentation of beneficiary diagnoses is an important component of the MA risk adjustment system. MAOs and CMS both invest resources to ensure that diagnosis codes submitted to CMS are accurate and appropriately documented, which supports the goals of payment accuracy and optimizing clinical care management. RADV Overview CMS conducts RADV audits on MA contracts as part of its program integrity efforts. Per CMS, MA RADV audits evaluate “whether the diagnosis codes submitted by MAOs can be validated by supporting medical record documentation.”1 In February of 2012 CMS published incomplete details of its RADV payment error calculation methodology, which comprises a member sampling method and a payment error extrapolation method.1 CMS indicated that the extrapolation method would apply for the first time to RADV contract-level audits conducted on payment year 2011. Since publishing the methodology in early 2012, CMS has also conducted RADV audits on 2012 and 2013 payment years but has not yet released complete details on the payment error penalty calculation. Notably, the component of the methodology yet to-be-determined is a Fee-for-Service (FFS) Adjuster that is intended to account for the fact that the documentation standard used to develop the MA risk adjustment model is inconsistent with the documentation standard used in RADV audits. Refer to Appendix B for a more comprehensive summary of CMS’s RADV methodology. Below we summarize the key elements. 1. CMS selects a set of approximately thirty (30) MA contracts for each RADV audit cycle (calendar year). 2. Within each selected contract members are flagged as “RADV-eligible” by satisfying a number of specific criteria, including the requirement that the member has at least one diagnosis code that resulted in the assignment of an HCC for the payment year. Notice of Final Payment Error Calculation Methodology for Part C Medicare Advantage Risk Adjustment Data Validation Contract-Level Audits. Available at: https://www.cms.gov/Research-Statistics-Data-andSystems/Monitoring-Programs/recovery-audit-program-parts-c-and-d/Other-Content-Types/RADVDocs/RADV-Methodology.pdf 1 Medicare RADV: Review of CMS Sampling and Extrapolation Methodology July 2018 Page 7 3. CMS orders the RADV-eligible contract population by payment year risk score and divides the population into three equal-size strata (equal in terms of the number of beneficiaries in each stratum). 4. CMS randomly selects sixty-seven (67) enrollees from each of the three strata. 5. MAOs submit detailed medical records to support the HCCs represented in the RADV sample selected. 6. Based on the medical record documentation submitted, CMS calculates a RADVcorrected risk score and corrected payment amount. 7. CMS derives the MA contract-level payment error (penalty) by extrapolating the sample observed payment error to the entire RADV-eligible population – the extrapolation uses the lower bound of a ninety-nine (99) percent confidence interval around the estimated sample payment error. 8. The payment error is constrained to zero (0) if negative, meaning that CMS intends to recoup initial net overpayments, but does not intend to correct initial net underpayments. 9. CMS has indicated the intent to reduce any positive (non-zero) payment penalties by a tobe-defined FFS Adjuster. Payment penalties are derived based on the lower bound of a ninety-nine (99) percent confidence interval around the observed sample payment error. While this approach materially reduces the derived payment penalty as compared to the observed sample payment error, it also introduces significant volatility into the calculation that, as we demonstrate via simulation, may potentially result in inequitable treatment of RADV audited contracts. Other aspects of the extrapolation methodology lead to results that are sensitive to sampling and subject to a high degree of randomness, which indicate additional, potentially problematic consequences of the CMS approach, as described further below. Evaluation of CMS RADV Methodology Goals The principal goal of our technical evaluation of CMS’s RADV methodology was to answer, through simulation, a few key questions: 1. Does the CMS sampling approach accurately estimate the simulated payment error rate of an MA contract? 2. Do contract attributes other than coding error rates influence the payment error calculation in a manner that drives potential inequitable treatment of contracts? Examples of such attributes: Medicare RADV: Review of CMS Sampling and Extrapolation Methodology July 2018 Page 8 a. Contract size – number of enrollees b. Diagnostic mix (HCC profile) of MA contract population c. Diagnostic mix (HCC profile) of sample population d. Average risk score of the MA contract population e. Average risk score of the sample population 3. Is the CMS extrapolation approach exposed to the risk of unintended consequences, such as payment penalties that are larger than the actual payment error rate? Evaluation Approach In order to perform a data-driven evaluation of CMS’s RADV sampling and extrapolation methodologies, we utilized the 2013 Medicare Limited Data Set (LDS) Standard Analytical Files and CMS HCC model coefficients2 used to risk adjust 2013 MA payments. The LDS data comprises diagnostic information for approximately eight hundred thousand (800,000) Medicare beneficiaries that account for over two million HCCs. The publicly-available LDS data is distinctive in that it represents the HCC prevalence and mix of the Medicare-eligible population. We utilized the 2013 LDS data and MA payment year 2013 HCC coefficients to ensure alignment of payment year with risk adjustment model year. We replicated the CMS RADV sampling and payment error extrapolation procedures on the LDS data using Monte Carlo simulation to first derive mock MA contracts of varying size (by enrollment), and then to mimic the CMS RADV methodology on such randomly generated contracts. Monte Carlo simulation, sometimes known as probability simulation, is a mathematical technique that uses random sampling to model the probability of different outcomes in a process. Monte Carlo simulation is a particularly valuable approach when applied to measuring uncertainty in processes that are impacted by random variables. This is an appropriate method for simulating the Medicare Advantage RADV process since there are several random variables involved, most notably: which two hundred and one RADV-eligible contract beneficiaries are randomly sampled, the diagnostic profile (HCCs) of the randomly selected beneficiaries in the sample, and the contract-specific coding error rates. We explored various scenarios of contract size and assumed coding error rates. For each combination of contract size and assumed coding error rate To simplify the analysis, disease interactions/comorbidities and disabled status/disease interactions were excluded from all risk scores and coding events. Additionally, lower-severity conditions that are “trumped” by more severe manifestations of the same disease hierarchy are excluded from the analysis. For example, a member with metastatic lung cancer will get scored for metastatic cancer (HCC8) and not lung cancer (HCC9) due to the cancer acuity hierarchy within CMS’s model. If, upon RADV audit review, the metastatic cancer diagnosis is found to be unsubstantiated, it is possible that the member would be re-scored with the lower acuity lung cancer diagnosis (vs. removing the cancer diagnosis completely). 2 Medicare RADV: Review of CMS Sampling and Extrapolation Methodology July 2018 Page 9 considered, we first randomly selected members to constitute simulated MA contract. We then simulated the RADV audit process one hundred thousand times (100,000) and derived summary statistics on the simulated RADV payment error penalties. Please refer to Appendix A for a more detailed description of the step-by-step Monte Carlo simulation approach utilized in our technical evaluation. As CMS has yet to publish details on the FFS Adjuster (magnitude or derivation methodology), our scenarios do not assume a FFS Adjuster penalty offset. Note also that Wakely did not evaluate CMS’s approach to selecting contracts for RADV audits – to our knowledge the methodology is not public. Finally, note that Wakely did not perform a comprehensive evaluation of CMS’s operational approach to conducting RADV audits, including the required documentation and administrative procedures. Instead our evaluation focuses on the statistical elements of CMS’s approach and the uncertainty inherent therein. Note that CMS applies its payment error calculation to MAO contract-specific payments. Since our approach involved “simulated” MA contracts, we assume an average “standardized” (1.0 risk score) payment of $850 per member per month (PMPM) for purposes of dollarizing illustrative payment penalties throughout this report. Key Findings Our RADV simulation work yielded several key findings which are outlined below. A consistent theme in our findings is that CMS’s RADV payment error extrapolation approach is prone to risk of inequitable treatment of contracts that vary in enrollment size, HCC mix, and absolute risk score. Refer to Appendix A for a broad range of scenarios tested and the statistical characteristics of resulting payment penalties. Payment Error Penalties Are Subject to a High Degree of Randomness CMS’s sampling approach has the potential to accurately approximate the simulated payment error rate, and typically does so with a low variance. Despite the CMS approach’s capability of approximating the payment error rate, the extrapolated RADV payment penalties are subject to a high degree of randomness. Small samples (201), combined with the fact that coding errors are somewhat rare contribute to erratic penalty results even at the same assumed true coding error rate. The observed randomness in payment penalties remains despite CMS’s stratification approach, which contributes, albeit insufficiently, to more stable results as compared to an unstratified sampling methodology. For a tangible example, please see Table 1 below (duplicated in Appendix A). For three different contract sizes tested, we randomly sampled two hundred and one (201) beneficiaries one hundred thousand (100,000) times (three hundred thousand (300,000) scenarios tested in total for this “batch” of samples). For this set we assumed that ten (10) percent of HCCs were Medicare RADV: Review of CMS Sampling and Extrapolation Methodology July 2018 Page 10 unsupported (coded when they should not have been) and six (6) percent of HCCs were supported but not reported (not coded when they should have been). Note that the scenario parameters are loosely based on a Fiscal Year (FY) 2016 Department of Health and Human Services (HHS) Agency Financial Report.3 This report reflects an average gross payment error rate of approximately ten (10) percent and an average net payment error rate of approximately four (4) percent that takes into account supported but not reported codes. The HHS report reflects data through June of 2015, and implies that supported but not reported coding errors represent a material offset to reported but unsupported coding errors. As an illustrative example, if we assume an average RADV sample beneficiary risk score of 1.04 and a standardized bid of $850, and we assume that the net payment error rate is four (4) percent, we would expect the ‘true’ risk score to be 1.0 and the average payment penalty PMPM to be: ($850 PMPM * 1.04) – ($850 PMPM * 1.0) = $34 PMPM Table 1: Monte Carlo Simulation Assuming 10% HCCs Unsupported, 6% Supported But Not Reported Contract RADVEligible Enrollment PMPM Average Sample Payment Error PMPM Average Sample Payment Error Variance PMPM Payment Error Average Penalty PMPM Min Penalty PMPM Max Penalty % Penalties >0 % High Penalties 1,000 $33.55 $0.26 $2.26 $0.00 $46.80 26.8% 0.102% 10,000 $33.58 $0.33 $2.60 $0.00 $67.07 27.6% 0.238% 100,000 $32.43 $0.33 $2.66 $0.00 $57.98 27.6% 0.326% Department of Health and Human Services, Fiscal Year (FY) 2016 Department of Health and Human Services (HHS) Agency Financial Report. Available online at: https://www.hhs.gov/sites/default/files/fy2016-hhs-agency-financial-report.pdf 3 Medicare RADV: Review of CMS Sampling and Extrapolation Methodology July 2018 Page 11 For reference we define the column headings below from Table 1 below (repeated in Appendix A): Table 1 Column Heading Definition Contract RADV-eligible Enrollment Number of beneficiaries in the simulated MA contract – each row in the tables represents a unique set of scenarios PMPM Average Sample Payment Error The per member per month (PMPM) average value of the simulated sample coding errors PMPM Average Sample Payment Error The PMPM value of the variance of sample Variance payment errors PMPM Payment Error Average Penalty The average PMPM value of the extrapolated payment penalties PMPM Min Penalty The minimum extrapolated penalty among all RADV simulations within the scenario set PMPM Max Penalty The maximum extrapolated penalty among all RADV simulations within the scenario set % Penalties > 0 The percentage of RADV scenarios in which the payment penalty is greater than $0 % High Penalties The percentage of RADV non-zero penalty scenarios in which the extrapolated penalty is larger than the true payment error rate For each of the simulated contract sizes, we pulled one hundred thousand (100,000) random RADV samples, replicated the RADV payment penalty calculation, and derived a number of summary statistics. For a particular contract size, we derived the sample average payment error, the variance in payment penalty, the minimum penalty, and maximum penalty, the percentage of scenarios for which a nonzero penalty was generated, and the percentage of scenarios for which the penalty was larger than the assumed true payment error rate (in this case approximately 4 percent net error rate). As detailed in Appendix A, we modeled a wide array of true error rate simulations, but our key findings generally hold true across all scenarios tested. The sampling approach has the potential to accurately approximate the value of the expected payment error – based on a ten (10) percent unsupported coding error rate and a six (6) percent supported but not reported coding error rate, we would expect approximately a $34 PMPM error (0.04 risk score value), with very low variance. However, the payment penalties do yield a wide range of outcomes when simulating the RADV process on the same contract – ranging from no penalty ($0) to penalties that are approximately double ($67.07 PMPM) the average sample payment error ($33.58 PMPM). Such variation in payment penalty for randomly chosen RADV samples drawn from the same contract is obviously problematic. Our simulation exercise illustrates that if CMS runs the RADV process on the same contract twice, the resulting payment penalties may vary significantly. The instability in simulated payment penalty results suggests that the RADV extrapolation process cannot reliably and equitably align payment penalties with actual payment error rates. Medicare RADV: Review of CMS Sampling and Extrapolation Methodology July 2018 Page 12 Payment Penalties May be Higher than the True Coding Error Rate We emphasize that payment penalties derived by CMS’s methodology have the potential to be materially higher than the true payment error rate, albeit with low probability. For example, refer to Table 1 above - the maximum penalty observed in the ten thousand (10,000) contract enrollment scenarios was $67.07 PMPM, nearly double the average risk score error observed over one hundred thousand (100,000) samples of $33.58 PMPM. Note that we define the “High penalty” in Table 1 as a penalty that is higher than the modeled true error rate of the contract. The nonzero probability of such a penalty that is higher than the true payment error is problematic, and could lead to a situation where RADV-audited contracts forfeit material funding due to randomness in CMS’s sampling methodology, not due to coding accuracy. CMS Methodology May Inequitably Penalize Contracts as Enrollment Increases As previously noted, we define “High penalty” cases as those for which the CMS-derived penalty is greater than the true error rate. As can be seen in Table 1 and across the multitude of scenario “sets” tested in Appendix A, larger contract sizes are generally penalized by greater randomness in penalties. This is particularly evident when looking at the percentage of scenarios that yield a “High penalty” – a metric that generally increases with contract size based on our simulation work. Notice in Table 1 that both the average PMPM payment error penalty and the percentage of “High penalties” increases with contract size by enrollment. RADV sample size does not vary by contract size (for contracts with at least one thousand RADV-eligible beneficiaries). If two contracts of very different sizes have identically-distributed errors over the entire population, there will tend to be more variance (i.e. more randomness) in the penalties drawn from the larger contract due to the fixed sample size. Payment Error Penalties Are Sensitive to Small Variations in Sample Population HCCs Our simulation work affirmed what we believe to be an intuitive observation: the HCC profile of the randomly selected RADV sample may drive significant variation in the payment penalty. In other words, the diagnoses/HCCs that the randomly selected RADV sample members happen to have may drive material variation in payment penalty. The CMS methodology randomly selects RADV-eligible beneficiaries, and each beneficiary may have a significantly different mix of HCCs. Below see an example of how the inclusion or exclusion of a single HCC error from the sampled RADV population drives a significant change in payment penalty. Using one of the actual RADV samples simulated in our Monte Carlo work, we measured the sensitivity of the RADV payment error penalty to the inclusion of a single additional HCC coding error. Medicare RADV: Review of CMS Sampling and Extrapolation Methodology July 2018 Page 13 Figure 1: Sensitivity of RADV Penalty to Single Coding Errors RADV Scenario Parameters RADV Sample Members Contract Enrollment Size (RADV-eligible) Simulated probability of unsupported coding error Simulated probability of supported but not reported coding error Simulated net coding error rate Assumed MA Standardized Bid (1.0 Risk Score) PMPM 201 100,000 10% 6% 4% $850.00 RADV Extrapolated Payment Penalty ($million) $35.m $30.m $4.3m $25.m $20.m $15.m $25.7m $25.7m RADV Sample Scenario Add Single Unsupported Code to RADV Sample (Metastatic Cancer) $10.m $5.m $.m As illustrated above, the sensitivity of CMS’s payment penalty calculation to a single HCC error is problematic in that random chance could drive material swings in extrapolated payment penalties. For this particular example, adding a single unsupported coding error (metastatic cancer) to the RADV sample results in a 16.7 percent ($4.3 million) increase in the RADV penalty. Random chance associated with a single high acuity condition being present in a RADV audit Medicare RADV: Review of CMS Sampling and Extrapolation Methodology July 2018 Page 14 sample exposes MA contracts to financial penalties that vary not based just on coding accuracy, but also on the “luck of the draw” in the randomly selected sample population. CMS Methodology May Treat Two Contracts with Identical Payment Error Rates Differently The CMS approach of deriving the RADV payment penalty from the lower bound of a confidence interval may drive inequitable treatment of contracts with identical average error rates. As validated by our scenario testing, two contracts with virtually identical payment error rates may be subject to vastly different penalties. More specifically, for two issuers with the same payment error rate but with different observed error variances, the issuer with the greater observed error variance (i.e. more volatility) will face the lower penalty. Below we illustrate an example of two simulated RADV scenarios (actual scenarios from our simulation work) that reflect the same simulated true payment error rate ($32.41 PMPM), the same observed sample average error rate ($75.32 PMPM), but materially different RADV penalties. The disparity in penalties is driven by the difference in RADV sample error variance between the two scenarios. Note that these examples also illustrate that the random chance element of the RADV sampling process may yield a sample payment error ($75.32) materially larger than the true payment error ($32.41). For the specific examples summarized in Figure 2, scenario 99914 reflects a standard error (square root of variance) PMPM value of $24.46, whereas the standard error for scenario 22046 is $13.36 PMPM. This means that the RADV sample drawn for scenario 99914 reflects more volatility, or a larger variation from the sample average error rate, as compared to the sample drawn for scenario 22046. Since the CMS RADV payment penalty reflects the lower bound of the ninety-nine (99) percent confidence interval around the observed sample error, higher variation (higher standard error) drives a lower payment penalty. Figure 2: Variation in RADV Penalties for Contract with Identical Average Errors RADV Scenario Parameters RADV Sample Members Contract Enrollment Size (RADV-eligible) Simulated probability of unsupported coding error Simulated probability of supported but not reported coding error Simulated net coding error rate Assumed MA Standardized Bid (1.0 Risk Score) PMPM Medicare RADV: Review of CMS Sampling and Extrapolation Methodology 201 100,000 10% 6% 4% $850.00 July 2018 Page 15 RADV PMPM Payment Penalty $80.00 $75.32 $75.32 $70.00 $60.00 $50.00 $40.00 $40.92 $32.41 $32.41 $30.00 $24.46 $20.00 $13.36 $12.33 $10.00 $0.00 Simulated True Observed Standard Error RADV Penalty Payment Error Average Error (Square Root of PMPM Rate Variance) Scenario 22046 Scenario 99914 CMS Methodology May Inequitably Penalize Low Risk Score Contracts We note from our simulation work that the CMS RADV methodology may drive disproportionate and potentially inaccurate penalization of low risk score contracts. If two issuers have the same true payment error per 1.0 risk score value4 but materially different absolute average risk scores, then the variance of the errors in the contract with the higher risk score will likely have a lower penalty per 1.0 risk score value than the contract with the lower risk score. This inequitable treatment of contracts by absolute risk score is related to the confidence interval approach that CMS uses to calculate payment penalties. Since CMS uses the lower bound of the confidence interval to define the payment penalty, greater variance in the observed payment errors drives a lower penalty result. For example, a five (5) percent error rate per 1.0 of risk score would mean that a contract with an average risk score of 2.0 would have an expected risk score value error of 0.1. 4 Medicare RADV: Review of CMS Sampling and Extrapolation Methodology July 2018 Page 16 Other Comments on CMS Methodology The CMS Methodology does not consider HCC-specific substantiation rates. CMS’s RADV process randomly selects beneficiaries and each selected beneficiary is assigned the same weight in the extrapolation penalty calculation. However, each sampled beneficiary has a unique HCC profile, or “mix” of HCCs. Further, each HCC has its own substantiation success rate within the industry, as some HCCs are harder to substantiate than others. Therefore, depending on the HCC profile of the contract population, as well as the RADV sample population, there may be material variation in “expected” coding error rates. CMS acknowledges its understanding that error rates may vary by HCC, as illustrated in several recent publications:  Proposed 2019 Notice of Benefit and Payment Parameters5 (page 51074) - “HHS could also evaluate error rates within each HCC, or groups of HCCs, then only apply error rates to outlier’ issuers’ risk scores within each HCC or group of HCCs.”  Final 2019 Notice of Benefit and Payment Parameters6 (page 16961) – “Our simulations of failure rates by HCC group suggest that such an approach yields a more equitable measure to evaluate statistically different HCC failure rates affecting an issuer’s error rate than an approach based on an overall failure rate, which may overly adjust issuers with abnormal distributions of certain HCCs due to their underlying populations rather than differences due to errors in diagnoses codes.”  December 2015 Statement of Work for RADV Recovery Audit Contractors7 - “Condition Specific RADV Audits will be conducted for a subset of MA contracts not subject to a Comprehensive Audit for any given payment year. The focus of Condition Specific Audits will be a set(s) of HCCs determined to have a higher probability of being erroneous, for example, it may be decided that the hierarchy of HCCs relating to ‘diabetes’ should be the subject of this targeted review.” The MA RADV audit standard against which coding error rates are measured is not reflective of varying expected error rates by HCC. This is problematic for a few reasons. First, not considering HCC-specific substantiation rates virtually guarantees inequitable treatment of two different Patient Protection and Affordable Care Act; HHS Proposed Notice of Benefit and Payment Parameters for 2019. Available online at: https://www.gpo.gov/fdsys/pkg/FR-2017-11-02/pdf/2017-23599.pdf 5 Patient Protection and Affordable Care Act; HHS Final Notice of Benefit and Payment Parameters for 2019. Available online at: https://www.gpo.gov/fdsys/pkg/FR-2018-04-17/pdf/2018-07355.pdf 6 The Medicare Advantage (MA) Risk Adjustment Data Validation (RADV) Recovery Audit Contractor (RAC) Request for Information. Available online at: https://www.fbo.gov/utils/view?id=e50f5bb5f02c9fc7d9815f163f0941a4. 7 Medicare RADV: Review of CMS Sampling and Extrapolation Methodology July 2018 Page 17 contracts selected for RADV audit. Both contracts will be held to the same coding standard despite the fact that varying HCC profiles would drive varying coding error rate expectations. Second, even within the same contract, randomly selected samples of beneficiaries will yield varying HCC profiles, which should be measured against HCC-specific substantiation standards. Therefore, two different samples from the same contract could yield materially different payment penalties as a result of varying HCC profiles of two randomly selected samples. As demonstrated in Figure 1, even a single coding error from a RADV sample can drive material variance in payment penalties. Therefore, widely varying HCC profiles of two randomly selected samples may drive material variance in payment penalties based not on coding accuracy but rather on HCC mix. CMS has not yet released details on its FFS Adjuster. As noted by CMS in its February 2012 methodology release, CMS calibrates the MA risk score model using Medicare FFS claims experience. CMS acknowledges that the coding documentation standard used in RADV audits is different from the coding documentation standard used to develop the MA risk adjustment model. The FFS adjuster is intended to account for this disconnect, since it would be mathematically inconsistent to hold MAOs to a stricter coding standard than that used to develop the MA risk adjustment model.8 The details of the FFS Adjuster, its magnitude and derivation methodology, have not yet been released. Therefore, an evaluation of the FFS Adjuster is not possible at this point. CMS excludes members with zero HCCs (“zero HCC” or “non-HCC”) from the RADV-eligible contract population, which biases the sample payment error rate upwards. Any supported but not reported codes on the population that is not RADV-eligible are systematically ignored in the CMS approach. A Fiscal Year (FY) 2016 Department of Health and Human Services (HHS) Agency Financial Report reflects an average gross payment error rate of approximately ten (10) percent and an average net payment error rate of approximately four (4) percent that takes into account supported but not reported codes. The HHS report reflects data through June of 2015, and implies that supported but not reported coding errors represent a material offset to unsupported coding errors. Excluding non-HCC members from the RADV audit samples biases the observed payment error by removing potential supported but not reported codes for non-HCC members. This makes the expected observed payment error rate higher than the true payment error rate over the entire contract (RADV-eligible plus non-eligible). Supported but not reported codes for non-HCC beneficiaries are completely unaddressed in the CMS RADV methodology since these beneficiaries are excluded from the population from which members are sampled. American Academy of Actuaries. “Re: Comment on RADV Sampling and Error Calculation Methodology.” Received by Cheri Rice, 21 January 2011. Available online at: https://www.actuary.org/pdf/health/RADV_comment_letter_012111_final.pdf 8 Medicare RADV: Review of CMS Sampling and Extrapolation Methodology July 2018 Page 18 CMS operational processes may exclude supported but not reported coding errors from the calculation of the payment error. While we do not endeavor to provide a comprehensive review of RADV audit operational design, we do note that key operational parameters could serve to influence penalty calculation outcomes. One of these operational parameters involves medical chart submissions. For each HCC that CMS attempts to validate via RADV audit, MAOs are permitted to submit up to five charts that (potentially) substantiate the HCC in question. The RADV auditor identifies the first chart that substantiates the HCC, and then codes every diagnosis on that particular chart. The first substantiating chart may uncover supported but not reported codes which would be accounted for in CMS’s approach, but supported but not reported codes from the remaining charts would not be uncovered. If none of the charts substantiate the HCC, then it is not clear that un-submitted diagnoses from any of the five charts would be coded, which would eliminate any chance to uncover supported but not reported conditions. Instead of ensuring a comprehensive and accurate picture of a selected beneficiary’s health care status, the operational limitations of the RADV audit process potentially restrict CMS from gaining a complete picture of coding errors (both unsupported codes and supported but not reported codes). Therefore, the potential exclusion of some medical charts from the RADV audit process biases the observed payment error upwards when compared to the true payment error. Such potential overstatement of the true payment error could inflate the observed sample payment error and therefore erroneously inflate extrapolated payment penalties. Conclusion CMS’s payment error extrapolation approach exposes MA contracts to materially inequitable treatment based on characteristics independent of coding accuracy. While the random sampling approach has the potential to accurately approximate payment error rates, the design of the payment error extrapolation calculation introduces the risk of unintended and problematic consequences such as larger payment penalties for contracts with low variance in coding errors. The inherent randomness in the HCC profile of a RADV sample, as well as CMS-acknowledged variations in HCC-specific substantiation rates, further contribute to the potential for inequitable treatment by contract. Such sources of bias and inequity exist independently from the to-bedefined FFS Adjuster. Although the FFS Adjuster would mitigate the absolute value of financial risk exposure to MA contracts, there is currently no evidence to suggest that it would lessen the bias and potential inequity evident in CMS’s extrapolation approach. Considerations and Limitations Wakely was commissioned by America’s Health Insurance Plans (AHIP) to perform a technical evaluation of the CMS RADV methodology. The report should be considered in its entirety. The report represents a technical evaluation and does not represent support for any particular policy or changes thereof. We do not intend this information to benefit any third party nor create a reliance by any third party on Wakely. Wakely is not responsible for any use of the report or Medicare RADV: Review of CMS Sampling and Extrapolation Methodology July 2018 Page 19 consequences of such use outside the specific purpose for which it was intended. Any mathematical estimates included in this report and produced by our Monte Carlo simulation exercise are inherently uncertain. Users of the report results should be qualified to use it and understand the results and the inherent uncertainty. Medicare RADV: Review of CMS Sampling and Extrapolation Methodology July 2018 Page 20 Appendix A – Monte Carlo Simulation Background and Results Monte Carlo Simulation Background Monte Carlo simulation, sometimes known as probability simulation, is a mathematical technique that uses random sampling to model the probability of different outcomes in a process. Monte Carlo simulation is a particularly valuable approach when applied to measuring uncertainty in processes that are impacted by random variables. Monte Carlo simulation is an appropriate method for simulating the MA RADV process since there are several random variables involved, most notably: which two hundred and one randomly selected RADV-eligible contract beneficiaries are sampled, the diagnostic profile (HCCs) of the randomly selected beneficiaries in the sample, the contract-specific true unsupported coding error rate, and the contract-specific true supported but not reported coding error rate. By holding constant the contract size and assumed true coding error rates for particular “scenario sets 9,” one can evaluate over a large number of samples the statistical attributes of CMS’s sampling and extrapolation methodology. More specifically, one can evaluate how closely the observed coding errors track the true coding error rate, the probability of nonzero payment penalties, whether there are risks of biases in the extrapolation calculations, the frequency and severity of unintended extrapolation results, and other statistical attributes. While it is not possible to mimic all operational aspects of CMS’s RADV audit approach on actual Medicare Advantage coding data, the deployment of Monte Carlo simulation on actual Medicare beneficiary diagnostic data enables a robust mathematical evaluation. Monte Carlo Simulation Approach We started by limiting the 2013 Medicare LDS data set to beneficiaries that satisfy criteria for RADV eligibility. We then simulated MA contracts of varying sizes by randomly selecting enrollees to make up three contract sizes – one thousand (1,000) enrollees, ten thousand (10,000) enrollees, and one hundred thousand (100,000) enrollees. For each contract size tested, we defined coding error rates (unsupported codes and supported but not reported codes) and randomly assigned actual coding errors to the RADV-eligible population diagnostic data. Note that we did not assume varying coding error rates by HCC. We use the term “scenario set” to refer to a particular combination of MA contract size and coding error rates assumed (e.g. one thousand (1,000) beneficiaries, ten (10) percent unsupported coding error rate, six (6) percent supported but not reported error rate). 9 Medicare RADV: Review of CMS Sampling and Extrapolation Methodology July 2018 Page 21  Assumed unsupported coding error rate – the probability that a particular HCC is erroneously reported but unsupported, i.e. submitted by the MAO as a valid diagnosis code despite the supporting documentation being insufficient  Assumed supported but not reported coding error rate – the probability that a particular HCC is supported but erroneously not reported, i.e. not submitted by the MAO as a valid diagnosis code despite the supporting documentation being sufficient We then replicated the RADV sampling process on the simulated MA contracts: we stratified the RADV-eligible contract population into three groups based on risk score, randomly selected sixty-seven (67) beneficiaries from each cohort, calculated the sample average errors and sample error variances, and finally derived an extrapolated payment penalty per the CMSpublished formula. Note that a more complete summary of CMS’s methodology is captured in Appendix B. For each of the scenario sets explored, we simulated the RADV sampling and payment penalty extrapolation one hundred thousand (100,000) times. For each single scenario, we replicated CMS calculations to derive an extrapolated payment penalty (excluding the yet-to-be-defined Medicare FFS adjuster). For each scenario set we calculated the sample average risk score coding error, the variance in average errors of the samples, the average penalty as a percent of premium, the maximum/minimum penalties among scenarios tested, the percentage of scenarios that generated a nonzero positive penalty, and the percentage of penalties that were higher than the true error rate of the underlying contract (referred to as a “High penalty”). Refer to the tables below for summary statistics on a number of Monte Carlo simulations of RADV sampling and payment penalty calculations across varying contract sizes and assumed true error rates. We first define the table column headings that are used consistently across all six Appendix A scenario set summary tables: Medicare RADV: Review of CMS Sampling and Extrapolation Methodology July 2018 Page 22 Appendix A Table Column Heading Contract RADV-eligible Enrollment PMPM Average Sample Payment Error PMPM Average Sample Payment Error Variance PMPM Payment Error Average Penalty PMPM Min Penalty PMPM Max Penalty % Penalties > 0 % High Penalties Definition Number of beneficiaries in the simulated MA contract – each row in the tables represents a unique set of scenarios The per member per month (PMPM) average value of the simulated sample coding errors The PMPM value of the variance of sample payment errors The average PMPM value of the extrapolated payment penalties The minimum extrapolated penalty among all RADV simulations within the scenario set The maximum extrapolated penalty among all RADV simulations within the scenario set The percentage of RADV scenarios in which the payment penalty is greater than $0 The percentage of RADV non-zero penalties in which the extrapolated penalty is larger than the true coding error rate Table A1: 10% HCCs Unsupported, 6% Supported but Not Reported PMPM PMPM Average PMPM Contract Average Sample Payment RADVSample Payment Error PMPM PMPM % Eligible Payment Error Average Min Max Penalties % High Enrollment Error Variance Penalty Penalty Penalty >0 Penalties 1,000 $33.55 $0.26 $2.26 $0.00 $46.80 26.8% 0.102% 10,000 $33.58 $0.33 $2.60 $0.00 $67.07 27.6% 0.238% 100,000 $32.43 $0.33 $2.66 $0.00 $57.98 27.6% 0.326% Table A2: 5% HCCs Unsupported, 5% Supported but Not Reported PMPM PMPM Average PMPM Contract Average Sample Payment RADVSample Payment Error PMPM PMPM % Eligible Payment Error Average Min Max Penalties % High Enrollment Error Variance Penalty Penalty Penalty >0 Penalties 1,000 -$4.08 $0.16 $0.00 $0.00 $9.75 0.1% 100.000% 10,000 -$0.49 $0.20 $0.01 $0.00 $14.17 0.4% 100.000% 100,000 -$0.13 $0.21 $0.02 $0.00 $19.67 0.4% 100.000% Medicare RADV: Review of CMS Sampling and Extrapolation Methodology July 2018 Page 23 Table A3: 30% HCCs Unsupported, 0% Supported but Not Reported PMPM PMPM Average PMPM Contract Average Sample Payment RADVSample Payment Error PMPM PMPM % Eligible Payment Error Average Min Max Penalties % High Enrollment Error Variance Penalty Penalty Penalty >0 Penalties 1,000 $242.61 $0.52 $182.59 $110.84 $262.94 100.0% 0.044% 10,000 $240.84 $0.63 $181.53 $107.66 $278.98 100.0% 0.133% 100,000 $243.48 $0.63 $184.59 $111.35 $277.17 100.0% 0.171% Table A4: 10% HCCs Unsupported, 0% Supported but Not Reported PMPM PMPM Average PMPM Contract Average Sample Payment RADVSample Payment Error PMPM PMPM % Eligible Payment Error Average Min Max Penalties % High Enrollment Error Variance Penalty Penalty Penalty >0 Penalties 1,000 $80.19 $0.17 $46.53 $13.77 $86.52 100.0% 0.008% 10,000 $81.11 $0.21 $46.85 $12.21 $92.64 100.0% 0.052% 100,000 $81.00 $0.21 $46.94 $12.23 $94.05 100.0% 0.075% Table A5: 5% HCCs Unsupported, 0% Supported but Not Reported PMPM PMPM Average PMPM Contract Average Sample Payment RADVSample Payment Error PMPM PMPM % Eligible Payment Error Average Min Max Penalties % High Enrollment Error Variance Penalty Penalty Penalty >0 Penalties 1,000 $38.32 $0.07 $16.43 $0.00 $42.36 100.0% 0.006% 10,000 $41.55 $0.12 $16.33 $0.00 $48.44 99.9% 0.035% 100,000 $40.64 $0.11 $16.79 $0.00 $48.23 99.9% 0.045% Table A6: 1% HCCs Unsupported, 0% Supported but Not Reported PMPM PMPM Average PMPM Contract Average Sample Payment RADVSample Payment Error PMPM PMPM % Eligible Payment Error Average Min Max Penalties % High Enrollment Error Variance Penalty Penalty Penalty >0 Penalties 1,000 $7.45 $0.01 $0.19 $0.00 $7.71 20.4% 0.001% 10,000 $7.88 $0.02 $0.14 $0.00 $8.86 13.7% 0.001% 100,000 $8.02 $0.02 $0.15 $0.00 $9.64 14.2% 0.006% Medicare RADV: Review of CMS Sampling and Extrapolation Methodology July 2018 Page 24 Appendix B – CMS RADV Methodology Sampling and Extrapolation Methodology10 In this section we paraphrase and summarize CMS’s February 2012 Notice of Final Payment Error Calculation methodology for Part C MA RADV Contract-Level Audits. ELIGIBILITY FOR RADV SAMPLING CMS selects11 a set of MA contracts for each RADV audit cycle. Within each contract selected for RADV audit, a sample of enrollees is defined selected in order for CMS to estimate a contractlevel risk adjustment payment error. The sample enrollees are drawn from contract population that CMS deems “RADV-eligible” by virtue of meeting all of the following criteria: 1. Beneficiary is enrolled in the selected MA contract as of January of the diagnosis collection period (calendar year prior to payment year) and continuously enrolled for twelve (12) months through January of the payment year 2. Beneficiary is not identified by CMS as End-Stage Renal Disease (ESRD) status and is not identified as hospice status at any time from January of the diagnosis collection period through January of the payment year. 3. Beneficiary is enrolled in Medicare Part B coverage for all twelve (12) months of the diagnosis collection period. 4. Beneficiary has at least one diagnosis code submitted that led to the assignment of at least one HCC for the payment year. SAMPLE SIZE AND STRATA CMS orders the RADV-eligible contract population based on payment year risk score (lowest to highest) and divides the sample into three equal size groups, or strata. Sixty-seven (67) enrollees from each of the three strata are randomly selected, generating a total sample size of two hundred and one (201) enrollees. Note that smaller samples are drawn if a contract’s RADV-eligible See Notice of Final Payment Error Calculation Methodology for Part C Medicare Advantage Risk Adjustment Data Validation Contract-Level Audits. Available online at: https://www.cms.gov/ResearchStatistics-Data-and-Systems/Monitoring-Programs/recovery-audit-program-parts-c-and-d/Other-ContentTypes/RADV-Docs/RADV-Methodology.pdf 10 Thirty (30) contracts were selected for 2011 payment year MA RADV audits. This report does not include an assessment of CMS’s methodology for selecting MA contracts for RADV audits, as the methodology has not been published to our knowledge. 11 Medicare RADV: Review of CMS Sampling and Extrapolation Methodology July 2018 Page 25 population is less than one thousand (1,000) enrollees. At this stage, CMS defines a “stratumbased enrollee weight” as the number of enrollees in the stratum divided by the number of enrollees randomly selected (usually sixty-seven). For example, if a contract has fifteen thousand (15,000) RADV-eligible enrollees, sixty-seven enrollees (67) would be randomly selected from each of three strata of five thousand (5,000) enrollees. The stratum-based enrollee weight in this case would be five thousand (5,000) divided by sixty-seven (67), or 74.627. The stratum-based enrollee weight is used as a multiplier to extrapolate the payment error measured to the entire RADV-eligible population of the stratum. DOCUMENTATION SUBMISSION PARAMETERS MAOs are required to submit detailed medical records to support all HCCs represented in the randomly selected beneficiary sample. CMS permits audited MA contracts to submit multiple medical records for each HCC being validated. However, all diagnoses identified in the first medical record that validates the HCC will be used. The “one best medical record” policy applies to the RADV audit dispute and appeal processes. In the event of a RADV audit dispute/appeal CMS requires that MAOs submit a single medical record that best substantiates the HCC in question. PAYMENT ERROR CALCULATION Based on the medical record documentation submitted, CMS calculates a RADV-corrected risk score and corrected payment amount. The risk score value of HCCs not substantiated by medical record documentation is removed from enrollee risk scores, and the HCC value of any previously undocumented diagnoses is added to the enrollee risk scores. Per member per month (PMPM) payment errors are defined as the difference between the original monthly CMS payment to the MAO and the RADV-corrected monthly payment for each enrollee. Note that payment errors at the enrollee level may be positive (overpayment to MAO) or negative (underpayment to MAO). CMS derives an annual payment error for each sampled enrollee by multiplying the PMPM payment error by the number of months the beneficiary was enrolled during the payment year. PAYMENT ERROR EXTRAPOLATION CMS derives the MA contract-level payment error by extrapolating the observed annual payment error to the entire RADV-eligible population. Put simply, CMS estimates the average payment error based on the randomly selected sample of beneficiaries and calculates a nine-nine (99) percent confidence interval (CI) around that estimated error. In other words, CMS is implying that there is a ninety-nine (99) percent chance that the actual payment error will be between the lower and upper bounds of its confidence interval. The more intricate details of the extrapolation are outlined in the paragraph below: Medicare RADV: Review of CMS Sampling and Extrapolation Methodology July 2018 Page 26 CMS multiplies the annual payment error for each sampled enrollee by the stratum-based enrollee weight (previously defined). The extrapolated enrollee annual payment error is summed across all enrollees in the sample to derive an estimated “point estimate” of the contract-level payment error. Importantly, a ninety-nine (99) percent CI for the payment error is calculated for each RADV audited contract. The ninety-nine (99) percent CI is derived by varying the estimated payment error observed (average observed payment error) by 2.575 times the Standard Error (SE). The SE is derived as follows: 1. Derive the variance (vh) of the unweighted enrollee payment errors within each of three strata (h). 2. Calculate the variance of the estimated total (𝑉 ^ ) payment error where N represents the number of RADV-eligible enrollees in stratum h. 𝑉 ^ = 𝑁 𝑣 67 3. 𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐸𝑟𝑟𝑜𝑟 𝑆𝐸 = 𝑉 ^ PAYMENT RECOVERY AMOUNT AND FFS ADJUSTER CMS uses the lower bound of the derived confidence interval to determine the payment recovery amount (the amount that CMS intends to recoup from the MAO). Note that we use “payment recovery amount” and “payment penalty” interchangeably throughout the report. CMS sets the recovery amount floor at zero. In other words, if the lower bound of the confidence interval below zero, which indicates that CMS may have underpaid the MAO initially, CMS will not make an incremental payment to the MAO to correct for the initial underpayment. If the lower bound of the derived payment error confidence interval is above zero, then the lower bound of the confidence interval will define the “preliminary payment recovery amount.” This preliminary amount will be adjusted downward by a to-be-defined Fee-for-Service Adjuster (FFS Adjuster), but still constrained to the zero recovery floor. The concept of the FFS Adjuster is intended to account for the difference in coding documentation standards between the MAO medical records and the FFS claim medical records used to develop the Medicare Advantage risk adjustment model. CMS has indicated that the FFS Adjuster will be derived based on a “RADV-like review of records submitted to support FFS claims data.” To our knowledge, since the February 2012 release of the MA RADV Sampling and Extrapolation Methodology, CMS has yet to release any substantive information on the FFS Adjuster amount or its derivation. Medicare RADV: Review of CMS Sampling and Extrapolation Methodology July 2018