Rationale for School Quality Metrics and Construction of Ranking Scores Boston Globe School Ranking Project Anil Nathan* and Jack Schneider+ College of the Holy Cross Introduction This document explains the procedure used by the Boston Globe to rank schools in Massachusetts. The following six categories were used to score and rank schools based on user input on how to weight each of the attributes. The categories are 1) Massachusetts Comprehensive Assessment System (MCAS) Mathematics Growth Score; 2) MCAS English Language Arts Growth Score; 3) School Climate (which includes graduation rates, dropout rates, and the intentions of attending a 2 or 4 year college; 4) College Readiness (which includes SAT Writing scores and the percentage of students scoring 3 or higher on Advanced Placement tests); 5) School Resources (as measured by expenditure per student); and 6) Diversity (calculation is explained below). The most recent available data is used for all calculations.1 All variables were scaled (if necessary) and scored based on their deviations from the mean of the variable. This measure allows for some notion of distance between scores while removing some problems associated with the natural magnitude of the variables. Taking a weighted average of these deviations from the means (using category weights inputted by the user) allows for the calculation of a score, which is used to rank the schools.2 The remainder of the document explains the rationale for using the categories and variables listed above as well as reporting specifics about how the variables are used to calculate a final score. MCAS Growth (Mathematics and English Language Arts) The Student Growth Percentile (SGP) score, unique to Massachusetts, is important because it attempts to identify the value added by the school in the process of education. After all, schools working with high-achieving populations will have high standardized test scores, even if those students gain little from their coursework. Conversely, schools with low-achieving students might do an outstanding job in raising achievement scores, but still produce low overall test scores. Measuring student growth rather than net scores, then, theoretically makes it possible to compare schools with vastly different student populations.3 SGPs are determined by comparing one student's history of MCAS exam scores to the scores of all the other students in the state with a similar testing history.4 The median MCAS SGP scores by school from 2013 are used in this ranking system. The MCAS SGP scores naturally have a theoretical low point of 0 and a theoretical high point of * Assistant Professor, Department of Economics and Accounting, College of the Holy Cross, Stein 511, 1 College Street, Worcester, MA 01610. Phone: (508) 793-2680. Email: anathan@holycross.edu + Assistant Professor, Department of Education, College of the Holy Cross, Stein 437, 1 College Street, Worcester, MA 01610. Phone: (508) 793-3731. Email: jschneid@holycross.edu 100, and so do not require scaling. Means are calculated for each variable conditional on school type, and then deviations from the relevant mean are derived for each school to use in scoring. School Climate School climate is challenging to measure. That said, graduation rates are a logical proxy for both student and adult commitment to the process of education. Graduation from high school represents a significant commitment and can signal student feeling about the importance of school work, adult care for students, and broader community support for education.5 The inclusion of dropouts in such figures is also important because graduation rates are often calculated based on the percentage of enrolled students--potentially allowing a school with a high dropout rate to possess a misleadingly high graduation rate.6 In order to add some depth to this picture, the postsecondary ambitions of high school students--measured through surveys administered by the state--have also been included.7 Again, while such data correlates imperfectly with school climate, it does send a powerful signal about the degree to which students feel prepared for college work (whether at two- or four-year schools) and view education as valuable.8 Both graduation and dropout rates naturally have a theoretical low point of 0 and a theoretical high point of 100, and so do not require scaling. Means are calculated for each variable conditional on school type, and then deviations from the relevant mean are derived for each school to use in scoring.9 To combine graduation rates, dropout rates, and college intentions into the category of school climate, the deviations from the mean for graduation rates are multiplied by .50, the deviations from the mean for dropout rates are multiplied by .3, and the deviations from the mean for college intentions are multiplied by .2. These weighted deviations from the mean are then added together to form the school climate deviation from the mean for use in scoring. College Readiness College readiness can be measured in a number of ways, none of them perfect. Two relatively predictive and readily available measures, however, are SAT writing scores and the percentage of students taking AP exams earning scores of 3 or higher. Certainly they are limited. Not all students in every school take the SAT or enroll in AP courses. And, of course, it is possible to perform poorly on the SAT or on an AP exam and do quite well in college. Still, SAT writing scores have proven to be reliable predictors of college success--enough so, in fact, that the SAT dropped writing as a separate subject exam and included it in the general test.10 Successful work on AP exams, similarly, have proven in a number of studies to be predictive of college success, and particularly so for subjects like Calculus BC or Physics.11 The SAT writing section is scored from a low of 200 to a high of 800. These values are scaled from 0 to 100, and then means and deviations from the relevant means are calculated. The percentage of students taking AP exams that score a 3 or higher naturally have a theoretical low of 0 and a theoretical high of 100, so no scaling is needed. Deviations from the relevant mean are then calculated. In order to derive a college readiness deviation from the mean, the SAT writing deviation and are each multiplied by .50 and added together. School Resources It is common wisdom among parents and policymakers that resources matter in education. Wealthier districts can lower average class sizes and recruit talented faculty, since teacher salaries and benefits make up the bulk of educational expenditures. And adequate resources also make it possible for schools to finance staff development, additional pupil support, music programs, and extra-curricular activities--all of which can enhance the academic and non-academic components of the school program. Some research does indicate that money may play a smaller role than the public imagines.12 Nevertheless, it is clear that money matters in schooling, providing students with a range of opportunities that might otherwise be unavailable.13 There is no theoretical low and high point for expenditure per student, which is used as a proxy for school resources. The data is used to set low and high points to create a scale between 0 and 100.14 Deviations from the relevant mean are then calculated and used in scoring. Diversity All families value diversity differently. And some families intentionally select nondiverse schools and districts, believing that their children will benefit from racial or cultural homogeneity. Still, educational research indicates that a diverse mix of students can produce a range of positive outcomes for young people. Perhaps most obviously, students in diverse schools gain racial and cultural awareness, as well as an enhanced sense of openness that appears to carry forward into adult life.15 But such environments also appear to promote critical thinking among students.16 And though it appears that non-dominant groups (particularly AfricanAmerican and Latino/Hispanic students) benefit the most from diverse school communities, research indicates that white students also benefit by such environments, or at the very least are not adversely affected.17 The 4 racial groups used to measure diversity are white, black, Latino/Hispanic, and other. A notion of "perfect diversity" is introduced where each group comprises 25% of the school's population.18 The Euclidean distance in this space is then calculated for each school.19 This number naturally has a theoretical low of 0, and the data is used to determine the theoretical high in order to scale from 0 to 100 if necessary.20 Deviations from means are then calculated and used for scoring.21 Overall Score and Ranking The overall score for a school is a weighted average of the deviations from the relevant means from the categories, where the weights are user determined. Any data in a positively weighted category that happens to be missing for a particular school will force that school to not be ranked. These scores are then used to rank the schools in numerical order from high to low. Notes 1 All the data is from the state Department of Elementary and Secondary Education. Data used includes: 2013 MCAS growth score results; 2010-2011 Expenditures per pupil; 2011-2012 Class size by race ethnicity; 2011-2012 Plans of high school graduates; 2011-2012 SAT writing scores; 2011-2012 Advanced Placement performance. 2 Schools are split into High Schools, Middle Schools, and Elementary Schools. A school is counted in the high school category if it has enrollment in grades 9 or above. A school is counted in the middle school category if it has enrollment in 6th, 7th, or 8th grade. A school is counted in the elementary school category if it has enrollment in any grades between pre-kindergarten and 5th grade. Note that a school can be in multiple categories, but only the relevant MCAS growth scores for the categories are used for calculations. For example, a school with enrollment in grades K-12 will only have 10th grade MCAS growth scores used in the high school category. The means and deviations are calculated from these three sets. Note that elementary and middle schools do not have "School Climate" or "College Readiness" variables as defined here. 3 For more, see the RAND Corporation's useful primer, "Value-Added Modeling 101: Using Student Test Scores to Help Measure Teaching Effectiveness" at http://www.rand.org/education/projects/measuring-teachereffectiveness/value-added-modeling.html. For a perspective on the dangers of value-added modeling, see Jack Schneider, "The High Stakes of Teacher Evaluation," Education Week, June 5, 2012, http://academics.holycross.edu/files/Education/High_Stakes_of_Teacher_Eval.docx 4 Massachusetts Board of Elementary and Secondary Education, "Letter to Educators Explaining the Growth Model," September 10, 2010, http://www.doe.mass.edu/mcas/growth/0910letter.html 5 Melissa Roderick, Jenny Nagaoka, Vanessa Coca, Eliza Moeller, Karen Roddie, Jamiliyah Gilliam, and Desmond Patton, From High School to the Future: Potholes on the Road to College (Chicago: Consortium on Chicago School Research, 2008); Robert Slavin and Olatokunbo Fashola, "Effective Dropout Prevention and College Attendance Programs for Students Placed at Risk," Journal of Education for Students Placed at Risk (1998): 159-183. 6 C. Chapman, J. Laird, and A. Kewal Ramani, Trends in High School Dropout and Completion Rates in the United States: 1972-2008, Washington, D.C.: National Center for Education Statistics, Institute of Education Sciences, U.S. Department of Education, NCES 2011-012, 2010. 7 See "2011-12 Plans of High School Graduates Report" at http://profiles.doe.mass.edu/state_report/plansofhsgrads.aspx for these figures. 8 For a brief overview of the democratizing influence of two-year colleges, see the National Student Clearinghouse Research Center's snapshot report: "The Role of Two-Year Institutions in Four-Year Success," http://www.studentclearinghouse.info/snapshot/docs/SnapshotReport6-TwoYearContributions.pdf 9 Since a higher dropout rate is suboptimal, these deviations from the mean are multiplies by -1 for use in calculation. 10 Saul Geiser and Maria Veronica Santelices, "Validity of High-School Grades in Predicting Student Success Beyond the Freshman Year: High-School Record vs. Standardized Tests as Indicators of Four-Year College Outcomes," Research & Occasional Paper Series: CSHE.6.07 (2007), http://cshe.berkeley.edu/publications/publications.php?id=265. 11 Saul Geiser and Maria Veronica Santelices, "The Role of Advanced Placement and Honors Courses in College Admissions," in P. Gandara, G. Orfield and C. Horn, eds., Expanding Opportunity in Higher Education: Leveraging Promise (Albany, NY: SUNY Press, 2006); Chrys Dogherty, Lynn Mellor, and Shuling Jian, "The Relationship between Advanced Placement and College Graduation" (National Center for Educational Accountability, 2006); Phillip L. Ackerman, Ruth Kanfer, and Margaret E. Beier, "Trait Complex, Cognitive Ability, and Domain Knowledge Predictors of Baccalaureate Success, STEM Persistence, and Gender Differences," Journal of Educational Psychology 105, no. 3 (2013): 911-927. 12 Eric A. Hanushek, "Conclusions and Controversies about the Effectiveness of School Resources," Economic Policy Review 4, no. 1 (1998). 13 Bruce D. Baker, David G. Sciarra, and Danielle Farrie, Is School Funding Fair? A National Report Card (Newark, NJ: Education Law Center, 2010); Diana Epstein, Measuring Inequity in School Funding (Washington, DC: Center for American Progress, 2011); Steve Gibbons, Sandra McNally, and Martina Viarengo, Does Additional Spending Help Urban Schools? An Evaluation Using Boundary Discontinuities, Centre for the Study of the Economics of Education Discussion Paper No. 128 (2011); Jonathan Guryan, Does Money Matter? RegressionDiscontinuity Estimates from Education Finance Reform in Massachusetts (National Bureau of Economic Research, 2001); Kevin J. Payne and Bruce J. Biddle, "Poor School Funding, Child Poverty, and Mathematics Achievement," Educational Researcher 28, no. 6 (1999): 4-13. 14 The expenditure per student ranges from about $5000 to $35000, which is used as the low and high points for scaling. 15 Jeffrey F. Milem, "The Educational Benefits of Diversity: Evidence from Multiple Sectors," in M. Chang, D. Witt, J. Jones, and K. Hakuta, eds., Compelling Interest: Examining the Evidence on Racial Dynamics in Higher Education (Palto Alto, CA: Stanford University Press), 126-169. 16 Dan Barrett, "Encounters with Diversity, on Campuses and in Coursework, Bolter Critical Thinking Skills," Chronicle of Higher Education, November 19, 2012; Robert L. Linn and Kevin G. Welner, eds., Race-Conscious Policies for Assigning Students to Schools: Social Science Research and the Supreme Court Cases (Washington, DC: National Academy of Education, 2007). 17 Robert L. Linn and Kevin G. Welner, eds., Race-Conscious Policies for Assigning Students to Schools: Social Science Research and the Supreme Court Cases (Washington, DC: National Academy of Education, 2007). 18 Although achieving "perfect diversity" is virtually impossible, there are some schools that approach it. "Perfect diversity" also may not even be desirable, but it is used here as a fixed point to create a notion of distance from a theoretical ideal point. 19 The Euclidean distance is defined as (White % - 25)! + (Black % - 25)! + (Hispanic % - 25)! + (Other % - 25)! 20 The highest Euclidean distances naturally are around 100 in this case, so no scaling is necessary. 21 A higher score is less desirable, so the deviations from the mean are multiplied by -1 to use for scoring purposes.