Tom Loveless – Education Next https://www.educationnext.org A Journal of Opinion and Research About Education Policy Thu, 25 May 2023 13:41:07 +0000 en-US hourly 1 https://wordpress.org/?v=5.4.2 https://www.educationnext.org/wp-content/uploads/2020/06/e-logo-1.png Tom Loveless – Education Next https://www.educationnext.org 32 32 181792879 Stanford Summer Math Camp Defense Doesn’t Add Up, Either https://www.educationnext.org/stanford-summer-math-camp-defense-doesnt-add-up-either/ Wed, 24 May 2023 13:13:26 +0000 https://www.educationnext.org/?p=49716680 Flawed, non-causal research that the proposed California framework embraces

The post Stanford Summer Math Camp Defense Doesn’t Add Up, Either appeared first on Education Next.

]]>
Photo of Stanford University
Stanford University was the site of a summer math camp whose outcomes were studied.

I thank Jack Dieckmann for reading my critique of the proposed California State Math Framework (“California’s New Math Framework Doesn’t Add Up”) and for writing a response (“Stanford Summer Math Camp Researchers Defend Study”). In the article, I point to scores of studies cited by What Works Clearinghouse Practice Guides as examples of high-quality research that the framework ignores. I also mention and two studies of Youcubed-designed math summer camps as examples of flawed, non-causal research that the proposed California State Math Framework embraces.

I focused on outcomes measured by how students performed on four tasks created by the Mathematical Assessment Research Service. Based on MARS data, Youcubed claims that students gained 2.8 years of math learning by attending its first 18-day summer camp in 2015. Dieckmann defends MARS as being “well-respected” and having a “rich legacy,” but he offers no psychometric data to support assessing students with the same four MARS tasks pre- and post-camp and converting gains into years of learning. Test-retest using the same instrument within such a short period of time is rarely good practice. And lacking a comparison or control group prevents the authors from making credible causal inferences from the scores.

Is there evidence that MARS tasks should not be used to measure the camps’ learning gains? Yes, quite a bit. The MARS website includes the following warning: “Note: please bear in mind that these materials are still in draft and unpolished form.” Later that point is reiterated, “Note: please bear in mind that these prototype materials need some further trialing before inclusion in a high-stakes test.” I searched the list of assessments covered in the latest edition of the Buros Center’s Mental Measurements Yearbook, regarded as the encyclopedia of cognitive tests, and could find no entry for MARS. Finally, Evidence for ESSA and What Works Clearinghouse are the two main repositories for high quality program evaluations and studies of education interventions. I searched both sites and found no studies using MARS.

The burden of proof is on any study using four MARS tasks to measure achievement gains to justify choosing that particular instrument for that particular purpose.

Dieckmann is correct that I did not discuss the analysis of change in math grades, even though a comparison group was selected using a matching algorithm. The national camp study compared the change in pre- and post-camp math grades, converted to a 4-point scale, of camp participants and matched non-participants. One reason not to take the “math GPA data” seriously is that grades are missing for more than one-third of camp participants (36%). Moreover, baseline statistics on math grades are not presented for treatment and comparison groups. Equivalence of the two groups’ GPAs before the camps cannot be verified.

Let’s give the benefit of doubt and assume the two groups had similar pre-camp grades. Are post-camp grade differences meaningful? The paper states, “On average, students who attended camp had a math GPA that was 0.16 points higher than similar non-attendees.” In a real-world sense, that’s not very impressive on a four-point scale. We learn in the narrative that special education students made larger gains than non-special education students. Non-special education students’ one-tenth of a GPA point gain is underwhelming.

Moreover, as reported in Table 5, camp dosage, as measured in hours of instruction, is inversely related to math GPA. More instruction is associated with less impact on GPA. When camps are grouped into three levels of instructional hours (low, medium, and high dosage), effects decline from low (0.27) to medium (0.09) to high (0.04) dosage. This is precisely the opposite of the pattern of changes reported for the MARS outcome—and the opposite of what one would expect if increased exposure to the camps boosted math grades.

The proposed California Math Framework relies on Youcubed for its philosophical outlook on K-12 mathematics: encouraging how the subject should be taught, defining its most important curricular topics, providing guidance on how schools should organize students into different coursework, and recommending the best way of measuring the mathematics that students learn. With the research it cites as compelling and the research it ignores as inconsequential, the framework also sets a standard for what it sees as empirical evidence that educators should follow in making the crucial daily decisions that shape teaching and learning.

It’s astonishing that California’s K-12 math policy is poised to take the wrong road on so many important aspects of education.

Tom Loveless, a former 6th-grade teacher and Harvard public policy professor, is an expert on student achievement, education policy, and reform in K–12 schools. He also was a member of the National Math Advisory Panel and U.S. representative to the General Assembly, International Association for the Evaluation of Educational Achievement, 2004–2012.

The post Stanford Summer Math Camp Defense Doesn’t Add Up, Either appeared first on Education Next.

]]>
49716680
California’s New Math Framework Doesn’t Add Up https://www.educationnext.org/californias-new-math-framework-doesnt-add-up/ Tue, 16 May 2023 09:00:12 +0000 https://www.educationnext.org/?p=49716646 It would place Golden State 6th graders years behind the rest of the world—and could eventually skew education in the rest of the U.S., too

The post California’s New Math Framework Doesn’t Add Up appeared first on Education Next.

]]>
Rocks and mud cover HIghway 1 in California
A bumpy road is ahead for California’s proposed new math framework.

California’s proposed math curriculum framework has ignited a ferocious debate, touching off a revival of the 1990s math wars and attracting national media attention. Early drafts of the new framework faced a firestorm of criticism, with opponents charging that the guidelines sacrificed accelerated learning for high achievers in a misconceived attempt to promote equity.

The new framework, first released for public comment in 2021, called for all students to take the same math courses through 10th grade, a “detracking” policy that would effectively end the option of 8th graders taking algebra. A petition signed by nearly 6,000 STEM leaders argued that the framework “will have a significant adverse effect on gifted and advanced learners.” Rejecting the framework’s notions of social justice, an open letter with over 1,200 signatories, organized by the Independent Institute, accused the framework of “politicizing K–12 math in a potentially disastrous way” by trying “to build a mathless Brave New World on a foundation of unsound ideology.”

About once every eight years, the state of California convenes a group of math educators to revisit the framework that recommends how math will be taught in the public schools. The current proposal calls for a more conceptual approach toward math instruction, deemphasizing memorization and stressing problem solving and collaboration. After several delays, the framework is undergoing additional edits by the state department of education and is scheduled for consideration by the state board of education for approval sometime in 2023.

Why should anyone outside of California care? With almost six million public school students, the state constitutes the largest textbook market in the United States. Publishers are likely to cater to that market by producing instructional materials in accord with the state’s preferences. California was ground zero in the debate over K–12 math curriculum in the 1990s, a conflict that eventually spread coast to coast and around the world. A brief history will help set the stage.

Historical Context

Standards define what students are expected to learn—the knowledge, skills, and concepts that every student should master at a given grade level. Frameworks provide guidance for meeting the standards—including advice on curriculum, instruction, and assessments. The battle over the 1992 California state framework, a document admired by math reformers nationwide, started slowly, smoldered for a few years, and then burst into a full-scale, media-enthralling conflict by the end of the decade. That battle ended in 1997 when the math reformers’ opponents, often called math traditionalists, convinced state officials to adopt math standards that rejected the inquiry-based, constructivist philosophy of existing state math policy.

The traditionalists featured a unique coalition of parents and professional mathematicians—scholars in university mathematics departments, not education schools—who were organized via a new tool of political advocacy: the Internet.

The traditionalist standards lasted about a decade. By the end of the aughts, the standards were tarnished by their association with the unpopular No Child Left Behind Act, which mandated that schools show all students scoring at the “proficient” level on state tests by 2014 or face consequences. It was clear that virtually every school in the country would be deemed a failure, No Child Left Behind had plummeted in the public’s favor, and policymakers needed something new. Enter the Common Core State Standards.

The Common Core authors wanted to avoid a repeat of the 1990s math wars, and that meant compromise. Math reformers were satisfied by the standards’ recommendation that procedures (computation), conceptual understanding, and problem solving receive “equal emphasis.” Traditionalists were satisfied with the Common Core requirement that students had to master basic math facts for addition and multiplication and the standard algorithms (step-by-step computational procedures) for all four operations—addition, subtraction, multiplication, and division.

California is a Common Core state and, for the most part, has avoided the political backlash that many states experienced a few years after the standards’ widespread adoption. The first Common Core–oriented framework, published in 2013, was noncontroversial; however, compromises reflected in the careful wording of some learning objectives led to an unraveling when the framework was revised and presented for public comment in 2021.

Unlike most of the existing commentary on the revised framework, my analysis here focuses on the elementary grades and how the framework addresses two aspects of math: basic facts and standard algorithms. The two topics are longstanding sources of disagreement between math reformers and traditionalists. They were flashpoints in the 1990s math wars, and they are familiar to most parents from the kitchen-table math that comes home from school. In the case of the California framework, these two topics illustrate how reformers have diverged from the state’s content standards, ignored the best research on teaching and learning, and relied on questionable research to justify the framework’s approach.

Photo of Jo Boaler
Jo Boaler is a math education professor at Stanford and member of the California Math Framework writing committee.

Addition and Multiplication Facts

Fluency in mathematics usually refers to students’ ability to perform calculations quickly and accurately. The Common Core mathematics standards call for students to know addition and multiplication facts “from memory,” and the California math standards expect the same. The task of knowing basic facts in subtraction and division is made easier by those operations being the inverse, respectively, of addition and multiplication. If one knows that 5 + 6 = 11, then it logically follows that 11 – 6 = 5; and if 8 × 9 = 72, then surely 72 ÷ 9 = 8.

Cognitive psychologists have long pointed out the value of automaticity with number facts—the ability to retrieve facts immediately from long-term memory without even thinking about them. Working memory is limited; long-term memory is vast. In that way, math facts are to math as phonics is to reading. If these facts are learned and stored in long-term memory, they can be retrieved effortlessly when the student is tackling more-complex cognitive tasks. In a recent interview, Sal Khan, founder of Khan Academy, observed, “I visited a school in the Bronx a few months ago, and they were working on exponent properties like: two cubed, to the seventh power. So, you multiply the exponents, and it would be two to [the] 21st power. But the kids would get out the calculator to find out three times seven.” Even though they knew how to solve the exponent exercise itself, “the fluency gap was adding to the cognitive load, taking more time, and making things much more complex.”

California’s proposed framework mentions the words “memorize” and “memorization” 27 times, but all in a negative or downplaying way. For example, the framework states: “In the past, fluency has sometimes been equated with speed, which may account for the common, but counterproductive, use of timed tests for practicing facts. . . . Fluency is more than the memorization of facts or procedures, and more than understanding and having the ability to use one procedure for a given situation.” (All framework quotations here are from the most recent public version, a draft presented for the second field review, a 60-day public-comment period in 2022.)

One can find the intellectual origins of the framework on the website of Youcubed, a Stanford University math research center led by Jo Boaler, who is a math education professor at Stanford and member of the framework writing committee. Youcubed is cited 28 times in the framework, including Boaler’s essay on that site, “Fluency without Fear: Research Evidence on the Best Ways to Learn Math Facts.” The framework cites Boaler an additional 48 times.

The framework’s attempt to divorce fluency from speed (and from memory retrieval) leads it to distort the state’s math standards. “The acquisition of fluency with multiplication facts begins in third grade and development continues in grades four and five,” the framework states. Later it says, “Reaching fluency with multiplication and division within 100 represents a major portion of upper elementary grade students’ work.”

Both statements are inaccurate. The state’s 3rd-grade standard is that students will know multiplication facts “from memory,” not that they will begin fluency work and continue development in later grades. After 3rd grade, the standards do not mention multiplication facts again. In 4th grade, for example, the standards call for fluency with multidigit multiplication, a stipulation embedded within “understanding of place value to 1,000,000.” Students lacking automaticity with basic multiplication facts will be stopped cold. Parents who are concerned that their 4th graders don’t know the times tables, let alone how to multiply multidigit numbers, will be directed to the framework to justify children falling behind the standards’ expectations.

After the release of Common Core, the authors of the math standards published “Progressions” documents that fleshed out the standards in greater detail. The proposed framework notes approvingly, “The Progressions for the Common Core State Standards documents are a rich resource; they (McCallum, Daro, and Zimba, 2013) describe how students develop mathematical understanding from kindergarten through grade twelve.” But the Progressions contradict the framework on fluency. They state: “The word fluent is used in the Standards to mean ‘fast and accurate.’ Fluency in each grade involves a mixture of just knowing some answers, knowing some answers from patterns (e.g., ‘adding 0 yields the same number’), and knowing some answers from the use of strategies.”

Students progress toward fluency in a three-stage process: use strategies, apply patterns, and know from memory. Students who have attained automaticity with basic facts have reached the top step and just know them, but some students may take longer to commit facts to memory. As retrieval takes over, the possibility of error declines. Students who know 7 × 7 = 49 but must “count on” by 7 to confirm that 8 × 7 = 56 are vulnerable to errors to which students who “just know” that 7 × 8 = 56 are impervious. In terms of speed, the analogous process in reading is decoding text. Students who “just know” certain words because they have read them frequently are more fluent readers than students who must pause to sound out those words phonetically. This echoes the point Sal Khan made about students who know how to work with exponents raised to another power but still need a calculator for simple multiplication facts.

Photo of Sal Khan
Sal Khan, the founder of Khan Academy, observed that, “the fluency gap was adding to the cognitive load, taking more time, and making things much more complex.”

Standard Algorithms

Algorithms are methods for solving multi-digit calculations. Standard algorithms are simply those used conventionally. Learning the standard algorithms of addition, subtraction, multiplication, and division allows students to extend single-digit knowledge to multi-digit computation, while being mindful of place value and the possible need for regrouping.

Barry Garelick, a math teacher and critic of Common Core, posted a series of blog posts about the standards and asked, “Can one teach only the standard algorithm and meet the Common Core State Standards?” Jason Zimba, who is one of three authors of the Common Core math standards, responded:

Provided the standards as a whole are being met, I would say that the answer to this question is yes. The basic reason for this is that the standard algorithm is “based on place value [and] properties of operations.” That means it qualifies. In short, the Common Core requires the standard algorithm; additional algorithms aren’t named, and they aren’t required.

Zimba provides a table showing how exclusively teaching the standard algorithms of addition and subtraction could be accomplished, presented not as a recommendation, but as “one way it could be done.” Zimba’s approach begins in 1st grade, with students—after receiving instruction in place value—learning the proper way to line up numbers vertically. “Whatever one thinks of the details in the table, I would think that if the culminating standard in grade 4 is realistically to be met, then one likely wants to introduce the standard algorithm pretty early in the addition and subtraction progression.”

Note the term “culminating standard.” That implies the endpoint of development. The framework, however, interprets 4th grade as the grade of first exposure, not the culmination—and extends that misinterpretation to all four operations with whole numbers. “The progression of instruction in standard algorithms begins with the standard algorithm for addition and subtraction in grade four; multiplication is addressed in grade five; the introduction of the standard algorithm for whole number division occurs in grade six,” the framework reads.

This advice would place California 6th graders years behind the rest of the world in learning algorithms. In Singapore, for example, division of whole numbers up to 10,000 is taught in 3rd grade. The justification for delay stated in the framework is: “Students who use invented strategies before learning standard algorithms understand base-ten concepts more fully and are better able to apply their understanding in new situations than students who learn standard algorithms first (Carpenter et al., 1997).”

The 1997 Carpenter study, however, is a poor reference for the framework’s assertion. That study’s authors declare, “Instruction was not a focus of this study, and the study says very little about how students actually learned to use invented strategies.” In addition, the study sample was not scientifically selected to be representative, and the authors warn, “The characterization of patterns of development observed in this study cannot be generalized to all students.”

As for the Progressions documents mentioned above, they do not prohibit learning standard algorithms before the grade level of the “culminating expectation.” Consistent with Jason Zimba’s approach, forms of the standard addition and subtraction algorithms are presented as 2nd grade topics, two years before students are required to demonstrate fluency.

The selective use of evidence extends beyond the examples above, as is clear from the research that is cited—and not cited—by the framework.

Research Cited by the Framework

On June 1, 2021, Jo Boaler issued a tweet asserting, “This 4 week camp increases student achievement by the equivalent of 2.8 years.” The tweet included information on a two-day workshop at Stanford for educators interested in holding a Youcubed-inspired summer camp. The Youcubed website promotes the summer camp with the same claim of additional years of learning.

Where did the 2.8 years come from? The first Youcubed math camp was held on the Stanford campus in 2015 with 83 6th and 7th graders. For 18 days, students spent mornings working on math problems and afternoons touring the campus in small groups, going on scavenger hunts, and taking photographs. The students also received instruction targeting their mathematical mindsets, learning that there is no such thing as “math people” and “nonmath people,” that being fast at math is not important, and that making mistakes and struggling, along with thinking visually and making connections between mathematical representations, promote brain growth. Big ideas, open-ended tasks, collaborative problem solving, lessons on mindset, and inquiry-based teaching—these are foundational to the framework. The camp offers a test run of the proposed framework, the document asserting that the camps “significantly increase achievement in a short period of time.”

The claim of growth is based on an assessment the researchers administered on the first and last days of the camp. The test consisted of four open-ended problems, called “tasks,” scored by a rubric, with both the problems and the rubric created by the Mathematical Assessment Research Service, or MARS. Students were given four tasks on the first day and the same four tasks on the final day of camp. An effect size of 0.91 was calculated by dividing the difference between the group’s pre- and post-test average scores by the pre-test standard deviation. How this effect size was converted into years of learning is not explained, but researchers usually do this based on typical rates of achievement growth among students taking standardized math tests in consecutive years.

In 2019, the Youcubed summer-camp program went national. An in-house study was conducted involving 10 school districts in five states where the camps served about 900 students in total and ranged from 10 to 28 days. The study concluded, “The average gain score for participating students across all sites was 0.52 standard deviation units (SD), equivalent to 1.6 years of growth in math.”

Let’s consider these reported gains in the context of recent NAEP math scores. The 2022 scores triggered nationwide concern as 4th graders’ scores fell to 236 scale score points from 241 in 2019, a decline of 0.16 standard deviations. Eighth graders’ scores declined to 274 from 282, equivalent to 0.21 standard deviations. Headlines proclaimed that two decades of learning had been wiped out by two years of pandemic. A McKinsey report estimated that NAEP scores might not return to 2019 levels until 2036.

If the Youcubed gains are to be believed, all pandemic learning losses can be restored, and additional gains achieved, by two to four weeks of summer school.

There are several reasons to doubt the study’s conclusions, the most notable of which is the lack of a comparison group to gauge the program’s effects as measured by the MARS outcome. School districts recruited students for the camps. No data are provided on the number of students approached, the number who refused, and the number who accepted but didn’t show up. The final group of participating students comprises the study’s treatment group. The claim that these students experienced 1.6 years of growth in math is based solely on the change in students’ scores on the MARS tasks between the first and last day of the program.

This is especially problematic because the researchers gave students the same four MARS tasks before and after the program. Using the exact same instrument to test and re-test students within four weeks could inflate post-treatment scores, especially if the students worked on similar problems during the camp. No data are provided confirming that the MARS tasks are suitable, in terms of technical quality, for use in estimating the summer camp’s effect. Nor do the authors demonstrate that the tasks are representative of the full range of math content that students are expected to master, which is essential to justify reporting students’ progress in terms of years of learning. Even the grade level of the tasks is unknown, although camp attendees spanned grades 5 to 7, and MARS offers three levels of tasks (novice, apprentice, and expert).

The study’s problems extend to its treatment of attrition from the treatment sample. For one of the participating school districts (#2), 47 students are reported enrolled, but the camp produces 234 test scores—a mystery that goes unexplained. When this district is omitted, the remaining nine districts are lacking pre- and post-test scores for about one-third of enrolled students, who presumably were absent on either the first or last day. The study reports attendance rates in each district as the percentage of students who attended 75 percent of the days or more, with the median district registering 84 percent. Four districts reported less than 70 percent of students meeting that attendance threshold. A conventional metric for attendance during a school year is that students who miss 10 percent of days are “chronically absent.” By that standard, attendance at the camps appears spotty at best, and in four of the 10 camps, quite poor.

These are serious weaknesses. Just as the camps serve as prototypes of the framework’s ideas about good curriculum and instruction, the studies of Youcubed summer camps are illustrative of what the framework considers compelling research. The studies do not meet minimal standards of causal evidence.

Photo of Brian Conrad
Brian Conrad, professor of mathematics at Stanford University, has analyzed the framework’s citations and documented many instances where the original findings of studies were distorted.

Research Omitted by the Framework

It is also informative to look at research that is not included in the California framework.

The What Works Clearinghouse, housed within the federal Institute of Education Sciences, publishes practice guides for educators. The guides aim to provide concise summaries of high-quality research on various topics. A panel of experts conducts a search of the research literature and screens studies for quality, following strict protocols. Experimental and quasi-experimental studies are favored because of their ability to estimate causal effects. The panel summarizes the results, linking each recommendation to supporting studies. The practice guides present the best scientifically sound evidence on causal relationships in teaching and learning.

How many of the studies cited in the practice guides are also cited in the framework? To find out, I searched the framework for citations to the studies cited by the four practice guides most relevant to K–12 math instruction. Here are the results:

Assisting Students Struggling with Mathematics: Intervention in the Elementary Grades
(2021) 0 out of 43 studies

Teaching Strategies for Improving Algebra Knowledge in Middle and High School Students
(2015, revised 2019) 0 out of 12 studies

Improving Mathematical Problem Solving in Grades 4 Through 8
(2012) 0 out of 37 studies

Developing Effective Fractions Instruction for Kindergarten Through 8th Grade
(2010) 1 out of 22 studies

Except for one study, involving teaching the number line to young children using games, the framework ignores the best research on K–12 mathematics. How could this happen?

One powerful clue: key recommendations in the practice guides directly refute the framework. Timed activities with basic facts, for example, are recommended to increase fluency, with the “Struggling Students” guide declaring “the expert panel assigned a strong level [emphasis original] of evidence to this recommendation based on 27 studies of the effectiveness of activities to support automatic retrieval of basic facts and fluid performance of other tasks involved in solving complex problems.” Calls for explicit or systematic instruction in the guides fly in the face of the inquiry methods endorsed in the framework. Worked examples, in which teachers guide students step by step from problem to solution, are encouraged in the guides but viewed skeptically by the framework for not allowing productive struggle.

Bumpy Road Ahead

The proposed California Math Framework not only ignores key expectations of the state’s math standards, but it also distorts or redefines them to serve a reform agenda. The standards call for students to know “from memory” basic addition facts by the end of 2nd grade and multiplication facts by the end of 3rd grade. But the framework refers to developing fluency with basic facts as a major topic of 4th through 6th grades. Fluency is redefined to disregard speed. Instruction on standard algorithms is delayed by interpreting the grades for culminating standards as the grades in which standard algorithms are first encountered. California’s students will be taught the standard algorithm for division years after the rest of the world.

The framework’s authors claim to base their recommendations on research, but it is unclear how—or even if—they conducted a literature search or what criteria they used to identify high-quality studies. The document serves as a manifesto for K–12 math reform, citing sources that support its arguments and ignoring those that do not, even if the omitted research includes the best scholarship on teaching and learning mathematics. Brian Conrad, professor of mathematics at Stanford University, has analyzed the framework’s citations and documented many instances where the original findings of studies were distorted. In some cases, the papers’ conclusions were the opposite of those presented in the framework.

The pandemic took a toll on math learning. To return to a path of achievement will require the effort of teachers, parents, and students. Unfortunately, if the state adopts the proposed framework in its current form, the document will offer little assistance in tackling the hard work ahead.

Tom Loveless, a former 6th-grade teacher and Harvard public policy professor, is an expert on student achievement, education policy, and reform in K–12 schools. He also was a member of the National Math Advisory Panel and U.S. representative to the General Assembly, International Association for the Evaluation of Educational Achievement, 2004–2012.

The post California’s New Math Framework Doesn’t Add Up appeared first on Education Next.

]]>
49716646
San Francisco’s Detracking Experiment https://www.educationnext.org/san-franciscos-detracking-experiment/ Tue, 29 Mar 2022 20:31:54 +0000 https://www.educationnext.org/?p=49715200 Course enrollments are a means to an end—student learning—not an end unto themselves.

The post San Francisco’s Detracking Experiment appeared first on Education Next.

]]>

Golden Gate Bridge

The San Francisco Unified School District (SFUSD) adopted a detracking initiative in the 2014–15 school year, eliminating accelerated middle and high school math classes, including the option for advanced students to take Algebra I in eighth grade. The policy stands today. High schools feature a common math sequence of heterogeneously-grouped classes studying Algebra I in ninth grade and Geometry in tenth grade. After tenth grade, students are allowed to take math courses reflecting different abilities and interests.

Implementing the Common Core was provided as the impetus for the change. When first proposed, district officials summed up the reform as, “There would no longer be honors or gifted mathematics classes, and there would no longer be Algebra I in eighth grade due to the Common Core State Standards in 8th grade.” Parents received a flyer from the district reinforcing this message, explaining, “The Common Core State Standards in Math (CCSS-M) require a change in the course sequence for mathematics in grades 6–12.” Phi Daro, one of Common Core’s coauthors, served as a consultant to the district on both the design and political strategy of the detracking plan.

The policy was controversial from the start. Parents showed up in community meetings to voice opposition, and a petition urging the district to reverse the change began circulating. District officials launched a public relations campaign to justify the policy. Focused on the goal of greater equity, that campaign continues today. SFUSD declared detracking a great success, claiming that the graduating class of 2018–19, the first graduating class affected by the policy when in eighth grade, saw a drop in Algebra 1 repeat rates from 40 percent to 8 percent and that, compared to the previous year, about 10 percent more students in the class took math courses beyond Algebra II. Moreover, the district reported enrollment gains by Black and Hispanic students in advanced courses.

Important publications applauded SFUSD and congratulated the district on the early evidence of success. Education Week ran a story in 2018, “A Bold Effort to End Tracking in Algebra Shows Promise,” that described the reforms with these words: “Part of an ambitious project to end the relentless assignment of underserved students into lower-level math, the city now requires all students to take math courses of equal rigor through geometry, in classrooms that are no longer segregated by ability.” The National Council of Teachers of Mathematics (NCTM) issued a policy brief portraying the detracking effort as a model for the country. Omitted from these reviews was the fact that the “lower-level math” to which non-algebra eighth-graders were assigned was Common Core Eighth Grade Math, which SFUSD and NCTM had spent a decade depicting as a rigorous math course, as they do currently.

Jo Boaler, noted math reformer, professor at Stanford, and critic of tracking, teamed up with Alan Schoenfeld, Phil Daro and others to write “How One City Got Math Right” for The Hechinger Report, and Boaler and Schoenfeld published an op-ed, “New Math Pays Dividends for SF Schools” in the San Francisco Chronicle.

In this public relations campaign, there was no mention of math achievement or test scores. Course enrollments and passing grades were presented as meaningful measures by which to measure the success of detracking.

They are bad measures. Course enrollments are a means to an end—student learning—not an end unto themselves. If a district enrolls students in courses that fail to teach important content, nothing has been accomplished. Boosting enrollment in advanced courses, therefore, is of limited value.[1] It’s also a statistic, along with grades, that is easily manipulated. No matter the school district, if word spreads that the superintendent would like to see more kids enrolled in higher math classes and fewer D and F grades in those classes, enrollments will go up and the number of D’s and F’s will go down.

Families for San Francisco

Families for San Francisco, a parent advocacy group, acquired data from the district under the California Public Records Act (the state’s version of the Freedom of Information Act). The group’s analysis calls into question the district’s assertions. As mentioned previously, repeat rates for Algebra I dropped sharply after the elimination of Algebra I in eighth grade, but whether the reform had anything to do with that is questionable. The falling repeat rate occurred after the district changed the rules for passing the course, eliminating a requirement that students pass a state-designed end of course exam in Algebra I before gaining placement in Geometry. In a presentation prepared by the district, speaker notes to the relevant slide admit, “The drop from 40 percent of students repeating Algebra 1 to 8 percent of students repeating Algebra 1, we saw as a one-time major drop due to both the change in course sequence and the change in placement policy.”

The claim that more students were taking “advanced math” classes (defined here as beyond Algebra II) also deserves scrutiny. Enrollment in calculus courses declined post-reform. The claim rests on a “compression” course the district offers, combining Algebra II and precalculus into a single-year course. The Families for San Francisco analysis shows that once the enrollment figures for the compression course are excluded, the enrollment gains evaporate. Why should they be excluded? The University of California rejected the district’s classification of the compression course as “advanced math,” primarily because the course topics fall short of content specifications for precalculus.

Smarter Balanced scores

The conventional way to measure achievement gaps—and progress towards closing them—is with scores on achievement tests. California students take the Smarter Balanced assessments in grades three through eight and in grade eleven. Following SFUSD’s analytical strategy, let’s compare scores from 2015, the last cohort of eleventh-graders under the previous policy, and 2019, the last cohort with pre-pandemic test scores.[2] Please be alerted, however, that both analyses, SFUSD’s and the one presented here, fall far short of supporting causal claims. The purpose of the current analysis is to illustrate that SFUSD’s public relations campaign omitted crucial information to determine what’s going on.

As displayed in Table 1, SFUSD’s scores for eleventh-grade mathematics remained flat from 2015 (scale score of 2611) to 2019 (scale score of 2610), moving only a single point. Table 1 shows the breakdown by racial and ethnic groups. Black students made a small gain (+2), Hispanic scores declined (-14), White students gained (+17), and Asian students registered the largest gains (+22).

Table 1. San Francisco Unified School District Smarter Balanced Scores, grade 11, 2015–19

Table 1

 

Table 2 offers some context for interpreting the scores. Smarter Balanced is vertically scaled so that scores can be compared across grades. On Smarter Balanced results from twelve states, the mean fifth-grade math score was 2498, well above the 2479 score for eleventh-grade Black students in SFUSD and the same as the 2498 score registered by eleventh-grade Hispanics students.[3] The mean Smarter Balanced sixth-grade score was 2515, well above the scores of both groups of eleventh graders in SFUSD.

Table 2. 2019 Smarter Balanced summative assessment scores, mathematics, by grade

Table 2

Summing up: Black and Hispanic eleventh-graders in San Francisco score about the same as or lower than the typical fifth-grader who took the same math test. Black eleventh-graders fall just short of the threshold for being considered proficient in fourth-grade math and well below the cut point for demonstrating fifth-grade proficiency. The situation is appalling.

Are test score gaps narrowing?

Contrary to the district’s spin, the trend towards greater equity is not headed in the right direction. Gaps are widening. Perhaps this trend is statewide and not just a SFUSD phenomenon.

Table 3 supplies the gap calculations from the data above in Table 1, along with a comparison to statewide trends. For example, at the state level, the eleventh-grade Black-White gap grew by 11 points—from 94 to 105—while in SFUSD, the gap expanded by 15 points (from 143 to 158). The Hispanic-White gap provides a more dramatic contrast. The state level gap grew by only 5 points, but in San Francisco, it expanded by a whopping 31 points. Glancing back at Table 2 again will provide context. The 31-point expansion is larger than the 20-point difference in mean scores for Smarter Balanced’s eighth-grade and high school assessments. That’s a big change.

With both gaps, SFUSD evidenced greater inequities than state averages in 2015, and that relative underperformance worsened by 2019. The district’s anti-tracking public relations campaign, by focusing on metrics such as grades and course enrollments, diverts attention from the harsh reality that SFUSD is headed in the wrong direction on equity.

Table 3. Black-White and Hispanic-White gap, grades 11, California and San Francisco, 2015–19, by Smarter Balanced scale scores, mathematics

Could the situation be even worse?

Finally, as bad as the preceding data look, the reality of the district’s poor math achievement is probably worse. SFUSD has exceptionally low rates of test participation on the state test, especially among Black and Hispanic students. Don’t forget: This is the test that state and district officials use for accountability purposes. Participation is mandated by both federal and state law. If the students who don’t take the test tend to be low achievers—usually a fair assumption—the district’s test score performance could fall even lower once those students are included.

Table 4. 11th-grade students tested as a percentage of students enrolled

Conclusion

San Francisco Unified School District embarked on a detracking initiative in 2015, followed by an extensive public relations campaign to portray the policy as having successfully narrowed achievement gaps. The campaign omitted assessment data indicating that the Black-White and Hispanic-White achievement gaps have widened, not narrowed, the exact opposite of the district’s intention and of the story the district was selling to the public. Only SFUSD possesses the data needed to conduct a formal evaluation that would credibly identify the causal factors producing such dismal results.

Whether detracking can assist in the quest for greater equity is an open question. It could, in fact, exacerbate inequities by favoring high achieving children from upper-income families—who can afford private sector workarounds—or with parents savvy enough to negotiate the bureaucratic hurdles SFUSD has erected to impede acceleration. As I have written elsewhere, the voluminous literature on tracking is better at describing problems than in solving them. The evidence that detracking promotes equity is sparse, mostly drawing on case studies that are restricted in terms of generalizability of findings to other settings and with research designs that do not support causal inferences.

If SFUSD would now approach tracking with an open mind, officials need not look far to discover equitable possibilities. Across the bay, David Card, a scholar at University of California, Berkeley, won the 2021 Nobel Prize in Economics for his research applying innovative econometrics to thorny public policy problems. Card’s recent studies, conducted with colleague Laura Giuliano, investigate tracking. In 2014, Card and Giuliano published a paper evaluating an urban district’s tracking program based on prior achievement. In particular, disadvantaged students and students of color benefitted from an accelerated curriculum, with no negative spillover effects for students pursuing the regular course of study. Card and Giuliano concluded, “Our findings suggest that a comprehensive tracking program that establishes a separate classroom in every school for the top‐performing students could significantly boost the performance of the most talented students in even the poorest neighborhoods, at little or no cost to other students or the District’s budget.”

Card and Giuliano’s current project studies two large urban districts in Florida, predominantly Black and Hispanic, that provide mathematically talented students with the opportunity to accelerate through middle school math courses. When these students enter high school, they will have already completed Algebra I and Geometry. They begin high school two years ahead of students in San Francisco, opening up greater opportunities to take Advanced Placement (AP) courses in later years.

Which system is more equitable?


Notes:

1. An analysis that I conducted in 2013 showed a steadily increasing percentage of students who had taken Algebra II; however, NAEP scores for students who had taken Algebra II also steadily declined concurrent with the increased enrollments.

2. California employs “Black or African American” and “Hispanic or Latino” as reporting categories. After Table 1, for the sake of clarity, the terms are shortened to “Black” and “Hispanic” in both tables and the narrative.

3. Using 2018 scores, the cohort of eleventh-graders first affected by detracked eighth-grade courses in 2015, would not change the analysis significantly except for one aspect: The achievement gaps associated with race and ethnicity were larger in 2018 because of higher scores for White students. 2018 scores were Asian (2682), Black or African American (2479), Hispanic or Latino (2497), White (2650).

Tom Loveless, a former sixth-grade teacher and Harvard public policy professor, is an expert on student achievement, education policy, and reform in K-12 schools. He also is a member of the National Math Advisory Panel.

From TomLoveless.com via the Fordham Flypaper.

The post San Francisco’s Detracking Experiment appeared first on Education Next.

]]>
49715200
A Decade On, Has Common Core Failed? https://www.educationnext.org/decade-on-has-common-core-failed-impact-national-standards-forum-polikoff-petrilli-loveless/ Tue, 14 Jan 2020 00:00:00 +0000 http://www.educationnext.org/decade-on-has-common-core-failed-impact-national-standards-forum-polikoff-petrilli-loveless/ Assessing the impact of national standards

The post A Decade On, Has Common Core Failed? appeared first on Education Next.

]]>


The Common Core State Standards, released in 2010, were rapidly adopted by more than 40 states. Champions maintained that these rigorous standards would transform American education, but the initiative went on to encounter a bumpy path. A decade on, what are we to make of this ambitious effort? What kind of impact, if any, has it had on the quality of instruction and student learning—or is it too early to say?

In this forum, three experts present their views on these questions: Morgan Polikoff, associate professor at the Rossier School of Education at the University of Southern California; Michael J. Petrilli, president of the Thomas B. Fordham Institute and an executive editor at Education Next; and Tom Loveless, past director of the Brown Center on Education Policy at the Brookings Institution and former policy professor at Harvard.

 

Common Standards Aren’t Enough

By Morgan S. Polikoff

 

 

 

 

Stay the Course on National Standards

By Michael J. Petrilli

 

 

 

 

Common Core Has Not Worked

By Tom Loveless

 

 

This article appeared in the Spring 2020 issue of Education Next. Suggested citation format:

Polikoff, M.S., Petrilli, M.J., and Loveless, T. (2020). A Decade On, Has Common Core Failed? Assessing the impact of national standards. Education Next, 20(2), 72-81.

The post A Decade On, Has Common Core Failed? appeared first on Education Next.

]]>
49710467
Common Core Has Not Worked https://www.educationnext.org/common-core-has-not-worked-forum-decade-on-has-common-core-failed/ Tue, 14 Jan 2020 00:00:00 +0000 http://www.educationnext.org/common-core-has-not-worked-forum-decade-on-has-common-core-failed/ Forum: A Decade On, Has Common Core Failed?

The post Common Core Has Not Worked appeared first on Education Next.

]]>

Education standards do not flop spectacularly. Their failure gives rise to nothing like the black-and-white films of early aeronautical experiments: no missiles exploding on launch pads or planes tumbling from the sky. But 10 years after 46 of the 50 states adopted the Common Core standards, the lack of evidence that they have improved student achievement is nonetheless remarkable. Despite the fact that Common Core enjoyed the bipartisan support of policy elites and commanded vast financial resources from both public and private sources, it simply did not accomplish what its supporters had intended. The standards wasted both time and money and diverted those resources away from more promising pursuits.

Three studies have now sought to examine the effects of Common Core and, more generally, “college- and career-ready” standards on student learning. The picture that emerges does not inspire confidence. The most recent study, conducted in 2019 by the federally funded Center on Standards, Alignment, Instruction, and Learning, or C-SAIL, found that college- and career-ready standards had negative effects on student performance on the National Assessment of Educational Progress, or NAEP, in both 4th-grade reading and 8th-grade math. A series of analyses that I conducted over several years revealed mixed effects from Common Core in states defined as “strong implementers” of the standards. And a 2017 study showed that adoption of Common Core standards did prompt many states to raise their performance benchmarks—that is, the minimum score at which students are judged as attaining “proficiency” on state tests. These higher proficiency bars, however, have not translated into higher student achievement. It is time to accept that Common Core didn’t fulfill its promise.

C-SAIL Study

C-SAIL’s 2019 study examined states’ average NAEP scores in 2010, the year in which most states adopted the standards, and in 2017. Researchers theorized that, among the states adopting Common Core, those that had weak standards before 2010 stood to incur the greatest gains, while those that had more rigorous standards in place before Common Core would experience the least change because they already had high expectations for students. Based on the Fordham Institute’s 2010 evaluations of state English language arts and math standards, the C-SAIL research team created a Prior Rigor Index, assigning states with weak standards to the “treatment” group and states with strong standards to the comparison group. (States scoring in the middle of Fordham’s rating scale were excluded from the analysis, to provide a sharper contrast, as were states adopting standards in any year other than 2010.)

In this analysis, researchers detected statistically significant negative effects in both 4th-grade reading and 8th-grade math.

The C-SAIL team conducted a second analysis using what they dubbed a Prior Similarity Index. A 2009 study by researchers at Michigan State University had determined that some states’ 2009 math standards were similar to Common Core in terms of focus and coherence, while other states’ standards were inferior on those qualities. The states with the “less similar” standards comprised the treatment group, since researchers assumed that Common Core imposed a substantial change in those places. States with prior math standards that were similar to Common Core’s were assigned to the comparison group.

This second analysis uncovered no statistically significant effects.

All of the estimated effects from both analyses are negative, with losses ranging from about 1.5 to 4 NAEP scale score points. The effects are also small, especially considering that they represent a policy unfolding over seven years. Consider these results in the context of the history of NAEP scores for the nation as a whole. Losses on NAEP are rare, but relatively large gains are common. NAEP advances of four or more points have been registered during short periods: 4th-grade reading (6 points, 2000–02), 8th-grade reading (4 points, 1994–98), 4th-grade math (9 points, 2000–03), and 8th-grade math (5 points, 2000–03).

Common Core supporters were understandably disappointed by these findings, but a particularly disheartening discovery was that the losses did not abate, and in fact, were still accumulating in 2015–17. It became harder for these advocates to urge patience and argue that Common Core’s positive impact would eventually emerge: the negative effects of Common Core were larger in 2017 than in any previous year.

Impact on State Proficiency Standards

While the evidence indicates that Common Core failed to improve academic achievement, the standards did prompt states to raise their benchmarks for student learning. In 2017, Jaekyung Lee and Yin Wu of the University of Buffalo-SUNY investigated the effects of Common Core on state proficiency standards for reading and math (that is, the minimum scores set for students to be identified as “proficient” on state tests) and student achievement. They found that Common Core states raised the proficiency bar more than non-adopting states during this period. Raising this standard makes it more difficult for students to score as proficient and thereby raises expectations. Echoing previous research, though, the researchers found that raising or lowering the proficiency bar was not associated with gains in student achievement on NAEP from 2009 to 2015. The authors caution: “Although it is premature to make any verdict on the impact of the CCSS [Common Core] on student achievement, the findings of this study as well as previous studies raise concerns about implementation challenges and limitations of the current CCSS-based education policies.”

Brown Center Report Studies

In 2014–16, I conducted a series of correlational analyses of Common Core, published by the Brookings Institution’s Brown Center Report on American Education. In 2018, I released a follow-up study. The goal of these studies was to take a look at whether Common Core was more effective in states that took implementation of the standards seriously.

My method was to compare test results in the states that rejected Common Core (non-adopters), with those in states that were “strong implementers” of the standards. I conducted two sets of comparisons with different criteria for identifying states as strong implementers.

The first group of strong implementers comprised states that in 2011 reported spending federal stimulus funds on three activities to support standards implementation: professional development, new instructional materials, and joining a testing consortium.

For the second set of comparisons, I designated as “strong implementers” the states with ambitious timelines for fully implementing Common Core “in classrooms.” These 11 states planned on full implementation by the end of the 2012–13 academic year. These criteria were designed to be dynamic. The composition of the groups changed over time with changes in state policy toward Common Core. After 2013, states that formally rescinded the standards were re-categorized as non-adopters for the
NAEP period in which the policy change occurred. Non-adopters grew to 10 states in 2017 from 5 states in 2013, and strong implementers declined to 8 states in 2017 from 11 states in 2013.

For this essay, I developed a third strategy for identifying strong implementers, based on whether in 2017 a state used either of the two assessments that were specifically developed to align with the Common Core standards: the Partnership for Assessment of Readiness for College and Careers test or the Smarter Balanced test. (I counted the three states that used some items from these tests in a hybrid state assessment—Louisiana, Massachusetts, and Michigan—as among those using a “Common Core” test.)

The premise of this strategy is that states using a prominent Common Core–aligned test in 2017 were publicly indicating a strong commitment to the standards. This model has the advantage of producing larger comparison groups than the other two—with 23 states using a Common Core test and 27 not.

Results of the comparisons are mixed (see Table 1). Some of the changes in NAEP performance associated with Common Core are positive and some are negative. These effects are also quite small—plus or minus about 2 NAEP scale score points. The results are more favorable toward Common Core than those of the C-SAIL study, especially in reading: the improvement in 4th-grade reading ranged from 0.2 scale score points to 2.4 points—but these findings agree with C-SAIL’s conclusion that only minimal changes in NAEP scores are associated with states embracing or rejecting Common Core.

Time to Cut Bait?

A decade after the release of the Common Core standards, the accumulated evidence reveals no meaningfully positive result. A limitation of this research is the difficulty of pinpointing precisely when Common Core should be considered fully implemented and of evaluating the fidelity of that implementation. Self-selection could also be a problem if unknown factors influenced states in adopting or rejecting Common Core and those factors subsequently influenced state NAEP scores. Yet the research to date on Common Core reinforces a larger body of evidence suggesting that academic-content standards bear scant relevance to student learning. In a recent blog post, Robert Slavin of Johns Hopkins University observes that “plentiful evidence from rigorous studies” indicates that adopting one set of standards over another “makes little difference in student achievement.” Slavin notes that of the dozens of favorable reviews of curricula posted by EdReports.org, a curriculum-evaluation organization that was founded to support Common Core implementation, only two programs with high ratings have any empirical evidence of effectiveness. Alignment with Common Core, not evidence of boosting student learning, is the first screen in the EdReports review process.

A curriculum-review process that gives greater weight to adherence to standards than to impact on learning is not identifying high-quality curricula; it is identifying conforming curricula. An example rich with irony can be found in the textbook series Math in Focus, which is based on the math standards of Singapore. Students in that nation consistently score near the top of international math assessments, and the authors of Common Core touted it as one of the countries whose standards they consulted in developing Common Core. In the early days of implementation, Common Core supporters pointed to Singapore math as ideal for implementing their vision of high-quality mathematics instruction. Math in Focus produced impressive learning gains in three rigorous studies of effectiveness that involved about 3,000 children.

But Math in Focus failed the EdReports review. How can that be? The textbook series moves students more quickly through elementary math than Common Core dictates. A common refrain in the EdReports reviews is that topics from later grades are introduced, taking the program out of alignment with the standards. A program with rigorous evidence of effectively teaching math is vetoed while programs with no evidence of boosting learning are endorsed because they are compatible with Common Core.

In short, the evidence suggests student achievement is, at best, about where it would have been if Common Core had never been adopted, if the billions of dollars spent on implementation had never been spent, if the countless hours of professional development inducing teachers to retool their lessons had never been imposed. When will time be up on the Common Core experiment? How many more years must pass, how much more should Americans spend, and how many more effective curricula must be pushed aside before leaders conclude that Common Core has failed?

This piece is part of a forum, “A Decade On, Has Common Core Failed?” For alternate takes, please see “Common Standards Aren’t Enough” by Morgan S. Polikoff, and “Stay the Course on National Standards” by Michael J. Petrilli.

This article appeared in the Spring 2020 issue of Education Next. Suggested citation format:

Polikoff, M.S., Petrilli, M.J., and Loveless, T. (2020). A Decade On, Has Common Core Failed? Assessing the impact of national standards. Education Next, 20(2), 72-81.

The post Common Core Has Not Worked appeared first on Education Next.

]]>
49710471
Racial Disparities in School Suspensions https://www.educationnext.org/racial-disparities-school-suspensions/ Wed, 29 Mar 2017 00:00:00 +0000 http://www.educationnext.org/racial-disparities-school-suspensions/ Future efforts at discipline reform must reflect fundamental fairness while also ensuring orderly schools and welcoming learning environments.

The post Racial Disparities in School Suspensions appeared first on Education Next.

]]>

The 2017 Brown Center Report (BCR) on American Education was released last week, and one of the report’s studies focuses on out-of-school suspensions. For the past several years, state education leaders in California have encouraged schools to reduce these exclusionary punishments. A major reason for doing so is that racial disparities associated with suspensions are glaring: Suspensions of African-American students occur at rates three to four times higher than the state average for all students.

Suspensions have declined dramatically. From 2012-2015, the number of suspensions in the state fell from 539,134 to 334,649, a decline of 37.9 percent. The decline has been evident among all major ethnic groups, so the racial disparities associated with this form of discipline have not disappeared. The BCR study calculated suspension rates as the number of suspensions involving a particular race divided by the student enrollment of that race. In 2015, the statewide African-American suspension rate was 17.8 percent, meaning 17.8 suspensions of African-Americans occurred for every 100 African-American students enrolled. The figure for Hispanics was 5.2 percent, for whites, 4.4 percent, and for Asians, 1.2 percent.

Where are high suspension rates for blacks more prevalent? The study sorted schools into two groups: High-suspension-rate schools, those with black suspension rates of 5 percent or higher, and low-suspension-rate schools, those with black suspension rates of less than 5 percent. Three school characteristics stood out as associated with higher rates: middle schools (as opposed to elementary or K-8 schools), large schools (especially if enrollment exceeds 1,300 students), and schools with a higher percentage of African-American enrollment (that pattern is seen in schools with greater than 16 percent African-American enrollment).

The study has implications for both research and policy. Today’s discipline reformers are promoting programs such as restorative justice interventions as alternatives to suspending students. Policymakers should also consider whether altering the structural characteristics of schools—reconfiguring large middle schools as smaller K-8 school, for example—may prove helpful in reducing suspensions.

Evaluations of discipline reform should not be limited to the impact on students who are at risk of being suspended, but also assess the impact of new approaches on school safety and quality of the instructional environment. A recent report from Max Eden of the Manhattan Institute analyzed survey data from New York City and concluded that discipline reform may be contributing to a deterioration in school climate. Schools with over 90 percent minority enrollment “experienced the worst climate shifts.” The cause of equity will be ill-served if suspensions of African-Americans are reduced, but black students who come to school ready to learn are increasingly exposed to unruly peers.

Factoring suspension rates into school accountability systems may prove challenging for policymakers. The state of California recently debuted its new “California School Dashboard,” a multiple-measure accountability system. Suspension rates are one of the performance indicators of school quality. Each indicator has five levels ranging from “very low” to “very high.” Schools are graded on both current performance (status) and improvement over time (change). They are also graded on a curve, with performance measured relative to other schools. In recognition of the association noted above—that suspension rates differ by the grade configuration of schools—performance levels are different for elementary, middle, and high schools. That could cause confusion. An elementary school and a middle school can receive different performance ratings even though they have identical suspension rates.

Race and school discipline is a thorny issue. A study by Education Week found that school resource officers, who essentially function as law enforcement personnel, are more likely to be deployed on campuses with large numbers of black students. As a result of the “zero tolerance” approach to school discipline that was popular in decades past, many African-American students attend schools where student behavior is under intense scrutiny. A 2017 study by researchers from the University of Texas and Stanford University found middle school a time when black students begin to believe they don’t get a fair shake when it comes to discipline. Future efforts at discipline reform must reflect fundamental fairness while also ensuring orderly schools and welcoming learning environments. That won’t be easy. More research, instead of legislation or regulation, is needed to assist local educators in tackling this daunting task.

— Tom Loveless

Tom Loveless is a senior fellow in Governance Studies at the Brookings Institution in Washington, D.C.

This post originally appeared on the Brown Center Chalkboard.

The post Racial Disparities in School Suspensions appeared first on Education Next.

]]>
49706280
The Strange Case of the Disappearing NAEP https://www.educationnext.org/the-strange-case-of-the-disappearing-naep/ Thu, 20 Oct 2016 00:00:00 +0000 http://www.educationnext.org/the-strange-case-of-the-disappearing-naep/ Why has NAEP abandoned its foundational assessment and embarked on a new agenda?

The post The Strange Case of the Disappearing NAEP appeared first on Education Next.

]]>
The long term trend test of the National Assessment of Educational Progress (LTT NAEP) is the longest running test of student achievement that provides a scientifically valid estimate of what American students have learned.  For over four decades, beginning with a science assessment in 1969, the LTT NAEP has tested randomly-selected groups of students age 9, 13, and 17 in several school subjects.  Reading and mathematics have been assessed the most. The reading test began in 1971 and the math test in 1973.[i] 

ednext-oct2016-blog-loveless-naepThe last LTT NAEP was given in 2012. It was scheduled to be given in 2016, but that assessment was cancelled because of budget cuts. It was then scheduled for 2020, but earlier this year–again for budgetary reasons–that assessment was also cancelled. Currently, the next LTT NAEP is scheduled for 2024. If it is indeed administered then—and there is no guarantee that it will be—twelve years will have passed between NAEP LTT tests. Up until 2012, the longest interval without a LTT NAEP was five years.

Researchers have questioned the wisdom of delaying the LTT NAEP for such a long period of time.[ii] Scores on the main NAEP—the other, better known NAEP assessment—have stagnated in recent years. In 2016, the LTT NAEP could have provided another authoritative measure of national achievement, at a time when Common Core and other education reforms are changing U.S. schooling. In fact, when the main NAEP was introduced in the 1990’s, it was designated the national assessment that would change periodically to reflect changes in curricular fashions.  The LTT was designated the national assessment that would remain unchanged by shifting sentiments.

NAEP’s schedule is set by the National Assessment Governing Board (NAGB). At its November 2015 meeting, the board issued a press release explaining that “budget constraints and priorities for the NAEP program” necessitated several cutbacks, including postponement of the LTT NAEP.[iii]

Three of NAGB’s top priorities are questionable:

DIGITAL ASSESSMENTS

All NAEP assessments are to be digitally-based by 2017. Digital assessments have merits, but NAGB has yet to release a cost-benefit study showing that exclusively using digital platforms is worth the cost.[iv] In addition, it’s quite likely, given the rapid advances of assessment technologies, that whatever NAGB decides to embrace today will be obsolete in a few years. The two key questions are: How much will going all-digital cost, not just now but in the immediate years ahead? What are the added benefits beyond paper and pencil tests that justify the exclusive use of digital assessments? The danger is that NAEP, with a severely constrained budget, is now on the hook for escalating costs that will produce negligible or ephemeral benefits.

TRIAL URBAN DISTRICT ASSESSMENT (TUDA)

Administered every two years in math and reading, the main NAEP oversamples in 21 large urban districts and releases the scores separately as TUDA. Is every two years necessary? The districts don’t depend on NAEP to tell them how students are doing; they already take part in annual state assessments. Moreover, even without TUDA, it’s possible to compare big city students’ NAEP scores from state to state. You can compare the performance of schools in New York state’s big cities to the performance of big city students in California. But you can’t disaggregate by city, to compare, for example, New York City students’ reading performance to that of students in Los Angeles. So the only true benefit of TUDA is for its 21 districts to compare themselves to other TUDA districts that are not in their own state. This seems like a pretty slim reward. Doesn’t it make sense to administer TUDA less frequently and to redeploy those funds so that we can compare U.S. students’ reading performance in 2016 to reading performance in 1971?

TECHNOLOGY AND ENGINEERING LITERACY (TEL)

TEL is NAEP’s newest assessment. After several years in development, it was first given to eighth graders in 2014. The word “literacy” should set off alarm bells. As educational jargon, the term extends far beyond the traditional definition of being able to read and write. When appended to academic subjects other than reading, “literacy” de-emphasizes disciplinary knowledge in favor of applications and problem-solving. NAEP’s Technology and Engineering Literacy assessment is a perfect example.

The test is completely computer-based. It seeks to tap “21st Century learning skills” such as collaboration and communication. Students are presented with scenarios and asked to solve particular problems. NAEP has produced a series of videos to explain TEL. In one scenario, students are asked to play the role of an engineer and help a remote village where the water well is no longer working. Viewers are told, “The student is not expected to have any prior knowledge of wells or hand pumps but is challenged to use the information provided to get to the root of the problem.” The narrator goes on to explain, “The point of a task like this is to measure how students apply real-world trouble-shooting skills.”

The scenario is beautifully crafted and engaging. Students will enjoy the task. But what is really being measured here? Does the item assess key knowledge and skills of engineering? NAEP already has a lot of problem-solving items in its math and science tests, so for the TEL to generate unique information it needs to target the non-mathematical, non-scientific side of technology and engineering. Are engineers who don’t know math and science in demand? Moreover, few eighth graders have actually taken a formal course in technology or engineering, so background information that real engineers would consider elementary must either be provided to the student or avoided altogether.

Questions concerning transfer always come up with items like the water pump problem. For students who do well on the item, does it mean they possess skill at solving a wide variety of engineering problems that students who do poorly do not possess?  Or does it merely mean they could figure out the solution to this particular problem because the right information was given to them? The former may be important; the latter clearly is not—and not worth monitoring with a national test.  Solving problems without content knowledge is a dubious activity.[v]

CONCLUSION

NAGB has postponed the LTT NAEP until 2024 because of budgetary constraints.  In the meantime, it has pursued other projects, among them: making NAEP all-digital by 2017, continuing the TUDA assessment every two years, and launching the TEL assessment.  The LTT NAEP is a national treasure. It is the only test that measures trends in American academic achievement back to the early 1970s. It is also NAEP’s original reason for existence.  But some question its value.[vi]  If NAGB has decided that the LTT NAEP is no longer worthy of funding at all, that the LTT should be abandoned permanently, it has been silent on the rationale for that decision.

The three priorities that have taken precedence over the LTT NAEP incur significant costs and may produce benefits that fall short of justifying the expense. Properly evaluating NAGB’s priorities requires evidence. But NAGB hasn’t released figures on how much the new projects cost, how much the LTT NAEP costs, all of the projects’ expected costs in upcoming fiscal years, and the informational benefits each of the projects are expected to yield.  Benefit-cost analyses are a conventional component of policy deliberations.

This is an election year. With a new administration and a new Congress coming to power in January, NAGB can begin to address its budgetary problems by releasing information that Congressional committees will want to see. Why NAEP has abandoned its foundational assessment and embarked on its current agenda should be the central question of NAEP’s request for funding next year.

– Tom Loveless

Tom Loveless is a nonresident senior fellow at Brookings. This first appeared on the Brown Center Chalkboard.


[i] https://nces.ed.gov/nationsreportcard/ltt/interpreting_results.aspx 

[ii] See Kristin Blagg and Mattew M. Chingos, “Varsity Blues: Are High Schools Students Being Left Behind?” (Urban Institute, 2016) and Nat Malkus, “Taking too long on the NAEP long term trend assessments,” (American Enterprise Institute, 2015).

[iii] Also see the resolution urging increased NAEP funding adopted in August 2015: https://www.nagb.org/content/nagb/assets/documents/policies/NAEP%20Funding%20Resolution%20Approved%208.8.15.pdf

[iv] NCES offers a website devoted to NAEP’s transition to digitally based assessments: https://nces.ed.gov/nationsreportcard/dba/

[v] The question of whether problem solving skills can be divorced from content knowledge has been debated for decades. For a discussion regarding mathematics, see: Jamin Carson, “A Problem with Problem Solving: Teaching Thinking without Teaching Knowledge,” The Mathematics Educator, v. 17, no. 2 (2007), pp. 7-14.

[vi] In an Education Week story on the LTT NAEP, University of Illinois professor Sarah Lubienski described the LTT NAEP as a test of basic skills and dismissed the assessment’s importance.

The post The Strange Case of the Disappearing NAEP appeared first on Education Next.

]]>
49705471
The NAEP Proficiency Myth https://www.educationnext.org/naep-proficiency-myth/ Mon, 20 Jun 2016 00:00:00 +0000 http://www.educationnext.org/naep-proficiency-myth/ NAEP proficient is not synonymous with grade level. It is a standard set much higher than that.

The post The NAEP Proficiency Myth appeared first on Education Next.

]]>
On May 16, I got into a Twitter argument with Campbell Brown of The 74, an education website. She released a video on Slate giving advice to the next president. The video begins: “Without question, to me, the issue is education. Two out of three eighth graders in this country cannot read or do math at grade level.” I study student achievement and was curious. I know of no valid evidence to make the claim that two out of three eighth graders are below grade level in reading and math. No evidence was cited in the video. I asked Brown for the evidentiary basis of the assertion. She cited the National Assessment of Educational Progress (NAEP).

ednext-blog-loveless-june16-naepNAEP does not report the percentage of students performing at grade level. NAEP reports the percentage of students reaching a “proficient” level of performance. Here’s the problem. That’s not grade level.

In this post, I hope to convince readers of two things:

1. Proficient on NAEP does not mean grade level performance. It’s significantly above that.
2. Using NAEP’s proficient level as a basis for education policy is a bad idea.

Before going any further, let’s look at some history.

NAEP history

NAEP was launched nearly five decades ago. The first NAEP test was given in science in 1969, followed by a reading test in 1971 and math in 1973. For the first time, Americans were able to track the academic progress of the nation’s students. That set of assessments, which periodically tests students 9, 13, and 17 years old and was last given in 2012, is now known as the Long Term Trend (LTT) NAEP.

It was joined by another set of NAEP tests in the 1990s. The Main NAEP assesses students by grade level (fourth, eighth, and twelfth) and, unlike the LTT, produces not only national but also state scores. The two tests, LTT and main, continue on parallel tracks today, and they are often confounded by casual NAEP observers. The main NAEP, which was last administered in 2015, is the test relevant to this post and will be the only one discussed hereafter. The NAEP governing board was concerned that the conventional metric for reporting results (scale scores) was meaningless to the public, so achievement standards (also known as performance standards) were introduced. The percentage of students scoring at advanced, proficient, basic, and below basic levels are reported each time the main NAEP is given.

Does NAEP proficient mean grade level?

The National Center for Education Statistics (NCES) states emphatically, “Proficient is not synonymous with grade level performance.” The National Assessment Governing Board has a brochure with information on NAEP, including a section devoted to myths and facts. There, you will find this:

Myth: The NAEP Proficient level is like being on grade level.

Fact: Proficient on NAEP means competency over challenging subject matter. This is not the same thing as being “on grade level,” which refers to performance on local curriculum and standards. NAEP is a general assessment of knowledge and skills in a particular subject.

Equating NAEP proficiency with grade level is bogus. Indeed, the validity of the achievement levels themselves is questionable. They immediately came under fire in reviews by the U.S. Government Accountability Office, the National Academy of Sciences, and the National Academy of Education. [1] The National Academy of Sciences report was particularly scathing, labeling NAEP’s achievement levels as “fundamentally flawed.”

Despite warnings of NAEP authorities and critical reviews from scholars, some commentators, typically from advocacy groups, continue to confound NAEP proficient with grade level. Organizations that support school reform, such as Achieve Inc. and Students First, prominently misuse the term on their websites. Achieve presses states to adopt cut points aligned with NAEP proficient as part of new Common Core-based accountability systems. Achieve argues that this will inform parents whether children “can do grade level work.” No, it will not. That claim is misleading.

How unrealistic is NAEP proficient?

Shortly after NCLB was signed into law, Robert Linn, one of the most prominent psychometricians of the past several decades, called ”the target of 100% proficient or above according to the NAEP standards more like wishful thinking than a realistic possibility.” History is on the side of that argument. When the first main NAEP in mathematics was given in 1990, only 13 % of eighth graders scored proficient and 2 % scored advanced. Imagine using “proficient” as synonymous with grade level—85 % scored below grade level!

The 1990 national average in eighth grade scale scores was 263 (see Table 1). In 2015, the average was 282, a gain of 19 scale score points.

Table 1. Main NAEP Eighth Grade Math Score, by achievement levels, 1990-2015

Year Scale Score Average Below Basic (%) Basic Proficient Advanced Proficient and Above
2015 282 29 38 25 8 33
2009 283 27 39 26 8 34
2003 278 32 39 23 5 28
1996 270 39 38 20 4 24
1990 263 48 37 13 2 15

That’s an impressive gain. Analysts who study NAEP often use 10 points on the NAEP scale as a back of the envelope estimate of one year’s worth of learning. Eighth graders have gained almost two years. The percentage of students scoring below basic has dropped from 48% in 1990 to 29% in 2015. The percentage of students scoring proficient or above has more than doubled, from 15% to 33%. That’s not bad news; it’s good news.

But the cut point for NAEP proficient is 299. By that standard, two-thirds of eighth graders are still falling short. Even students in private schools, despite hailing from more socioeconomically advantaged homes and in some cases being selectively admitted by schools, fail miserably at attaining NAEP proficiency. More than half (53 percent) are below proficient.

Today’s eighth graders have made it about half-way to NAEP proficient in 25 years, but they still need to gain almost two more years of math learning (17 points) to reach that level. And, don’t forget, that’s just the national average, so even when that lofty goal is achieved, half of the nation’s students will still fall short of proficient. Advocates of the NAEP proficient standard want it to be for all students. That is ridiculous. Another way to think about it: proficient for today’s eighth graders reflects approximately what the average twelfth grader knew in mathematics in 1990. Someday the average eighth grader may be able to do that level of mathematics. But it won’t be soon, and it won’t be every student.

In the 2007 Brown Center Report on American Education, I questioned whether NAEP proficient is a reasonable achievement standard. [2] That year, a study by Gary Phillips of American Institutes for Research was published that projected the 2007 TIMSS scores on the NAEP scale. Phillips posed the question: based on TIMSS, how many students in other countries would score proficient or better on NAEP? The study’s methodology only produces approximations, but they are eye-popping.

Here are just a few countries:

Table 2. Projected Percent NAEP Proficient, Eighth Grade Math

Singapore 73
Hong Kong SAR 66
Korea, Rep. of 65
Chinese Taipei 61
Japan 57
Belgium (Flemish) 40
United States 26
Israel 24
England 22
Italy 17
Norway 9

Singapore was the top scoring nation on TIMSS that year, but even there, more than a quarter of students fail to reach NAEP proficient. Japan is not usually considered a slouch on international math assessments, but 43% of its eighth graders fall short. The U.S. looks weak, with only 26% of students proficient. But England, Israel, and Italy are even weaker. Norway, a wealthy nation with per capita GDP almost twice that of the U.S., can only get 9 out of 100 eighth graders to NAEP proficient.

Finland isn’t shown in the table because it didn’t participate in the 2007 TIMSS. But it did in 2011, with Finland and the U.S. scoring about the same in eighth grade math. Had Finland’s eighth graders taken NAEP in 2011, it’s a good bet that the proportion scoring below NAEP proficient would have been similar to that in the U.S. And yet articles such as “Why Finland Has the Best Schools,” appear regularly in the U.S. press. [3]

Why it matters

The National Center for Education Statistics warns that federal law requires that NAEP achievement levels be used on a trial basis until the Commissioner of Education Statistics determines that the achievement levels are “reasonable, valid, and informative to the public.” As the NCES website states, “So far, no Commissioner has made such a determination, and the achievement levels remain in a trial status. The achievement levels should continue to be interpreted and used with caution.”

Confounding NAEP proficient with grade-level is uninformed. Designating NAEP proficient as the achievement benchmark for accountability systems is certainly not cautious use. If high school students are required to meet NAEP proficient to graduate from high school, large numbers will fail. If middle and elementary school students are forced to repeat grades because they fall short of a standard anchored to NAEP proficient, vast numbers will repeat grades.

On NAEP, students are asked the highest level math course they’ve taken. On the 2015 twelfth grade NAEP, 19% of students said they either were taking or had taken calculus. These are the nation’s best and the brightest, the crème-de la crème of math students. Only one in five students work their way that high up the hierarchy of American math courses. If you are over 45 years old and reading this, the proportion who took calculus in high school is less than one out of ten. In the graduating class of 1990, for instance, only 7% of students had taken calculus. [4]

Unsurprisingly, calculus students are also typically taught by the nation’s most knowledgeable math teachers. The nation’s elite math students paired with the nation’s elite math teachers: if any group can prove NAEP proficient a reasonable goal and succeed in getting all students over the NAEP proficiency bar, this is the group.

But they don’t. A whopping 30% score below proficient on NAEP. For black and Hispanic calculus students, the figures are staggering. Two-thirds of black calculus students score below NAEP proficient. For Hispanics, the figure is 52%. The nation’s pre-calculus students also fair poorly (69% below proficient). Then the success rate falls off a cliff. In the class of 2015, more than nine out of ten students whose highest math course was Trigonometry or Algebra II fail to meet the NAEP proficient standard.

Table 3. 2015 NAEP Twelfth Grade Math, Proficient by Highest Math Course Taken

Highest Math Course Taken Percentage Below NAEP Proficient
Calculus 30
Pre-calculus 69
Trig/Algebra II 92

Source: NAEP Data Explorer

These data defy reason; they also refute common sense. For years, educators have urged students to take the toughest courses they can possibly take. Taken at face value, the data in Table 3 rip the heart out of that advice. These are the toughest courses, and yet huge numbers of the nation’s star students, by any standard aligned with NAEP proficient, would be told that they have failed. Some parents, misled by the confounding of proficient with grade level, might even mistakenly believe that their kids don’t know grade level math.

Conclusion

NAEP proficient is not synonymous with grade level. NAEP officials urge that proficient not be interpreted as reflecting grade level work. It is a standard set much higher than that. Scholarly panels have reviewed the NAEP achievement standards and found them flawed. The highest scoring nations of the world would appear to be mediocre or poor performers if judged by the NAEP proficient standard. Even large numbers of U.S. calculus students fall short.

As states consider building benchmarks for student performance into accountability systems, they should not use NAEP proficient—or any standard aligned with NAEP proficient—as a benchmark. It is an unreasonable expectation, one that ill serves America’s students, parents, and teachers–and the effort to improve America’s schools.

—Tom Loveless

This post originally appeared on the Brown Center Chalkboard

Chester E. Finn, Jr. has written a response, “The Value of NAEP Achievement Levels.”


Notes:
[1] Shepard, L. A., Glaser, R., Linn, R., & Bohrnstedt, G. (1993) Setting Performance Standards For Student Achievement: Background Studies. Report of the NAE Panel on the Evaluation of the NAEP Trial State Assessment: An Evaluation of the 1992 Achievement Levels. National Academy of Education.
[2] Loveless, Tom. The 2007 Brown Center Report, pages 10-13.
[3] William Doyle, “Why Finland Has The Best Schools,” Los Angeles Times, March 18, 2016.
[4] NCES, America’s High School Graduates: Results of the 2009 NAEP High School Transcript Study. See Table 8, p. 49.

The post The NAEP Proficiency Myth appeared first on Education Next.

]]>
49704763
Common Core’s Major Political Challenges for the Remainder of 2016 https://www.educationnext.org/common-cores-major-political-challenges-for-the-remainder-of-2016/ Tue, 05 Apr 2016 00:00:00 +0000 http://www.educationnext.org/common-cores-major-political-challenges-for-the-remainder-of-2016/ Common Core is now several years into implementation. Supporters have had a difficult time persuading skeptics that any positive results have occurred. The best evidence has been mixed on that question.

The post Common Core’s Major Political Challenges for the Remainder of 2016 appeared first on Education Next.

]]>
The 2016 Brown Center Report (BCR), which was published last week, presented a study of Common Core State Standards (CCSS). In this post, I’d like to elaborate on a topic touched upon but deserving further attention: what to expect in Common Core’s immediate political future. I discuss four key challenges that CCSS will face between now and the end of the year.

ednext-blog-april16-loveless-optoutLet’s set the stage for the discussion. The BCR study produced two major findings. First, several changes that CCSS promotes in curriculum and instruction appear to be taking place at the school level. Second, states that adopted CCSS and have been implementing the standards have registered about the same gains and losses on NAEP as states that either adopted and rescinded CCSS or never adopted CCSS in the first place. These are merely associations and cannot be interpreted as saying anything about CCSS’s causal impact. Politically, that doesn’t really matter. The big story is that NAEP scores have been flat for six years, an unprecedented stagnation in national achievement that states have experienced regardless of their stance on CCSS. Yes, it’s unfair, but CCSS is paying a political price for those disappointing NAEP scores. No clear NAEP differences have emerged between CCSS adopters and non-adopters to reverse that political dynamic.

TIMSS and PISA scores in November-December

NAEP has two separate test programs. The scores released in 2015 were for the main NAEP, which began in 1990. The long term trend (LTT) NAEP, a different test that was first given in 1969, has not been administered since 2012. It was scheduled to be given in 2016, but was cancelled due to budgetary constraints. It was next scheduled for 2020, but last fall officials cancelled that round of testing as well, meaning that the LTT NAEP won’t be given again until 2024.

With the LTT NAEP on hold, only two international assessments will soon offer estimates of U.S. achievement that, like the two NAEP tests, are based on scientific sampling: PISA and TIMSS. Both tests were administered in 2015, and the new scores will be released around the Thanksgiving-Christmas period of 2016. If PISA and TIMSS confirm the stagnant trend in U.S. achievement, expect CCSS to take another political hit. America’s performance on international tests engenders a lot of hand wringing anyway, so the reaction to disappointing PISA or TIMSS scores may be even more pronounced than what the disappointing NAEP scores generated.

Is teacher support still declining?

Watch Education Next’s survey on Common Core (usually released in August/September) and pay close attention to teacher support for CCSS. The trend line has been heading steadily south. In 2013, 76 percent of teachers said they supported CCSS and only 12 percent were opposed. In 2014, teacher support fell to 43 percent and opposition grew to 37 percent. In 2015, opponents outnumbered supporters for the first time, 50 percent to 37 percent. Further erosion of teacher support will indicate that Common Core’s implementation is in trouble at the ground level. Don’t forget: teachers are the final implementers of standards.

An effort by Common Core supporters to change NAEP

The 2015 NAEP math scores were disappointing. Watch for an attempt by Common Core supporters to change the NAEP math tests. Michael Cohen, President of Achieve, a prominent pro-CCSS organization, released a statement about the 2015 NAEP scores that included the following: “The National Assessment Governing Board, which oversees NAEP, should carefully review its frameworks and assessments in order to ensure that NAEP is in step with the leadership of the states. It appears that there is a mismatch between NAEP and all states’ math standards, no matter if they are common standards or not.”

Reviewing and potentially revising the NAEP math framework is long overdue. The last adoption was in 2004. The argument for changing NAEP to place greater emphasis on number and operations, revisions that would bring NAEP into closer alignment with Common Core, also has merit. I have a longstanding position on the NAEP math framework. In 2001, I urged the National Assessment Governing Board (NAGB) to reject the draft 2004 framework because it was weak on numbers and operations—and especially weak on assessing student proficiency with whole numbers, fractions, decimals, and percentages.

Common Core’s math standards are right in line with my 2001 complaint. Despite my sympathy for Common Core advocates’ position, a change in NAEP should not be made because of Common Core. In that 2001 testimony, I urged NAGB to end the marriage of NAEP with the 1989 standards of the National Council of Teachers of Mathematics, the math reform document that had guided the main NAEP since its inception. Reform movements come and go, I argued. NAGB’s job is to keep NAEP rigorously neutral. The assessment’s integrity depends upon it. NAEP was originally intended to function as a measuring stick, not as a PR device for one reform or another. If NAEP is changed it must be done very carefully and should be rooted in the mathematics children must learn. The political consequences of it appearing that powerful groups in Washington, DC are changing “The Nation’s Report Card” in order for Common Core to look better will hurt both Common Core and NAEP.

Will Opt Out grow?

Watch the Opt Out movement. In 2015, several organized groups of parents refused to allow their children to take Common Core tests. In New York state alone, about 60,000 opted out in 2014, skyrocketing to 200,000 in 2015. Common Core testing for 2016 begins now and goes through May. It will be important to see whether Opt Out can expand to other states, grow in numbers, and branch out beyond middle- and upper-income neighborhoods.

Conclusion

Common Core is now several years into implementation. Supporters have had a difficult time persuading skeptics that any positive results have occurred. The best evidence has been mixed on that question. CCSS advocates say it is too early to tell, and we’ll just have to wait to see the benefits. That defense won’t work much longer. Time is running out. The political challenges that Common Core faces the remainder of this year may determine whether it survives.

—Tom Loveless

This post originally appeared on the Brown Center Chalkboard.

The post Common Core’s Major Political Challenges for the Remainder of 2016 appeared first on Education Next.

]]>
49704447
Has Common Core Influenced Instruction? https://www.educationnext.org/has-common-core-influenced-instruction/ Tue, 01 Dec 2015 00:00:00 +0000 http://www.educationnext.org/has-common-core-influenced-instruction/ Advocates of the Common Core hope that the standards will eventually produce long term positive effects as educators learn how to use them. That’s a reasonable hypothesis. But it should now be apparent that a counter-hypothesis has equal standing: any positive effect of adopting Common Core may have already occurred.

The post Has Common Core Influenced Instruction? appeared first on Education Next.

]]>
The release of 2015 NAEP scores showed national achievement stalling out or falling in reading and mathematics. The poor results triggered speculation about the effect of Common Core State Standards (CCSS), the controversial set of standards adopted by more than 40 states since 2010. Critics of Common Core tended to blame the standards for the disappointing scores. Its defenders said it was too early to assess CCSS’s impact and that implementation would take many years to unfold. William J. Bushaw, executive director of the National assessment Governing Board, cited “curricular uncertainty” as the culprit. Secretary of Education Arne Duncan argued that new standards typically experience an “implementation dip” in the early days of teachers actually trying to implement them in classrooms.

In the rush to argue whether CCSS has positively or negatively affected American education, these speculations are vague as to how the standards boosted or depressed learning. They don’t provide a description of the mechanisms, the connective tissue, linking standards to learning. Bushaw and Duncan come the closest, arguing that the newness of CCSS has created curriculum confusion, but the explanation falls flat for a couple of reasons. Curriculum in the three states that adopted the standards, rescinded them, then adopted something else should be extremely confused. But the 2013-2015 NAEP changes for Indiana, Oklahoma, and South Carolina were a little bit better than the national figures, not worse.[i] In addition, surveys of math teachers conducted in the first year or two after the standards were adopted found that: a) most teachers liked them, and b) most teachers said they were already teaching in a manner consistent with CCSS.[ii] They didn’t mention uncertainty. Recent polls, however, show those positive sentiments eroding. Mr. Bushaw might be mistaking disenchantment for uncertainty.[iii]

For teachers, the novelty of CCSS should be dissipating. Common Core’s advocates placed great faith in professional development to implement the standards. Well, there’s been a lot of it. Over the past few years, millions of teacher-hours have been devoted to CCSS training. Whether all that activity had a lasting impact is questionable. Randomized control trials have been conducted of two large-scale professional development programs. Interestingly, although they pre-date CCSS, both programs attempted to promote the kind of “instructional shifts” championed by CCSS advocates. The studies found that if teacher behaviors change from such training—and that’s not a certainty—the changes fade after a year or two. Indeed, that’s a pattern evident in many studies of educational change: a pop at the beginning, followed by fade out.

My own work analyzing NAEP scores in 2011 and 2013 led me to conclude that the early implementation of CCSS was producing small, positive changes in NAEP.[iv] I warned that those gains “may be as good as it gets” for CCSS.[v] Advocates of the standards hope that CCSS will eventually produce long term positive effects as educators learn how to use them. That’s a reasonable hypothesis. But it should now be apparent that a counter-hypothesis has equal standing: any positive effect of adopting Common Core may have already occurred. To be precise, the proposition is this: any effects from adopting new standards and attempting to change curriculum and instruction to conform to those standards occur early and are small in magnitude. Policymakers still have a couple of arrows left in the implementation quiver, accountability being the most powerful. Accountability systems have essentially been put on hold as NCLB sputtered to an end and new CCSS tests appeared on the scene. So the CCSS story isn’t over. Both hypotheses remain plausible.

Reading Instruction in 4th and 8th Grades

Back to the mechanisms, the connective tissue binding standards to classrooms. The 2015 Brown Center Report introduced one possible classroom effect that is showing up in NAEP data: the relative emphasis teachers place on fiction and nonfiction in reading instruction. The ink was still drying on new Common Core textbooks when a heated debate broke out about CCSS’s recommendation that informational reading should receive greater attention in classrooms.[vi]

Fiction has long dominated reading instruction. That dominance appears to be waning.

Click to enlarge
Click to enlarge
Click to enlarge
Click to enlarge

After 2011, something seems to have happened. I am more persuaded that Common Core influenced the recent shift towards nonfiction than I am that Common Core has significantly affected student achievement—for either good or ill. But causality is difficult to confirm or to reject with NAEP data, and trustworthy efforts to do so require a more sophisticated analysis than presented here.

Four lessons from previous education reforms

Nevertheless, the figures above reinforce important lessons that have been learned from previous top-down reforms. Let’s conclude with four:

1. There seems to be evidence that CCSS is having an impact on the content of reading instruction, moving from the dominance of fiction over nonfiction to near parity in emphasis. Unfortunately, as Mark Bauerlein and Sandra Stotsky have pointed out, there is scant evidence that such a shift improves children’s reading.[vii]

2. Reading more nonfiction does not necessarily mean that students will be reading higher quality texts, even if the materials are aligned with CCSS. The Core Knowledge Foundation and the Partnership for 21st Century Learning, both supporters of Common Core, have very different ideas on the texts schools should use with the CCSS.[viii] The two organizations advocate for curricula having almost nothing in common.

3. When it comes to the study of implementing education reforms, analysts tend to focus on the formal channels of implementation and the standard tools of public administration—for example, intergovernmental hand-offs (federal to state to district to school), alignment of curriculum, assessment and other components of the reform, professional development, getting incentives right, and accountability mechanisms. Analysts often ignore informal channels, and some of those avenues funnel directly into schools and classrooms.[ix] Politics and the media are often overlooked. Principals and teachers are aware of the politics swirling around K-12 school reform. Many educators undoubtedly formed their own opinions on CCSS and the fiction vs. nonfiction debate before the standard managerial efforts touched them.

4. Local educators whose jobs are related to curriculum almost certainly have ideas about what constitutes good curriculum. It’s part of the profession. Major top-down reforms such as CCSS provide local proponents with political cover to pursue curricular and instructional changes that may be politically unpopular in the local jurisdiction. Anyone who believes nonfiction should have a more prominent role in the K-12 curriculum was handed a lever for promoting his or her beliefs by CCSS. I’ve previously called these the “dog whistles” of top-down curriculum reform, subtle signals that give local advocates license to promote unpopular positions on controversial issues.

—Tom Loveless

This post originally appeared on the Brown Center Chalkboard.


Notes:
[i] In the four subject-grade combinations assessed by NAEP (reading and math at 4th and 8th grades), IN, SC, and OK all exceeded national gains on at least three out of four tests from 2013-2015. NAEP data can be analyzed using the NAEP Data Explorer: http://nces.ed.gov/nationsreportcard/naepdata/.
[ii] In a Michigan State survey of teachers conducted in 2011, 77 percent of teachers, after being presented with selected CCSS standards for their grade, thought they were the same as their state’s former standards. http://education.msu.edu/epc/publications/documents/WP33ImplementingtheCommonCoreStandardsforMathematicsWhatWeknowaboutTeacherofMathematicsin41S.pdf
[iii] In the Education Next surveys, 76 percent of teachers supported Common Core in 2013 and 12 percent opposed. In 2015, 40 percent supported and 50 percent opposed. https://www.educationnext.org/2015-ednext-poll-school-reform-opt-out-common-core-unions.
[iv] I used variation in state implementation of CCSS to assign the states to three groups and analyzed differences of the groups’ NAEP gains
[v] http://www.brookings.edu/~/media/research/files/reports/2015/03/bcr/2015-brown-center-report_final.pdf
[vi] http://www.edweek.org/ew/articles/2012/11/14/12cc-nonfiction.h32.html?qs=common+core+fiction
[vii] Mark Bauerlein and Sandra Stotsky (2012). “How Common Core’s ELA Standards Place College Readiness at Risk.” A Pioneer Institute White Paper.
[viii] Compare the P21 Common Core Toolkit (http://www.p21.org/our-work/resources/for-educators/1005-p21-common-core-toolkit) with Core Knowledge ELA Sequence (http://www.coreknowledge.org/ccss). It is hard to believe that they are talking about the same standards in references to CCSS.
[ix] I elaborate on this point in Chapter 8, “The Fate of Reform,” in The Tracking Wars: State Reform Meets School Policy (Brookings Institution Press, 1999).

The post Has Common Core Influenced Instruction? appeared first on Education Next.

]]>
49703929