This is the story of educational romanticism in elementary and secondary schools —its rise, its etiology, and, we have reason to hope, its approaching demise.
Educational romanticism consists of the belief that just about all children who are not doing well in school have the potential to do much better. Correlatively, educational romantics believe that the academic achievement of children is determined mainly by the opportunities they receive; that innate intellectual limits (if they exist at all) play a minor role; and that the current K-12 schools have huge room for improvement.
Educational romanticism characterizes reformers of both Left and Right, though in different ways. Educational romantics of the Left focus on race, class, and gender. It is children of color, children of poor parents, and girls whose performance is artificially depressed, and their academic achievement will blossom as soon as they are liberated from the racism, classism, and sexism embedded in American education. Those of the Right see public education as an ineffectual monopoly, and think that educational achievement will blossom when school choice liberates children from politically correct curricula and obdurate teachers’ unions.
In public discourse, the leading symptom of educational romanticism is silence on the role of intellectual limits even when the topic screams for their discussion. Try to think of the last time you encountered a news story that mentioned low intellectual ability as the reason why some students do not perform at grade level. I doubt if you can. Whether analyzed by the news media, school superintendents, or politicians, the problems facing low-performing students are always that they have come from disadvantaged backgrounds, or have gone to bad schools, or grown up in peer cultures that do not value educational achievement. The problem is never that they just aren’t smart enough.
The apotheosis of educational romanticism occurred on January 8, 2002, when a Republican president of the United States, surrounded by approving legislators from both parties, signed into law the No Child Left Behind Act, which had this as the Statement of Purpose for its key title:
The purpose of this title is to ensure that all children have a fair, equal, and significant opportunity to obtain a high-quality education and reach, at a minimum, proficiency on challenging State academic achievement standards and state academic assessments.
I added the italics. All means exactly that: everybody, right down to the bottom level of ability. The language of the 2002 law made no provision for any exclusions. The Act requires that this goal be met “not later than 12 years after the end of the 2001–2002 school year.”
We are not talking about a political speech or a campaign promise. The United States Congress, acting with large bipartisan majorities, at the urging of the President, enacted as the law of the land that all children are to be above average. I do not exaggerate. When No Child Left Behind began in 2002, the nation already possessed operational definitions of proficient in the math and reading tests administered under the National Assessment of Educational Progress (NAEP, pronounced “nape”). NAEP is seen as the gold standard in educational testing. Only about 30 percent of American students were proficient in either reading or math by NAEP’s definitions when No Child Left Behind began. In other words, by NAEP’s standard, all students are not just to be brought to the average that existed when No Child Left Behind was enacted. All of them are to reach the level of students at the seventieth percentile.
Many laws are too optimistic, but the No Child Left Behind Act transcended optimism. It set a goal that was devoid of any contact with reality. How did we get to that point?
I begin by briefly making the case that educational romanticism is in fact out of touch with reality. I will call on some specific bodies of scholarly evidence, but nothing I say will come as a surprise to parents of children who are more than a few years into elementary school. Exceptions exist, but the overwhelmingly common parental experience is that even in preschool our children began to exhibit profiles of abilities. When we observed a strength we tried to build on it, and when we observed a weakness we tried to remediate it or find someone who could. But whatever profiles we observed when our children were still quite young could only be tweaked. Our children with dyslexia, for example, could be taught strategies for coping, but reading never became easy for them. If specific learning disabilities were not involved, then nothing much changed no matter how hard we tried. School performance might have risen or fallen because of other things going on in their lives—emotional problems, peer pressures in either direction, or distractions because of a family crisis, for example—but the underlying profiles of abilities that our children took into elementary school didn’t look much different when they got to middle school and high school.
That common experience of parents conforms to everything that is known scientifically about the nature of intellectual ability. A lively debate continues about the malleability of intellectual ability in infants and toddlers, but few make ambitious claims for the malleability of intellectual ability after children enter elementary school. There are no examples of intensive in-school programs that permanently raise intellectual ability during the K-12 years (minor and temporary practice effects are the most that have been demonstrated).
No one disputes the empirical predictiveness of tests of intellectual ability—IQ tests—for large groups. If a classroom of first-graders is given a full-scale IQ test that requires no literacy and no mathematics, the correlation of those scores with scores on reading and math tests at age seventeen is going to be high. Such correlations will be equally high whether the class consists of rich children or poor, black or white, male or female. They will be high no matter how hard the teachers have worked. Scores on tests of reading and math track with intellectual ability, no matter what.
That brings us to an indispensable tenet of educational romanticism: The public schools are so bad that large gains in student performance are possible even within the constraints of intellectual ability. A large and unrefuted body of evidence says that this indispensable tenet is incorrect. Differences among schools do not have much effect on test scores in reading and mathematics. This finding is not well known by the general public (parents could spend less time fretting over their children’s school if it were), and needs some explanation.
When Congress passed the 1964 Civil Rights Act, it included a mandate for a nationwide study to assess the effects of inequality of educational opportunity on student achievement. The study, led by the sociologist James Coleman, was one of the most ambitious in the history of social science. The sample consisted of 645,000 students. Data were collected not only about the students’ personal school histories, but also about their parents’ socioeconomic backgrounds, their neighborhoods, the curricula and facilities of their schools, and the qualifications of the teachers within those schools.
Before Coleman’s team set to work, everybody expected that the study would document a relationship between the quality of schools and the academic achievement of the students in those schools. To everyone’s shock, the Coleman Report instead found that the quality of schools explains almost nothing about differences in academic achievement. Family background was by far the most important factor in determining student achievement. The Coleman Report came under intense fire, but re- analyses of the Coleman data and the collection of new data in the decades since it appeared support its finding that the quality of public schools doesn’t make much difference in student achievement.
In thinking about the explanation for this counter-intuitive result, it is important not to confuse your idea of a bad public school with the worst-of-the-worst inner-city schools that are the subject of horror stories. When schools are as bad as they are in the inner-city neighborhoods of Detroit, Washington, and a few other large cities, they certainly have a depressing effect on student achievement. Getting students out of those schools should be a top policy priority. But only a few percent of the nation’s students attend such schools. In what might be called a “normally bad” public school, a lot of the slack has been taken out of the room for improvement. The normally bad school maintains a reasonably orderly learning environment and offers a standard range of courses taught with standard textbooks. Most of the teachers aren’t terrible; they’re just mediocre. Those raw materials give students most of the education they are going to absorb regardless of where they go to school. Excellent schools with excellent teachers will augment their learning, and are a better experience for children in many other ways as well. But an excellent school’s effects on mean test scores for the student body as a whole will not be dramatic. Readers who attended normally bad K-12 schools and then went to selective colleges are likely to understand why: Your classmates who had gone to Phillips Exeter had taken much better courses than your school offered, and you may have envied their good luck, but you had read a lot on your own, you weren’t that far behind, and you caught up quickly.
To sum up, a massive body of evidence says that reading and mathematics achievement have strong ties to underlying intellectual ability, that we do not know how to change intellectual ability after children reach school, and that the quality of schooling within the normal range of schools does not have much effect on student achievement. To put it another way, we have every reason to think—and already did when the No Child Left Behind Act was passed—that the notion of making all children proficient in math and reading is ridiculous. Such a feat is not possible even for an experimental school with unlimited funding, let alone for public schools operating in the real world. By NAEP’s definition of proficiency, we probably cannot make even half of the students proficient.
And yet the nation passed legislation to make all children proficient by 2014. And so I return to the question: How did we get to that point?
The first strand in explaining educational romanticism is a mythic image of the good old days when teachers brooked no nonsense and all the children learned their three R’s. You have probably run across tokens of it in occasional editorials that quote examination questions once asked of public schools students. Here is an example that The Wall Street Journal gave from the admissions test to Jersey City High School in 1885: “Write a sentence containing a noun as an attribute, a verb in the perfect tense potential mood, and a proper adjective.” Or consider the McGuffey Readers that were standard textbooks in the nineteenth century, filled with literary selections far more difficult than the ones given to today’s students at equivalent ages. That’s the kind of material all children routinely learned, right?
Wrong. American schools have never been able to teach everyone how to read, write, and do arithmetic. The myth that they could has arisen because schools a hundred years ago did not have to educate the least able. When the twentieth century began, about a quarter of all adults had not reached fifth grade and half had not reached eighth grade. The relationship between school dropout and intellectual ability was not perfect, but it was strong. Today’s elementary and middle schools are dealing with 99 percent of all children in the eligible age groups. Let today’s schools not report the test results for the children that schools in 1900 did not have to teach, and NAEP scores would go through the roof.
The second strand in explaining educational romanticism is the periodic discovery of magic bullets for raising classroom performance. The earliest one was the Pygmalion effect, sometimes known as the Rosenthal effect, promulgated in 1968 when Robert Rosenthal and Lenore Jacobson published Pygmalion in the Classroom, reporting large IQ gains for children who, teachers had been told, were potential “late bloomers” intellectually. The designated children had actually been chosen randomly. The gains in IQ were purely a function of the teachers’ expectations, the authors concluded. The implication drawn in the media coverage was that intellectual differences among children are mostly an illusion, and an illusion that can be dispelled if teachers have high expectations for all their students.
It was an appealing story, but it couldn’t withstand examination. Within a few years, other researchers attempting to replicate the Pygmalion effect had determined that it was either nonexistent or very small. But while that conclusion is by now empirically undisputed, you wouldn’t know it. Many people (including influential educators) still think that a large Pygmalion effect is out there, waiting to be tapped if only we can get teachers to shed “the soft bigotry of low expectations,” as the rhetoric of No Child Left Behind puts it.
A year after Pygmalion in the Classroom appeared, Nathaniel Branden published The Psychology of Self-Esteem, introducing another magic bullet and setting off an entire educational movement. Fostering self-esteem as Branden actually described it—an internalized sense of self-responsibility and self-sufficiency—could have been positive. But the movement focused instead on having a favorable opinion of oneself, independently of objective justification for that favorable opinion. From the 1970s through the 1990s, low self-esteem took on the aura of a meta-explanation. California went so far as to establish a task force on self-esteem, which predictably concluded in its 1989 report that “many, if not most, of the major problems plaguing society have roots in the low self-esteem of many of the people who make up society.” And since low self-esteem was the problem, high self-esteem was the solution. The educational romanticists bought into it unreservedly. Children were to be praised, because praise fosters self-esteem. If criticism were unavoidable, the criticism should be cocooned in layers of praise, because criticism undermines self-esteem. Classroom competitions should be avoided, because they damage the self-esteem of the losers.
Once again, an appealing story turned out to be false. The landmark change in the scholarly consensus occurred in 2003 when a comprehensive review of 15,000 studies on the relationship of self-esteem to the development of children, headed by a scholar who formerly had been sympathetic to the self-esteem movement, concluded that there is no empirical evidence that improving self-esteem raises grades, test scores, or, for that matter, has any positive effect whatsoever. Once again, you wouldn’t know it by visiting classrooms. If anything, the assumptions of the self-esteem movement are more firmly embedded in educational practice now than they have ever been.
Still another magic bullet appeared in 1995 when Claude Steele and Joshua Aronson demonstrated experimentally that test performance by academically talented blacks was worse when a test was called an IQ test than when it was innocuously described as a research tool. Steele and Aronson called this phenomenon “stereotype threat.” It has since been extended to stereotypes involving women and math ability. You guessed it: the media interpreted the Steele and Aronson results as meaning that group differences in test scores are illusions that will evaporate if only we can get students to ignore the stereotypes that hold down their performance.
This time, the problem is not that stereotype threat doesn’t really exist. The jury is still out on the magnitude of its effect and the conditions that prompt it, but the reality of the phenomenon is surviving examination. Instead, the problem that gets in the way of this appealing story is that all of the experimental studies have explicitly induced a threat as part of the experiment’s protocol. That threat consists of telling the experimental group that they are about to take a test that measures their innate ability. But tests in K-12 education are never presented that way. The high-stakes tests given in elementary and secondary school are expressly described as measures of what students have learned, not how smart they are. Even tests that do measure innate ability are not presented that way—a case in point being the relentless efforts of the College Board to present the SAT as a measure of acquired skills.
This doesn’t mean that stereotype threat on high stakes tests doesn’t occur in the real world. Many students correctly realize that the SAT is a pretty good measure of innate ability. And many students taking any test worry that they aren’t smart enough to do well, and those worries may stem from stereotypes. But how would the presentation of tests in an ideal educational world look any different from the way it is done now? The overwhelming majority of tests that students take are introduced with statements like “Tomorrow there’s going to be a quiz on the material on pages 16 to 35.” There is no way to say that in a less threatening way. Whatever causes stereotype threat in the larger society, it is not anything that is fixable in the day-to-day conduct of K-12 education.
The third and probably most powerful strand for explaining educational romanticism in the last quarter-century has been Howard Gardner’s theory of multiple intelligences, introduced in Frames of Mind (1983). Gardner had two agendas. One was to topple the word intelligence from its pedestal, and to establish that abilities other than intellectual can be classified as intelligences with equal justification. Discussing that aspect of his theory would take us deep into psychometric issues that are irrelevant to educational romanticism. Gardner’s other agenda was to draw attention to the reality of many different kinds of ability. There were seven in his original count: bodily- kinesthetic, musical, spatial, interpersonal, intrapersonal, logical-mathematical, and linguistic. The message of that agenda is both true and educationally useful: Good schools and good teachers should keep all of these abilities in mind when approaching any individual child. It is also true that high intrapersonal ability, which includes qualities such as persistence and self-discipline, can have great impact on academic achievement, and that low interpersonal ability—severe shyness, for example—can impede classroom performance.
But the existence of different abilities and their relevance to performance in a wide variety of school activities does not mean that they play equal roles in allowing children to learn the material in English lit, American history, chemistry, civics, and advanced algebra. All of the seven abilities can augment or impede learning in academic courses under certain circumstances, but two of them—what Gardner calls linguistic intelligence and logical-mathematical intelligence—are indispensable. Those two abilities are highly correlated with each other, so that it is not the case that children who are below average in one have a fair shot at being above average in the other. On the contrary, a large majority of children who are well below average in one will also be below average in the other. But the extraordinarily widespread embrace of multiple intelligences in the nation’s schools doesn’t dwell on such details. The genuine existence of different kinds of ability has been transmuted into breezy assertions that different children learn in different but equally valid ways and that everything will work out if only we tap the special abilities that reside in every child.
A mythic view of what education used to be able to accomplish, magic bullets for raising academic performance, and sloppy inferences drawn from the theory of multiple intelligences have been enablers of educational romanticism. Perhaps they abetted an inevitable process. The roots of educational romanticism go back to the beginnings of the Progressive Education movement early in the twentieth century. Its flowering in the 1960s and 1970s coincided with a zeitgeist that nurtured wishful thinking of all sorts. But I think we need to come to grips with another important historical force that made educational romanticism dominant. The effects of the triumphant Civil Rights Movement gave a special reason for white elites in the 1960s to start ignoring the implications of intellectual limitations.
It is difficult to convey to readers who came of age in the 1970s or thereafter the emotional power of the Civil Rights Movement of the 1950s and early 1960s. The ambiguities associated with affirmative action and the enforcement of anti-discrimination laws were still in the future. The Civil Rights Movement prior to 1964 created a change in the consciousness of white elites that was felt viscerally, and it included an embarrassing awareness of just how unremittingly whites had violated every American ideal when it came to blacks. With that awareness came elite white guilt —honest, deeply felt, and warranted.
Elite white guilt explains much about all kinds of social policy from the last half of the 1960s onward, but especially about education. Until the 1960s, white educators and politicians could look at a class of white children in which a number of students were doing poorly and shrug. The schools try to teach everyone, but some kids can’t handle the material. That’s just the way the things are; it is not a problem that can be fixed. But when the class consisted of black students who were doing poorly, that reaction was not acceptable. These were children growing up in a society where all the odds had been stacked against them, and their failings couldn’t be passed off as “just the way things are.” Elite white guilt made it impossible to say that a lot of black children were going to continue to fail in school and there’s nothing anybody could do about it. Once it could not be said of black children, neither could it be said of white children. In that context, educational romanticism did not just become fashionable during the 1960s. It became emotionally mandatory.
And so, beginning with the Elementary and Secondary Education Act of 1965, the federal government embarked on a series of major efforts to improve education for disadvantaged children that culminated in 2002 with the No Child Left Behind Act. Surveying that history, an analogy occurred to me that I offer as a speculative proposition: America’s federal education policy as of 2008 is at about the same place that the Soviet Union’s economic policy was in 1990.
The parallels between the trajectory of the Soviet Union’s attempt to reform its economy and the trajectory of the federal government’s attempts to reform the public education system are striking. By the mid-1980s, Soviet leaders knew that they had to introduce supply and demand into the economy, but they couldn’t bring themselves to try honest-to-God capitalism, so they tried to decentralize decision-making and permit some elements of a market economy while retaining central price controls and government ownership of the means of production. The reforms were based on premises about human nature that were patently wrong. By the turn of the twenty-first century, the educational romantics—and George W. Bush is the Percy Bysshe Shelley of educational romantics—knew that public school systems everywhere had become bureaucratically top-heavy and that many inner-city schools were no longer functional. They knew that the billions of federal money spent on upgrading education for disadvantaged children had produced no demonstrable improvements. But they thought they could fix the system. Bush’s glasnost was to implement accountability through measurement of results by test scores. Bush’s perestroika was a mishmash of performance standards and fragments of a market economy in schools, while retaining public funding of the schools and government control over the enforcement of the new standards. The reforms were based on premises about intellectual ability that were patently wrong.
Unlike the Soviet economy, American public schools are still in business, but scholarly analyses of the administration of No Child Left Behind are documenting a monumental mess. In the early years, I didn’t need the experts to tell me. I was watching the demoralized teachers in my children’s school, wearied by endless preparation for the exams and frustrated by demands from on high to concentrate on students who were at the cusp of being able to pass the state’s proficiency benchmark at the expense of everyone else. In subsequent years, the demoralization and frustration may have eased—not because No Child Left Behind got better, but because teachers, principals, and state departments of education have learned all the ways that the Act and its compliance requirements can be gamed.
Some of the ways to game the Act are punishing to low-ability students. Scholars have documented that high-stakes testing directly produces higher dropout rates among low-ability students. Others are mendacious—give teachers information that enables them to teach to the test, or just cook the numbers to make them look better than they really are. Still other methods exploit the loopholes that enable states to be in compliance while not actually raising test scores (the language of the law permits states to be in compliance if they do the paperwork right). Two of the most astute observers of American education, Frederick Hess and Chester Finn, summed it up this way:
No Child Left Behind’s dogmatic aspirations and fractured design are producing a compliance-driven regimen that recreates the very pathologies it was intended to solve. It is time to relearn the lessons of the Great Society, when ambitious programs designed to promote justice and opportunity were undone by utopian formulations, unworkable implementation structures, and the stubborn unwillingness of supporters to acknowledge the limitations of federal action in the American system.
What’s happened to test scores? Interpreting the results state-by-state is intricate—the quality of the tests, the interpretability of the results, and the amount of bureaucratic chicanery vary enormously across states. An excellent analysis through the 2006 testing year found no effects attributable to No Child Left Behind. But to assess the nationwide effects, an easier method is available: use the results from NAEP.
If students were progressing at the rate implied by the Act, more than 60 percent of them would have been at the proficient level by 2007. In math, the actual percentages for NAEP were 39 percent for fourth-graders and 32 percent for eighth-graders. Twelfth-graders were last tested in 2005, when only 23 percent were proficient. The scores for fourth-graders and eighth-graders were higher than in previous years, but psychometricians have yet to untangle the degree of improvement attributable to No Child Left Behind (no NAEP math test was given in 2002, when No Child Left Behind began). Whatever their determination, its effect cannot be more than a few percentage points. NAEP did administer a reading test in 2002, so we have a firm baseline for comparison. Thirty-one percent of fourth-graders were proficient in 2002; 33 percent were proficient in 2007. For eighth- and twelfth-graders, the percentage passing the proficient level fell from 2002 to the most recent tests (from 33 percent to 31 percent in 2007 for the eighth-graders; from 36 percent to 35 percent in 2005 for the twelfth-graders)—changes that for practical purposes amount to zero.
Contemplate these results for a moment. A law is passed that, at least in the first few years, convulses educational practice throughout the nation. It is a law explicitly designed to raise test scores, if only because it produces intense drilling on how to take tests. And it produces trivial increases in NAEP’s math scores and no increases in its reading scores. No Child Left Behind has been not just a failure for educational romanticism, but its repudiation.
The good news is that educational romanticism is surely teetering on the edge of collapse. I am optimistic for three reasons. First, the data keep piling up. It takes a while for empiricism to discredit cherished beliefs, but No Child Left Behind may prove to have done us a favor by putting so much emphasis on test scores and thereby focusing attention on how hard it is to budge those scores. Second, we no longer live in a romantic age. Educational romanticism was born of forces that have lost most of their power, and façades collapse when the motives for maintaining those façades weaken. Third, hardly anybody really believes in educational romanticism even now. No one but the most starry-eyed denies in private the reality of differences in intellectual ability that we are powerless to change with K-12 education. People are unwilling to talk about those differences in public, but it is a classic emperor’s-clothes scenario waiting for someone to point out the obvious. Starting that process can be as simple as more articles like this one.
For the good of our children, educati