To write a book with a title like this you need to get everything right (check), explain the math without glossing over anything (check), and do it with words (and maybe a few figures and graphs) but no equations or formulas (check). This is a book to be taken seriously as mathematical exposition for the ordinary intelligent reader. As Ellenberg rightly says, “mathematics is the extension of common sense by other means.” That is a perspective conducive to giving the ordinary reader an insight into the concepts that give mathematics its power, while staying anchored in our ordinary knowledge of how the world works.


At the center of the book are explanations of certain statistical concepts that provide tools for understanding the mass of quantitative data that informs, or should inform, policy decisions in business, government, economics, science, and medicine. The custom of having the country run by people with humanistic skills and law degrees is potentially a recipe for disaster, because interpreting quantitative data is a skill quite unlike constructing a plausible legal or journalistic case.

A typical concept in statistical interpretation is survivorship bias. Smokers look around at other smokers and think, at least subliminally, “All those smokers look healthy enough to me; surely smoking can’t really be as dangerous as they say.” The problem with that reasoning is that the smokers on show are the surviving ones. The ones you see are indeed healthy enough, but the ones you don’t see are distinctly unwell, or worse. That is why you don’t see them. So there is a “survivorship bias” in the evidence: only the survivors are there to be observed.

Ellenberg begins the book with a story from World War II that shows the need to take survivorship bias into account when planning on the basis of data. Abraham Wald was one of the leading members of the Statistical Research Group at Columbia University, which provided advice on military problems. American planes were coming back from missions over Europe covered in bullet holes. The engines were found to have 1.1 bullet holes per square foot, while the outer parts (outside the engine, fuselage, and fuel system) had many more than that: 1.8 bullet holes per square foot. The military asked for advice—since the outer parts were being shot more often, should they be more heavily armored? Wald’s advice was the opposite: put the extra armor on the engines. Fewer bullet holes are seen in the engines because the planes shot in the engines are often not coming back. The data, that is, is subject to a survivorship bias, which has to be understood before it is clear what the data means.

The moral of the story, Ellenberg says, is that “countries don’t win wars just by being braver than the other side, or freer, or slightly preferred by God. The winners are usually the guys who get 5 percent fewer of their planes shot down, or use 5 percent less fuel, or get 5 percent more nutrition into their infantry at 95 percent of the cost. That’s not the stuff war movies are made of, but it’s the stuff wars are made of. And there’s math every step of the way.”

True, but another aspect of the story suggests a qualification to that conclusion. Abraham Wald was not an American. He was the grandson of a rabbi in the eastern part of the Austro-Hungarian Empire. Unemployable locally after a brilliant degree in pure mathematics at the University of Vienna, he was able to emigrate to the United States in 1938. One reason for Allied victory was that our Germans were better than their Germans. Early twentieth-century science was predominantly German, but there was something like a survivorship bias among German (and Austrian) scientists as to where they were located in 1940. The reasons for that were not mathematical. Something to do with which side was freer, surely.

Another crucial statistical concept, of which Ellenberg gives a superb explanation, is regression to the mean. During the Great Depression, Horace Secrist made waves in the emerging field of business studies with his book The Triumph of Mediocrity in Business. After analyzing a mass of data on businesses from 1920 to 1933, he found that the most successful firms in 1920 had mostly fallen towards the average, while the bottom firms had moved up towards the average. What could be the force driving firms towards mediocrity? Secrist had no doubt about the answer. It was free competition, under which “Superior judgement, merchandising sense, and honesty are always at the mercy of the unscrupulous, the unwise, the misinformed and the injudicious.”

That is not correct. As statisticians explained—to little avail—it was an example of the purely statistical phenomenon of regression to the mean, and hence needed no cause to account for it. If you observe any property with a chance component to its expression—such as height of persons from generation to generation, or baseball performance from season to season—and choose the top or bottom performers in a time period, then their performance in a following time period will be closer to average. No cause needs to be postulated to explain that. It is just a result of having chosen best and worst performers in the original time period, which are what they are partly through chance. Of course they don’t all stay lucky. The fallacy of supposing there must be a cause to explain the failure to maintain “winning streaks” is shown by the fact that regression to the mean works in reverse as well: the top performers this year were not, mostly, the top performers last year. There cannot be a cause that works in reverse time.

That is one tangle about data and causality sorted out, but there are worse ones. One of the most difficult issues in what mathematics tells us about the world is “causation from correlation.” Correlation means the usual co-occurrence of two things, such as horse and carriage, love and marriage, or poverty and ill-health. If there’s correlation, it usually means causation—but it’s very hard to say what causation. Endless observational studies show that people in poor areas are less healthy than people in rich areas, on average. Usually it is implied that poverty causes ill-health. But surely there is causality in the other direction too: sick people earn less money and cannot afford to live in rich areas. Or again, there may be in some cases a common cause of poverty and ill-health that helps explain the correlation, for example, a mindset that squanders a large percentage of disposable income on alcohol and cigarettes. It is still very likely true that poverty causes ill-health, but sorting out the causality from the observed correlations is extremely difficult, and remains a problem theoretically as well as practically unsolved.

Ellenberg has some sympathy for the statisticians who remained skeptical for a very long time about the evidence that smoking causes cancer. It became clear in the 1950s that many smokers were dying of lung cancer, but it was still possible to think of explanations other than that smoking caused lung cancer. R. A. Fisher, the doyen of the statistics profession, seriously proposed that pre-cancerous lung conditions, likely to lead to cancer in any case, might cause irritations that prompted people to assuage them by taking up smoking. Dr. Joseph Berkson, a severe critic of significance testing in statistics, argued that the secret of the correlation lay in the properties of the minority of non-smokers: “The small group of persons who successfully resist the incessantly applied blandishments and reflex conditioning of the cigaret advertisers are a hardy lot, and if they can withstand these assaults, they should have relatively little difficulty in fending off tuberculosis or even cancer!” H. J. Eysenck, perhaps the best-known psychologist of his generation, suggested that smoking caused not lung cancer, but diagnosis of lung cancer (because doctors performing autopsies on smokers looked first at the lungs). Those ingenious theories turned out to be false. Smoking rats predeceased their controls in droves; heavier smokers developed cancer more frequently, ex-smokers less; the action of tobacco toxins became clearer at the cellular level. But the critics were right at the time to look hard for alternative explanations of the correlations.

The corresponding question today is that of carbon dioxide and climate change, where the correlations are clear enough but inferring the causality is not so easy. Unfortunately this is not in Ellenberg’s area and he does not attempt it. He is a pure mathematician—in number theory, regarded by most pure mathematicians as one of the hardest subfields—and his parents were statisticians. He ranges widely in pure mathematics and statistics, but he avoids the third of the big divisions of mathematics, applied mathematics. Applied mathematics involves modelling—extracting the essential mathematical structure of complex systems in order to understand their behavior. The relation between the mathematical (or computer) model and the reality it models is a tricky one, peculiar to applied mathematics. We need some experts in that area, neither over-credulous nor over-skeptical of the climate scientists’ work, to tell us what to believe.

Other topics that Ellenberg treats with clarity and some literary flair include the reasons why so many published statistical studies are wrong, the perils of deciding what public opinion is and how to vote correctly so that the candidate that the people want wins, when jackpot lotteries are worth investing in, and the right conclusions to reach from Pascal’s Wager. What is not included is mathematics in everyday life, as promised by the subtitle. A successful book aimed at that is Eastaway and Wyndham’s Why Do Buses Come in Threes? The Hidden Mathematics of Everyday Life, which gets down to business with cakes, traffic, and queues. How Not To Be Wrong is more about the skills needed to base decisions rationally on data. If policy-makers choose to read the book, they will be less confident as to what actions the data suggests, and all the better for that when it comes to making decisions.

Finally, Ellenberg has one unusual piece of advice on what we might learn from mathematical reasoning. One of the famous techniques of mathematics is proof by contradiction, or reductio ad absurdum. To prove that the square root of 2 is irrational, one begins by assuming it is rational and derives a contradiction. Anything that implies a contradiction must be false. Therefore the square root of 2 cannot be rational and must be irrational. That sounds far from everyday life, but maybe we could try thinking that way about lots of our beliefs.

Put pressure on all your beliefs, social, political, scientific, and philosophical. Believe whatever you want by day; but at night, argue against all the propositions you hold most dear. Don’t cheat! To the greatest extent possible you have to think as though you believe what you don’t believe. And if you can’t talk yourself out of your existing beliefs, you’ll know a lot more about why you believe what you believe. You’ll have come a little closer to a proof.

It could be worth a try.