correlation does not imply causation

205 results back to index


pages: 250 words: 64,011

Everydata: The Misinformation Hidden in the Little Data You Consume Every Day by John H. Johnson

Affordable Care Act / Obamacare, Black Swan, business intelligence, Carmen Reinhart, cognitive bias, correlation does not imply causation, Daniel Kahneman / Amos Tversky, Donald Trump, en.wikipedia.org, Kenneth Rogoff, labor-force participation, lake wobegon effect, Long Term Capital Management, Mercator projection, Mercator projection distort size, especially Greenland and Africa, meta analysis, meta-analysis, Nate Silver, obamacare, p-value, PageRank, pattern recognition, publication bias, QR code, randomized controlled trial, risk-adjusted returns, Ronald Reagan, selection bias, statistical model, The Signal and the Noise by Nate Silver, Thomas Bayes, Tim Cook: Apple, wikimedia commons, Yogi Berra

This book isn’t meant to be a statistics textbook. Unfortunately, we don’t have the space to teach you how to run a perfect statistical analysis, or determine the exact correlation. But that’s okay, because our goal is simply to help you make better decisions by recognizing the difference between correlation and causation, and understanding some of the reasons that people confuse the two—so you can avoid making the same mistakes. How to Be a Good Consumer of Correlation and Causation So now, armed with a better understanding of the distinction between correlation and causation, here are some steps to keep in mind when consuming data about a statistical relationship: 1. Ask yourself what is being represented in the news article or research. Does the story actually use the phrase “causal” relationship? More often than not, a headline or article might appear to be implying causality, but if you actually dig deeper, you will find most of the actual research is only a discussion of some type of correlation. 2.

So if you really want a “Proud parent of an honor roll student” bumper sticker for your minivan, apparently all you need to do is get your kids glasses and an iPhone, have them watch a few Ronald Reagan speeches, play some Radiohead, don’t let them fall asleep before midnight, turn them into lefties, and start them drinking (once they reach legal age, of course). Have we lost our minds? No. We’ve just read a lot of studies and media reports that seem to draw the wrong conclusion from statistical analyses—specifically, reports and articles that confuse correlation with causation, and therefore, sometimes unintentionally, mislead the reader about the key takeaways. It’s important to note that there are two issues here: first of all, there are the original scientific studies that sometimes confuse correlation with causation. But what you’re more likely to encounter in your everyday life are newspaper articles and other media accounts that misreport the findings from valid scientific studies. We’ve seen many cases in which a finding is reported in the news as causation, even though the underlying study notes that it is only correlation.

From a statistical perspective, we can find lots of apparent connections between two factors, such as wearing glasses and having a high IQ. These types of connections—when there is some sort of relationship between data—are called correlations. But, as we’ll explore in this chapter, the mere existence of such a statistical relationship between two factors does not imply that there is actually a meaningful link between them. Correlation does not equal causation. It’s actually one of the most common ways that people misinterpret data. But don’t worry—in this chapter, we’ll take a close look at how and why people mistake correlation for causation, and give you the tools to help you understand which everydata you should really believe. SMARTPHONES = SMART PEOPLE? So, back to the smart people analysis. We dug a bit deeper into what the actual studies said, and uncovered some interesting caveats, warnings, and facts that might shed some light on these findings.


The Book of Why: The New Science of Cause and Effect by Judea Pearl, Dana Mackenzie

affirmative action, Albert Einstein, Asilomar, Bayesian statistics, computer age, computer vision, correlation coefficient, correlation does not imply causation, Daniel Kahneman / Amos Tversky, Edmond Halley, Elon Musk, en.wikipedia.org, experimental subject, Isaac Newton, iterative process, John Snow's cholera map, Loebner Prize, loose coupling, Louis Pasteur, Menlo Park, pattern recognition, Paul Erdős, personalized medicine, Pierre-Simon Laplace, placebo effect, prisoner's dilemma, probability theory / Blaise Pascal / Pierre de Fermat, randomized controlled trial, selection bias, self-driving car, Silicon Valley, speech recognition, statistical model, Stephen Hawking, Steve Jobs, strong AI, The Design of Experiments, the scientific method, Thomas Bayes, Turing test

Despite heroic efforts by the geneticist Sewall Wright (1889–1988), causal vocabulary was virtually prohibited for more than half a century. And when you prohibit speech, you prohibit thought and stifle principles, methods, and tools. Readers do not have to be scientists to witness this prohibition. In Statistics 101, every student learns to chant, “Correlation is not causation.” With good reason! The rooster’s crow is highly correlated with the sunrise; yet it does not cause the sunrise. Unfortunately, statistics has fetishized this commonsense observation. It tells us that correlation is not causation, but it does not tell us what causation is. In vain will you search the index of a statistics textbook for an entry on “cause.” Students are not allowed to say that X is the cause of Y—only that X and Y are “related” or “associated.” Because of this prohibition, mathematical tools to manage causal questions were deemed unnecessary, and statistics focused exclusively on how to summarize data, not on how to interpret it.

It was the first bridge ever built between causality and probability, the first crossing of the barrier between rung two and rung one on the Ladder of Causation. Having built this bridge, Wright could travel backward over it, from the correlations measured in the data (rung one) to the hidden causal quantities, d and h (rung two). He did this by solving algebraic equations. This idea must have seemed simple to Wright but turned out to be revolutionary because it was the first proof that the mantra “Correlation does not imply causation” should give way to “Some correlations do imply causation.” FIGURE 2.7. Sewall Wright’s first path diagram, illustrating the factors leading to coat color in guinea pigs. D = developmental factors (after conception, before birth), E = environmental factors (after birth), G = genetic factors from each individual parent, H = combined hereditary factors from both parents, O, O′ = offspring. The objective of analysis was to estimate the strength of the effects of D, E, H (written as d, e, h in the diagram).

How is this possible? Did the coins somehow communicate with each other at light speed? Of course not. In reality you conditioned on a collider by censoring all the tails-tails outcomes. In The Direction of Time, published posthumously in 1956, philosopher Hans Reichenbach made a daring conjecture called the “common cause principle.” Rebutting the adage “Correlation does not imply causation,” Reichenbach posited a much stronger idea: “No correlation without causation.” He meant that a correlation between two variables, X and Y, cannot come about by accident. Either one of the variables causes the other, or a third variable, say Z, precedes and causes them both. Our simple coin-flip experiment proves that Reichenbach’s dictum was too strong, because it neglects to account for the process by which observations are selected.


pages: 579 words: 76,657

Data Science from Scratch: First Principles with Python by Joel Grus

correlation does not imply causation, natural language processing, Netflix Prize, p-value, Paul Graham, recommendation engine, SpamAssassin, statistical model

That is the sort of relationship that correlation looks for. In addition, correlation tells you nothing about how large the relationship is. The variables: x = [-2, 1, 0, 1, 2] y = [99.98, 99.99, 100, 100.01, 100.02] are perfectly correlated, but (depending on what you’re measuring) it’s quite possible that this relationship isn’t all that interesting. Correlation and Causation You have probably heard at some point that “correlation is not causation,” most likely by someone looking at data that posed a challenge to parts of his worldview that he was reluctant to question. Nonetheless, this is an important point — if x and y are strongly correlated, that might mean that x causes y, that y causes x, that each causes the other, that some third factor causes both, or it might mean nothing. Consider the relationship between num_friends and daily_minutes.

closeness centrality, Betweenness Centrality clustering, Clustering-For Further Explorationbottom-up hierarchical clustering, Bottom-up Hierarchical Clustering-Bottom-up Hierarchical Clustering choosing k, Choosing k example, clustering colors, Example: Clustering Colors example, meetups, Example: Meetups-Example: Meetups k-means clustering, The Model clusters, Rescaling, The Ideadistance between, Bottom-up Hierarchical Clustering code examples from this book, Using Code Examples coefficient of determination, The Model combiners (in MapReduce), An Aside: Combiners comma-separated values files, Delimited Filescleaning comma-delimited stock prices, Cleaning and Munging command line, running Python scripts at, stdin and stdout conditional probability, Conditional Probabilityrandom variables and, Random Variables confidence intervals, Confidence Intervals confounding variables, Simpson’s Paradox confusion matrix, Correctness continue statement (Python), Control Flow continuity correction, Example: Flipping a Coin continuous distributions, Continuous Distributions control flow (in Python), Control Flow correctness, Correctness correlation, Correlationand causation, Correlation and Causation in simple linear regression, The Model other caveats, Some Other Correlational Caveats outliers and, Correlation Simpson's Paradox and, Simpson’s Paradox correlation function, Simple Linear Regression cosine similarity, User-Based Collaborative Filtering, Item-Based Collaborative Filtering Counter (Python), Counter covariance, Correlation CREATE TABLE statement (SQL), CREATE TABLE and INSERT cumulative distribution function (cdf), Continuous Distributions currying (Python), Functional Tools curse of dimensionality, The Curse of Dimensionality-The Curse of Dimensionality, User-Based Collaborative Filtering D D3.js library, Visualization datacleaning and munging, Cleaning and Munging exploring, Exploring Your Data-Many Dimensions finding, Find Data getting, Getting Data-For Further Explorationreading files, Reading Files-Delimited Files scraping from web pages, Scraping the Web-Example: O’Reilly Books About Data using APIs, Using APIs-Using Twython using stdin and stdout, stdin and stdout manipulating, Manipulating Data-Manipulating Data rescaling, Rescaling-Rescaling data mining, What Is Machine Learning?

Index A A/B test, Example: Running an A/B Test accuracy, Correctnessof model performance, Correctness all function (Python), Truthiness Anaconda distribution of Python, Getting Python any function (Python), Truthiness APIs, using to get data, Using APIs-Using Twythonexample, using Twitter APIs, Example: Using the Twitter APIs-Using Twythongetting credentials, Getting Credentials using twython, Using Twython finding APIs, Finding APIs JSON (and XML), JSON (and XML) unauthenticated API, Using an Unauthenticated API args and kwargs (Python), args and kwargs argument unpacking, zip and Argument Unpacking arithmeticin Python, Arithmetic performing on vectors, Vectors artificial neural networks, Neural Networks(see also neural networks) assignment, multiple, in Python, Tuples B backpropagation, Backpropagation bagging, Random Forests bar charts, Bar Charts-Line Charts Bayes's Theorem, Bayes’s Theorem, A Really Dumb Spam Filter Bayesian Inference, Bayesian Inference Beautiful Soup library, HTML and the Parsing Thereof, n-gram Modelsusing with XML data, JSON (and XML) Bernoulli trial, Example: Flipping a Coin Beta distributions, Bayesian Inference betweenness centrality, Betweenness Centrality-Betweenness Centrality bias, The Bias-Variance Trade-offadditional data and, The Bias-Variance Trade-off bigram model, n-gram Models binary relationships, representing with matrices, Matrices binomial random variables, The Central Limit Theorem, Example: Flipping a Coin Bokeh project, Visualization booleans (Python), Truthiness bootstrap aggregating, Random Forests bootstrapping data, Digression: The Bootstrap bottom-up hierarchical clustering, Bottom-up Hierarchical Clustering-Bottom-up Hierarchical Clustering break statement (Python), Control Flow buckets, grouping data into, Exploring One-Dimensional Data business models, Modeling C CAPTCHA, defeating with a neural network, Example: Defeating a CAPTCHA-Example: Defeating a CAPTCHA causation, correlation and, Correlation and Causation, The Model cdf (see cumulative distribtion function) central limit theorem, The Central Limit Theorem, Confidence Intervals central tendenciesmean, Central Tendencies median, Central Tendencies mode, Central Tendencies quantile, Central Tendencies centralitybetweenness, Betweenness Centrality-Betweenness Centrality closeness, Betweenness Centrality degree, Finding Key Connectors, Betweenness Centrality eigenvector, Eigenvector Centrality-Centrality classes (Python), Object-Oriented Programming classification trees, What Is a Decision Tree?


pages: 219 words: 65,532

The Numbers Game: The Commonsense Guide to Understanding Numbers in the News,in Politics, and inLife by Michael Blastland, Andrew Dilnot

Atul Gawande, business climate, correlation does not imply causation, credit crunch, happiness index / gross national happiness, Intergovernmental Panel on Climate Change (IPCC), moral panic, pension reform, pensions crisis, randomized controlled trial, school choice, very high income

During medical trials of new drugs, it used to be customary to record anything that happened to a patient taking an experimental drug and say the drug might have caused it: “side effects,” they were called, as it was noted that someone had a headache or a runny nose and thereafter this “side effect” was printed forever on the side of the packet. Nowadays these are referred to as “adverse events,” making it clear that the cause was unclear and they might have had nothing to do with the medication. Restlessness for the true cause is a constructive habit, an insurance against gullibility. And though correlation does not prove causation, it is often a good hint, but a hint to start asking questions, not to settle for easy answers. There is one caveat. Here and there you will come across a tendency to dismiss almost all statistical findings as correlation-causation fallacy, a rhetorical cudgel, as one careful critic put it, to avoid believing any evidence. But we need to distinguish between casual associations often made for political ends and proper statistical studies. The latter come to their conclusions by trying to eliminate all the other possible causes through careful control of any trial, sample, or experiment, making sure if they can that there is no bias, that samples are random when possible.

Human (and sometimes animal) ability to see how one thing leads to another is prodigious—thank goodness, since it is vital to survival. But it also goes badly wrong. From applying it all the time, people acquire a headstrong tendency to see it everywhere, even where it isn’t. We see how one thing goes with another—and quickly conclude that it causes the other, and never more so than when the numbers or measurements seem to agree. This is the oldest fallacy in the book, that correlation proves causation, and also the most obdurate. And so it has been observed by smart researchers that overweight people live longer than thinner people, and therefore it was concluded that being overweight causes longer life. Does it? We will see. How do we train the instinct that serves us so well most of the time for the occasions when it doesn’t? Not by keeping it in check—it is genius at work—but by refusing to let it sleep.

That is a cheap joke. There are many possible causes of acne, even in lovers of heavy metal, the likelier culprits being teenage hormones and diet. Correlation—the apparent link between two separate things—does not prove causation: just because two things seem to go together doesn’t mean one brings about the other. This shouldn’t need saying, but it does, hourly. Get this wrong—mistake correlation for causation—and we flout one of the most elementary rules of statistics or logic. When we spot a fallacy of this kind lurking behind a claim, we cannot believe anyone could have fallen for it. That is, until tomorrow, when we miss precisely the same kind of fallacy and then see fit to say the claim is supported by compelling evidence. It is frighteningly easy to think in this way. Time and again someone measures a change in A, notes another in B, and declares one the mother of the other.


pages: 475 words: 134,707

The Hype Machine: How Social Media Disrupts Our Elections, Our Economy, and Our Health--And How We Must Adapt by Sinan Aral

Airbnb, Albert Einstein, Any sufficiently advanced technology is indistinguishable from magic, augmented reality, Bernie Sanders, bitcoin, carbon footprint, Cass Sunstein, computer vision, coronavirus, correlation does not imply causation, COVID-19, Covid-19, crowdsourcing, cryptocurrency, death of newspapers, disintermediation, Donald Trump, Drosophila, Edward Snowden, Elon Musk, en.wikipedia.org, Erik Brynjolfsson, experimental subject, facts on the ground, Filter Bubble, global pandemic, hive mind, illegal immigration, income inequality, Kickstarter, knowledge worker, longitudinal study, low skilled workers, Lyft, Mahatma Gandhi, Mark Zuckerberg, Menlo Park, meta analysis, meta-analysis, Metcalfe’s law, mobile money, move fast and break things, move fast and break things, multi-sided market, Nate Silver, natural language processing, Network effects, performance metric, phenotype, recommendation engine, Robert Bork, Robert Shiller, Robert Shiller, Second Machine Age, sentiment analysis, shareholder value, skunkworks, Snapchat, social graph, social intelligence, social software, social web, statistical model, stem cell, Stephen Hawking, Steve Jobs, Telecommunications Act of 1996, The Chicago School, The Wisdom of Crowds, theory of mind, Tim Cook: Apple, Uber and Lyft, uber lyft, WikiLeaks, Yogi Berra

Taking Causality Seriously I have an xkcd cartoon about the difference between correlation and causation on the door to my office at MIT. It depicts two friends talking. One friend says to the other, “I used to think correlation implied causation. Then I took a statistics class [and] now I don’t.” The other friend says, “Sounds like the class helped,” and the first friend replies, “Well, maybe.” It could be that the class taught the friend about the difference between correlation and causation. It could also just as easily be that the friend who took the class has an interest in and thus a proclivity to understand statistics. So maybe he “selected into” the class. This “selection effect” can explain the correlation between taking the class and understanding the difference between correlation and causation just as easily as the class teaching him about this difference.

The performance of four predictive models is evaluated by their relative true positive and false positive rates. The area under the ROC curve represents model performance. The greater the area under a model’s curve and above the 45-degree dotted line, the better the model performs. When Thomas Blake, Chris Nosko, and Steven Tadelis compared the ROI measures eBay was using to experimental measures that distinguished correlation from causation, they found brand search ad effectiveness was overestimated at eBay by up to 4,100 percent. Comparing traditional measures to a large experiment measuring the returns on Web display ads on Yahoo!, Randall Lewis and David Reiley found ROI inflation of 300 percent. In a large-scale experiment testing the effectiveness of retargeting ads compared to industry studies, Garrett Johnson, Randall Lewis, and Elmar Nubbemeyer found overestimates up to 1,600 percent.

So simple correlations in running behavior among friends don’t prove that friends influence each other to exercise. People who choose to run or bike in groups may simply be more committed to running or biking and may, therefore, run and bike longer. To understand whether digital peer effects motivate exercise and whether exercise is contagious, we need some way of distinguishing between correlation and causation. But while randomized experiments are the gold standard of causal inference and useful in a marketing context, we can’t go around randomly cattle-prodding some people to get off their couches to run. So to measure peer effects in running, we had to find another source of as-good-as-random variation in people’s running habits—something that motivates some people to run but has no effect on whether their friends run.


pages: 227 words: 62,177

Numbers Rule Your World: The Hidden Influence of Probability and Statistics on Everything You Do by Kaiser Fung

American Society of Civil Engineers: Report Card, Andrew Wiles, Bernie Madoff, Black Swan, business cycle, call centre, correlation does not imply causation, cross-subsidies, Daniel Kahneman / Amos Tversky, edge city, Emanuel Derman, facts on the ground, fixed income, Gary Taubes, John Snow's cholera map, moral hazard, p-value, pattern recognition, profit motive, Report Card for America’s Infrastructure, statistical model, the scientific method, traveling salesman

But imperfect information does not intimidate them; they seek models that fit the available evidence more tightly than all alternatives. Box’s writings on his experiences in the industry have inspired generations of statisticians; to get a flavor of his engaging style, see the collection Improving Almost Anything, lovingly produced by his former students. More ink than necessary has been spilled on the dichotomy between correlation and causation. Asking for the umpteenth time whether correlation implies causation is pointless (we already know it does not). The question Can correlation be useful without causation? is much more worthy of exploration. Forgetting what the textbooks say, most practitioners believe the answer is quite often yes. In the case of credit scoring, correlation-based statistical models have been wildly successful even though they do not yield simple explanations for why one customer is a worse credit risk than another.

It is implausible that something as variable as human behavior can be attributed to simple causes; modelers specializing in stock market investment and consumer behavior have also learned similar lessons. Statisticians in these fields have instead relied on accumulated learning from the past. The standard statistics book grinds to a halt when it comes to the topic of correlation versus causation. As readers, we may feel as if the authors have taken us along for the ride! After having plodded through the mathematics of regression modeling, we reach a section that screams, “Correlation is not causation!” and, “Beware of spurious correlations!” over and over. The bottom line, the writers tell us, is that almost nothing we have studied can prove causation; their motley techniques measure only correlation. The greatest statistician of his generation, Sir Ronald Fisher, famously scoffed at Hill’s technique to link cigarette smoking and lung cancer; he offered that the discovery of a gene that predisposes people to both smoking and cancer would discredit such a link.

As interesting as it would be to know how each step of a touring plan decreased their wait times, Testa’s millions of fans care about only one thing: whether the plan let them visit more rides, enhancing the value of their entry tickets. The legion of satisfied readers is testimony to the usefulness of this correlational model. ~###~ Polygraphs rely strictly on correlations between the act of lying and certain physiological metrics. Are correlations useful without causation? In this case, statisticians say no. To avoid falsely imprisoning innocent people based solely on evidence of correlation, they insist that lie detection technology adopt causal modeling of the type practiced in epidemiology. They caution against logical overreach: Liars breathe faster. Adam’s breaths quickened. Therefore, Adam was a liar. Deception, or stress related to it, is only one of many possible causes for the increase in breathing rate, so variations in this or similar measures need not imply lying.


pages: 397 words: 109,631

Mindware: Tools for Smart Thinking by Richard E. Nisbett

affirmative action, Albert Einstein, availability heuristic, big-box store, Cass Sunstein, choice architecture, cognitive dissonance, correlation coefficient, correlation does not imply causation, cosmological constant, Daniel Kahneman / Amos Tversky, dark matter, endowment effect, experimental subject, feminist movement, fixed income, fundamental attribution error, glass ceiling, Henri Poincaré, Intergovernmental Panel on Climate Change (IPCC), Isaac Newton, job satisfaction, Kickstarter, lake wobegon effect, libertarian paternalism, longitudinal study, loss aversion, low skilled workers, Menlo Park, meta analysis, meta-analysis, quantitative easing, Richard Thaler, Ronald Reagan, selection bias, Shai Danziger, Socratic dialogue, Steve Jobs, Steven Levy, the scientific method, The Wealth of Nations by Adam Smith, Thomas Kuhn: the structure of scientific revolutions, William of Occam, Zipcar

Many of my fellow psychologists are going to be distressed by my bottom line here: such questions as whether academic success is affected by self-esteem, controlling for depression, or whether the popularity of fraternity brothers is affected by extroversion, controlling for neuroticism, or whether the number of hugs a person receives per day confers resistance to infection, controlling for age, educational attainment, frequency of social interaction, and a dozen other variables, are not answerable by MRA. What nature hath joined together, multiple regression analysis cannot put asunder. No Correlation Doesn’t Mean No Causation Correlation doesn’t prove causation. But the problem with correlational studies is worse than that. Lack of correlation doesn’t prove lack of causation—and this mistake is made possibly as often as the converse error. Does diversity training improve rates of hiring women and minorities? One study examined this question by quizzing human resource managers at seven hundred U.S. organizations about whether they had diversity training programs and by checking on the firms’ minority hiring rates filed with the Equal Employment Opportunity Commission.31 As it happens, having diversity training programs was unrelated to “the share of white women, black women, and black men in management.”

The representativeness heuristic underlies many of our prior assumptions about correlation. If A is similar to B in some respect, we’re likely to see a relationship between them. The availability heuristic can also play a role. If the occasions when A is associated with B are more memorable than occasions when it isn’t, we’re particularly likely to overestimate the strength of the relationship. Correlation doesn’t establish causation, but if there’s a plausible reason why A might cause B, we readily assume that correlation does indeed establish causation. A correlation between A and B could be due to A causing B, B causing A, or something else causing both. We too often fail to consider these possibilities. Part of the problem here is that we don’t recognize how easy it is to “explain” correlations in causal terms. Reliability refers to the degree to which a case gets the same score on two occasions or when measured by different means.

A basic problem with MRA is that it typically assumes that the independent variables can be regarded as building blocks, with each variable taken by itself being logically independent of all the others. This is usually not the case, at least for behavioral data. Self-esteem and depression are intrinsically bound up with each other. It’s entirely artificial to ask whether one of those variables has an effect on a dependent variable independent of the effects of the other variable. Just as correlation doesn’t prove causation, absence of correlation fails to prove absence of causation. False-negative findings can occur using MRA just as false-positive findings do—because of the hidden web of causation that we’ve failed to identify. 12. Don’t Ask, Can’t Tell How many questionnaire and survey results about people’s beliefs, values, or behavior will you read during your lifetime in newspapers, magazines, and business reports? Thousands, surely.


pages: 267 words: 71,123

End This Depression Now! by Paul Krugman

airline deregulation, Asian financial crisis, asset-backed security, bank run, banking crisis, Bretton Woods, business cycle, capital asset pricing model, Carmen Reinhart, centre right, correlation does not imply causation, credit crunch, Credit Default Swap, currency manipulation / currency intervention, debt deflation, Eugene Fama: efficient market hypothesis, financial deregulation, financial innovation, Financial Instability Hypothesis, full employment, German hyperinflation, Gordon Gekko, Hyman Minsky, income inequality, inflation targeting, invisible hand, Joseph Schumpeter, Kenneth Rogoff, liquidationism / Banker’s doctrine / the Treasury view, liquidity trap, Long Term Capital Management, low skilled workers, Mark Zuckerberg, money market fund, moral hazard, mortgage debt, negative equity, paradox of thrift, Paul Samuelson, price stability, quantitative easing, rent-seeking, Robert Gordon, Ronald Reagan, Upton Sinclair, We are the 99%, working poor, Works Progress Administration

Inequality and Crises Before the financial crisis of 2008 struck, I would often give talks to lay audiences about income inequality, in which I would point out that top income shares had risen to levels not seen since 1929. Invariably there would be questions about whether that meant that we were on the verge of another Great Depression—and I would declare that this wasn’t necessarily so, that there was no reason extreme inequality would necessarily cause economic disaster. Well, whaddya know? Still, correlation is not the same as causation. The fact that a return to pre-Depression levels of inequality was followed by a return to depression economics could be just a coincidence. Or it could reflect common causes of both phenomena. What do we really know here, and what might we suspect? Common causation is almost surely part of the story. There was a major political turn to the right in the United States, the United Kingdom, and to some extent other countries circa 1980.

Before I can answer that question, I have to talk briefly about the pitfalls one needs to avoid. The Trouble with Correlation You might think that the way to assess the effects of government spending on the economy is simply to look at the correlation between spending levels and other things, like growth and employment. The truth is that even people who should know better sometimes fall into the trap of equating correlation with causation (see the discussion of debt and growth in chapter 8). But let me try to disabuse you of the notion that this is a useful procedure, by talking about a related question: the effects of tax rates on economic performance. As you surely know, it’s an article of faith on the American right that low taxes are the key to economic success. But suppose we look at the relationship between taxes—specifically, the share of GDP collected in federal taxes—and unemployment over the past dozen years.

., 200 conservatives: anti-government ideology of, 66 anti-Keynesianism of, 93–96, 106–8, 110–11 Big Lie of 2008 financial crisis espoused by, 64–66, 100 free market ideology of, 66 Consumer Financial Protection Bureau, 84 Consumer Price Index (CPI), 156–57, 159, 160 consumer spending, 24, 26, 30, 32, 33, 39, 41, 113, 136 effect of government spending on, 39 household debt and, 45, 47, 126, 146 income inequality and, 83 in 2008 financial crisis, 117 conventional wisdom, lessons of Great Depression ignored in, xi corporations, 30 see also business investment, slump in; executive compensation correlation, causation vs., 83, 198, 232–33, 237 Cowen, Brian, 88 credit booms, 65 credit crunches: of 2008, 41, 110, 113, 117 Great Depression and, 110 credit default swaps, 54, 55 credit expansion, 154 currency, manipulation of, 221 currency, national: devaluation of, 169 disadvantages of, 168–69, 170–71 flexibility of, 169–73, 179 optimum currency area and, 171–72 see also euro Dakotas, high employment in, 37 debt, 4, 34, 131 deregulation and, 50 high levels of, 34, 45, 46, 49–50, 51 self-reinforcing downward spiral in, 46, 48, 49–50 usefulness of, 43 see also deficits; government debt; household debt; private debt “Debt-Deflation Theory of Great Depressions, The” (Fisher), 45 debt relief, 147 defense industry, 236 defense spending, 35, 38–39, 148, 234–35, 235, 236 deficits, 130–49, 151, 202, 238 Alesina/Ardagna study of, 196–99 depressions and, 135–36, 137 exaggerated fear of, 131–32, 212 job creation vs., 131, 143, 149, 206–7, 238 monetary policy and, 135 see also debt deflation, 152, 188 debt and, 45, 49, 163 De Grauwe, Paul, 182–83 deleveraging, 41, 147 paradox of, 45–46, 52 demand, 24–34 in babysitting co-op example, 29–30 inadequate levels of, 25, 29–30, 34, 38, 47, 93, 101–2, 118, 136, 148 spending and, 24–26, 29, 47, 118 unemployment and, 33, 47 see also supply and demand Democracy Corps, 8 Democrats, Democratic Party, 2012 election and, 226, 227–28 Denmark, 184 EEC joined by, 167 depression of 2008–, ix–xii, 209–11 business investment and, 16, 33 debt levels and, 4, 34, 47 democratic values at risk in, 19 economists’ role in, 100–101, 108 education and, 16 in Europe, see Europe, debt crisis in housing sector and, 33, 47 income inequality and, 85, 89–90 inflation rate in, 151–52, 156–57, 159–61, 189, 227 infrastructure investment and, 16–17 lack of demand in, 47 liquidity trap in, 32–34, 38, 51, 136, 155, 163 long-term effects of, 15–17 manufacturing capacity loss in, 16 as morality play, 23, 207, 219 private sector spending and, 33, 47, 211–12 unemployment in, x, 5–12, 24, 110, 117, 119, 210, 212 see also financial crisis of 2008–09; recovery, from depression of 2008– depressions, 27 disproportion between cause and effect in, 22–23, 30–31 government spending and, 135–36, 137, 231 Keynes’s definition of, x Schumpeter on, 204–5 see also Great Depression; recessions deregulation, financial, 54, 56, 67, 85, 114 under Carter, 61 under Clinton, 62 income inequality and, 72–75, 74, 81, 82, 89 under Reagan, 50, 60–61, 62, 67–68 rightward political shift and, 83 supposed benefits of, 69–70, 72–73, 86 derivatives, 98 see also specific financial instruments devaluation, 169, 180–81 disinflation, 159 dot-com bubble, 14, 198 Draghi, Mario, 186 earned-income tax credit, 120 econometrics, 233 economic output, see gross domestic product Economics (Samuelson), 93 economics, economists: academic sociology and, 92, 96, 103 Austrian school of, 151 complacency of, 55 disproportion between cause and effect in, 22–23, 30–31 ignorance of, 106–8 influence of financial elite on, 96 Keynesian, see Keynesian economics laissez-faire, 94, 101 lessons of Great Depression ignored by, xi, 92, 108 liquidationist school of, 204–5 monetarist, 101 as morality play, 23, 207, 219 renewed appreciation of past thinking in, 42 research in, see research, economic Ricardian, 205–6 see also macroeconomics “Economics of Happiness, The” (Bernanke), 5 economy, U.S.: effect of austerity programs on, 51, 213 election outcomes and, 225–26 postwar boom in, 50, 70, 149 size of, 121, 122 supposed structural defects in, 35–36 see also global economy education: austerity policies and, 143, 213–14 depression of 2008– and, 16 income inequality and, 75–76, 89 inequality in, 84 teachers’ salaries in, 72, 76, 148 efficient-markets hypothesis, 97–99, 100, 101, 103–4 Eggertsson, Gauti, 52 Eichengreen, Barry, 236 elections, U.S.: economic growth and, 225–26 of 2012, 226 emergency aid, 119–20, 120, 144, 216 environmental regulation, 221 Essays in Positive Economics (Friedman), 170 euro, 166 benefits of, 168–69, 170–71 creation of, 174 economic flexibility constrained by, 18, 169–73, 179, 184 fixing problems of, 184–87 investor confidence and, 174 liquidity and, 182–84, 185 trade imbalances and, 175, 175 as vulnerable to panics, 182–84, 186 wages and, 174–75 Europe: capital flow in, 169, 174, 180 common currency of, see euro creditor nations of, 46 debtor nations of, 4, 45, 46, 139 democracy and unity in, 184–85 fiscal integration lacking in, 171, 172–73, 176, 179 GDP in, 17 health care in, 18 inflation and, 185, 186 labor mobility lacking in, 171–72, 173, 179 1930s arms race in, 236 social safety nets in, 18 unemployment in, 4, 17, 18, 176, 229, 236 Europe, debt crisis in, x, 4, 40, 45, 46, 138, 140–41, 166–87 austerity programs in, 46, 144, 185, 186, 188, 190 budget deficits and, 177 fiscal irresponsibility as supposed cause of (Big Delusion), 177–79, 187 housing bubbles and, 65, 169, 172, 174, 176 interest rates in, 174, 176, 182–84, 190 liquidity fears and, 182–84 recovery from, 184–87 unequal impact of, 17–18 wages in, 164–65, 169–70, 174–75 European Central Bank, 46, 183 Big Delusion and, 179 inflation and, 161, 180 interest rates and, 190, 202–3 monetary policy of, 180, 185, 186 European Coal and Steel Community, 167 European Economic Community (EEC), 167–68 European Union, 172 exchange rates, fixed vs. flexible, 169–73 executive compensation, 78–79 “outrage constraint” on, 81–82, 83 expansionary austerity, 144, 196–99 expenditure cascades, 84 Fama, Eugene, 69–70, 73, 97, 100, 106 Fannie Mae, 64, 65–66, 100, 172, 220–21 Farrell, Henry, 100, 192 Federal Deposit Insurance Corporation (FDIC), 59, 172 Federal Housing Finance Agency, 221 Federal Reserve, 42, 103 aggressive action needed from, 216–19 creation of, 59 foreign exchange intervention and, 217 inflation and, 161, 217, 219, 227 interest rates and, 33–34, 93, 105, 117, 134, 135, 143, 151, 189–90, 193, 215, 216–17 as lender of last resort, 59 LTCM crisis and, 69 money supply controlled by, 31, 32, 33, 105, 151, 153, 155, 157, 183 recessions and, 105 recovery and, 216–19 in 2008 financial crisis, 104, 106, 116 unconventional asset purchases by, 217 Federal Reserve Bank of Boston, 47–48 Feinberg, Larry, 72 Ferguson, Niall, 135–36, 139, 160 Fianna Fáil, 88 filibusters, 123 financial crisis of 2008–09, ix, x, 40, 41, 69, 72, 99, 104, 111–16 Bernanke on, 3–4 Big Lie of, 64–66, 100, 177 capital ratios and, 59 credit crunch in, 41, 110, 113, 117 deleveraging in, 147 Federal Reserve and, 104, 106 income inequality and, 82, 83 leverage in, 44–46, 63 panics in, 4, 63, 111, 155 real GDP in, 13 see also depression of 2008–; Europe, debt crisis in financial elite: political influence of, 63, 77–78, 85–90 Republican ideology and, 88–89 top 0.01 percent in, 75, 76 top 0.1 percent in, 75, 76, 77, 96 top 1 percent in, 74–75, 74, 76–77, 96 see also income inequality financial industry, see banks, banking industry financial instability hypothesis, 43–44 Financial Times, 95, 100, 203–4 Finland, 184 fiscal integration, 171, 172–73, 176 Fisher, Irving, 22, 42, 44–46, 48, 49, 52, 163 flexibility: currency and, 18, 169–73 paradox of, 52–53 Flip This House (TV show), 112 Florida, 111 food stamps, 120, 144 Ford, John, 56 foreclosures, 45, 127–28 foreign exchange markets, 217 foreign trade, 221 Fox News, 134 Frank, Robert, 84 Freddie Mac, 64, 65–66, 100, 172, 220–21 free trade, 167 Friedman, Milton, 96, 101, 181, 205 on causes of Great Depression, 105–6 Gabriel, Peter, 20 Gagnon, Joseph, 219, 221 Gardiner, Chance (char.), 3 Garn–St.


pages: 147 words: 39,910

The Great Mental Models: General Thinking Concepts by Shane Parrish

Albert Einstein, Atul Gawande, Barry Marshall: ulcers, bitcoin, Black Swan, colonial rule, correlation coefficient, correlation does not imply causation, cuban missile crisis, Daniel Kahneman / Amos Tversky, dark matter, delayed gratification, feminist movement, index fund, Isaac Newton, Jane Jacobs, mandelbrot fractal, Pierre-Simon Laplace, Ponzi scheme, Richard Feynman, statistical model, stem cell, The Death and Life of Great American Cities, the map is not the territory, the scientific method, Thomas Bayes, Torches of Freedom

The factors correlate—meaning that alcohol consumption in parents has an inverse relationship with academic success in children. It is entirely possible that having parents who consume a lot of alcohol leads to worse academic outcomes for their children. It is also possible, however, that the reverse is true, or even that having kids who do poorly in school causes parents to drink more. Trying to invert the relationship can help you sort through claims to determine if you are dealing with true causation or just correlation. Causation Whenever correlation is imperfect, extremes will soften over time. The best will always appear to get worse and the worst will appear to get better, regardless of any additional action. This is called regression to the mean, and it means we have to be extra careful when diagnosing causation. This is something that the general media and sometimes even trained scientists fail to recognize.


Science Fictions: How Fraud, Bias, Negligence, and Hype Undermine the Search for Truth by Stuart Ritchie

Albert Einstein, anesthesia awareness, Bayesian statistics, Carmen Reinhart, Cass Sunstein, citation needed, Climatic Research Unit, cognitive dissonance, complexity theory, coronavirus, correlation does not imply causation, COVID-19, Covid-19, crowdsourcing, deindustrialization, Donald Trump, double helix, en.wikipedia.org, epigenetics, Estimating the Reproducibility of Psychological Science, Growth in a Time of Debt, Kenneth Rogoff, l'esprit de l'escalier, meta analysis, meta-analysis, microbiome, Milgram experiment, mouse model, New Journalism, p-value, phenotype, placebo effect, profit motive, publication bias, publish or perish, race to the bottom, randomized controlled trial, recommendation engine, rent-seeking, replication crisis, Richard Thaler, risk tolerance, Ronald Reagan, Scientific racism, selection bias, Silicon Valley, Silicon Valley startup, Stanford prison experiment, statistical model, stem cell, Steven Pinker, Thomas Bayes, twin studies, University of East Anglia

Incidentally, we shouldn’t forget what we saw in the previous chapter, about the low quality of much of the research on animals like mice. 22.  You often see a slightly different version: ‘correlation does not imply causation’. There’s an ambiguity here due to the multiple meanings of the word ‘imply’. In its strong definition (thing A logically involves thing B, in the same way that the existence of, say, a dance implies that there’s a dancer), it’s certainly true. But in its weak version (thing A suggests thing B without explicitly saying thing B, in the same way that a receiving a slightly terse email from your boss might imply that they’re unhappy with you), then it’s not correct. In that weak sense, a correlation does sometimes imply causation, even if there’s no causation there at all. Put it this way: if correlation didn’t imply causation in this latter sense, there wouldn’t be so much confusion between the two. 23.  

The psychophysiologist James Heathers has set up a novelty Twitter account that exists solely to retweet misleading news headlines from translational studies, such as ‘Scientists Develop Jab that Stops Craving for Junk Food’ or ‘Compounds in Carrots Reverse Alzheimer’s-Like Symptoms’ with a simple but accurate addition: ‘… IN MICE’.21 The third kind of hype found by the Cardiff team was possibly the most embarrassing. Everyone, especially scientists, is supposed to know that correlation is not causation.22 This basic insight is taught in every elementary statistics course and is a perennial feature of public debates about science, education, economics and more. When scientists look at an observational dataset, where data have been gathered without any randomised experimental intervention – say, a study charting the growth in children’s vocabulary as they get older – they’re generally just looking at correlations.

The statistician Matthew Hankins has amassed a collection of genuine quotes from published papers where p-values remained stubbornly above that threshold, but whose authors clearly had a strong desire for significant results: • ‘a trend that approached significance’ (for a result reported as ‘p < 0.06’) • ‘fairly significant’ (p = 0.09) • ‘significantly significant’ (p = 0.065) • ‘narrowly eluded statistical significance’ (p = 0.0789) • ‘hovered around significance’ (p = 0.061) • ‘very closely brushed the limit of statistical significance’ (p = 0.051) • ‘not absolutely significant but very probably so’ (p > 0.05)62 There’s a whole literature of studies by scientific spin-watchers, each of them highlighting spin in their own fields. 15 per cent of trials in obstetrics and gynaecology spun their non-significant results as if they showed benefits of the treatment.63 35 per cent of studies of prognostic tests for cancer used spin to obfuscate the non-significant nature of their findings.64 47 per cent of trials of obesity treatments published in top journals were spun in some way.65 83 per cent of papers reporting trials of antidepressant and anxiety medication failed to discuss important limitations of their study design.66 A review of brain-imaging studies concluded that hyping up correlation into causation was ‘rampant’.67 Some forms of spin shade into fraud, or at least gross incompetence: a 2009 review showed that, of a sample of studies published in Chinese medical journals that claimed to be randomised controlled trials, only 7 per cent actually used randomisation.68 Even meta-analyses aren’t safe, as we’ve seen before. A 2017 review of meta-analyses on diagnostic tests (for example, blood tests for Alzheimer’s disease) found that 50 per cent of them drew a positive conclusion about how well the test worked despite finding trivial, statistically non-significant effects in their analyses.


pages: 624 words: 127,987

The Personal MBA: A World-Class Business Education in a Single Volume by Josh Kaufman

Albert Einstein, Atul Gawande, Black Swan, business cycle, business process, buy low sell high, capital asset pricing model, Checklist Manifesto, cognitive bias, correlation does not imply causation, Credit Default Swap, Daniel Kahneman / Amos Tversky, David Heinemeier Hansson, David Ricardo: comparative advantage, Dean Kamen, delayed gratification, discounted cash flows, Donald Knuth, double entry bookkeeping, Douglas Hofstadter, en.wikipedia.org, Frederick Winslow Taylor, George Santayana, Gödel, Escher, Bach, high net worth, hindsight bias, index card, inventory management, iterative process, job satisfaction, Johann Wolfgang von Goethe, Kevin Kelly, Kickstarter, Lao Tzu, lateral thinking, loose coupling, loss aversion, Marc Andreessen, market bubble, Network effects, Parkinson's law, Paul Buchheit, Paul Graham, place-making, premature optimization, Ralph Waldo Emerson, rent control, side project, statistical model, stealth mode startup, Steve Jobs, Steve Wozniak, subscription business, telemarketer, the scientific method, time value of money, Toyota Production System, tulip mania, Upton Sinclair, Vilfredo Pareto, Walter Mischel, Y Combinator, Yogi Berra

Midranges are best used for quick estimates—they’re fast, and you only need to know two data points, but they can be easily skewed by outliers that are abnormally high or low, like Bill Gates’s bank balance. Means, Medians, Modes, and Midranges are useful analytical tools that can indicate typical results—provided you’re careful enough to use the right tool for the job. SHARE THIS CONCEPT: http://book.personalmba.com/mean-median-mode-midrange/ Correlation and Causation Correlation isn’t causation, but it sure is a hint. —EDWARD TUFTE, STATISTICIAN, INFORMATION DESIGN EXPERT, AND PROFESSOR AT YALE UNIVERSITY Imagine a billiards table: if you know the exact position of every ball on the table and the details of the forces applied to the cue ball (impact vector, impact force, location of impact, table friction, and air resistance), you can calculate exactly how the cue ball will travel and how it will affect other balls it hits along the way.

Here’s another thought experiment, using hypothetical data: people who suffer heart attacks eat, on average, 57 bacon double cheeseburgers every year. Does eating bacon double cheeseburgers cause heart attacks? Not necessarily. People who suffer heart attacks typically take 365 showers a year and blink their eyes 5.6 million times a year. Do taking showers and blinking your eyes cause heart attacks as well? Correlation is not Causation. Even if you notice that one measurement is highly associated with another, that does not prove that one thing caused the other. Imagine you own a pizza parlor, and you create a thirty-second advertisement to air on local television. Shortly after the commercial goes live, you notice a 30 percent increase in sales. Did the advertisement cause the increase? Not necessarily—the increase could be due to any number of factors.

For example, if you know that families go out to celebrate the end of school or that an annual convention is coming up, you can adjust for that seasonality by using historical data. The more you can isolate the change you made in the system from other factors, the more confidence you can have that the change you made intentionally actually caused the results you see. SHARE THIS CONCEPT: http://book.personalmba.com/correlation-causation/ Norms Those who cannot remember the past are condemned to repeat it. —GEORGE SANTAYANA, PHILOSOPHER, ESSAYIST, AND APHORIST If you want to compare the effectiveness of something in the present, it’s often useful to learn from the past. Norms are measures that use historical data as a tool to provide Context for current Measurements. For example, by looking at past data you may discover trends in your sales data directly related to the date the sale was made, which is called seasonality.


pages: 347 words: 99,969

Through the Language Glass: Why the World Looks Different in Other Languages by Guy Deutscher

Alfred Russel Wallace, correlation does not imply causation, Kickstarter, offshore financial centre, pattern recognition, Ralph Waldo Emerson, Sapir-Whorf hypothesis, Silicon Valley, Steven Pinker

Since the rooms face each other (rather like rooms 1 and 2 in the picture shown here), and since they have been arranged to look the same from the egocentric perspective, they are actually north-side-south. In his room the bed was in the north, in yours it is in the south; the telephone that in his room was in the west is now in the east. So while you will see and remember the same room twice, the Guugu Yimithirr speaker will see and remember two different rooms. CORRELATION OR CAUSATION? One of the most tempting and most common of all logical fallacies is to jump from correlation to causation: to assume that just because two facts correlate, one of them was the cause of the other. To reduce this kind of logic ad absurdum, I could advance the brilliant new theory that language can affect your hair color. In particular, I claim that speaking Swedish makes your hair go blond and speaking Italian makes your hair go dark. My proof?

There is no evidence of formal tuition in geographic coordinates at an early age (although there is evidence from Bali of some geographically relevant religious practices, such as putting children to bed with the head pointing in a particular geographic direction). So the only imaginable mechanism that could provide such intense drilling in orientation at such a young age is the spoken language—the need to know the directions in order to be able to communicate about the simplest aspects of everyday life. There is thus a compelling case that the relation between language and spatial thinking is not just correlation but causation, and that one’s mother tongue affects how one thinks about space. In particular, a language like Guugu Yimithirr, which forces its speakers to use geographic coordinates at all times, must be a crucial factor in bringing about the perfect pitch for directions and the corresponding patterns of memory that seem so weird and unattainable to us. Two centuries after Guugu Yimithirr bequeathed “kangaroo” to the world, its last remaining speakers gave the world a harsh lesson in philosophy and psychology.

And while this is as much as we can say with absolute certainty, it is plausible to go one step further and make the following inference: since people tend to react more quickly to color recognition tasks the farther apart the two colors appear to them, and since Russians react more quickly to shades across the siniy-goluboy border than what the objective distance between the hues would imply, it is plausible to conclude that neighboring hues around the border actually appear farther apart to Russian speakers than they are in objective terms. Of course, even if differences between the behavior of Russian and English speakers have been demonstrated objectively, it is always dangerous to jump automatically from correlation to causation. How can we be sure that the Russian language in particular—rather than anything else in the Russians’ background and upbringing—had any causal role in producing their response to colors near the border? Maybe the real cause of their quicker reaction time lies in the habit of Russians to spend hours on end gazing intently at the vast expanses of Russian sky? Or in years of close study of blue vodka?


pages: 337 words: 86,320

Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are by Seth Stephens-Davidowitz

affirmative action, AltaVista, Amazon Mechanical Turk, Asian financial crisis, Bernie Sanders, big data - Walmart - Pop Tarts, Cass Sunstein, computer vision, correlation does not imply causation, crowdsourcing, Daniel Kahneman / Amos Tversky, desegregation, Donald Trump, Edward Glaeser, Filter Bubble, game design, happiness index / gross national happiness, income inequality, Jeff Bezos, John Snow's cholera map, longitudinal study, Mark Zuckerberg, Nate Silver, peer-to-peer lending, Peter Thiel, price discrimination, quantitative hedge fund, Ronald Reagan, Rosa Parks, sentiment analysis, Silicon Valley, statistical model, Steve Jobs, Steven Levy, Steven Pinker, TaskRabbit, The Signal and the Noise by Nate Silver, working poor

And the last minute of this game will do something that, for an economist, is far more profound: the last sixty seconds will help finally tell us, once and for all, Do advertisements work? The notion that ads improve sales is obviously crucial to our economy. But it is maddeningly hard to prove. In fact, this is a textbook example of exactly how difficult it is to distinguish between correlation and causation. There’s no doubt that products that advertise the most also have the highest sales. Twentieth Century Fox spent $150 million marketing the movie Avatar, which became the highest-grossing film of all time. But how much of the $2.7 billion in Avatar ticket sales was due to the heavy marketing? Part of the reason 20th Century Fox spent so much money on promotion was presumably that they knew they had a desirable product.

If we did this, we would find that students who went to Stuyvesant score much higher on standardized tests and get accepted to substantially better universities. But as we’ve seen already in this chapter, this kind of evidence, by itself, is not convincing. Maybe the reason Stuyvesant students perform so much better is that Stuy attracts much better students in the first place. Correlation here does not prove causation. To test the causal effects of Stuyvesant High School, we need to compare two groups that are almost identical: one that got the Stuy treatment and one that did not. We need a natural experiment. But where can we find it? The answer: students, like Yilmaz, who scored very, very close to the cutoff necessary to attend Stuyvesant.* Students who just missed the cutoff are the control group; students who just made the cut are the treatment group.

Take college. Does it matter if you go to one of the best universities in the world, such as Harvard, or a solid school such as Penn State? Once again, there is a clear correlation between the ranking of one’s school and how much money people make. Ten years into their careers, the average graduate of Harvard makes $123,000. The average graduate of Penn State makes $87,800. But this correlation does not imply causation. Two economists, Stacy Dale and Alan B. Krueger, thought of an ingenious way to test the causal role of elite universities on the future earning potential of their graduates. They had a large dataset that tracked a whole host of information on high school students, including where they applied to college, where they were accepted to college, where they attended college, their family background, and their income as adults.


pages: 829 words: 186,976

The Signal and the Noise: Why So Many Predictions Fail-But Some Don't by Nate Silver

"Robert Solow", airport security, availability heuristic, Bayesian statistics, Benoit Mandelbrot, Berlin Wall, Bernie Madoff, big-box store, Black Swan, Broken windows theory, business cycle, buy and hold, Carmen Reinhart, Claude Shannon: information theory, Climategate, Climatic Research Unit, cognitive dissonance, collapse of Lehman Brothers, collateralized debt obligation, complexity theory, computer age, correlation does not imply causation, Credit Default Swap, credit default swaps / collateralized debt obligations, cuban missile crisis, Daniel Kahneman / Amos Tversky, diversification, Donald Trump, Edmond Halley, Edward Lorenz: Chaos theory, en.wikipedia.org, equity premium, Eugene Fama: efficient market hypothesis, everywhere but in the productivity statistics, fear of failure, Fellow of the Royal Society, Freestyle chess, fudge factor, George Akerlof, global pandemic, haute cuisine, Henri Poincaré, high batting average, housing crisis, income per capita, index fund, information asymmetry, Intergovernmental Panel on Climate Change (IPCC), Internet Archive, invention of the printing press, invisible hand, Isaac Newton, James Watt: steam engine, John Nash: game theory, John von Neumann, Kenneth Rogoff, knowledge economy, Laplace demon, locking in a profit, Loma Prieta earthquake, market bubble, Mikhail Gorbachev, Moneyball by Michael Lewis explains big data, Monroe Doctrine, mortgage debt, Nate Silver, negative equity, new economy, Norbert Wiener, PageRank, pattern recognition, pets.com, Pierre-Simon Laplace, prediction markets, Productivity paradox, random walk, Richard Thaler, Robert Shiller, Robert Shiller, Rodney Brooks, Ronald Reagan, Saturday Night Live, savings glut, security theater, short selling, Skype, statistical model, Steven Pinker, The Great Moderation, The Market for Lemons, the scientific method, The Signal and the Noise by Nate Silver, The Wisdom of Crowds, Thomas Bayes, Thomas Kuhn: the structure of scientific revolutions, too big to fail, transaction costs, transfer pricing, University of East Anglia, Watson beat the top human players on Jeopardy!, wikimedia commons

However, in order to achieve that purity, it denies the need for Bayesian priors or any other sort of messy real-world context. These methods neither require nor encourage us to think about the plausibility of our hypothesis: the idea that cigarettes cause lung cancer competes on a level playing field with the idea that toads predict earthquakes. It is, I suppose, to Fisher’s credit that he recognized that correlation does not always imply causation. However, the Fisherian statistical methods do not encourage us to think about which correlations imply causations and which ones do not. It is perhaps no surprise that after a lifetime of thinking this way, Fisher lost the ability to tell the difference. Bob the Bayesian In the Bayesian worldview, prediction is the yardstick by which we measure progress. We can perhaps never know the truth with 100 percent certainty, but making correct predictions is the way to tell if we’re getting closer.

As Hatzius sees it, economic forecasters face three fundamental challenges. First, it is very hard to determine cause and effect from economic statistics alone. Second, the economy is always changing, so explanations of economic behavior that hold in one business cycle may not apply to future ones. And third, as bad as their forecasts have been, the data that economists have to work with isn’t much good either. Correlations Without Causation The government produces data on literally 45,000 economic indicators each year.24 Private data providers track as many as four million statistics.25 The temptation that some economists succumb to is to put all this data into a blender and claim that the resulting gruel is haute cuisine. There have been only eleven recessions since the end of World War II.26 If you have a statistical model that seeks to explain eleven outputs but has to choose from among four million inputs to do so, many of the relationships it identifies are going to be spurious.

But it has given roughly as many false alarms—including most infamously in 1984, when it sharply declined for three straight months,34 signaling a recession, but the economy continued to zoom upward at a 6 percent rate of growth. Some studies have even claimed that the Leading Economic Index has no predictive power at all when applied in real time.35 “There’s very little that’s really predictive,” Hatzius told me. “Figuring out what’s truly causal and what’s correlation is very difficult to do.” Most of you will have heard the maxim “correlation does not imply causation.” Just because two variables have a statistical relationship with each other does not mean that one is responsible for the other. For instance, ice cream sales and forest fires are correlated because both occur more often in the summer heat. But there is no causation; you don’t light a patch of the Montana brush on fire when you buy a pint of Häagen-Dazs. If this concept is easily expressed, however, it can be hard to apply in practice, particularly when it comes to understanding the causal relationships in the economy.


pages: 442 words: 94,734

The Art of Statistics: Learning From Data by David Spiegelhalter

Antoine Gombaud: Chevalier de Méré, Bayesian statistics, Carmen Reinhart, complexity theory, computer vision, correlation coefficient, correlation does not imply causation, dark matter, Edmond Halley, Estimating the Reproducibility of Psychological Science, Hans Rosling, Kenneth Rogoff, meta analysis, meta-analysis, Nate Silver, Netflix Prize, p-value, placebo effect, probability theory / Blaise Pascal / Pierre de Fermat, publication bias, randomized controlled trial, recommendation engine, replication crisis, self-driving car, speech recognition, statistical model, The Design of Experiments, The Signal and the Noise by Nate Silver, The Wisdom of Crowds, Thomas Bayes, Thomas Malthus

In other words, wealthy people with higher education are more likely to be diagnosed and get their tumour registered, an example of what is known as ascertainment bias in epidemiology. ‘Correlation Does Not Imply Causation’ We saw in the last chapter how Pearson’s correlation coefficient measures how close the points on a scatter-plot are to a straight line. When considering English hospitals conducting children’s heart surgery in the 1990s, and plotting the number of cases against their survival, the high correlation showed that bigger hospitals were associated with lower mortality. But we could not conclude that bigger hospitals caused the lower mortality. This cautious attitude has a long pedigree. When Karl Pearson’s newly developed correlation coefficient was being discussed in the journal Nature in 1900, a commentator warned that ‘correlation does not imply causation’. In the succeeding century this phrase has been a mantra repeatedly uttered by statisticians when confronted by claims based on simply observing that two things tend to vary together.

It is easy for researchers to randomize if all they have to do is change a website: there is no effort to recruit participants since they don’t even know they are the subjects of an experiment, and there is no need to get ethical approval to use them as guinea pigs. But randomization is often difficult and sometimes impossible: we can’t test the effect of our habits by randomizing people to smoke or eat unhealthy diets (even though such experiments are performed on animals). When the data does not arise from an experiment, it is said to be observational. So often we are left with trying as best we can to sort out correlation from causation by using good design and statistical principles applied to observational data, combined with a healthy dose of scepticism. The issue of old men’s ears might be rather less important than some of the topics in this book, but illustrates the need for choosing study designs that are appropriate for answering questions. Taking a problem-solving approach based on the PPDAC cycle, the Problem is that, certainly based on my personal observation, old men often seem to have big ears.

So a form of regression has been developed for proportions, called logistic regression, which ensures a curve which cannot go above 100% or below 0%. Even without taking Bristol into account, hospitals with more patients had better survival rates, and the logistic regression coefficient (0.001) means the mortality rate is expected to be around 10% lower (relatively) for each additional 100 operations that a hospital conducts on under-1s over a four-year period.fn7 Of course, to use what is now rather a cliché, correlation does not mean causation, and we cannot conclude that bigger throughput is the reason for the better performance: as we mentioned previously, there could even be reverse causation, with hospitals with a good reputation attracting more patients. Figure 5.2 Fitted logistic regression model for child heart surgery data for under-1s in UK hospitals between 1991 and 1995. Hospitals treating more patients have better survival.


pages: 566 words: 153,259

The Panic Virus: The True Story Behind the Vaccine-Autism Controversy by Seth Mnookin

Albert Einstein, British Empire, Cass Sunstein, cognitive dissonance, correlation does not imply causation, Daniel Kahneman / Amos Tversky, en.wikipedia.org, illegal immigration, index card, Isaac Newton, loss aversion, meta analysis, meta-analysis, mouse model, neurotypical, pattern recognition, placebo effect, Richard Thaler, Saturday Night Live, selection bias, Solar eclipse in 1919, Stephen Hawking, Steven Pinker, the scientific method, Thomas Kuhn: the structure of scientific revolutions

Before continuing with the nationwide effort, his aides said, public health officials needed to promise that there would not be any more children who were diagnosed with polio after being vaccinated. That, as anyone with an elementary understanding of immunology knew, was an impossible guarantee to provide, and so, instead of trusting people to understand and accept that there are risks with every medical procedure and that correlation does not equal causation, or trying to explain that the problems appeared to be related to the specific conditions under which the infected batches had been produced and not with the safety of the vaccine generally, the government took the one step guaranteed to undermine public confidence: On May 7, Scheele announced that the polio vaccine program was being shut down so that the government, “with the help of the manufacturers,” could undertake “a reappraisal of all of their tests and procedures.”13 “The Public Health Service believes that every single step in the interest of safety must be taken,” he said.

., which was delivered under the protection of four policemen. Proving that the media’s frenzy for beating competitors by mere minutes is not a product of the Internet age, NBC immediately broke the embargo, and was just as quickly denounced by its competitors as forever tainting the sanctity of agreements made between reporters and their sources. 13 The difficulty in determining whether correlation equals causation causes an enormous number of misapprehensions. Until a specific mechanism demonstrating how A causes B is identified, it’s best to assume that any correlation is incidental, or that both A and B relate independently to some third factor. An example that highlights this is the correlation between drinking milk and cancer rates, which some advocacy groups (including People for the Ethical Treatment of Animals) use to argue that drinking milk causes cancer.

In order to square that circle, Kirby likened the dispute to a political campaign in which an “insurgent candidate” comes under “heavy fire from an entrenched opponent . . . the vitriol demonstrates that the challenge is being taken seriously, that it poses a realistic threat to the status quo.” In this political battle, Kirby employed a time-honored tactic of push pollers and ward politicians: He used an ominoussounding claim—“Curiously, the first case of autism was not recorded until the early 1940s, a few years after thimerosal was introduced in vaccines”—to make his accusation sound as if it was idle speculation. In this case, Kirby both blurred the difference between correlation and causation and conflated the first time a disease is given a particular label with the first time it appears in a population. (It was a little like saying, “Curiously, schizophrenia was not identified as a disorder until the late 1880s, a few years after Alexander Graham Bell invented the telephone.”) He also larded his writing with conditional statements and passive constructions: Eli Lilly “reportedly earn[ed] a profit” by licensing thimerosal to other drug companies; “the American health establishment . . . understandably has an interest in proving the unpleasant [thimerosal] theory wrong.”


pages: 531 words: 125,069

The Coddling of the American Mind: How Good Intentions and Bad Ideas Are Setting Up a Generation for Failure by Greg Lukianoff, Jonathan Haidt

AltaVista, Bernie Sanders, bitcoin, Black Swan, cognitive dissonance, correlation does not imply causation, demographic transition, Donald Trump, Ferguson, Missouri, Filter Bubble, helicopter parent, hygiene hypothesis, income inequality, Internet Archive, Isaac Newton, low skilled workers, Mahatma Gandhi, mass immigration, mass incarceration, means of production, moral panic, Nelson Mandela, Ralph Nader, risk tolerance, Silicon Valley, Snapchat, Steven Pinker, The Bell Curve by Richard Herrnstein and Charles Murray, Unsafe at Any Speed

Some institutions or companies make it harder for members of one group to succeed, as can be seen in recent books and articles about the toxic “bro culture” of Silicon Valley,37 which violates the dignity and rights of women (procedural injustice) while denying them the status, promotions, and pay that they deserve based on the quality of their work (distributive injustice). When you see a situation in which some groups are underrepresented, it is an invitation to investigate and find out whether there are obstacles, a hostile climate, or systemic factors that have a disparate impact on members of those groups. But how can you know whether unequal outcomes truly reveal a violation of justice? Correlation Does Not Imply Causation All social scientists know that correlation does not imply causation. If A and B seem to be linked—that is, they change together over time or are found together in a population at levels higher than chance would predict—then it is certainly possible that A caused B. But it’s also possible that B caused A (reverse causation) or that a third variable, C, caused both A and B and there is no direct relationship between A and B.

An article about the study that was published at Gawker.com featured this headline: MORE BUCK FOR YOUR BANG: PEOPLE WHO HAVE MORE SEX MAKE THE MOST MONEY.38 The headline suggested that A (sex) causes B (money), which is surely the best causal path to choose if your goal is to entice people to click on your article. But any social scientist presented with that correlation would instantly wonder about reverse causation (does having more money cause people to have more sex?) and would then move on to a third-variable explanation, which in this case seems to be the correct one.39 The Gawker story itself noted that people who are more extraverted have more sex and also make more money. In this case, a third variable, C (extraversion, or high sociability) may cause both A (more sex) and B (more money). Social scientists analyze correlations like this constantly (to the great annoyance of friends and family). They are self-appointed conversation referees, throwing a yellow penalty flag when anyone tries to interpret a correlation as evidence of causation. But a funny thing has been happening in recent years on campus.

Such efforts generally aim to remove barriers to equality of opportunity and also to ensure that everyone is treated with dignity. But when social justice efforts aim to achieve equality of outcomes by group, and when social justice activists are willing to violate distributive or procedural fairness for some individuals along the way, these efforts violate many people’s sense of intuitive justice. We call this equal-outcomes social justice. Correlation does not imply causation. Yet in many discussions in universities these days, the correlation of a demographic trait or identity group membership with an outcome gap is taken as evidence that discrimination (structural or individual) caused the outcome gap. Sometimes it did, sometimes it didn’t, but if people can’t raise alternative possible causal explanations without eliciting negative consequences, then the community is unlikely to arrive at an accurate understanding of the problem.


pages: 523 words: 112,185

Doing Data Science: Straight Talk From the Frontline by Cathy O'Neil, Rachel Schutt

Amazon Mechanical Turk, augmented reality, Augustin-Louis Cauchy, barriers to entry, Bayesian statistics, bioinformatics, computer vision, correlation does not imply causation, crowdsourcing, distributed generation, Edward Snowden, Emanuel Derman, fault tolerance, Filter Bubble, finite state, Firefox, game design, Google Glasses, index card, information retrieval, iterative process, John Harrison: Longitude, Khan Academy, Kickstarter, Mars Rover, Nate Silver, natural language processing, Netflix Prize, p-value, pattern recognition, performance metric, personalized medicine, pull request, recommendation engine, rent-seeking, selection bias, Silicon Valley, speech recognition, statistical model, stochastic process, text mining, the scientific method, The Wisdom of Crowds, Watson beat the top human players on Jeopardy!, X Prize

So, for example, your model for predicting or recommending a book on Amazon could include a feature “whether or not you’ve read Wes McKinney’s O’Reilly book Python for Data Analysis.” We wouldn’t say that reading his book caused you to read this book. It just might be a good predictor, which would have been discovered and come out as such in the process of optimizing for accuracy. We wish to emphasize here that it’s not simply the familiar correlation-causation trade-off you’ve perhaps had drilled into your head already, but rather that your intent when building such a model or system was not even to understand causality at all, but rather to predict. And that if your intent were to build a model that helps you get at causality, you would go about that in a different way. A whole different set of real-world problems that actually use the same statistical methods (logistic regression, linear regression) as part of the building blocks of the solution are situations where you do want to understand causality, when you want to be able to say that a certain type of behavior causes a certain outcome.

Madigan’s bio will be in the next chapter and requires this chapter as background. We’ll start instead with Ori, who is currently a data scientist at Wells Fargo. He got his PhD in biostatistics from UC Berkeley after working at a litigation consulting firm. As part of his job, he needed to create stories from data for experts to testify at trial, and he thus developed what he calls “data intuition” from being exposed to tons of different datasets. Correlation Doesn’t Imply Causation One of the biggest statistical challenges, from both a theoretical and practical perspective, is establishing a causal relationship between two variables. When does one thing cause another? It’s even trickier than it sounds. Let’s say we discover a correlation between ice cream sales and bathing suit sales, which we display by plotting ice cream sales and bathing suit sales over time in Figure 11-1.

bipartite graphs, Kyle Teague and GetGlue, Terminology from Social Networks black box algorithms, Machine Learning Algorithms bootstrap aggregating, Random Forests bootstrap samples, Random Forests Bruner, Jon, Data Journalism–Writing Technical Journalism: Advice from an Expert C caret packages, Code readability and reusability case attribute vs. social network data, Case-Attribute Data versus Social Network Data causal effect, Definition: The Causal Effect causal graphs, Visualizing Causality causal inference, The Rubin Causal Model causal models, In-Sample, Out-of-Sample, and Causality–In-Sample, Out-of-Sample, and Causality, In-Sample, Out-of-Sample, and Causality causal questions, Asking Causal Questions causality, Causality–Three Pieces of Advice A/B testing for evaluation, A/B Tests–A/B Tests causal questions, Asking Causal Questions clinical trials to determine, The Gold Standard: Randomized Clinical Trials–The Gold Standard: Randomized Clinical Trials correlation vs., Correlation Doesn’t Imply Causation–Confounders: A Dating Example observational studies and, Second Best: Observational Studies–Definition: The Causal Effect OK Cupid example, OK Cupid’s Attempt–OK Cupid’s Attempt unit level, The Rubin Causal Model visualizing, Visualizing Causality–Visualizing Causality centrality measures, Centrality Measures–The Industry of Centrality Measures eigenvalue centrality, Representations of Networks and Eigenvalue Centrality usefulness of, The Industry of Centrality Measures channels, Word Frequency Problem problems with, Word Frequency Problem chaos, Thought Experiment: How Would You Simulate Chaos?


The Data Revolution: Big Data, Open Data, Data Infrastructures and Their Consequences by Rob Kitchin

Bayesian statistics, business intelligence, business process, cellular automata, Celtic Tiger, cloud computing, collateralized debt obligation, conceptual framework, congestion charging, corporate governance, correlation does not imply causation, crowdsourcing, discrete time, disruptive innovation, George Gilder, Google Earth, Infrastructure as a Service, Internet Archive, Internet of things, invisible hand, knowledge economy, late capitalism, lifelogging, linked data, longitudinal study, Masdar, means of production, Nate Silver, natural language processing, openstreetmap, pattern recognition, platform as a service, recommendation engine, RFID, semantic web, sentiment analysis, slashdot, smart cities, Smart Cities: Big Data, Civic Hackers, and the Quest for a New Utopia, smart grid, smart meter, software as a service, statistical model, supply-chain management, the scientific method, The Signal and the Noise by Nate Silver, transaction costs

However, while pattern recognition might identify potentially interesting relationships, the veracity of these needs to be further tested on other datasets to ensure their reliability and validity. In other words, the relationships should form the basis for hypotheses that are more widely tested, which in turn are used to build and refine a theory that explains them. Thus correlations do not supersede causation, but rather should form the basis for additional research to establish if such correlations are indicative of causation. Only then can we get a sense as to how meaningful are the causes of the correlation. While the idea that data can speak for themselves free of bias or framing may seem like an attractive one, the reality is somewhat different. As Gould (1981: 166) notes, ‘inanimate data can never speak for themselves, and we always bring to bear some conceptual framework, either intuitive and illformed, or tightly and formally structured, to the task of investigation, analysis, and interpretation’.

In a provocative piece, Anderson argues that ‘the data deluge makes the scientific method obsolete’; that the patterns and relationships contained within big data inherently produce meaningful and insightful knowledge about social, political and economic processes and complex phenomena. He argues: There is now a better way. Petabytes allow us to say: ‘Correlation is enough.’ We can stop looking for models. We can analyze the data without hypotheses about what it might show. We can throw the numbers into the biggest computing clusters the world has ever seen and let statistical algorithms find patterns where science cannot... Correlation supersedes causation, and science can advance even without coherent models, unified theories, or really any mechanistic explanation at all. There’s no reason to cling to our old ways. (my emphasis) Similarly, Prensky (2009) argues: ‘scientists no longer have to make educated guesses, construct hypotheses and models, and test them with data-based experiments and examples. Instead, they can mine the complete set of data for patterns that reveal effects, producing scientific conclusions without further experimentation’ (my emphasis).

As argued in Chapter 1, data are not simply natural and essential elements that are abstracted from the world in neutral and objective ways and can be accepted at face value. Data do not pre-exist their generation and arise from nowhere. Rather data are created within a complex data assemblage that actively shapes its constitution. Data then can never just speak for themselves, but are always, inherently, speaking from a particular position (Crawford 2013). Further, Anderson’s (2008) claim that ‘[c]orrelation supersedes causation’, suggests that patterns found within a dataset are inherently meaningful. This is an assumption that all trained statisticians know is dangerous and false. Correlations between variables within a dataset can be random in nature and have no or little causal association (see Chapter 9). Interpreting every correlation as meaningful can therefore lead to serious ecological fallacies. This can be exacerbated in the case of big data because the empiricist position appears to promote the practice of data dredging – hunting for every correlation – thus increasing the likelihood of discovering random associations.


pages: 543 words: 153,550

Model Thinker: What You Need to Know to Make Data Work for You by Scott E. Page

"Robert Solow", Airbnb, Albert Einstein, Alfred Russel Wallace, algorithmic trading, Alvin Roth, assortative mating, Bernie Madoff, bitcoin, Black Swan, blockchain, business cycle, Capital in the Twenty-First Century by Thomas Piketty, Checklist Manifesto, computer age, corporate governance, correlation does not imply causation, cuban missile crisis, deliberate practice, discrete time, distributed ledger, en.wikipedia.org, Estimating the Reproducibility of Psychological Science, Everything should be made as simple as possible, experimental economics, first-price auction, Flash crash, Geoffrey West, Santa Fe Institute, germ theory of disease, Gini coefficient, High speed trading, impulse control, income inequality, Isaac Newton, John von Neumann, Kenneth Rogoff, knowledge economy, knowledge worker, Long Term Capital Management, loss aversion, low skilled workers, Mark Zuckerberg, market design, meta analysis, meta-analysis, money market fund, Nash equilibrium, natural language processing, Network effects, p-value, Pareto efficiency, pattern recognition, Paul Erdős, Paul Samuelson, phenotype, pre–internet, prisoner's dilemma, race to the bottom, random walk, randomized controlled trial, Richard Feynman, Richard Thaler, school choice, sealed-bid auction, second-price auction, selection bias, six sigma, social graph, spectrum auction, statistical model, Stephen Hawking, Supply of New York City Cabdrivers, The Bell Curve by Richard Herrnstein and Charles Murray, The Great Moderation, The Rise and Fall of American Growth, the rule of 72, the scientific method, The Spirit Level, The Wisdom of Crowds, Thomas Malthus, Thorstein Veblen, urban sprawl, value at risk, web application, winner-take-all economy, zero-sum game

A regression estimating the number of orders shipped per eight-hour shift as a function of the number of years an employee has worked produces the following: # Orders Filled = 200 + 20∗∗· Years The coefficient on years, 20, is significant at the 1% level. We can be confident it is positive. If the relationship is causal (see below), the model can be used to predict the number of orders that each employee can fill per shift as a function of years of work and we can use the model to project how many orders the current employees will fill next year. Here we have an instance of a model both making a prediction and guiding an action. Correlation vs. Causation Regression only reveals correlation among variables, not causality.3 If we first construct a model and then use regression to test if the model’s results are supported by data, we do not prove causality either. However, writing models first is far better than running regressions in search of a significant correlate, a technique known as data mining. Data mining runs the risk of identifying a variable that correlates with other causal variables.

First, inequality is one of the most important policy issues of our time. Income and wealth correlate with human flourishing. Higher-income individuals enjoy better health, longer life expectancy, and higher life satisfaction and happiness. Those at the bottom of the income distribution experience higher rates of homicide, divorce, mental illness, and anxiety.4 We must be careful not to confuse correlation with causation: a substantial part of this correlation can be explained by the fact that healthier, happier people earn more money. Nevertheless, almost all studies show a connection between income and flourishing. No one prefers to be poorer. Second, we have a plethora of models of inequality written by a diversely tooled collection of economists, sociologists, political scientists, and even physicists and biologists.

.), 233 conservatism, 89–90 consistency rational choice and, 50 in rational-actor model, 51 conspicuous consumption, 297 consumption, rational-actor model of, 48 consumption-investment equation, 101 content, 3 continuity, in rational-actor model, 49 continuous action games, 247–248 continuous function, entropy and, 146 continuous signals, 300 separation with, 301 convexity, 95–98 risk-loving and, 99 cooperation, 255 clustering bootstraps, 265 group selection and, 266 repetition and, 256–259 reputation and, 256–259 cooperative action model, 262–267 defining, 263 cooperative advantage, ratio of, 263 cooperative games, 108–110 defining, 108 coordination model, 241 paradoxes of, 174, 175 correlation, causation and, 86–87 costless signals, 298 craft traditions, 303 Craigslist, 105 Crime and Punishment (Dostoyevsky), 8 critical states, 74 Croson, Rachel, 243 crowded markets, 239 price competition in, 239 (fig.) crowds, wisdom of, 30 Cuban missile crisis, 9, 10 culture/strategy game, 318–319 cyclic, 147 data, 12 binary classifications of, 92–93 broadcast model and, 134 categories and, 34 dimensionality of, 31 in identification problem, 250 interpretation of, 2 many-model thinking and, 3–4 mining, 86 organization of, 2 overfitting, 41 on piece-rate work, 3–4 in wisdom hierarchy, 7 death rule, 176 decision problems, public project, 292 decisions, 47 Markov models, 199 trees, 93 decomposability, of entropy, 146 defection, 255 degeneration, 120 degree average, 118 of network formation, 123 in network structure, 118 of node, 118 degree squaring, 139–140 Deloitte, 4 Dennett, Daniel, 177–178 dependent variables, 84 depreciation rate, 101 design, 15 in REDCAPE, 20 Dewey, John, 305 Diamond, Jared, 269, 278 diffusion model, 134–137 diffusion probability, 135 dimensionality, of data, 31 diminishing returns, 98 discounting, hyperbolic, 52 discrete dynamical systems, 182 discrete signals, 298–300 Disney World, 185 distribution defining, 59 exponential, 149, 150 functions and, 63–66 lognormal, 60, 66–67 long-tailed, 59, 75–79 normal, 59–61, 61 (fig.), 65–66, 150 power-law, 69–73, 71 (fig.)


pages: 1,034 words: 241,773

Enlightenment Now: The Case for Reason, Science, Humanism, and Progress by Steven Pinker

3D printing, access to a mobile phone, affirmative action, Affordable Care Act / Obamacare, agricultural Revolution, Albert Einstein, Alfred Russel Wallace, anti-communist, Anton Chekhov, Arthur Eddington, artificial general intelligence, availability heuristic, Ayatollah Khomeini, basic income, Berlin Wall, Bernie Sanders, Black Swan, Bonfire of the Vanities, business cycle, capital controls, Capital in the Twenty-First Century by Thomas Piketty, carbon footprint, clean water, clockwork universe, cognitive bias, cognitive dissonance, Columbine, conceptual framework, correlation does not imply causation, creative destruction, crowdsourcing, cuban missile crisis, Daniel Kahneman / Amos Tversky, dark matter, decarbonisation, deindustrialization, dematerialisation, demographic transition, Deng Xiaoping, distributed generation, diversified portfolio, Donald Trump, Doomsday Clock, double helix, effective altruism, Elon Musk, en.wikipedia.org, end world poverty, endogenous growth, energy transition, European colonialism, experimental subject, Exxon Valdez, facts on the ground, Fall of the Berlin Wall, first-past-the-post, Flynn Effect, food miles, Francis Fukuyama: the end of history, frictionless, frictionless market, germ theory of disease, Gini coefficient, Hans Rosling, hedonic treadmill, helicopter parent, Hobbesian trap, humanitarian revolution, Ignaz Semmelweis: hand washing, income inequality, income per capita, Indoor air pollution, Intergovernmental Panel on Climate Change (IPCC), invention of writing, Jaron Lanier, Joan Didion, job automation, Johannes Kepler, John Snow's cholera map, Kevin Kelly, Khan Academy, knowledge economy, l'esprit de l'escalier, Laplace demon, life extension, long peace, longitudinal study, Louis Pasteur, Martin Wolf, mass incarceration, meta analysis, meta-analysis, Mikhail Gorbachev, minimum wage unemployment, moral hazard, mutually assured destruction, Naomi Klein, Nate Silver, Nathan Meyer Rothschild: antibiotics, Nelson Mandela, New Journalism, Norman Mailer, nuclear winter, obamacare, open economy, Paul Graham, peak oil, Peter Singer: altruism, Peter Thiel, precision agriculture, prediction markets, purchasing power parity, Ralph Nader, randomized controlled trial, Ray Kurzweil, rent control, Republic of Letters, Richard Feynman, road to serfdom, Robert Gordon, Rodney Brooks, rolodex, Ronald Reagan, Rory Sutherland, Saturday Night Live, science of happiness, Scientific racism, Second Machine Age, secular stagnation, self-driving car, sharing economy, Silicon Valley, Silicon Valley ideology, Simon Kuznets, Skype, smart grid, sovereign wealth fund, stem cell, Stephen Hawking, Steven Pinker, Stewart Brand, Stuxnet, supervolcano, technological singularity, Ted Kaczynski, The Rise and Fall of American Growth, the scientific method, The Signal and the Noise by Nate Silver, The Spirit Level, The Wealth of Nations by Adam Smith, The Wisdom of Crowds, Thomas Kuhn: the structure of scientific revolutions, Thomas Malthus, total factor productivity, union organizing, universal basic income, University of East Anglia, Unsafe at Any Speed, Upton Sinclair, uranium enrichment, urban renewal, War on Poverty, We wanted flying cars, instead we got 140 characters, women in the workforce, working poor, World Values Survey, Y2K

The citizens of richer countries have greater respect for “emancipative” or liberal values such as women’s equality, free speech, gay rights, participatory democracy, and protection of the environment (chapters 10 and 15). Not surprisingly, as countries get richer they get happier (chapter 18); more surprisingly, as countries get richer they get smarter (chapter 16).59 In explaining this Somalia-to-Sweden continuum, with poor violent repressive unhappy countries at one end and rich peaceful liberal happy ones at the other, correlation is not causation, and other factors like education, geography, history, and culture may play roles.60 But when the quants try to tease them apart, they find that economic development does seem to be a major mover of human welfare.61 In an old academic joke, a dean is presiding over a faculty meeting when a genie appears and offers him one of three wishes—money, fame, or wisdom. The dean replies, “That’s easy.

That creates an opening for politicians to rouse the rabble by singling out cheaters who take more than their fair share: welfare queens, immigrants, foreign countries, bankers, or the rich, sometimes identified with ethnic minorities.18 In addition to effects on individual psychology, inequality has been linked to several kinds of society-wide dysfunction, including economic stagnation, financial instability, intergenerational immobility, and political influence-peddling. These harms must be taken seriously, but here too the leap from correlation to causation has been contested.19 Either way, I suspect that it’s less effective to aim at the Gini index as a deeply buried root cause of many social ills than to zero in on solutions to each problem: investment in research and infrastructure to escape economic stagnation, regulation of the finance sector to reduce instability, broader access to education and job training to facilitate economic mobility, electoral transparency and finance reform to eliminate illicit influence, and so on.

In the developing world a young woman can’t even work as a household servant if she is unable to read a note or count out supplies, and higher rungs of the occupational ladder require ever-increasing abilities to understand technical material. The first countries that made the Great Escape from universal poverty in the 19th century, and the countries that have grown the fastest ever since, are the countries that educated their children most intensely.5 As with every question in social science, correlation is not causation. Do better-educated countries get richer, or can richer countries afford more education? One way to cut the knot is to take advantage of the fact that a cause must precede its effect. Studies that assess education at Time 1 and wealth at Time 2, holding all else constant, suggest that investing in education really does make countries richer. At least it does if the education is secular and rationalistic.


pages: 303 words: 75,192

10% Less Democracy: Why You Should Trust Elites a Little More and the Masses a Little Less by Garett Jones

"Robert Solow", Andrei Shleifer, Asian financial crisis, business cycle, central bank independence, clean water, corporate governance, correlation does not imply causation, creative destruction, Edward Glaeser, financial independence, game design, German hyperinflation, hive mind, invisible hand, Jean Tirole, Kenneth Rogoff, Mark Zuckerberg, mass incarceration, minimum wage unemployment, Mohammed Bouazizi, open economy, Pareto efficiency, Paul Samuelson, price stability, rent control, The Wealth of Nations by Adam Smith, trade liberalization

If, as neoclassical Nobel-winning economists like Lucas, Sargent, and Friedman said, the main job of a central bank is to maintain a low average rate of inflation, then it’s clear that the politically disconnected central banks are the ones that are getting the job done. FIGURE 3.1. Central Bank Independence and Inflation: A Negative Relationship. Source: Alesina and Summers (1993). Of course, any time you see data plotted out like this, with a strong correlation like this one, you should remind yourself that correlation isn’t causation—that having a chandelier in your house doesn’t make you rich (even though it’s a sign you’re rich), that buying a baby stroller won’t make you a parent (though it’s a sign a baby is on the way). But then what is causation? How can we know whether it’s the legal independence of the central banks of the United States, Switzerland, and Germany that is getting the job done? For instance, maybe instead it’s “German culture” that makes German inflation low.

The fear that such a prospect creates in the human heart spurs us to strengthen our arguments, find the data, check and see if we’re actually correct or if we’re just living in a dream world of our own creation. I’m taking that approach here, though in a nontechnical way: noting that multiple kinds of evidence, multiple measures of central bank independence, point toward the same prediction. The less political, the less democratic, the more insider driven the nation’s central bank is, the better the outcomes. A bundle of correlations tied together with some suggestions of causation. Oh, and here’s a small bonus in this tire-kicking line of research. Cukierman, like others, doesn’t find any noticeable evidence from the rich countries that central bank independence makes your country richer. You get lower, more stable inflation (which voters love), plus a more stable unemployment rate and a more stable economy overall, and those are great benefits.

Federal Reserve System, wrote in his excellent, short, blunt, not-too-technical book Central Banking in Theory and Practice that a common, but not universal, finding is that countries with more independent central banks have enjoyed lower average inflation rates without suffering lower average growth rates. . . . However at least two qualifications need to be entered. First, the notably negative correlation between central bank independence and inflation . . . is not very robust. For example, it does not hold up . . . when other variables are considered. . . . Second, some recent studies have questioned whether correlation implies causation.¹⁵ Despite the muddled empirical evidence, Blinder, who has been a central banker himself and has met many more central bankers the world over, still takes it as pretty obvious that central banks should be insulated, at least to some degree, from the political process. That said, he is no full-throated critic of democracy. Much of his essay is a discussion of the importance of being accountable to the citizenry, and he doesn’t see democratic politicians as uniquely shortsighted.


pages: 360 words: 85,321

The Perfect Bet: How Science and Math Are Taking the Luck Out of Gambling by Adam Kucharski

Ada Lovelace, Albert Einstein, Antoine Gombaud: Chevalier de Méré, beat the dealer, Benoit Mandelbrot, butterfly effect, call centre, Chance favours the prepared mind, Claude Shannon: information theory, collateralized debt obligation, correlation does not imply causation, diversification, Edward Lorenz: Chaos theory, Edward Thorp, Everything should be made as simple as possible, Flash crash, Gerolamo Cardano, Henri Poincaré, Hibernia Atlantic: Project Express, if you build it, they will come, invention of the telegraph, Isaac Newton, Johannes Kepler, John Nash: game theory, John von Neumann, locking in a profit, Louis Pasteur, Nash equilibrium, Norbert Wiener, p-value, performance metric, Pierre-Simon Laplace, probability theory / Blaise Pascal / Pierre de Fermat, quantitative trading / quantitative finance, random walk, Richard Feynman, Ronald Reagan, Rubik’s Cube, statistical model, The Design of Experiments, Watson beat the top human players on Jeopardy!, zero-sum game

By creating an explanation, we are assuming that one process has directly caused another. Horses in Hong Kong win because they are familiar with the terrain, and they are familiar with it because they have run lots of races. But just because two things are apparently related—like probability of winning and number of races run—it doesn’t mean that one directly causes the other. An oft-quoted mantra in the world of statistics is that “correlation does not imply causation.” Take the wine budget of Cambridge colleges. It turns out that the amount of money each Cambridge college spent on wine in the 2012–2013 academic year was positively correlated with students’ exam results during the same period. The more the colleges spent on wine, the better the results generally were. (King’s College, once home to Karl Pearson and Alan Turing, topped the wine list with a spend of £338,559, or about £850 per student.)

“Use of Performance Metrics to Forecast Success in the National Hockey League” (paper presented at the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, Prague, September 23–27, 2013). 205England had the lowest PDO: Burn-Murdoch, John. “Were England the Uunluckiest Team in the World Cup Group Stages?” FT Data Blog. 29 June 2014. http://blogs.ft.com/ftdata/2014/06/29/were-england-the-unluckiest-team-in-the-world-cup-group-stages/. 206Cambridge college spent on wine: “In Vino Veritas, Redux.” The Economist, February 5, 2014. http://www.economist.com/blogs/freeexchange/2014/02/correlation-and-causation-0. 207topped the wine list with a spend of £338,559: Simons, John. “Wages Not Wine: Booze Hound Colleges Spend £3 million on Wine.” Tab (Cambridge, England), January 22, 2014. http://thetab.com/uk/cambridge/2014/01/22/booze-hound-colleges-spend-3-million-on-wine-32441. 207Countries that consume lots of chocolate: Messerli, F. H. “Chocolate Consumption, Cognitive Function, and Nobel Laureates.”

See robots (bots) computerized prediction in blackjack, 42 in checkers, 156, 157 in horse racing, 46, 51, 57, 68 and the Monte Carlo method, 61 in roulette, 2, 13, 14, 15–20, 22 in sports, 80–82, 87, 88, 89–90, 97, 105, 217 “Computing Machinery and Intelligence” (Turing), 175 Connect Four, 158–159 control over events, 199 controlled randomness, 25–26, 28 cooperative relationships, 129, 136 copycats, 132 Coram, Marc, 63, 64 correlation and causation, issue of, 206–207 Corsi rating, 85 Cosmopolitan, Las Vegas 87 countermeasures, 21, 86, 195, 214 counting cards. See card counting Crick, Francis, 23 cricket, 90 curiosity, following, 218 Dahl, Fredrik, 172–173, 175, 176, 177, 182–183, 184, 185 Darwin, Charles, 46 data access to, 142 availability of, 54, 55, 68, 73, 86, 102, 174, 209 better, sports analysis methods and access to, 207, 217 binary, 116 collecting as much as possible, 4–5, 103 enough, to test strategies, 131 faster transatlantic travel of, 113 juggling, 166 limited, 84 new, testing strategies against, 53, 54 statistics and, importance of, in sports, 79, 80 storage and communication of, 11 data chunks, memory capacity and size of, 179–180 Deceptive Interaction Task, 190–191 decision making, chaotic, 162 decision-making layers, 173–174 Deep Blue chess computer, 166, 167, 171, 176 Deep Thought chess computer, 167 DeepFace Facebook algorithm, 174–175 DeRosa, David, 198–199, 200 Design of Experiments, The (Fisher), 24 deterministic game, 156 Diaconis, Persi, 62–63 DiCristina, Lawrence, 198, 199, 200, 201 Dixon, Mark, 74, 75, 76–78, 82, 97–98, 107, 218 Djokovic, Novak, 110 Dobson, Andrew, 129 Dodds, Peter Sheridan, 203 dogma, avoiding, 218 dovetail shuffle, 41–42, 62 Dow Jones Industrial Average, 96, 121, 122 Drug Enforcement Administration, 214 eBay, 94 Econometrica (journal), 148 economic theory, 153 ecosystems, 125–129, 130–131, 133 Einstein, Albert, 210 endgame database, 159–160 English Draughts Association, 156 English Premier League, 209 Enigma machines, 169–170 Eslami, Ali, 185–186, 187 Ethier, Stewart, 7–8 Eudaemonic Pie, The (Bass), 14, 15 Eudaemonic prediction method, 14, 15–20, 22, 124, 208 Euro 2008 soccer tournament, 76 European Championship (soccer tournament), 111 European currency union, 129 every-day gamblers, 102, 107 exchange rate, 110 exchanges.


pages: 1,380 words: 190,710

Building Secure and Reliable Systems: Best Practices for Designing, Implementing, and Maintaining Systems by Heather Adkins, Betsy Beyer, Paul Blankinship, Ana Oprea, Piotr Lewandowski, Adam Stubblefield

anti-pattern, barriers to entry, bash_history, business continuity plan, business process, Cass Sunstein, cloud computing, continuous integration, correlation does not imply causation, create, read, update, delete, cryptocurrency, cyber-physical system, database schema, Debian, defense in depth, DevOps, Edward Snowden, fault tolerance, fear of failure, general-purpose programming language, Google Chrome, Internet of things, Kubernetes, load shedding, margin call, microservices, MITM: man-in-the-middle, performance metric, pull request, ransomware, revision control, Richard Thaler, risk tolerance, self-driving car, Skype, slashdot, software as a service, source of truth, Stuxnet, Turing test, undersea cable, uranium enrichment, Valgrind, web application, Y2K, zero day

A trivial example based around glibc’s pthread_create created thread stacks of the same size, so we could rule out the kernel and glibc as the sources of the different sizes. We then examined the code that started threads, and discovered that many libraries just picked a thread size at random, explaining the variation of sizes. This understanding enabled us to save memory by focusing on the few threads with large stacks. Be mindful of correlation versus causation Sometimes debuggers assume that two events that start at the same time, or that exhibit similar symptoms, have the same root cause. However, correlation does not always imply causation. Two mundane problems might occur at the same time but have different root causes. Some correlations are trivial. For example, an increase in latency might lead to a reduction in user requests, simply because users are waiting longer for the system to respond. If a team repeatedly discovers correlations that in retrospect are trivial, there might be a gap in their understanding of how the system is supposed to work.

-Triaging the Incident when tools try to be helpful, Operational Security cross-site scripting (XSS), Preventing XSS: SafeHtml cryptographic code, Example: Secure cryptographic APIs and the Tink crypto framework-Example: Secure cryptographic APIs and the Tink crypto framework cryptographic keys, Credential and Secret Rotation cryptography, The Roles of Specialists CSRs (Certificate Signing Requests), Programming Language Choice culture, A Note About Culture, Building a Culture of Security and Reliability-Conclusionaligning goals and participant incentives, Align Project Goals and Participant Incentives balancing accountability and risk taking, Culture of Yes building a case for change, Build a Case for Change building a culture of security and reliability, Building a Culture of Security and Reliability-Conclusion building empathy, Build Empathy changing through good practice, Changing Culture Through Good Practice-Build Empathy culture of awareness, Culture of Awareness-Culture of Awareness culture of inevitability, Culture of Inevitably culture of review, Culture of Review-Culture of Review culture of sustainability, Culture of Sustainability-Culture of Sustainability culture of yes, Culture of Yes defining a healthy security/reliability culture, Defining a Healthy Security and Reliability Culture-Culture of Sustainability escalations and problem resolution, Escalations and Problem Resolution increasing productivity and usability, Increase Productivity and Usability-Increase Productivity and Usability leadership buy-in for security/reliability changes, Convincing Leadership-Escalations and Problem Resolution least privilege's impact on, Impact on Collaboration and Company Culture overcommunication and transparency, Overcommunicate and Be Transparent picking your battles, Pick Your Battles reducing fear with risk-reduction mechanisms, Reduce Fear with Risk-Reduction Mechanisms-Reduce Fear with Risk-Reduction Mechanisms safety nets as norm, Make Safety Nets the Norm security and reliability as default condition, Culture of Security and Reliability by Default culture of no, Culture of Yes culture of yes, Culture of Yes CVD (coordinated vulnerability disclosure), Compromises Versus Bugs Cyber Grand Challenge, Automation and Artificial Intelligence Cyber Kill Chain, Cyber Kill Chains™ cyber warfare, Military purposes D Dapper, Improve observability DARPA (Defense Advanced Research Projects Agency), Automation and Artificial Intelligence data corruption, Distinguish horses from zebras data integrity, Integrity data isolation, Data isolation data plane, Example: Google’s frontend design data sanitization, Data Sanitization data summarization, Budget for Logging DDoS attacks (see distributed denial-of-service attacks) debugging, Debugging Techniques-Practice!cleaning up code, Clean up code collaborative debugging, Collaborative Debugging: A Way to Teach correlation-versus-causation problem, Be mindful of correlation versus causation data corruption and checksums, Distinguish horses from zebras deleting legacy systems, Delete it! distinguishing common from uncommon bugs, Distinguish horses from zebras filtering out normal events from bugs, Know what’s normal for your system hypothesis testing with actual data, Test your hypotheses with actual data importance of regular practice, Practice!

The data from this analysis is usually fed back to teams doing forensic and detection work to provide better insight about potential indications that a system has been compromised. In digital forensics, the relationships between events are as important as the events themselves. Much of the work a forensic analyst does to obtain artifacts contributes to the goal of building a forensic timeline.5 By collecting a chronologically ordered list of events, an analyst can determine correlation and causation of attacker activity, proving why these events happened. Example: An Email Attack Let’s consider a fictional scenario: an unknown attacker has successfully compromised an engineer’s workstation by sending a malicious attachment via email to a developer, who unwittingly opened it. This attachment installed a malicious browser extension onto the developer’s workstation. The attacker, leveraging the malicious extension, then stole the developer’s credentials and logged in to a file server.


pages: 348 words: 99,383

The Financial Crisis and the Free Market Cure: Why Pure Capitalism Is the World Economy's Only Hope by John A. Allison

Affordable Care Act / Obamacare, American ideology, bank run, banking crisis, Bernie Madoff, business cycle, clean water, collateralized debt obligation, correlation does not imply causation, Credit Default Swap, credit default swaps / collateralized debt obligations, crony capitalism, disintermediation, fiat currency, financial innovation, Fractional reserve banking, full employment, high net worth, housing crisis, invisible hand, life extension, low skilled workers, market bubble, market clearing, minimum wage unemployment, money market fund, moral hazard, negative equity, obamacare, Paul Samuelson, price mechanism, price stability, profit maximization, quantitative easing, race to the bottom, reserve currency, risk/return, Robert Shiller, Robert Shiller, The Bell Curve by Richard Herrnstein and Charles Murray, too big to fail, transaction costs, yield curve, zero-sum game

A clear example of the proper use of mathematical models is physics. However, the models used in physics capture causal relationships and are properly evaluated based on the predictive power of these causal relationships. However, in economics, practically all mathematical models capture correlations, not causations. There is a difference in kind between correlation and causation. Also, the models are based on a multitude of assumptions. The danger lies in placing far too much confidence in models based on correlation rather than causation. Economists and government regulators often fall into the trap of believing that these models are objective. However, there are important economic factors, such as human behavior, that cannot be clearly mathematized. Taking these models as “gospel” is dangerous. There is also a tendency, in developing the models, to assume normal distributions with small “tails.”

In reality, the tails often turn out to be “fat,” that is, to have a greater chance of occurring than the model suggests. The tails typically represent very positive and very negative outcomes. In the case of the financial crisis, the negative fat tails (improbable events) became reality. These tails were magnified by the effect of panic on human behavior under stress. All the correlations (which were not based on causation) fell apart when human beings, who make decisions, started reacting to negative news. In addition, it is easy to underestimate the likelihood of unlikely events. For example, if you build a house in a 100-year flood plain, you will at some point experience a flood. It may be 90 years from now, or it may be next week. Eventually (or soon), a flood will affect your house. The mathematical models used by economists today are often floating abstractions that are not attached to reality.


pages: 578 words: 168,350

Scale: The Universal Laws of Growth, Innovation, Sustainability, and the Pace of Life in Organisms, Cities, Economies, and Companies by Geoffrey West

Alfred Russel Wallace, Anton Chekhov, Benoit Mandelbrot, Black Swan, British Empire, butterfly effect, carbon footprint, Cesare Marchetti: Marchetti’s constant, clean water, complexity theory, computer age, conceptual framework, continuous integration, corporate social responsibility, correlation does not imply causation, creative destruction, dark matter, Deng Xiaoping, double helix, Edward Glaeser, endogenous growth, Ernest Rutherford, first square of the chessboard, first square of the chessboard / second half of the chessboard, Frank Gehry, Geoffrey West, Santa Fe Institute, Guggenheim Bilbao, housing crisis, Index librorum prohibitorum, invention of agriculture, invention of the telephone, Isaac Newton, Jane Jacobs, Jeff Bezos, Johann Wolfgang von Goethe, John von Neumann, Kenneth Arrow, laissez-faire capitalism, life extension, Mahatma Gandhi, mandelbrot fractal, Marchetti’s constant, Masdar, megacity, Murano, Venice glass, Murray Gell-Mann, New Urbanism, Peter Thiel, profit motive, publish or perish, Ray Kurzweil, Richard Feynman, Richard Florida, Silicon Valley, smart cities, Stephen Hawking, Steve Jobs, Stewart Brand, technological singularity, The Coming Technological Singularity, The Death and Life of Great American Cities, the scientific method, too big to fail, transaction costs, urban planning, urban renewal, Vernor Vinge, Vilfredo Pareto, Von Neumann architecture, Whole Earth Catalog, Whole Earth Review, wikimedia commons, working poor

Data for data’s sake, or the mindless gathering of big data, without any conceptual framework for organizing and understanding it, may actually be bad or even dangerous. Just relying on data alone, or even mathematical fits to data, without having some deeper understanding of the underlying mechanism is potentially deceiving and may well lead to erroneous conclusions and unintended consequences. This admonition is closely related to the classic warning that “correlation does not imply causation.” Just because two sets of data are closely correlated does not imply that one is the cause of the other. There are many bizarre examples that illustrate this point.4 For instance, over the eleven-year period from 1999 to 2010 the variation in the total spending on science, space, and technology in the United States almost exactly followed the variation in the number of suicides by hanging, strangulation, and suffocation.

With the advent of big data this classic view is being challenged. In a highly provocative article published in Wired magazine in 2008 titled “The End of Theory: The Data Deluge Makes the Scientific Method Obsolete,” its then editor, Chris Anderson, wrote: The new availability of huge amounts of data, along with the statistical tools to crunch these numbers, offers a whole new way of understanding the world. Correlation supersedes causation, and science can advance even without coherent models, unified theories, or really any mechanistic explanation at all . . . faced with massive data, this approach to science—hypothesize, model, test—is becoming obsolete. . . . Out with every theory of human behavior, from linguistics to sociology. Forget taxonomy, ontology, and psychology. Who knows why people do what they do? The point is they do it, and we can track and measure it with unprecedented fidelity.

There are many versions of these, but all of them are based on the idea that we can design and program computers and algorithms to evolve and adapt based on data input to solve problems, reveal insights, and make predictions. They all rely on iterative procedures for finding and building upon correlations in data without concern for why such relationships exist and implicitly presume that “correlation supersedes causation.” This approach has become a huge area of interest and has already had a big impact on our lives. For instance, it is central to how search engines like Google operate, how strategies for investment or operating an organization are devised, and it provides the foundational basis for driverless cars. It also brings up the classic philosophical question as to what extent these machines are “thinking.”


pages: 414 words: 119,116

The Health Gap: The Challenge of an Unequal World by Michael Marmot

active measures, active transport: walking or cycling, Affordable Care Act / Obamacare, Atul Gawande, Bonfire of the Vanities, Broken windows theory, Capital in the Twenty-First Century by Thomas Piketty, Carmen Reinhart, Celtic Tiger, centre right, clean water, congestion charging, correlation does not imply causation, Doha Development Round, epigenetics, financial independence, future of work, Gini coefficient, Growth in a Time of Debt, illegal immigration, income inequality, Indoor air pollution, Kenneth Rogoff, Kibera, labour market flexibility, longitudinal study, lump of labour, Mahatma Gandhi, meta analysis, meta-analysis, microcredit, New Urbanism, obamacare, paradox of thrift, race to the bottom, Rana Plaza, RAND corporation, road to serfdom, Simon Kuznets, Socratic dialogue, structural adjustment programs, the built environment, The Spirit Level, trickle-down economics, twin studies, urban planning, Washington Consensus, Winter of Discontent, working poor

Should we really assume that these dark satanic mills and airless places, rather than causing terrible illness and shortened lives, selectively employed and attracted as residents sick people and those whose backgrounds accounted for all their subsequent illness? That subsequent improvement in living and working conditions, thus abating Victorian squalor, and associated improvements in health, were correlation, not causation? That while medical care improved health, housing also got better, and an intellectually slack public health profession mistook the improvement in housing and working conditions for causes of improved health? If proponents of this set of assumptions dropped their guard for a moment and accepted the evidence that air pollution, crowded living space, ghastly working conditions and poor nutrition were causes of ill-health in Victorian times why, a priori, do they start from the position that living and working conditions are not a cause of ill-health in the twenty-first century?

As well as having health insurance, 94 per cent had graduated from high school, and 43 per cent were college graduates. The ACE study was not a one-off. A review of 124 studies confirmed that child physical abuse, emotional abuse and neglect (they did not study sexual abuse) are linked to adult mental disorders, suicide attempts, drug use, sexually transmitted infections and risky sexual behaviour.9 The authors of the review concluded that this is more than simple correlation but represents causation. The graded nature of the relation between abuse and adult mental, and perhaps physical, ill-health – the more types of abuse the worse the adult health – suggests that we should not be looking only at exceptional episodes of abuse but, more generally, at quality of early child development. Indeed, further evidence supports this. Britain has been blessed by a series of long-term studies of people born at a particular moment and followed through their lives.

FIGURE 6.3: GETTING INTO WORK IN SWANSEA AND WREXHAM By focusing on the problem in a strategic way, working with young people, giving them access to information, and perhaps above all, caring, authorities in these towns lowered the toll of young people not in employment, education or training. There was an unexpected benefit. Youth offending in Swansea fell from over 1,000 incidents a year to fewer than 400.33 Correlation is not causation. One cannot say that the reduction in NEETs was responsible for the reduction in youth offending, but it is certainly possible. Unemployment harms health and work is vital. When work is of ‘good’ quality it is empowering. It provides power, money and resources – all essential for a healthy life. The ‘good’ characteristics of work tend to follow the social gradient: greater empowerment and better conditions go with higher status.


pages: 533

Future Politics: Living Together in a World Transformed by Tech by Jamie Susskind

3D printing, additive manufacturing, affirmative action, agricultural Revolution, Airbnb, airport security, Andrew Keen, artificial general intelligence, augmented reality, automated trading system, autonomous vehicles, basic income, Bertrand Russell: In Praise of Idleness, bitcoin, blockchain, brain emulation, British Empire, business process, Capital in the Twenty-First Century by Thomas Piketty, cashless society, Cass Sunstein, cellular automata, cloud computing, computer age, computer vision, continuation of politics by other means, correlation does not imply causation, crowdsourcing, cryptocurrency, digital map, distributed ledger, Donald Trump, easy for humans, difficult for computers, Edward Snowden, Elon Musk, en.wikipedia.org, Erik Brynjolfsson, Ethereum, ethereum blockchain, Filter Bubble, future of work, Google bus, Google X / Alphabet X, Googley, industrial robot, informal economy, intangible asset, Internet of things, invention of the printing press, invention of writing, Isaac Newton, Jaron Lanier, John Markoff, Joseph Schumpeter, Kevin Kelly, knowledge economy, lifelogging, Metcalfe’s law, mittelstand, more computing power than Apollo, move fast and break things, move fast and break things, natural language processing, Network effects, new economy, night-watchman state, Oculus Rift, Panopticon Jeremy Bentham, pattern recognition, payday loans, price discrimination, price mechanism, RAND corporation, ransomware, Ray Kurzweil, Richard Stallman, ride hailing / ride sharing, road to serfdom, Robert Mercer, Satoshi Nakamoto, Second Machine Age, selection bias, self-driving car, sexual politics, sharing economy, Silicon Valley, Silicon Valley startup, Skype, smart cities, Smart Cities: Big Data, Civic Hackers, and the Quest for a New Utopia, smart contracts, Snapchat, speech recognition, Steve Jobs, Steve Wozniak, Steven Levy, technological singularity, the built environment, The Structural Transformation of the Public Sphere, The Wisdom of Crowds, Thomas L Friedman, universal basic income, urban planning, Watson beat the top human players on Jeopardy!, working-age population

Or they fall foul of the group membership fallacy: the fact that I am a member of a group that tends to have a particular characteristic does not necessarily mean that I share that characteristic (a point sometimes lost on probabilistic machine learning approaches).There’s the entrenchment problem: it may well be true that students from higher-income families are more likely to get better grades at university, but using family income as an admission criterion would obviously entrench the educational inequality that already exists.8 There’s the correlation/ causation fallacy: the data may tell you that people who play golf tend to do better in business, but that does not mean that business success is caused by playing golf (and to hire on that basis might contradict a principle of justice which says that hiring should be done on merit).These are just a few examples—but given what we know of human ignorance and prejudice, we can be sure they aren’t the only ones.

Gabriella 180, 181, 404 collaborative democracy see Wiki Democracy Collini, Stefan 407 Collins,Victor 134 command economy 265, 329 commons 331–4, 335 commons-based peer production 244 communication liberty and private power 190–1 perception-control 148, 150–1, 229 communism 12 ‘fully automated luxury’ 328 community, freedom of see republican freedom companionship, as robot function 55 Compas 174 competition law 357 competitive elitism 217–19, 221, 240, 242, 253, 254 Computerscience.org 423 computing power, growth in 37–41 Comte, Auguste 170, 175, 177, 250, 403, 417 concentration camp inmates 131 495 concentration of wealth and power 318–22, 329–30 sharing economy 336 conceptual analysis 81–3, 84–5 Condliffe, Jamie 375 Condorcet, Marquis de 224 Conger, Krista 372 connectivity of technology 44–8 Connolly, William E. 390 consent principle 351–2, 353, 355, 357 Constant, Benjamin 128, 395 constitutive nature of technology 53–7 contextual analysis 84–5 contracts, smart see smart contracts Cooper, Daniel 402 Copernicus, Nicolaus 14 copyright 324, 332, 333 infringement 156 Cornell University 57 correlation/causation fallacy, rule-based injustice 284 corruption 82, 84, 225, 329, 361 Costeja González, Mario 138 Couldry, Nick 421 counters (democracy theorists) 224–5 Crawford, Kate 418 Creative Commons 45 credit scores 267 Crete-Nishihata, Masashi 399 Crick, Bernard 72, 389, 408 criminal justice 259 Cronologics 319 Cross, Tim 375 Crossley, Rob 388 crowdocracy see Wiki Democracy Crowdpac 417 crowds, wisdom of see wisdom of crowds crowdsourcing 244 cryptography 182–4 Cukier, Kenneth 387, 388, 395, 397, 403, 427, 433 data 62, 65 forgetting versus remembering 137 cultural oppression 273 OUP CORRECTED PROOF – FINAL, 28/05/18, SPi РЕЛИЗ ПОДГОТОВИЛА ГРУППА "What's News" VK.COM/WSNWS 496 Index Dabate, Connie and Richard 134–5 Dahir, Abdi Latif 405 Dahl, Robert A. 91, 390, 391, 411, 430 Daily Stormer 236 D’Ancona, Matthew 239, 412, 415 Dandeker, Christopher 391 Darknessbot 234 data as capital 317 increasingly quantified society 61–7 data-based injustice 282 Data Deal 66, 336–40, 358 Data Democracy 212, 246–50, 254, 348 datafication 62–7 data storage digitization 62 nanotechnology 56 usufructuary rights 330 data unions 340 Dayen, David 427 Dean, Sarah 402 Decentralised Autonomous Organisations (DAOs) 47 Deep Knowledge Ventures 31, 251 Defense Advanced Research Projects Agency (DARPA) 47, 178 De Filippi, Primavera 120, 378, 392, 393, 394 Delaney, Kevin J. 425, 430 delegation AI Democracy 252 Direct Democracy 242 Deleuze, Gilles 395 Delft University 56 Deliberative Democracy 212, 227–39, 254, 348 democracy 3, 10, 23–4, 346, 359–60 after the internet 219–21 AI Democracy 212, 213, 250–4, 348 arguments for 222–6 classical 214–16, 254 competitive elitism 217–19, 221, 240, 242, 253, 254 concept 74–6 conceptual analysis 81, 82, 84–5 contextual analysis 84–5 Data Democracy 212, 246–50, 254, 348 Deliberative Democracy 212, 227–39, 254, 348 Direct Democracy 212, 239–43, 254, 348 dream of 211–26 epistemic superiority 223–4, 234 in the future 227–54 liberal 216–17, 246, 254 and liberty 207–8, 222, 225, 249 liquid 242 nature of 213 normative analysis 84–5 representative 218, 240, 248 stability 240 story of 213–21 supercharged state, power of the 348 Wiki Democracy 212, 243–6, 254, 348 DemocracyOS 242, 415 Democratic Party (US) 229 desert, justice as 260–1 Desert Wolf 404 Desrosières, Alain 369, 370 Devlin, Patrick 202, 203, 204, 407, 408 Diamandis, Peter H. 374, 435 Dickens, Charles 211 dictatorship 71 Digital Confederalism 193, 205–6, 341 structural regulation 357, 358 digital disrespect 276 digital dissent 179–84 digital filtering see filtering digital law 100–14 Digital Liberalism 205 Digital Libertarianism 205 digital liberty 205–7 digital lifeworld algorithmic injustice 279, 285, 290, 292–4 code’s empire 97 democracy 212–13, 222, 227–54 OUP CORRECTED PROOF – FINAL, 28/05/18, SPi РЕЛИЗ ПОДГОТОВИЛА ГРУППА "What's News" VK.COM/WSNWS Index distributive justice 266, 269 force 100–1, 103, 107–8, 113, 116, 118–19, 121 freedom and the tech firm 188–90, 193–4, 196, 198, 200, 208 increasingly capable systems 29–41 increasingly integrated technology 42–60 increasingly quantified society 61–8 individual responsibility 346–7 justice in recognition 276–8 liberty 168–72, 180, 183, 185, 187 limits 360–1 perception-control 146–52 post-politics 362, 366 power 98–9, 345–6 property 314–17, 320, 322–3, 328–31, 334–6, 340–1 public and private power 154, 156, 158, 160 regulation 354, 357 scrutiny 123, 127–41 social justice 258–9 technological unemployment 295, 304, 306, 311 thinking like a theorist 69–86 Digital Millennium Copyright Act 1988 430 Digital Moralism 206 Digital Paternalism 198, 199, 206 digital ranking 276–8 Digital Republicanism 206–7, 347 structural regulation 357 Digital Rights Management (DRM) technology 96, 102, 105, 172, 333 digital storage 129 digitization 62 of force 100, 101–14 Direct Democracy 212, 239–43, 254, 348 disabilities, people with digital liberation 169 as victims of violence crimes 273 Discord 236 discrimination 497 algorithmic 281–2 rule-based injustice 284, 287–8 disrespect, digital 276 dissent, digital 179–84 distributed computing see smart devices distributive justice 257–70, 274, 278 Data Deal 337 Private Property Paradigm 326 Divine Rule 349 DNA 64, 362 Dodge, Martin 391 Domesday Book 16–17, 369 dominant goods 154 Domingos, Pedro 373, 374, 410, 417, 432 computing power, growth in 38 data unions 340 machine learning 34–5 Drahos, Peter 431 Dredge, Stuart 384, 385 driverless vehicles see self-driving vehicles drones force 106 hacking 183 increasingly integrated technology 54, 55 productive technologies 316 sharing economy 335 totalitarianism 179 utility analogy 158 Dryzek, John S. 368 Dunn, John 408, 409 Durkheim, Émile 61 Dvorsky, George 384 Dworkin, Gerald 171, 352, 401, 402, 432 Dwoskin, Elizabeth 433 Dwyer, Paula 428 Ebay 102 Economist 378, 379, 380, 381, 397, 422 Edelman, Benjamin 423 e-Democracia Wikilegis 244 Edwards, Cory 371 OUP CORRECTED PROOF – FINAL, 28/05/18, SPi РЕЛИЗ ПОДГОТОВИЛА ГРУППА "What's News" VK.COM/WSNWS 498 Index egalitarianism 259, 261–5 egalitarian plateau 259 e-government 220 Egypt 19, 183 Eisenstein, Elizabeth 62, 387 Ekbia, Hamid R. 431 Electrick spray paint 51 Electronic Frontier Foundation 406 Eliot, T.


pages: 327 words: 103,336

Everything Is Obvious: *Once You Know the Answer by Duncan J. Watts

active measures, affirmative action, Albert Einstein, Amazon Mechanical Turk, Black Swan, business cycle, butterfly effect, Carmen Reinhart, Cass Sunstein, clockwork universe, cognitive dissonance, coherent worldview, collapse of Lehman Brothers, complexity theory, correlation does not imply causation, crowdsourcing, death of newspapers, discovery of DNA, East Village, easy for humans, difficult for computers, edge city, en.wikipedia.org, Erik Brynjolfsson, framing effect, Geoffrey West, Santa Fe Institute, George Santayana, happiness index / gross national happiness, high batting average, hindsight bias, illegal immigration, industrial cluster, interest rate swap, invention of the printing press, invention of the telescope, invisible hand, Isaac Newton, Jane Jacobs, Jeff Bezos, Joseph Schumpeter, Kenneth Rogoff, lake wobegon effect, Laplace demon, Long Term Capital Management, loss aversion, medical malpractice, meta analysis, meta-analysis, Milgram experiment, natural language processing, Netflix Prize, Network effects, oil shock, packet switching, pattern recognition, performance metric, phenotype, Pierre-Simon Laplace, planetary scale, prediction markets, pre–internet, RAND corporation, random walk, RFID, school choice, Silicon Valley, social intelligence, statistical model, Steve Ballmer, Steve Jobs, Steve Wozniak, supply-chain management, The Death and Life of Great American Cities, the scientific method, The Wisdom of Crowds, too big to fail, Toyota Production System, ultimatum game, urban planning, Vincenzo Peruggia: Mona Lisa, Watson beat the top human players on Jeopardy!, X Prize

With their own electronic sales databases, third-party ratings agencies like Nielsen and comScore, and the recent tidal wave of clickstream data online, advertisers can measure many more variables, and at far greater resolution, than Wanamaker could. Arguably, in fact, the advertising world has more data than it knows what to do with. No, the real problem is that what advertisers want to know is whether their advertising is causing increased sales; yet almost always what they measure is the correlation between the two. In theory, of course, everyone “knows” that correlation and causation are different, but it’s so easy to get the two mixed up in practice that we do it all the time. If we go on a diet and then subsequently lose weight, it’s all too tempting to conclude that the diet caused the weight loss. Yet often when people go on diets, they change other aspects of their lives as well—like exercising more or sleeping more or simply paying more attention to what they’re eating.

Both these strategies will have the effect that sales and advertising will tend to be correlated whether or not the advertising is causing anything at all. But as with the diet, it is the advertising effort on which the business focuses its attention; thus if sales or some other metric of interest subsequently increases, it’s tempting to conclude that it was the advertising, and not something else, that caused the increase.17 Differentiating correlation from causation can be extremely tricky in general. But one simple solution, at least in principle, is to run an experiment in which the “treatment”—whether the diet or the ad campaign—is applied in some cases and not in others. If the effect of interest (weight loss, increased sales, etc.) happens significantly more in the presence of the treatment than it does in the “control” group, we can conclude that it is in fact causing the effect.

Part of the problem is also that social scientists, like everyone else, participate in social life and so feel as if they can understand why people do what they do simply by thinking about it. It is not surprising, therefore, that many social scientific explanations suffer from the same weaknesses—ex post facto assertions of rationality, representative individuals, special people, and correlation substituting for causation—that pervade our commonsense explanations as well. MEASURING THE UNMEASURABLE One response to this problem, as Lazarsfeld’s colleague Samuel Stouffer noted more than sixty years ago, is for sociologists to depend less on their common sense, not more, and instead try to cultivate uncommon sense.10 But getting away from commonsense reasoning in sociology is easier said than done.


pages: 245 words: 64,288

Robots Will Steal Your Job, But That's OK: How to Survive the Economic Collapse and Be Happy by Pistono, Federico

3D printing, Albert Einstein, autonomous vehicles, bioinformatics, Buckminster Fuller, cloud computing, computer vision, correlation does not imply causation, en.wikipedia.org, epigenetics, Erik Brynjolfsson, Firefox, future of work, George Santayana, global village, Google Chrome, happiness index / gross national happiness, hedonic treadmill, illegal immigration, income inequality, information retrieval, Internet of things, invention of the printing press, jimmy wales, job automation, John Markoff, Kevin Kelly, Khan Academy, Kickstarter, knowledge worker, labor-force participation, Lao Tzu, Law of Accelerating Returns, life extension, Loebner Prize, longitudinal study, means of production, Narrative Science, natural language processing, new economy, Occupy movement, patent troll, pattern recognition, peak oil, post scarcity, QR code, race to the bottom, Ray Kurzweil, recommendation engine, RFID, Rodney Brooks, selection bias, self-driving car, slashdot, smart cities, software as a service, software is eating the world, speech recognition, Steven Pinker, strong AI, technological singularity, Turing test, Vernor Vinge, women in the workforce

Using this tool, we can check how our culture has developed over time with regards to our areas of interest. * * * Figure 13.2: Comparing ‘happiness’ and ‘growth’ over time with n-grams. Courtesy of Google. * * * We can see in Figure 13.2 how ‘happiness’ and ‘growth’, between 1800 and 2008 have a negative correlation: as ‘growth’ rises, ‘happiness’ declines. Around 1830, authors started to talk more about growth than happiness. Again, to be fair, correlation does not imply causation, and the mere fact of writing about something does not tell you the whole story. This data only shows the occurrences of such words in books, not their context, nor their meaning. Authors could well have been talking about the ‘loss of happiness’, or something even more subtle. But it does show that the interest in growth has been, well, growing, whereas writers cared less to talk about being happy.

Courtesy of Google. * * * Figure 13.3 shows how the correlation is even stronger. I took the specific term ‘economic growth’, to rule out other possible disturbances in context. ‘Happiness’ declines from 1950 to 1995, while ‘economic growth’ and ‘GDP’ rise. After that we observe the reverse effect: both ‘GDP’ and ‘economic growth’ fall, while happiness increases considerably. Again, correlation does not mean causation, but it surely is remarkable what this data shows. For more than half a century, our culture has been fuelling the idea that the pursuit of growth, work, and economic expansion should be one of our primary goals in life, if not the highest of all. But that assumption is being challenged and it is slowly beginning to crumble. This very book that you are reading now did not come out of the blue.

That must be it.227 OK, let us be serious now. While I enjoy picking on the self-help idiocy wave that has invaded the United States and the UK these last five years, there are some suggestions that might actually help you, if you approach them with a bit of scientific rigour. I imagine you must be pretty tired of reading about things that do not work, scientific analyses with no clear distinction between correlation and causation, and plain old common sense masqueraded as hidden truth. How about some practical suggestions, things that you can apply in your daily life, that you would not already know? You know my position regarding self-help. I think it is mostly a pseudoscientific scam that greedy people play on the desperate and the gullible. However, if taken seriously, there are some things you could try, and that might actually help you live a happier life.


pages: 204 words: 58,565

Keeping Up With the Quants: Your Guide to Understanding and Using Analytics by Thomas H. Davenport, Jinho Kim

Black-Scholes formula, business intelligence, business process, call centre, computer age, correlation coefficient, correlation does not imply causation, Credit Default Swap, en.wikipedia.org, feminist movement, Florence Nightingale: pie chart, forensic accounting, global supply chain, Hans Rosling, hypertext link, invention of the telescope, inventory management, Jeff Bezos, Johannes Kepler, longitudinal study, margin call, Moneyball by Michael Lewis explains big data, Myron Scholes, Netflix Prize, p-value, performance metric, publish or perish, quantitative hedge fund, random walk, Renaissance Technologies, Robert Shiller, Robert Shiller, self-driving car, sentiment analysis, six sigma, Skype, statistical model, supply-chain management, text mining, the scientific method, Thomas Davenport

The degree of relatedness is expressed as a correlation coefficient, which ranges from −1.0 to +1.0. Correlation = +1 (Perfect positive correlation, meaning that both variables always move in the same direction together) Correlation = 0 (No relationship between the variables) Correlation = −1 (Perfect negative correlation, meaning that as one variable goes up, the other always trends downward) Correlation does not imply causation. Correlation is a necessary but insufficient condition for casual conclusions. Dependent variable: The variable whose value is unknown that you would like to predict or explain. For example, if you wish to predict the quality of a vintage wine using average growing season temperature, harvest rainfall, winter rainfall, and the age of the vintage, the quality of a vintage wine would be the dependent variable.

As we mentioned earlier in describing mad scientist experiments, if you create test and control groups and randomly assign people to them, if there turns out to be a difference in outcomes between the two groups, you can usually attribute it to being caused by the test condition. But if you simply find a statistical relationship between two factors, it’s unlikely to be a causal relationship. You may have heard the phrase, “correlation is not causation,” and it’s important to remember. Cognitive psychologists Christopher Chabris and Daniel Simons suggest a useful technique for checking on the causality issue in their book The Invisible Gorilla and Other Ways Our Intuitions Deceive Us: “When you hear or read about an association between two factors, think about whether people could have been assigned randomly to conditions for one of them.


pages: 877 words: 182,093

Wealth, Poverty and Politics by Thomas Sowell

affirmative action, Albert Einstein, British Empire, Capital in the Twenty-First Century by Thomas Piketty, colonial exploitation, colonial rule, correlation does not imply causation, Deng Xiaoping, desegregation, European colonialism, full employment, Gunnar Myrdal, income inequality, income per capita, invention of the sewing machine, invisible hand, low skilled workers, mass immigration, means of production, minimum wage unemployment, New Urbanism, profit motive, rent control, Scramble for Africa, Simon Kuznets, Steve Jobs, The Bell Curve by Richard Herrnstein and Charles Murray, The Wealth of Nations by Adam Smith, transatlantic slave trade, transcontinental railway, trickle-down economics, very high income, War on Poverty

Culture also does not lend itself to quantification, as a genetic determinist has complained,29 and therefore cannot produce statistical analyses, such as that showing a high correlation between nations’ IQ scores and their per capita incomes.30 Such correlations may lend an air of scientific precision, but so did earlier correlations between climate and prosperity by a geographic determinist.31 Both sets of correlations are from data taken in an extremely thin slice of time, compared to the many millennia of human history, during which various peoples’ and nations’ relative achievements have changed greatly. Moreover, as statisticians have often pointed out, correlation is not causation— and, as was said years ago: “It is better to be roughly right than precisely wrong.”32 Whether considering cultural, geographic, political or other factors, interactions of these various factors are part of the reason why understanding influences is very different from claiming determinism. a Daron Acemoglu and James A. Robinson, Why Nations Fail: The Origins of Power, Prosperity, and Poverty (New York: Crown Business, 2012).

Geography alone is enough to prevent equal prospects for all, but “No one can be praised or blamed for the temperature of the air, or the volume and timing of rainfall, or the lay of the land.”1 THE DIRECTION OF CAUSATION Differences in geography, demography, culture and other factors can make economic and other prospects or outcomes unequal for different individuals and groups, even if particular institutions or societies were to treat everyone the same. Nevertheless, many people blame statistical inequalities on the institutions where the statistics that convey these inequalities happened to be collected. Others blame some factor with which negative outcomes are correlated— blaming crime on poverty, for example. Statisticians have long warned against confusing correlation with causation, but too often those warnings have been ignored. Even when there is in fact a causal relationship between two things, that by itself does not tell us the direction of causation— that is, whether X caused Y or Y caused X, or whether both were caused by some other factor Z. It is possible that poverty causes crime, but it is also possible that the same set of attitudes and behavior— or the same lack of human capital— that lead to poverty can also lead to crime.

One way to tell which of these possibilities fits the facts is to compare academic test results between students in groups which each have both low-income families and high-income families. In 1981 and in 1995, for example, the average SAT score of black high school students on the mathematics portion of the test was lower than the average score of either white or Asian high school students. Since black students come from families with lower average incomes than either white or Asian students, this establishes correlation but does not help us determine causation, much less the direction of causation. However, when in 1981 black students from families with incomes of $50,000 or more scored slightly below white students from families with incomes under $6,000, and even further below Asian students with incomes under $6,000,2 clearly the cause of the test score differences was not differences in income. A very similar pattern appeared in 1995.


pages: 654 words: 191,864

Thinking, Fast and Slow by Daniel Kahneman

Albert Einstein, Atul Gawande, availability heuristic, Bayesian statistics, Black Swan, Cass Sunstein, Checklist Manifesto, choice architecture, cognitive bias, complexity theory, correlation coefficient, correlation does not imply causation, Daniel Kahneman / Amos Tversky, delayed gratification, demand response, endowment effect, experimental economics, experimental subject, Exxon Valdez, feminist movement, framing effect, hedonic treadmill, hindsight bias, index card, information asymmetry, job satisfaction, John von Neumann, Kenneth Arrow, libertarian paternalism, loss aversion, medical residency, mental accounting, meta analysis, meta-analysis, nudge unit, pattern recognition, Paul Samuelson, pre–internet, price anchoring, quantitative trading / quantitative finance, random walk, Richard Thaler, risk tolerance, Robert Metcalfe, Ronald Reagan, Shai Danziger, Supply of New York City Cabdrivers, The Chicago School, The Wisdom of Crowds, Thomas Bayes, transaction costs, union organizing, Walter Mischel, Yom Kippur War

The control group is expected to improve by regression alone, and the aim of the experiment is to determine whether the treated patients improve more than regression can explain. Incorrect causal interpretations of regression effects are not restricted to readers of the popular press. The statistician Howard Wainer has drawn up a long list of eminent researchers who have made the same mistake—confusing mere correlation with causation. Regression effects are a common source of trouble in research, and experienced scientists develop a healthy fear of the trap of unwarranted causal inference. One of my favorite examples of the errors of intuitive prediction is adapted from Max Bazerman’s excellent text Judgment in Managerial Decision Making: You are the sales forecaster for a department store chain. All stores are similar in size and merchandise selection, but their sales differ because of location, competition, and random factors.

income and education: The correlation appears impressive, but I was surprised to learn many years ago from the sociologist Christopher Jencks that if everyone had the same education, the inequality of income (measured by standard deviation) would be reduced only by about 9%. The relevant formula is v (1–r2), where r is the correlation. correlation and regression: This is true when both variables are measured in standard scores—that is, where each score is transformed by removing the mean and dividing the result by the standard deviation. confusing mere correlation with causation: Howard Wainer, “The Most Dangerous Equation,” American Scientist 95 (2007): 249–56. 18: Taming Intuitive Predictions far more moderate: The proof of the standard regression as the optimal solution to the prediction problem assumes that errors are weighted by the squared deviation from the correct value. This is the least-squares criterion, which is commonly accepted. Other loss functions lead to different solutions. 19: The Illusion of Understanding narrative fallacy: Nassim Nicholas Taleb, The Black Swan: The Impact of the Highly Improbable (New York: Random House, 2007).

Statistical Prediction: A Theoretical Analysis and a Review of the Evidence (Meehl) Clinton, Bill Coelho, Marta coffee mug experiments cognitive busyness cognitive ease; in basic assessments; and illusions of remembering; and illusions of truth; mood and; and writing persuasive messages; WYSIATI (what you see is all there is) and cognitive illusions; confusing experiences with memories; of pundits; of remembering; of skill; of stock-picking skill; of truth; of understanding; of validity Cognitive Reflection Test (CRT) cognitive strain Cohen, David coherence; see also associative coherence Cohn, Beruria coincidence coin-on-the-machine experiment cold-hand experiment Collins, Jim colonoscopies colostomy patients competence, judging of competition neglect complex vs. simple language concentration cogndiv height="0%"> “Conditions for Intuitive Expertise: A Failure to Disagree” (Kahneman and Klein) confidence; bias of, over doubt; overconfidence; WYSIATI (what you see is all there is) and confirmation bias conjunction fallacy conjunctive events, evaluation of “Consequences of Erudite Vernacular Utilized Irrespective of Necessity: Problems with Using Long Words Needlessly” (Oppenheimer) contiguity in time and place control cookie experiment correlation; causation and; illusory; regression and; shared factors and correlation coefficient cost-benefit correlation costs creativity; associative memory and credibility Csikszentmihalyi, Mihaly curriculum team Damasio, Antonio dating question Dawes, Robyn Day Reconstruction Method (DRM) death: causes of; life stories and; organ donation and; reminders of Deaton, Angus decisions, decision making; broad framing in; and choice from description; and choice from experience; emotions and vividness in; expectation principle in; in gambles, see gambles; global impressions and; hindsight bias and; narrow framing in; optimistic bias in; planning fallacy and; poverty and; premortem and; reference points in; regret and; risk and, see risk assessment decision utility decision weights; overweighting; unlikely events and; in utility theory vs. prospect theory; vivid outcomes and; vivid probabilities and decorrelated errors default options denominator neglect depression Detroit/Michigan problem Diener, Ed die roll problem dinnerware problem disclosures disease threats disgust disjunctive events, evaluation of disposition effect DNA evidence dolphins Dosi, Giovanni doubt; bias of confidence over; premortem and; suppression of Duke University Duluth, Minn., bridge in duration neglect duration weighting earthquakes eating eBay Econometrica economics; behavioral; Chicago school of; neuroeconomics; preference reversals and; rational-agent model in economic transactions, fairness in Econs and Humans Edge Edgeworth, Francis education effectiveness of search sets effort; least, law of; in self-control ego depletion electricity electric shocks emotional coherence, see halo effect emotional learning emotions and mood: activities and; affect heuristic; availability biases and; in basic assessments; cognitive ease and; in decision making; in framing; mood heuristic for happiness; negative, measuring; and outcomes produced by action vs. inaction; paraplegics and; perception of; substitution of question on; in vivid outcomes; in vivid probabilities; weather and; work and employers, fairness rules and endangered species endowment effect; and thinking like a trader energy, mental engagement Enquiry Concerning Human Understanding, An (Hume) entrepreneurs; competition neglect by Epley, Nick Epstein, Seymour equal-weighting schemes Erev, Ido evaluability hypothesis evaluations: joint; joint vs. single; single evidence: one-sided; of witnesses executive control expectation principle expectations expected utility theory, see utility theory experienced utility experience sampling experiencing self; well-being of; see also well-being expert intuition; evaluating; illusions of validity of; overconfidence and; as recognition; risk assessment and; vs. statistical predictions; trust in expertise, see skill Expert Political Judgment: How Good Is It?


pages: 321 words: 97,661

How to Read a Paper: The Basics of Evidence-Based Medicine by Trisha Greenhalgh

call centre, complexity theory, conceptual framework, correlation coefficient, correlation does not imply causation, deskilling, knowledge worker, longitudinal study, meta analysis, meta-analysis, microbiome, New Journalism, p-value, personalized medicine, placebo effect, publication bias, randomized controlled trial, selection bias, the scientific method

Whom is the study about? Was the design of the study sensible? Was systematic bias avoided or minimised? Was assessment ‘blind’? Were preliminary statistical questions addressed? Summing up References Chapter 5: Statistics for the non-statistician How can non-statisticians evaluate statistical tests? Have the authors set the scene correctly? Paired data, tails and outliers Correlation, regression and causation Probability and confidence The bottom line Summary References Chapter 6: Papers that report trials of drug treatments and other simple interventions ‘Evidence’ and marketing Making decisions about therapy Surrogate endpoints What information to expect in a paper describing a randomised controlled trial: the CONSORT statement Getting worthwhile evidence out of a pharmaceutical representative References Chapter 7: Papers that report trials of complex interventions Complex interventions Ten questions to ask about a paper describing a complex intervention References Chapter 8: Papers that report diagnostic or screening tests Ten men in the dock Validating diagnostic tests against a gold standard Ten questions to ask about a paper that claims to validate a diagnostic or screening test Likelihood ratios Clinical prediction rules References Chapter 9: Papers that summarise other papers (systematic reviews and meta-analyses) When is a review systematic?

Non-normal (skewed) data can sometimes be transformed to give a normal-shape graph by plotting the logarithm of the skewed variable or performing some other mathematical transformation (such as square root or reciprocal). Some data, however, cannot be transformed into a smooth pattern, and the significance of this is discussed subsequently. Deciding whether data are normally distributed is not an academic exercise, because it will determine what type of statistical tests to use. For example, linear regression (see section ‘Correlation, regression and causation’) will give misleading results unless the points on the scatter graph form a particular distribution about the regression line—that is, the residuals (the perpendicular distance from each point to the line) should themselves be normally distributed. Transforming data to achieve a normal distribution (if this is indeed achievable) is not cheating. It simply ensures that data values are given appropriate emphasis in assessing the overall effect.

I assumed this was a transcription error, so I moved the decimal point two places to the left. Some weeks later, I met the technician who had analysed the specimens and he asked ‘Whatever happened to that chap with acromegaly?’ Statistically correcting for outliers (e.g. to modify their effect on the overall result) is quite a sophisticated statistical manoeuvre. If you are interested, try the relevant section in your favourite statistics textbook. Correlation, regression and causation Has correlation been distinguished from regression, and has the correlation coefficient (‘r-value’) been calculated and interpreted correctly? For many non-statisticians, the terms correlation and regression are synonymous, and refer vaguely to a mental image of a scatter graph with dots sprinkled messily along a diagonal line sprouting from the intercept of the axes. You would be right in assuming that if two things are not correlated, it will be meaningless to attempt a regression.


pages: 304 words: 82,395

Big Data: A Revolution That Will Transform How We Live, Work, and Think by Viktor Mayer-Schonberger, Kenneth Cukier

23andMe, Affordable Care Act / Obamacare, airport security, barriers to entry, Berlin Wall, big data - Walmart - Pop Tarts, Black Swan, book scanning, business intelligence, business process, call centre, cloud computing, computer age, correlation does not imply causation, dark matter, double entry bookkeeping, Eratosthenes, Erik Brynjolfsson, game design, IBM and the Holocaust, index card, informal economy, intangible asset, Internet of things, invention of the printing press, Jeff Bezos, Joi Ito, lifelogging, Louis Pasteur, Mark Zuckerberg, Menlo Park, Moneyball by Michael Lewis explains big data, Nate Silver, natural language processing, Netflix Prize, Network effects, obamacare, optical character recognition, PageRank, paypal mafia, performance metric, Peter Thiel, post-materialism, random walk, recommendation engine, self-driving car, sentiment analysis, Silicon Valley, Silicon Valley startup, smart grid, smart meter, social graph, speech recognition, Steve Jobs, Steven Levy, the scientific method, The Signal and the Noise by Nate Silver, The Wealth of Nations by Adam Smith, Thomas Davenport, Turing test, Watson beat the top human players on Jeopardy!

So the quarantine applies only to the individual Internet users whose searches were most highly correlated with having the flu. Here we have the data on whom to pick up. Federal agents, armed with lists of Internet Protocol addresses and mobile GPS information, herd the individual web searchers into quarantine centers. But as reasonable as this scenario might sound to some, it is just plain wrong. Correlations do not imply causation. These people may or may not have the flu. They’d have to be tested. They’d be prisoners of a prediction, but more important, they’d be victims of a view of data that lacks an appreciation for what the information actually means. The point of the actual Google Flu Trends study is that certain search terms are correlated with the outbreak—but the correlation may exist because of circumstances like healthy co-workers hearing sneezes in the office and going online to learn how to protect themselves, not because the searchers are ill themselves.

They were able to achieve their accomplishments because so many features of the city had been datafied (however inconsistently), allowing them to process the information. The inklings of experts had to take a backseat to the data-driven approach. At the same time, Flowers and his kids continually tested their system with veteran inspectors, drawing on their experience to make the system perform better. Yet the most important reason for the program’s success was that it dispensed with a reliance on causation in favor of correlation. “I am not interested in causation except as it speaks to action,” explains Flowers. “Causation is for other people, and frankly it is very dicey when you start talking about causation. I don’t think there is any cause whatsoever between the day that someone files a foreclosure proceeding against a property and whether or not that place has a historic risk for a structural fire. I think it would be obtuse to think so.


pages: 278 words: 70,416

Super Thinking: The Big Book of Mental Models by Gabriel Weinberg, Lauren McCann

affirmative action, Affordable Care Act / Obamacare, Airbnb, Albert Einstein, anti-pattern, Anton Chekhov, autonomous vehicles, bank run, barriers to entry, Bayesian statistics, Bernie Madoff, Bernie Sanders, Black Swan, Broken windows theory, business process, butterfly effect, Cal Newport, Clayton Christensen, cognitive dissonance, commoditize, correlation does not imply causation, crowdsourcing, Daniel Kahneman / Amos Tversky, David Attenborough, delayed gratification, deliberate practice, discounted cash flows, disruptive innovation, Donald Trump, Douglas Hofstadter, Edward Lorenz: Chaos theory, Edward Snowden, effective altruism, Elon Musk, en.wikipedia.org, experimental subject, fear of failure, feminist movement, Filter Bubble, framing effect, friendly fire, fundamental attribution error, Gödel, Escher, Bach, hindsight bias, housing crisis, Ignaz Semmelweis: hand washing, illegal immigration, income inequality, information asymmetry, Isaac Newton, Jeff Bezos, John Nash: game theory, lateral thinking, loss aversion, Louis Pasteur, Lyft, mail merge, Mark Zuckerberg, meta analysis, meta-analysis, Metcalfe’s law, Milgram experiment, minimum viable product, moral hazard, mutually assured destruction, Nash equilibrium, Network effects, nuclear winter, offshore financial centre, p-value, Parkinson's law, Paul Graham, peak oil, Peter Thiel, phenotype, Pierre-Simon Laplace, placebo effect, Potemkin village, prediction markets, premature optimization, price anchoring, principal–agent problem, publication bias, recommendation engine, remote working, replication crisis, Richard Feynman, Richard Feynman: Challenger O-ring, Richard Thaler, ride hailing / ride sharing, Robert Metcalfe, Ronald Coase, Ronald Reagan, school choice, Schrödinger's Cat, selection bias, Shai Danziger, side project, Silicon Valley, Silicon Valley startup, speech recognition, statistical model, Steve Jobs, Steve Wozniak, Steven Pinker, survivorship bias, The Present Situation in Quantum Mechanics, the scientific method, The Wisdom of Crowds, Thomas Kuhn: the structure of scientific revolutions, transaction costs, uber lyft, ultimatum game, uranium enrichment, urban planning, Vilfredo Pareto, wikimedia commons

If everyone who ever smoked got lung cancer and everyone who didn’t smoke never got lung cancer, the data would be a lot more convincing. Unfortunately, the real world is rarely that simple. You may have heard anecdotes about people who happened to get cold and flu symptoms around the time that they got the flu vaccine and blame their illness on the vaccine. Just because two events happened in succession, or are correlated, doesn’t mean that the first actually caused the second. Statisticians use the phrase correlation does not imply causation to describe this fallacy. What is often overlooked when this fallacy arises is a confounding factor, a third, possibly non-obvious factor that influences both the assumed cause and the observed effect, confounding the ability to draw a correct conclusion. In the case of the flu vaccine, the cold and flu season is that confounding factor. People get the flu vaccine during the time of year when they are more likely to get sick, whether they have received the vaccine or not.

In other instances, a correlation can occur by random chance. It’s easier than ever to test the correlation between all sorts of information, so many spurious correlations are bound to be discovered. In fact, there is a hilarious site (and book) called Spurious Correlations, chock-full of these silly results. The graph below shows one such correlation, between cheese consumption and deaths due to bedsheet tanglings. Correlation Does Not Imply Causation One time when Lauren was in high school, she started feeling like a cold was coming on, and her dad told her to drink plenty of fluids to help her get better. She proceeded to drink half a case of raspberry Snapple that day, and, surprisingly, the next day she felt a lot better! Was this clear evidence that raspberry Snapple is a miracle cure for the common cold? No. She probably just experienced a coincidental recovery due to the body’s natural healing ability after also drinking a whole bunch of raspberry Snapple.

There are certainly lots of pitfalls to watch out for, but we hope you also take away the fact that research and data are more useful for navigating uncertainty than hunches and opinions. KEY TAKEAWAYS Avoid succumbing to the gambler’s fallacy or the base rate fallacy. Anecdotal evidence and correlations you see in data are good hypothesis generators, but correlation does not imply causation—you still need to rely on well-designed experiments to draw strong conclusions. Look for tried-and-true experimental designs, such as randomized controlled experiments or A/B testing, that show statistical significance. The normal distribution is particularly useful in experimental analysis due to the central limit theorem. Recall that in a normal distribution, about 68 percent of values fall within one standard deviation, and 95 percent within two.


pages: 698 words: 198,203

The Stuff of Thought: Language as a Window Into Human Nature by Steven Pinker

airport security, Albert Einstein, Bob Geldof, colonial rule, conceptual framework, correlation does not imply causation, Daniel Kahneman / Amos Tversky, David Brooks, Douglas Hofstadter, en.wikipedia.org, experimental subject, fudge factor, George Santayana, Laplace demon, loss aversion, luminiferous ether, Norman Mailer, Richard Feynman, Ronald Reagan, Sapir-Whorf hypothesis, science of happiness, social intelligence, speech recognition, stem cell, Steven Pinker, Thomas Bayes, Thorstein Veblen, traffic fines, urban renewal, Yogi Berra

But as we will see in chapter 5, people have the cognitive means to evaluate whether a framing is faithful to reality; the framing does not lock their minds into one way of construing the world. 3. The stock of words in a language reflects the kinds of things its speakers deal with in their lives and hence think about. This, of course, is the obvious non-Whorfian interpretation of the Eskimo-snow factoid. The Whorfian interpretation is a classic example of the fallacy of confusing correlation with causation. In the case of varieties of snow and words for snow, not only did the snow come first, but when people change their attention to snow, they change their words as a result. That’s how meteorologists, skiers, and New Englanders coin new expressions for the stuff, whether in circumlocutions (wet snow, sticky snow) or in neologisms (hardpack, powder, dusting, flurries). Presumably it didn’t happen the other way around—that vocabulary show-offs coined new words for snow, then took up skiing or weather forecasting because they were intrigued by their own coinages.

(All of these are signatures of the analogue estimation system—which reinforces the notion that this component of the number sense exists independently of number words.) Gordon concluded that the lack of precise number thoughts among the Pirahã is caused by their lack of precise number words—the “rare and perhaps unique case for strong linguistic determinism.” But as the cognitive scientist Daniel Casasanto put it, this is a case of “crying Whorf ”: it depends on a dubious leap from correlation to causation.94 It can’t be a coincidence that the Pirahã language just happens to lack big number words (unlike the English language) and the Pirahã speakers just happen to hunt and gather in remote stone-age villages (unlike English speakers). A more plausible interpretation is that the lifestyle, history, and culture of a technologically undeveloped hunter-gatherer people will cause it to lack both number words and numerical reasoning.

It follows, then, that all reasonings concerning cause and effect are founded on experience, and that all reasonings from experience are founded on the supposition that the course of nature will continue uniformly the same.109 Tucked into this analysis of whether we can justify our causal attributions is an offhand theory of the psychology of causality called constant conjunction: that our intuitions of cause and effect are nothing but an expectation that if one thing followed another many times in the past, it will continue to do so in the future. It’s not terribly different from what happens when a dog is conditioned to anticipate food when a bell is rung, or a pigeon learns to peck a key in the expectation of food. The story that began the chapter, about the two alarms that go off in succession, raises an obvious problem for the theory. People understand (even if they don’t always apply) the principle that correlation does not imply causation. The rooster’s cock-a-doodle-doo does not cause the sun to rise, thunder doesn’t cause forest fires, and the flashing lights on the top of a printer don’t cause it to spit out a document. These are perceived to be epiphenomena: byproducts of the real causes. I called Hume’s theory “offhand” because he didn’t consistently embrace it himself. The very example of “causation” he adduced in his summary—“when we think of the son, we are apt to carry our attention to the father”—could not be a more ruinous counterexample.


pages: 249 words: 81,217

The Art of Rest: How to Find Respite in the Modern Age by Claudia Hammond

Anton Chekhov, conceptual framework, correlation does not imply causation, Desert Island Discs, Donald Trump, El Camino Real, iterative process, Kickstarter, lifelogging, longitudinal study, Menlo Park, meta analysis, meta-analysis, Milgram experiment, moral panic, Stephen Hawking, The Spirit Level, The Wisdom of Crowds, theory of mind, Thorstein Veblen

To take just one example, a major study of more than 60,000 Brazilian adults found that watching more than five hours of TV a day was associated with a higher risk of depression.14 But, of course, this does not prove that TV per se is the problem. It won’t surprise you to learn that people who are unemployed or stuck at home because they’re unwell tend to watch more TV on average.15 It’s cheap, ever-changing, doesn’t require physical fitness and can provide hours of distraction. Those same people also have lower levels of well-being than people in good health or with jobs. So we are left with a perennial research issue – correlation versus causation. We don’t know which came first – the unhappiness or the TV watching. Staying in all day watching television might well isolate people and make them feel worse. Alternatively, they may already be feeling unhappy and are using TV to cope, like the lonely people we heard about earlier who binge watch box sets. If this dependence on TV becomes habitual then it may store up other problems for the future.

It seems a curious result, but it is almost certainly other factors that were to blame, not the lack of television. Perhaps these people were so poor they couldn’t afford a television or so busy working and caring for others that they never had time to rest and watch it, in which case, of course, it was the lack of any free time and the overwhelming stress of their lives making them unhappy rather than the lack of an hour’s television watching. To get around the correlation versus causation issue, researchers in the US examined data from 50,000 nurses, who were followed over a ten-year period. Did long hours spent in front of the television precede depression several years later? For many of the nurses it did. And the reason? Watching lots of TV meant they did less exercise, and the authors think it’s that rather than anything about watching TV per se that was the main problem.17 It is obvious that watching a lot of TV is not good for us physically because it generally involves a lot of sitting down.


Trading Risk: Enhanced Profitability Through Risk Control by Kenneth L. Grant

backtesting, business cycle, buy and hold, commodity trading advisor, correlation coefficient, correlation does not imply causation, delta neutral, diversification, diversified portfolio, fixed income, frictionless, frictionless market, George Santayana, implied volatility, interest rate swap, invisible hand, Isaac Newton, John Meriwether, Long Term Capital Management, market design, Myron Scholes, performance metric, price mechanism, price stability, risk tolerance, risk-adjusted returns, Sharpe ratio, short selling, South Sea Bubble, Stephen Hawking, the scientific method, The Wealth of Nations by Adam Smith, transaction costs, two-sided market, value at risk, volatility arbitrage, yield curve, zero-coupon bond

Correlation analysis thus helped him make better use of his resources. I was less successful in convincing him that the Beach Boy’s “Pet Sounds” is the most overrated album in the history of popular music and that Brian Eno’s “Here Come the Warm Jets” is the most underrated. “Baby’s on fire, better throw her in the water.” I can’t close this discussion without referencing an old platitude that cautions against confusing “correlation with causation.” While correlation analysis can be extremely useful in understanding patterns, providing insights, and offering clues as to what is driving performance, it is crucial to resist the temptation of reading too much into associated outcomes. Ideally, like all other statistics discussed in this chapter, the calculation of correlations will evoke as many questions as answers. It will then be up to you to derive the applicable inferences and to make the appropriate adjustments with respect to your trading.

Think carefully about the implications of the results as in most cases intuition will be your best guide as to what to make of them. Final Word on Correlation I caution you, yet again, against reading too much into the implications of the results. Correlation analysis is a very useful descriptive statistic, but it is an imprecise predictive mechanism. As such, I can’t stress strongly enough the age-old adage admonishing us not to confuse correlation with causation. The goal here is to gain insight into those elements of your routine trading program that are most likely to bring you success in your quest for risk-adjusted return and into those that are causing inefficiencies that can at least be managed, if not altogether corrected. Remember that, due to the extraordinary amount of complexity that is involved in the portfolio management process, any change you make to your program designed to address an anomaly uncovered by these types of statistical analyses may very well have implications for other elements of your methodologies that could offset the potential benefits you seek through the change.


pages: 340 words: 94,464

Randomistas: How Radical Researchers Changed Our World by Andrew Leigh

Albert Einstein, Amazon Mechanical Turk, Anton Chekhov, Atul Gawande, basic income, Black Swan, correlation does not imply causation, crowdsourcing, David Brooks, Donald Trump, ending welfare as we know it, Estimating the Reproducibility of Psychological Science, experimental economics, Flynn Effect, germ theory of disease, Ignaz Semmelweis: hand washing, Indoor air pollution, Isaac Newton, Kickstarter, longitudinal study, loss aversion, Lyft, Marshall McLuhan, meta analysis, meta-analysis, microcredit, Netflix Prize, nudge unit, offshore financial centre, p-value, placebo effect, price mechanism, publication bias, RAND corporation, randomized controlled trial, recommendation engine, Richard Feynman, ride hailing / ride sharing, Robert Metcalfe, Ronald Reagan, statistical model, Steven Pinker, uber lyft, universal basic income, War on Poverty

Suppose I told you that surveys typically find that people who slumber longer are happier. You might reasonably respond that this is because happiness causes sleep – good-tempered people tend to hit the pillow early. Or you might argue that both happiness and sleep are products of something else – like being in a stable relationship. Either way, an observational study falls prey to the old critique: correlation doesn’t imply causation. Misleading correlations are all around us.32 Ice-cream sales are correlated with shark attacks, but that doesn’t mean you should boycott Mr Whippy. Shoe size is correlated with exam performance, but buying adult shoes for kindergarteners isn’t going to help. Countries with higher chocolate consumption win more Nobel prizes, but chomping Cadbury won’t make you a genius.33 By contrast, a randomised trial uses the power of chance to assign the groups.

A pair of supporters wrote: ‘If a social evangelist had a choice of picking one tool, one movement with the goal of emancipating the poorest women on earth, the microcredit phenomenon wins without serious competition.’10 As US president, Bill Clinton provided development assistance to microcredit programs and championed Muhammad Yunus for the Nobel Peace Prize.11 Awarding the prize to Yunus in 2006, the Nobel committee praised him for developing ‘micro-credit into an ever more important instrument in the struggle against poverty’. U2’s Bono wrote: ‘Give a man a fish, he’ll eat for a day. Give a woman microcredit, she, her husband, her children, and her extended family will eat for a lifetime.’12 Yet it turned out that the bold claims for microcredit were largely based on anecdotes and evaluations that failed to distinguish correlation from causation. By the 2000s, researchers had begun carrying out randomised trials of microcredit programs in Bosnia, Ethiopia, India, Mexico, Morocco and Mongolia. Summarising these six experiments, a team of leading development economists concluded that microcredit had no impact on raising household income, getting children to stay in school, or empowering women.13 Microcredit schemes did provide more financial freedom, and led people to invest more money in their businesses, but it didn’t make them more profitable.


pages: 336 words: 93,672

The Future of the Brain: Essays by the World's Leading Neuroscientists by Gary Marcus, Jeremy Freeman

23andMe, Albert Einstein, bioinformatics, bitcoin, brain emulation, cloud computing, complexity theory, computer age, computer vision, conceptual framework, correlation does not imply causation, crowdsourcing, dark matter, data acquisition, Drosophila, epigenetics, global pandemic, Google Glasses, iterative process, linked data, mouse model, optical character recognition, pattern recognition, personalized medicine, phenotype, race to the bottom, Richard Feynman, Ronald Reagan, semantic web, speech recognition, stem cell, Steven Pinker, supply-chain management, Turing machine, twin studies, web application

Furthermore, experiments with appropriately modified viruses to stain, mark, turn on, or turn off molecularly identified subpopulations of neurons permit unprecedented control of mouse brain circuitry. This cannot be emphasized enough. The exploding use of opto- and pharmacogenetics methods that delicately, transiently, reversibly, and invasively control defined events in defined cell types at defined times constitute a suite of interventionist tools that allows neuroscience to move from correlation to causation, from observing that this circuit is activated whenever the subject is contemplating a decision to inferring that this circuit is necessary for decision making. Second, the human brain is more than three orders of magnitude larger than the mouse brain—1.4 kg weight versus 0.4 g; a 1-liter volume versus a sugar cube; eighty-six billion nerve cells versus seventy-one million for the entire brain and sixteen billion versus fourteen million nerve cells for the neocortex.

The blurriness of these instruments was mirrored by the primitive and edentate tools used to safely perturb the human brain—electrical stimulation in patients, and extracranial electromagnetic fields and drugs in volunteers. The other major advance fifty years ago was the birth of opto- and pharmaco-genetics, methods that delicately, transiently, reversibly, and invasively control defined events in defined cell types at defined times, initially in a few model organisms—the worm, the fly, and the mouse. Equipped with these tools for perturbing the brain, scientists systematically moved from correlation to causation, from observing that this circuit is activated whenever the subject is contemplating a decision to inferring that this circuit is necessary for decision making or that those neurons mark a particular memory. By the early 2020s, the complete logic of thalamo-cortical circuits could be manipulated, in hindsight a tipping point in our ability to bridge the gap between cortex and theories of its universal and particular functions.


Humble Pi: A Comedy of Maths Errors by Matt Parker

8-hour work day, Affordable Care Act / Obamacare, bitcoin, British Empire, Brownian motion, Chuck Templeton: OpenTable:, collateralized debt obligation, computer age, correlation does not imply causation, crowdsourcing, Donald Trump, Flash crash, forensic accounting, game design, High speed trading, Julian Assange, millennium bug, Minecraft, obamacare, orbital mechanics / astrodynamics, publication bias, Richard Feynman, Richard Feynman: Challenger O-ring, selection bias, Tacoma Narrows Bridge, Therac-25, value at risk, WikiLeaks, Y2K

Multibillionaire investor Warren Buffett is a big fan of non-transitive dice and brought them out when he met also-multibillionaire computer guy Bill Gates. The story goes that Gates’s suspicion was aroused when Buffett insisted he pick his dice first and, upon a closer inspection of the numbers, he in turn insisted Buffett choose first. The link between people who like non-transitive dice and billionaires may be only correlation and not causation. James Grime’s contribution to the non-transitive world was to make it so that his dice have two different possible cycles of non-transitiveness but with only one of them reversing when you double the dice.fn1 By renaming the green dice ‘olive’, the second cycle can be remembered as the alphabetical order of the colours. Using both cycles, in theory, you can let two other people choose their dice colours and, as long as you can then choose the one- or two-dice version of the game, you can beat both opponents simultaneously more often than not.

And decades of studies have revealed no biological impact from mobile-phone masts. In this case, both factors were dependent on a third variable: population size. Both the number of mobile-phone masts in an area and the number of births depend on how many people live there. I should make it very clear: in the article I explained that the correlation was because of population size. I explained in great detail that this was an exercise in showing that correlation does not mean causation. But it ended up also being an exercise in how people don’t read the article properly before commenting underneath. The correlation was too alluring and people could not help but put forward their own reasons. More than one person suggested that expensive neighbourhoods have fewer masts and young families with loads of kids cannot afford to live there, proving once again that there is no topic that Guardian readers cannot make out to be about house prices.


pages: 576 words: 105,655

Austerity: The History of a Dangerous Idea by Mark Blyth

"Robert Solow", accounting loophole / creative accounting, balance sheet recession, bank run, banking crisis, Black Swan, Bretton Woods, business cycle, buy and hold, capital controls, Carmen Reinhart, Celtic Tiger, central bank independence, centre right, collateralized debt obligation, correlation does not imply causation, creative destruction, credit crunch, Credit Default Swap, credit default swaps / collateralized debt obligations, currency peg, debt deflation, deindustrialization, disintermediation, diversification, en.wikipedia.org, ending welfare as we know it, Eugene Fama: efficient market hypothesis, eurozone crisis, financial repression, fixed income, floating exchange rates, Fractional reserve banking, full employment, German hyperinflation, Gini coefficient, global reserve currency, Growth in a Time of Debt, Hyman Minsky, income inequality, information asymmetry, interest rate swap, invisible hand, Irish property bubble, Joseph Schumpeter, Kenneth Rogoff, liberal capitalism, liquidationism / Banker’s doctrine / the Treasury view, Long Term Capital Management, market bubble, market clearing, Martin Wolf, money market fund, moral hazard, mortgage debt, mortgage tax deduction, Occupy movement, offshore financial centre, paradox of thrift, Philip Mirowski, price stability, quantitative easing, rent-seeking, reserve currency, road to serfdom, savings glut, short selling, structural adjustment programs, The Great Moderation, The Myth of the Rational Market, The Wealth of Nations by Adam Smith, Tobin tax, too big to fail, unorthodox policies, value at risk, Washington Consensus, zero-sum game

Rather, in both cases, what was once seen as sustainable suddenly became seen as unsustainable once the possibility of a contagion-led fire sale through the European bond markets was factored into a slow-moving growth crisis. As usual, it’s the perception of risk that matters. And again, just as we saw in the US case, there was no orgy of government spending behind all this. Why, then, keep up the fiction that the bond market crisis is a crisis of spendthrift governments? Confusing Correlation and Causation: Austerity’s Moment in the Sun With yields spiking to unsustainable levels in Greece, Ireland, and Portugal, each country received a bailout from the EU, ECB, and the IMF, as well as bilateral loans, on the condition that it accept and implement an austerity package to right its fiscal ship. Cut spending, raise taxes—but cut spending more than you raise taxes—and all will be well, the story went.

Growth rates and foreign investment both soared.105 Key to all this, as before, was the large expenditure-based cut plus wage moderation and devaluation.106 Stephen Kinsella offers a rather different version of events in his recent study of Ireland’s twin experiments with austerity: in the late 1980s and today in the aftermath of the banking crisis of 2008.107 Kinsella emphasizes that Ireland did have an expansion following a consolidation, as the literature claims, but notes that correlation is not causation in this case. Instead, he notes another correlation; that Ireland’s consolidation “coincided with a period of growth in the international economy, with the presence of fiscal transfers from the European Union, the opening up of the single market and a well-timed devaluation in August 1986.”108 An earlier paper by John Considine and James Duffy makes a similar point, namely, that it’s the boom in British imports—the so-called Lawson boom—that combined with the 1986 devaluation to make the difference.109 This is backed up by a piece by Roberto Perotti, who argues that in the Irish case “the concomitant depreciation of Sterling and the expansion in the UK … boosted Irish exports.”110 Kinsella also notes that the adjustment was considerably eased by an income tax amnesty that raised the equivalent of 2 percent of GDP.111 The part that stands out in Kinsella’s account is, however, something completely absent in other retellings of these events.


pages: 294 words: 77,356

Automating Inequality by Virginia Eubanks

autonomous vehicles, basic income, business process, call centre, cognitive dissonance, collective bargaining, correlation does not imply causation, deindustrialization, disruptive innovation, Donald Trump, Elon Musk, ending welfare as we know it, experimental subject, housing crisis, IBM and the Holocaust, income inequality, job automation, mandatory minimum, Mark Zuckerberg, mass incarceration, minimum wage unemployment, mortgage tax deduction, new economy, New Urbanism, payday loans, performance metric, Ronald Reagan, self-driving car, statistical model, strikebreaker, underbanked, universal basic income, urban renewal, War on Poverty, working poor, Works Progress Administration, young professional, zero-sum game

In other words, it searches through all available information to pluck out any variables that vary along with the thing you are trying to measure—which leads to charges that the method is a kind of “data dredging,” or a statistical fishing expedition. For the AFST, the Vaithianathan team tested 287 variables available in Cherna’s data warehouse. The regression knocked out 156 of them, leaving 131 factors that the team believes predict child harm.9 Even if a regression finds factors that predictably rise and fall together, correlation is not causation. In a classic example, shark attacks and ice cream consumption are highly correlated. But that doesn’t mean that eating ice cream makes swimmers too slow to avoid aquatic predators, or that sharks are attracted to soft-serve. There is a third variable that influences both shark attacks and ice cream consumption: summer. Both ice cream eating and shark attacks go up when the weather is warmer.

Bell Buzelle, George Cardwell, Glenn Cardwell, John caregiving and caregivers of children and gendered expectations rewarding casework and caseworkers and child welfare and digital poorhouse and eligibility rules history of and homelessness and human bias and hybrid eligibility system and the New Deal and predictive risk models scientific casework and scientific charity and welfare reform and welfare rights movement cash benefits Cermak, Joe Charity Organization Society Cherna, Marc child abuse and neglect hotlines and mandatory reporters and nuisance calls and racial disproportionality and referral bias Child Abuse Prevention and Treatment Act (CAPTA, 1974) child welfare case of Byron Giffin case of Shawntee Ford child placement as proxy for maltreatment ChildLine (hotline) ChildLine Abuse Registry community re-referral as proxy for maltreatment General Protective Services (GPS) and human bias and mandated reporting neglect vs. abuse and race and referral bias and religion risk factors for abuse and scientific charity See also Allegheny Family Screening Tool (AFST); child abuse and neglect hotlines; predictive risk models civil rights civil rights movement and activism. See also King, Martin Luther, Jr.; Southern Christian Leadership Conference (SCLC) Civil Works Administration (CWA) Civilian Conservation Corps (CCC) Clinton, Bill Cloward, Richard Cohen, Stanley Cohn, Cindy COINTELPRO (the COunter INTELligence PROgram of the FBI) Communism confidentiality. See also privacy Conrad N. Hilton Foundation coordinated entry system (CES) correlation vs. causation county farms. See poorhouses county homes. See poorhouses COWPI. See Indiana, Committee on Welfare Privatization Issues creative economy in Los Angeles in Pennsylvania criminalization and automated decision-making and digital poorhouse and homelessness and poverty and welfare reform Crouch, Suzanne Culhane, Dennis Cullors, Patrisse cultural denial Cunningham, Mary Dalton, Erin Daniels, Mitch Dare, Tim data analytics, regime of mining and right to be forgotten security warehouse decision making automated and big data human and inclusion revolutionary change in and scientific charity tracking of and transparency Declaration of Independence deindustrialization in Indiana in South LA Denton, Nancy A.


pages: 377 words: 97,144

Singularity Rising: Surviving and Thriving in a Smarter, Richer, and More Dangerous World by James D. Miller

23andMe, affirmative action, Albert Einstein, artificial general intelligence, Asperger Syndrome, barriers to entry, brain emulation, cloud computing, cognitive bias, correlation does not imply causation, crowdsourcing, Daniel Kahneman / Amos Tversky, David Brooks, David Ricardo: comparative advantage, Deng Xiaoping, en.wikipedia.org, feminist movement, Flynn Effect, friendly AI, hive mind, impulse control, indoor plumbing, invention of agriculture, Isaac Newton, John von Neumann, knowledge worker, Long Term Capital Management, low skilled workers, Netflix Prize, neurotypical, Norman Macrae, pattern recognition, Peter Thiel, phenotype, placebo effect, prisoner's dilemma, profit maximization, Ray Kurzweil, recommendation engine, reversible computing, Richard Feynman, Rodney Brooks, Silicon Valley, Singularitarianism, Skype, statistical model, Stephen Hawking, Steve Jobs, supervolcano, technological singularity, The Coming Technological Singularity, the scientific method, Thomas Malthus, transaction costs, Turing test, twin studies, Vernor Vinge, Von Neumann architecture

If Gates is correct, then the best way to help the world’s destitute might be to figure out how to raise the IQs of the world’s poorest people. GENES VS. ENVIRONMENT Genetics determines between 50 and 80 percent of your IQ. To be more precise: intelligence researchers disagree over how much of the variation in people’s IQs is caused by genetics, with estimates ranging from about 50 to 80 percent.162 Researchers don’t agree on the relative importance of genetics in determining IQ because of the challenge of separating correlation from causation. To understand this difficulty, suppose we know that parents who read a lot to their children tend to have children with high IQs. This correlation might occur because reading to a child increases her IQ. But here are some other possible causes, and if any one of them is the correct explanation, reading will do absolutely nothing to boost a child’s intelligence: •The higher a parent’s IQ, the more she enjoys reading to her child, and so the more she will read to her child.

Researchers have some decent evidence that brain training can reduce the risk of an elderly person developing dementia.278 Given the huge economic burden that dementia imposes on the United States, if brain training proved effective, it could reduce the rate of increase of Medicare costs. A child’s working memory has been found to be a key predictor of his success in kindergarten as measured by teacher evaluations, perhaps indicating that parents should provide brain training to their toddlers.279 Of course, the relationship between these two indicators might be due merely to correlation, not causation, and so using brain fitness software to improve a four-year-old’s working memory might not help him in kindergarten. If computer brain training proved effective, educators could continually improve it using massive data analysis. Brain-training programs could easily keep track of students’ performances. Researchers could use this data to figure out what types of exercises worked best for different categories of students.


pages: 343 words: 101,563

The Uninhabitable Earth: Life After Warming by David Wallace-Wells

"Robert Solow", agricultural Revolution, Albert Einstein, anthropic principle, Asian financial crisis, augmented reality, basic income, Berlin Wall, bitcoin, British Empire, Buckminster Fuller, Burning Man, Capital in the Twenty-First Century by Thomas Piketty, carbon footprint, carbon-based life, cognitive bias, computer age, correlation does not imply causation, cryptocurrency, cuban missile crisis, decarbonisation, Donald Trump, effective altruism, Elon Musk, endowment effect, energy transition, everywhere but in the productivity statistics, failed state, fiat currency, global pandemic, global supply chain, income inequality, Intergovernmental Panel on Climate Change (IPCC), invention of agriculture, Joan Didion, John Maynard Keynes: Economic Possibilities for our Grandchildren, labor-force participation, life extension, longitudinal study, Mark Zuckerberg, mass immigration, megacity, megastructure, mutually assured destruction, Naomi Klein, nuclear winter, Pearl River Delta, Peter Thiel, plutocrats, Plutocrats, postindustrial economy, quantitative easing, Ray Kurzweil, rent-seeking, ride hailing / ride sharing, Sam Altman, Silicon Valley, Skype, South China Sea, South Sea Bubble, Steven Pinker, Stewart Brand, the built environment, the scientific method, Thomas Malthus, too big to fail, universal basic income, University of East Anglia, Whole Earth Catalog, William Langewiesche, Y Combinator

But by laundering all conflict and competition through the market, neoliberalism also proffered a new model of doing business, so to speak, on the world stage—one that didn’t emerge from, or point toward, endless nation-state rivalry. One should not confuse correlation with causation, especially since there was so much tumult coming out of World War II that it is hard to isolate the single cause of just about anything. But the international cooperative order that has since presided, establishing or at least emerging in parallel with relative peace and abundant prosperity, is very neatly historically coincident with the reign of globalization and the empire of financial capital we now group together as neoliberalism. And if one were inclined to confuse correlation with causation, there is a quite intuitive and plausible theory connecting them. Markets may be problematic, shall we say, but they also value security and stability and, all else being equal, reliable economic growth.


pages: 364 words: 102,926

What the F: What Swearing Reveals About Our Language, Our Brains, and Ourselves by Benjamin K. Bergen

correlation does not imply causation, information retrieval, pre–internet, Ronald Reagan, statistical model, Steven Pinker

For example, here’s a graph of the age at which one particular child first used each of his nouns (his age is on the x-axis) plotted against how frequent that word was in the child-directed speech he heard (it’s actually the log of word frequency because frequency effects in language have logarithmic effects).21 You can see that within nouns, the child learns more frequent ones earlier, on average, and then moves on to learn less frequent ones as well. Each dot represents the first time the child produced a particular noun; more frequent nouns tended to be learned earlier than less frequent ones. Image reproduced from B. C. Roy et al. (2009), used with permission. Of course, a reasonable person could object to studies like this one. Correlation does not imply causation. So the fact that children tend to learn more frequent words earlier doesn’t entail that frequency is the reason for earlier word learning. Other factors might be in play. For instance, more frequent words are shorter, all things being equal. And children learn shorter words earlier. Maybe frequency plays no causal role. To know for sure, you’d need to run an experiment: you’d have to manipulate how often children heard particular words and see whether this factor alone, holding all other possible causes constant, affected children’s learning of the words.

The study states that adolescents who reported watching shows and playing games with more profanity in them also reported finding profanity more acceptable and using more profanity themselves. Does this answer the question about frequency? Does this mean that exposure to more profanity leads to more use of profanity? We don’t know, because the study was correlational. It’s not always obvious why correlation doesn’t imply causation, so let me just remind you here. (If this is old hat to you, by all means, skip to the next paragraph.) Here’s a nice example of why you can’t infer causation from correlation.24 Suppose you want to know whether religious faith causes an increase in alcohol consumption. You might try to find an answer by counting the number of bars and the number of churches in each of a large number of US cities.


Work in the Future The Automation Revolution-Palgrave MacMillan (2019) by Robert Skidelsky Nan Craig

3D printing, Airbnb, algorithmic trading, Amazon Web Services, anti-work, artificial general intelligence, autonomous vehicles, basic income, business cycle, cloud computing, collective bargaining, correlation does not imply causation, creative destruction, data is the new oil, David Graeber, David Ricardo: comparative advantage, deindustrialization, deskilling, disintermediation, Donald Trump, Erik Brynjolfsson, feminist movement, Frederick Winslow Taylor, future of work, gig economy, global supply chain, income inequality, informal economy, Internet of things, Jarndyce and Jarndyce, Jarndyce and Jarndyce, job automation, John Maynard Keynes: Economic Possibilities for our Grandchildren, John Maynard Keynes: technological unemployment, John von Neumann, Joseph Schumpeter, knowledge economy, Loebner Prize, low skilled workers, Lyft, Mark Zuckerberg, means of production, moral panic, Network effects, new economy, off grid, pattern recognition, post-work, Ronald Coase, Second Machine Age, self-driving car, sharing economy, Steve Jobs, strong AI, technoutopianism, The Chicago School, The Future of Employment, the market place, The Nature of the Firm, The Wealth of Nations by Adam Smith, Thorstein Veblen, Turing test, Uber for X, uber lyft, universal basic income, wealth creators, working poor

In particular a person, or a community of persons, deemed unlikely to pay back that loan will systematically be prevented from getting loans, even if the prediction is wrong. And even more importantly, even if that prediction was wrong initially, that prediction ends up being correct if enough predictive algorithms agree that “people like them” don’t look likely to pay back loans, and they are systematically shut out of the banking system. In this sense predictions become truth, and correlations become causations. There are plenty of real examples where this touches quite demonstrably on ethics and the public good, but I’ll indulge in a fictional example taken from an episode of Star Trek: Voyager’s seventh season, called “Critical Care.” The Voyager doctor, which is an AI, runs from a holographic emitter, which is stolen by an alien. The doctor is initially forced to work in a chaotic, under-resourced alien hospital filled to the brim with dying people in desperate need of life-saving medicine, but he eventually gets moved to a higher floor, which is beautifully run and caters to well-off folks getting the sci-fi version of botox treatment with the same medicine that was desperately needed only a few floors below.


Bulletproof Problem Solving by Charles Conn, Robert McLean

active transport: walking or cycling, Airbnb, Amazon Mechanical Turk, asset allocation, availability heuristic, Bayesian statistics, Black Swan, blockchain, business process, call centre, carbon footprint, cloud computing, correlation does not imply causation, Credit Default Swap, crowdsourcing, David Brooks, Donald Trump, Elon Musk, endowment effect, future of work, Hyperloop, Innovator's Dilemma, inventory management, iterative process, loss aversion, meta analysis, meta-analysis, Nate Silver, nudge unit, Occam's razor, pattern recognition, pets.com, prediction markets, principal–agent problem, RAND corporation, randomized controlled trial, risk tolerance, Silicon Valley, smart contracts, stem cell, the rule of 72, the scientific method, The Signal and the Noise by Nate Silver, time value of money, transfer pricing, Vilfredo Pareto, walkable city, WikiLeaks

We looked at London using air quality data and asthma hospital admissions by postcode for the year 2015. The heat map that results shows the neighborhoods with the highest level of risk. As a first cut, it suggests exploring the issue further is warranted, even though the correlations aren't especially high between particulate matter and hospital admissions for yearlong data. And as we know, correlations do not prove causation; there could be an underlying factor causing both PM 2.5 hotspots and asthma hospital admissions. Experiments, more granular data analysis, and large‐scale models are the next step for this work. EXHIBIT 6.2 Source: Q. Di et al., “Air Pollution and Mortality in the Medicare Population,” New England Journal of Medicine 376 (June 29, 2017), 2513–2522. Regression Models to Understand Obesity Obesity is a genuinely wicked problem to which we will return in Chapter 9.

However, when we put both walkability and a comfort score together using multi‐variable regression, we see that walkability is significantly correlated with obesity after controlling for weather. This example is just a simple one to show how regression analysis can help you begin to understand the drivers of your problem, and perhaps to craft strategies for positive intervention at the city level. As useful as regression is in exploring our understanding, there are some pitfalls to consider: Be careful with correlation and causation. Walkable cities seem to almost always have far lower obesity rates than less walkable cities. However, we have no way of knowing from statistics alone whether city walkability is the true cause of lower obesity. Perhaps walkable cities are more expensive to live in and the real driver is higher socioeconomic status. Or perhaps healthier people move to more walkable communities. Regression models can be misleading if there are variables that we may not have accounted for in our model but that may be very important.


pages: 515 words: 142,354

The Euro: How a Common Currency Threatens the Future of Europe by Joseph E. Stiglitz, Alex Hyde-White

bank run, banking crisis, barriers to entry, battle of ideas, Berlin Wall, Bretton Woods, business cycle, buy and hold, capital controls, Carmen Reinhart, cashless society, central bank independence, centre right, cognitive dissonance, collapse of Lehman Brothers, collective bargaining, corporate governance, correlation does not imply causation, credit crunch, Credit Default Swap, currency peg, dark matter, David Ricardo: comparative advantage, disintermediation, diversified portfolio, eurozone crisis, Fall of the Berlin Wall, fiat currency, financial innovation, full employment, George Akerlof, Gini coefficient, global supply chain, Growth in a Time of Debt, housing crisis, income inequality, incomplete markets, inflation targeting, information asymmetry, investor state dispute settlement, invisible hand, Kenneth Arrow, Kenneth Rogoff, knowledge economy, light touch regulation, manufacturing employment, market bubble, market friction, market fundamentalism, Martin Wolf, Mexican peso crisis / tequila crisis, money market fund, moral hazard, mortgage debt, neoliberal agenda, new economy, open economy, paradox of thrift, pension reform, pensions crisis, price stability, profit maximization, purchasing power parity, quantitative easing, race to the bottom, risk-adjusted returns, Robert Shiller, Robert Shiller, Ronald Reagan, savings glut, secular stagnation, Silicon Valley, sovereign wealth fund, the payments system, The Rise and Fall of American Growth, The Wealth of Nations by Adam Smith, too big to fail, transaction costs, transfer pricing, trickle-down economics, Washington Consensus, working-age population

Supporters of the euro might respond by pointing out that if Greece owed money to, say, Germany in Germany’s currency, the weakening of Greece’s exchange rate would increase the real indebtedness of Greece. True—but that is precisely what is happening now, as Troika policies have lowered Greek incomes by more than a quarter. More relevant, Greece would likely not have borrowed in German currency, precisely because it (and presumably its lenders) should have been aware of the risk that that entailed.30 CORRELATION AND CAUSATION The poor performance of the eurozone, both absolutely and relative to others, might, of course, be due to some factor other than the euro. And there have been changes in the global economy that have affected the eurozone and, more particularly, one group of countries within the eurozone relative to others. That’s why Germany’s suggestion that the failures of the countries in the eurozone are due to their profligacy seems so out of touch with economic reality, so demonstrative of a total lack of analysis.

They argued that there were important instances where when governments had contracted government spending, the result was that the overall economy grew. The notion that there could be expansionary contractions was a chimera. A series of papers showed major flaws in their analysis.57 The IMF, which had supported austerity-style policies in the past, in fact reversed itself. It pointed out that when governments contract spending, the economy contracts.58 The big flaw in the pro-austerity study was confusing correlation with causation. There were a few countries, small economies with flexible exchange rates, where a contraction in government spending was associated with growth; but in these cases the hole in demand created by the government contraction was filled in with exports. Canada in the early 1990s was lucky because the United States was going through a rapid expansion, the recovery from the 1991 recession. Canada benefited, too from a flexible exchange rate that enabled it to sell its goods more cheaply to the United States.


pages: 416 words: 106,582

This Will Make You Smarter: 150 New Scientific Concepts to Improve Your Thinking by John Brockman

23andMe, Albert Einstein, Alfred Russel Wallace, banking crisis, Barry Marshall: ulcers, Benoit Mandelbrot, Berlin Wall, biofilm, Black Swan, butterfly effect, Cass Sunstein, cloud computing, congestion charging, correlation does not imply causation, Daniel Kahneman / Amos Tversky, dark matter, data acquisition, David Brooks, delayed gratification, Emanuel Derman, epigenetics, Exxon Valdez, Flash crash, Flynn Effect, hive mind, impulse control, information retrieval, Intergovernmental Panel on Climate Change (IPCC), Isaac Newton, Jaron Lanier, Johannes Kepler, John von Neumann, Kevin Kelly, lifelogging, mandelbrot fractal, market design, Mars Rover, Marshall McLuhan, microbiome, Murray Gell-Mann, Nicholas Carr, open economy, Pierre-Simon Laplace, place-making, placebo effect, pre–internet, QWERTY keyboard, random walk, randomized controlled trial, rent control, Richard Feynman, Richard Feynman: Challenger O-ring, Richard Thaler, Satyajit Das, Schrödinger's Cat, security theater, selection bias, Silicon Valley, Stanford marshmallow experiment, stem cell, Steve Jobs, Steven Pinker, Stewart Brand, the scientific method, Thorstein Veblen, Turing complete, Turing machine, twin studies, Vilfredo Pareto, Walter Mischel, Whole Earth Catalog, WikiLeaks, zero-sum game

If, for most reasonable sets of priors, information about A would allow us to update our estimate of B, then it would seem there is some sort of causal connection between the two. But the form of the causal connection is unspecified—a principle often stated as “correlation does not imply causation.” The reason for this is that the essence of causation as a concept rests on our tendency to have information about earlier events before we have information about later events. (The full implications of this concept for human consciousness, the second law of thermodynamics, and the nature of time are interesting, but sadly outside the scope of this essay.) If information about all events always came in the order in which the events occurred, then correlation would indeed imply causation. But in the real world, not only are we limited to observing events in the past but also we may discover information about those events out of order.

., 61 climate change, 51, 53, 99, 178, 201–2, 204, 268, 309, 315, 335, 386, 390 CO2 levels and, 202, 207, 217, 262 cultural differences in view of, 387–88 global economy and, 238–39 procrastination in dealing with, 209, 210 clinical trials, 26, 44, 56 cloning, 56, 165 coastlines, xxvi, 246 Cochran, Gregory, 360–62 coffee, 140, 152, 351 cognition, 172 perception and, 133–34 cognitive humility, 39–40 cognitive load, 116–17 cognitive toolkit, 333 Cohen, Daniel, 254 Cohen, Joel, 65 Cohen, Steven, 307–8 cold fusion, 243, 244 Coleman, Ornette, 254, 255 collective intelligence, 257–58 Colombia, 345 color, 150–51 color-blindness, 144 Coltrane, John, 254–55 communication, 250, 358, 372 depth in, 227 temperament and, 231 companionship, 328–29 comparative advantage, law of, 100 comparison, 201 competition, 98 complexity, 184–85, 226–27, 326, 327 emergent, 275 computation, 227, 372 computers, 74, 103–4, 146–47, 172 cloud and, 74 graphical desktops on, 135 memory in, 39–40 open standards and, 86–87 computer software, 80, 246 concept formation, 276 conduction, 297 confabulation, 349–52 confirmation bias, 40, 134 Conner, Alana, 367–70 Conrad, Klaus, 394 conscientiousness, 232 consciousness, 217 conservatism, 347, 351 consistency, 128 conspicuous consumption, 228, 308 constraint satisfaction, 167–69 consumers, keystone, 174–76 context, sensitivity to, 40 continental drift, 244–45 conversation, 268 Conway, John Horton, 275, 277 cooperation, 98–99 Copernicanism, 3 Copernican Principle, 11–12, 25 Copernicus, Nicolaus, 11, 294 correlation, and causation, 215–17, 219 creationism, 268–69 creativity, 152, 395 constraint satisfaction and, 167–69 failure and, 79, 225 negative capability and, 225 serendipity and, 101–2 Crick, Francis, 165, 244 criminal justice, 26, 274 Croak, James, 271–72 crude look at the whole (CLAW), 388 Crutzen, Paul, 208 CT scans, 259–60 cultural anthropologists, 361 cultural attractors, 180–83 culture, 154, 156, 395 change and, 373 globalization and, see globalization culture cycle, 367–70 cumulative error, 177–79 curating, 118–19 currency, central, 41 Cushman, Fiery, 349–52 cycles, 170–73 Dalrymple, David, 218–20 DALYs (disability-adjusted life years), 206 danger, proving, 281 Darwin, Charles, 2, 44, 89, 98, 109, 156, 165, 258, 294, 359 Das, Satyajit, 307–9 data, 303, 394 personal, 303–4, 305–6 security of, 76 signal detection theory and, 389–93 Dawkins, Richard, 17–18, 180, 183 daydreaming, 235–36 DDT, 125 De Bono, Edward, 240 dece(i)bo effect, 381–85 deception, 321–23 decision making, 52, 305, 393 constraint satisfaction and, 167–69 controlled experiments and, 25–27 risk and, 56–57, 68–71 skeptical empiricism and, 85 deduction, 113 defeasibility, 336–37 De Grey, Aubrey, 55–57 delaying gratification, 46 democracy, 157–58, 237 Democritus, 9 Demon-Haunted World, The (Sagan), 273 Dennett, Daniel C., 170–73, 212, 275 depth, 226–28 Derman, Emanuel, 115 Descent of Man, The (Darwin), 156 design: mind and, 250–53 recursive structures in, 246–49 determinism, 103 Devlin, Keith, 264–65 Diagnostic and Statistical Manual of Mental Disorders (DSM-5), 233–34 “Dial F for Frankenstein” (Clarke), 61 Diesel, Rudolf, 170 diseases, 93, 128, 174 causes of, 59, 303–4 distributed systems, 74–77 DNA, 89, 165, 223, 244, 260, 292, 303, 306 Huntington’s disease and, 59 sequencing of, 15 see also genes dopamine, 230 doughnuts, 68–69, 70 drug trade, 345 dualities, 296–98, 299–300 wave-particle, 28, 296–98 dual view of ourselves, 32 dynamics, 276 Eagleman, David, 143–45 Earth, 294, 360 climate change on, see climate change distance between sun and, 53–54 life on, 3–5, 10, 15 earthquakes, 387 ecology, 294–95 economics, 100, 186, 208, 339 economy(ies), 157, 158, 159 global, 163–64, 238–39 Pareto distributions in, 198, 199, 200 and thinking outside of time, 223 ecosystems, 312–14 Edge, xxv, xxvi, xxix–xxx education, 50, 274 applying to real-world situations, 40 as income determinant, 49 policies on, controlled experiments in, 26 scientific lifestyle and, 20–21 efficiency, 182 ego: ARISE and, 235–36 see also self 80/20 rule, 198, 199 Einstein, Albert, 28, 55, 169, 301, 335, 342 on entanglement, 330 general relativity theory of, 25, 64, 72, 234, 297 memory law of, 252 on simplicity, 326–27 Einstellung effect, 343–44 electrons, 296–97 Elliott, Andrew, 150 Eliot, T.


pages: 825 words: 228,141

MONEY Master the Game: 7 Simple Steps to Financial Freedom by Tony Robbins

3D printing, active measures, activist fund / activist shareholder / activist investor, addicted to oil, affirmative action, Affordable Care Act / Obamacare, Albert Einstein, asset allocation, backtesting, bitcoin, buy and hold, clean water, cloud computing, corporate governance, corporate raider, correlation does not imply causation, Credit Default Swap, Dean Kamen, declining real wages, diversification, diversified portfolio, Donald Trump, estate planning, fear of failure, fiat currency, financial independence, fixed income, forensic accounting, high net worth, index fund, Internet of things, invention of the wheel, Jeff Bezos, Kenneth Rogoff, lake wobegon effect, Lao Tzu, London Interbank Offered Rate, market bubble, money market fund, mortgage debt, new economy, obamacare, offshore financial centre, oil shock, optical character recognition, Own Your Own Home, passive investing, profit motive, Ralph Waldo Emerson, random walk, Ray Kurzweil, Richard Thaler, risk tolerance, riskless arbitrage, Robert Shiller, Robert Shiller, self-driving car, shareholder value, Silicon Valley, Skype, Snapchat, sovereign wealth fund, stem cell, Steve Jobs, survivorship bias, telerobotics, the rule of 72, thinkpad, transaction costs, Upton Sinclair, Vanguard fund, World Values Survey, X Prize, Yogi Berra, young professional, zero-sum game

Ray says most of the big institutions, with hundreds of billions of dollars, are making the same mistake! RAINMAKER Ray was now on a roll and was systematically dissecting everything I had been taught or sold over the years! “Tony, there is another major problem with the balanced portfolio ‘theory.’ It’s based around a giant and, unfortunately, inaccurate assumption. It’s the difference between correlation and causation.” Correlation is a fancy investment word for when things move together. In primitive cultures, they would dance in an attempt to make it rain. Sometimes it actually worked! Or so they thought. They confused causation with correlation. In other words, they thought their jumping up and down caused the rain, but it was actually just coincidence. And if it happened more and more often, they would build some false confidence around their ability to predict the correlation between their dancing and the rain.

., 530 Chantal (Rwandan orphan), 592–93 child slavery, 600 China, death by a thousand cuts in, 109 Churchill, Winston, 188, 244, 457, 588 clarity, as power, 611 Clason, George Samuel, 69 Clinton, Bill, 553 Cloonan, James, 87 clothing, breathable, 567 cloud computing, xxvii Club of Rome, 556 Coca-Cola, 460, 566 Coelho, Paulo, 225 cognitive illusions, 38–39 cognitive limitations, 41 cognitive understanding, 42 collectibles, 324 commodities, 324 community service, 342–43 complexity, 41, 206 compounding, 35–36, 49–52, 58, 256, 364 fees, 106–9, 479 financial breakthrough of, 192–93 rule of 72 in, 283 savings, 60, 62–65, 238, 280 and taxes, 235, 277–78, 279, 445–46 and time, 311, 312 Connally, John B., 372 connection, 77 consumer spending, 213, 562 contrast, 245 contribution, 77–78, 266–67, 585 control, illusion of, 422, 580 Coppola, Francis Ford, 6, 52–53, 60 corporate bonds, 318–19 correlation vs. causation, 384 cortisol, 197 Costa Rica, moving to, 291 cost calculator, 111 creativity, 193, 266–67 credit-default obligations (CDOs), 325 critical mass, 33, 58, 89, 90, 408 Cuban, Mark, 281 Cuddy, Amy, 197 Cunningham, Keith, 133–34 currencies, 324, 328, 353 currency risk, 328 currency swap, 469 Curry, Ann, 350 Dalai Lama 574–75 Dalio, Ray, 10, 21–24, 25, 30, 41, 84, 94, 106, 496 on active management, 165 and All Seasons/All Weather, 306, 370, 371–72, 374–92, 404, 448, 613 and asset allocation, 101, 163, 282–83, 296, 298, 299, 331, 379, 383–84, 388, 389, 412, 494 author’s interview with, 47, 448, 455, 496–97 and Bridgewater, 21, 99, 374–75, 397, 496–97 and futures contract, 374 How the Economic Machine Works, 380 and McDonald’s, 373–74 portfolio of, 23, 101, 372–73, 390–91, 437 and Pure Alpha, 375–76, 397 and Risk/Reward, 173 and volatility, 301, 321 Damon, Matt, 17 death by a thousand cuts, 109, 122 debt, 239–40, 275 decisions: financial, 295 investment, 295, 364 our lives determined by, 244, 246 defined benefit plans, 155 deflation, 329, 385, 386, 526 demographic inevitability, 285 demographic wave, 562 denial, 211 depreciation, 285–86 depression, 581–82, 594 Diamandis, Peter, 47, 551, 554–55, 564, 572 DiCaprio, Leonardo, 15 Dimensional Funds, 113, 143 Dimon, Jamie, 499 discipline, 199, 543 Disraeli, Benjamin, 248, 573 diversification, 325–26, 527–28 and asset allocation, 296, 297–300, 355, 363, 364, 378, 472–73, 482 and asset classes, 355, 363, 383, 473, 490–91 and index funds, 49, 357, 473, 483 and long-term investment, 474 and returns, 276, 282, 297 and risk/reward, 297, 300, 379, 383, 456, 472–73 and volatility, 104 Dodd, Chris, 122 Dodd-Frank Wall Street Reform and Consumer Protection Act (2009), 122–23, 135 dollar-cost averaging, 355–59, 363, 365–66, 613 dollar-weighted returns, 118–19, 121 Dow Jones Industrial Average, 101 Dream Bucket, 207, 339, 340–47, 363 asset allocation, 346, 347, 613 and community service, 342–43 filling, 343–44, 613 and gifts, 341–42 and lifestyle, 341 list your dreams, 345 state your goals, 345 strategic splurges in, 340 Dunn, Elizabeth, 589, 601 Duty Free Shopping (DFS), 72 Earhart, Amelia, 63 Earnhardt, Dale Sr., 321 earnings, and investment, 259–72 Ebates, 256 Edelen, Roger, 114 Edison, Thomas A., 19 education, 264, 265–66 teachers, 266–67 effort, 228 Egyptian Treasury bills, 319 Einhorn, David, 99 Einstein, Albert, 50, 83, 259, 292 Eisenson, Marc, 251 Elizabeth I, queen of England, 550 Elizabeth II, queen of England, 541 emergency/protection fund, 216–17, 302 emerging markets, 100, 358, 473, 527 Emerson, Ralph Waldo, 19, 59, 219 Eminem, 191 emotion, 191, 209, 210, 301, 355, 402, 582, 594 emotional mastery, 42 empowerment, 190 endowment model, 469 energy policy, 506, 509, 510–12, 556–57 Enriquez, Juan, 551, 563, 566 Enron, 133–34, 162–63 entrepreneurs: and automatic savings, 65, 69 cash-balance plan for, 155 and 401(k)s, 146–48, 152, 153, 181 environment, investment, 385–88 Epictetus, 37 equities, 322–23, 329–30, 473 Erdoes, Mary Callahan, 10, 99–100, 455, 498 on asset allocation, 296, 337, 504 author’s interview with, 100, 309, 337–38, 498–504 on leadership, 501 on long-term investment, 504 on rebalancing, 361 on structured notes, 309–10 Europe, economies in, 518–20 Evans, Richard, 114 exchange-traded funds (ETFs), 322–23 execution, 41, 65, 228, 388, 616 expectations, 334, 387 expense ratio, 108, 113 expenses, cutting, 253–56 Extrabux, 256 extracellular matrix (ECM), 568 Faber, Marc, 523–28, 523 Facebook, 270 failure to try, 271 Fama, Eugene, 98 Farrell, Charlie, 279 fate, 228–29, 343 fear: of being judged, 193 dealing with, 544–45 of failure, 183–84, 225, 301 physical effects of, 196 of the unknown, 185, 211 Federal Deposit Insurance Corporation (FDIC), 178–79, 302, 305 Federal Reserve, 354, 481, 524, 535 Fee Checker, 145, 148, 151–52, 181 Feeding America, 598, 599 Feeney, Chuck, 72–73, 595 fees, 87, 104, 236 of annuities, 168–69, 308, 434, 439 compounding, 106–9, 479 cost calculator, 111 in 401(k)s, 111, 114, 141, 142, 143–46, 148, 151–52, 181 on index funds, 112, 165, 278 of mutual funds, 105–15, 119, 121, 141, 180, 273, 278, 479 nondeductible, 112 in pensions, 86 reducing, 273–80 and risk/reward, 177, 180 in structured notes, 310 Feldstein, Martin, 385 fiduciary, 126–33 advice from, 126, 286, 319, 338, 362 brokers vs., 126–28, 137, 180 Butcher vs.


pages: 586 words: 186,548

Architects of Intelligence by Martin Ford

3D printing, agricultural Revolution, AI winter, Apple II, artificial general intelligence, Asilomar, augmented reality, autonomous vehicles, barriers to entry, basic income, Baxter: Rethink Robotics, Bayesian statistics, bitcoin, business intelligence, business process, call centre, cloud computing, cognitive bias, Colonization of Mars, computer vision, correlation does not imply causation, crowdsourcing, DARPA: Urban Challenge, deskilling, disruptive innovation, Donald Trump, Douglas Hofstadter, Elon Musk, Erik Brynjolfsson, Ernest Rutherford, Fellow of the Royal Society, Flash crash, future of work, gig economy, Google X / Alphabet X, Gödel, Escher, Bach, Hans Rosling, ImageNet competition, income inequality, industrial robot, information retrieval, job automation, John von Neumann, Law of Accelerating Returns, life extension, Loebner Prize, Mark Zuckerberg, Mars Rover, means of production, Mitch Kapor, natural language processing, new economy, optical character recognition, pattern recognition, phenotype, Productivity paradox, Ray Kurzweil, recommendation engine, Robert Gordon, Rodney Brooks, Sam Altman, self-driving car, sensor fusion, sentiment analysis, Silicon Valley, smart cities, social intelligence, speech recognition, statistical model, stealth mode startup, stem cell, Stephen Hawking, Steve Jobs, Steve Wozniak, Steven Pinker, strong AI, superintelligent machines, Ted Kaczynski, The Rise and Fall of American Growth, theory of mind, Thomas Bayes, Travis Kalanick, Turing test, universal basic income, Wall-E, Watson beat the top human players on Jeopardy!, women in the workforce, working-age population, zero-sum game, Zipcar

You can show a young child their first giraffe, and now they know what a giraffe looks like; you can show them a new gesture or dance move, or how you use a new tool, and right away they’ve got it; they may not be able to make that move themselves, or use that tool, but they start to grasp what’s going on. Or think about learning causality, for example. We learn in basic statistics classes that correlation and causation are not the same thing, and correlation doesn’t always imply causation. You can take a dataset, and you can measure that the two variables are correlated, but it doesn’t mean that one causes the other. It could be that A causes B, B causes A, or some third variable causes both. The fact that correlation doesn’t uniquely imply causation is often cited to show how difficult it is to take observational data and infer the underlying causal structure of the world, and yet humans do this. In fact, we solve a much harder version of this problem. Even young children can often infer a new causal relation from just one or a few examples—they don’t even need to see enough data to detect a statistically significant correlation.

When we did, I realized that it is causality that gives us this modularity, and when we lose causality, we lose modularity, and we enter into no-man’s land. That means that we lose transparency, we lose reconfigurability, and other nice features that we like. By the time that I published my book on Bayesian networks in 1988, though, I already felt like an apostate because I knew already that the next step would be to model causality, and my love was already on a different endeavor. MARTIN FORD: We always hear people saying that “correlation is not causation,” and so you can never get causation from the data. Bayesian networks do not offer a way to understand causation, right? JUDEA PEARL: No, Bayesian networks could work in either mode. It depends on what you think about when you construct it. MARTIN FORD: The Bayesian idea is that you update probabilities based on new evidence so that your estimate should get more accurate over time.

However, in practice, people noticed that if you structure the network in the causal direction, things are much easier. The question was why. Now we understand that we were craving for features of causality that we didn’t even know come from causality. These were: modularity, reconfigurability, transferability, and more. By the time I looked into causality, I had realized that the mantra “correlation does not imply causation” is much more profound than we thought. You need to have causal assumptions before you can get causal conclusions, which you cannot get from data alone. Worse yet, even if you are willing to make causal assumptions, you cannot express them. There was no language in science in which you can express a simple sentence like “mud does not cause rain,” or “the rooster does not cause the sun to rise.”


pages: 252 words: 72,473

Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy by Cathy O'Neil

Affordable Care Act / Obamacare, Bernie Madoff, big data - Walmart - Pop Tarts, call centre, carried interest, cloud computing, collateralized debt obligation, correlation does not imply causation, Credit Default Swap, credit default swaps / collateralized debt obligations, crowdsourcing, Emanuel Derman, housing crisis, I will remember that I didn’t make the world, and it doesn’t satisfy my equations, illegal immigration, Internet of things, late fees, mass incarceration, medical bankruptcy, Moneyball by Michael Lewis explains big data, new economy, obamacare, Occupy movement, offshore financial centre, payday loans, peer-to-peer lending, Peter Thiel, Ponzi scheme, prediction markets, price discrimination, quantitative hedge fund, Ralph Nader, RAND corporation, recommendation engine, Rubik’s Cube, Sharpe ratio, statistical model, Tim Cook: Apple, too big to fail, Unsafe at Any Speed, Upton Sinclair, Watson beat the top human players on Jeopardy!, working poor

Insurance companies as well as bankers delineated neighborhoods where they would not invest. This cruel practice, known as redlining, has been outlawed by various pieces of legislation, including the Fair Housing Act of 1968. Nearly a half century later, however, redlining is still with us, though in far more subtle forms. It’s coded into the latest generation of WMDs. Like Hoffman, the creators of these new models confuse correlation with causation. They punish the poor, and especially racial and ethnic minorities. And they back up their analysis with reams of statistics, which give them the studied air of evenhanded science. On this algorithmic voyage through life, we’ve clawed our way through education and we’ve landed a job (even if it is one that runs us on a chaotic schedule). We’ve taken out loans and seen how our creditworthiness is a stand-in for other virtues or vices.


pages: 836 words: 158,284

The 4-Hour Body: An Uncommon Guide to Rapid Fat-Loss, Incredible Sex, and Becoming Superhuman by Timothy Ferriss

23andMe, airport security, Albert Einstein, Black Swan, Buckminster Fuller, carbon footprint, cognitive dissonance, Columbine, correlation does not imply causation, Dean Kamen, game design, Gary Taubes, index card, Kevin Kelly, knowledge economy, life extension, lifelogging, Mahatma Gandhi, microbiome, p-value, Parkinson's law, Paul Buchheit, placebo effect, Productivity paradox, publish or perish, Ralph Waldo Emerson, Ray Kurzweil, Richard Feynman, selective serotonin reuptake inhibitor (SSRI), Silicon Valley, Silicon Valley startup, Skype, stem cell, Steve Jobs, survivorship bias, Thorstein Veblen, Vilfredo Pareto, wage slave, William of Occam

The point isn’t to speculate about hundreds of possible explanations. The point is to be skeptical, especially of sensationalist headlines. Most “new studies” in the media are observational studies that can, at best, establish correlation (A happens while B happens), but not causality (A causes B to happen). If I pick my nose when the Super Bowl cuts to a commercial, did I cause that? This isn’t a haiku. It’s a summary: correlation doesn’t prove causation. Be skeptical when people tell you that A causes B. They’re wrong much more than 50% of the time. USE THE YO-YO: EMBRACE CYCLING Yo-yo dieting gets a bad rap. Instead of beating yourself up, going to the shrink, or eating an entire cheesecake because you ruined your diet with one cookie, allow me to deliver a message: it’s normal. Eating more, then less, then more, and so on in a continuous sine wave is an impulse we can leverage to reach goals faster.

Here is the most important paragraph in this chapter: Observational studies cannot control or even document all of the variables involved. Observational studies can only show correlation: A and B both exist at the same time in one group. They cannot show cause and effect.4 In contrast, randomized and controlled experiments control variables and can therefore show cause and effect (causation): A causes B to happen. The satirical religion Pastafarianism purposely confuses correlation and causation: With a decrease in the number of pirates, there has been an increase in global warming over the same period. Therefore, global warming is caused by a lack of pirates. Even more compelling: Somalia has the highest number of Pirates AND the lowest Carbon emissions of any country. Coincidence? Drawing unwarranted cause-and-effect conclusions from observational studies is the bread-and-butter of media and cause- or financially-driven scientists blind to their own lack of ethics.

Then try to bundle all the data up together, so that your negative data is swallowed up by some mediocre positive results. Or you could get really serious and start to manipulate the statistics. For two pages only, this will now get quite nerdy. Here are the classic tricks to play in your statistical analysis to make sure your trial has a positive result. Ignore the protocol entirely Always assume that any correlation proves causation. Throw all your data into a spreadsheet programme and report—as significant—any relationship between anything and everything if it helps your case. If you measure enough, some things are bound to be positive just by sheer luck. Play with the baseline Sometimes, when you start a trial, quite by chance the treatment group is already doing better than the placebo group. If so, then leave it like that.


pages: 307 words: 96,543

Tightrope: Americans Reaching for Hope by Nicholas D. Kristof, Sheryl Wudunn

Affordable Care Act / Obamacare, basic income, Bernie Sanders, carried interest, correlation does not imply causation, creative destruction, David Brooks, Donald Trump, dumpster diving, Edward Glaeser, Elon Musk, epigenetics, full employment, Home mortgage interest deduction, housing crisis, impulse control, income inequality, Jeff Bezos, job automation, jobless men, knowledge economy, labor-force participation, low skilled workers, mandatory minimum, Martin Wolf, mass incarceration, Mikhail Gorbachev, offshore financial centre, randomized controlled trial, rent control, Robert Shiller, Robert Shiller, Ronald Reagan, Shai Danziger, single-payer health, Steven Pinker, The Spirit Level, universal basic income, upwardly mobile, Vanguard fund, War on Poverty, working poor

“A father’s absence increases antisocial behavior, such as aggression, rule-breaking, delinquency and illegal drug use,” with the effects greater for boys than for girls, Sara McLanahan of Princeton University and Christopher Jencks of Harvard University concluded after assessing the evidence. Yet there’s a danger of drawing too sweeping a conclusion here, for it’s difficult to untangle correlation from causation, and in any case many single moms do brilliantly. In addition, most of the data is driven by low-income households, where a single parent means a constant financial struggle; more affluent single-parent households are much more likely to succeed. In any case, what matters isn’t a traditional family structure so much as stability. In principle, it probably doesn’t matter to the child whether the parents are formally married, but there is a difference in practice.

“it makes it harder for me to get a job”: Resentment of Latino immigrants was rooted not only in lost jobs but also in frustration that the social status of white working-class men had plummeted, with demographic and cultural changes making them feel a little like, in Arlie Russell Hochschild’s phrase, “strangers in their own land.” 16. THE MARRIAGE OF TRUE MINDS success for black men was marriage: W. Bradford Wilcox, Wendy R. Wang and Ronald B. Mincy, “Black Men Making It in America,” American Enterprise Institute, 2018. Of course, that is correlation rather than causation, and some of the unmarried men had risk factors that also made them less marriageable. two-parent households have more social capital: Consider low-income black men growing up in two different neighborhoods in Los Angeles. Of young black men who grew up in the lowest-income families in Watts, 44 percent ended up incarcerated on a single day (the day of the 2010 census). But of young black men who grew up similarly poor in Compton, two miles to the south, only 6 percent were incarcerated that day.


pages: 1,351 words: 385,579

The Better Angels of Our Nature: Why Violence Has Declined by Steven Pinker

1960s counterculture, affirmative action, Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem, Albert Einstein, availability heuristic, Berlin Wall, Bonfire of the Vanities, British Empire, Broken windows theory, business cycle, California gold rush, Cass Sunstein, citation needed, clean water, cognitive dissonance, colonial rule, Columbine, computer age, conceptual framework, correlation coefficient, correlation does not imply causation, crack epidemic, cuban missile crisis, Daniel Kahneman / Amos Tversky, David Brooks, delayed gratification, demographic transition, desegregation, Doomsday Clock, Douglas Hofstadter, Edward Glaeser, en.wikipedia.org, European colonialism, experimental subject, facts on the ground, failed state, first-past-the-post, Flynn Effect, food miles, Francis Fukuyama: the end of history, fudge factor, full employment, George Santayana, ghettoisation, Gini coefficient, global village, Henri Poincaré, Hobbesian trap, humanitarian revolution, impulse control, income inequality, informal economy, Intergovernmental Panel on Climate Change (IPCC), invention of the printing press, Isaac Newton, lake wobegon effect, libertarian paternalism, long peace, longitudinal study, loss aversion, Marshall McLuhan, mass incarceration, McMansion, means of production, mental accounting, meta analysis, meta-analysis, Mikhail Gorbachev, moral panic, mutually assured destruction, Nelson Mandela, open economy, Peace of Westphalia, Peter Singer: altruism, QWERTY keyboard, race to the bottom, Ralph Waldo Emerson, random walk, Republic of Letters, Richard Thaler, Ronald Reagan, Rosa Parks, Saturday Night Live, security theater, Skype, Slavoj Žižek, South China Sea, Stanford marshmallow experiment, Stanford prison experiment, statistical model, stem cell, Steven Levy, Steven Pinker, The Bell Curve by Richard Herrnstein and Charles Murray, The Wealth of Nations by Adam Smith, theory of mind, transatlantic slave trade, Turing machine, twin studies, ultimatum game, uranium enrichment, Vilfredo Pareto, Walter Mischel, WikiLeaks, women in the workforce, zero-sum game

When rock music burst onto the scene in the 1950s, politicians and clergymen vilified it for corrupting morals and encouraging lawlessness. (An amusing video reel of fulminating fogies can be seen in Cleveland’s Rock and Roll Hall of Fame and Museum.) Do we now have to—gulp—admit they were right? Can we connect the values of 1960s popular culture to the actual rise in violent crimes that accompanied them? Not directly, of course. Correlation is not causation, and a third factor, the pushback against the values of the Civilizing Process, presumably caused both the changes in popular culture and the increase in violent behavior. Also, the overwhelming majority of baby boomers committed no violence whatsoever. Still, attitudes and popular culture surely reinforce each other, and at the margins, where susceptible individuals and subcultures can be buffeted one way or another, there are plausible causal arrows from the decivilizing mindset to the facilitation of actual violence.

Archer found that countries in which women are better represented in government and the professions, and in which they earn a larger proportion of earned income, are less likely to have women at the receiving end of spousal abuse. Also, cultures that are classified as more individualistic, where people feel they are individuals with the right to pursue their own goals, have relatively less domestic violence against women than the cultures classified as collectivist, where people feel they are part of a community whose interests take precedence over their own.94 These correlations don’t prove causation, but they are consistent with the suggestion that the decline of violence against women in the West has been pushed along by a humanist mindset that elevates the rights of individual people over the traditions of the community, and that increasingly embraces the vantage point of women. Though elsewhere I have been chary about making predictions, I think it’s extremely likely that in the coming decades violence against women will decrease throughout the world.

On the contrary, “they must be permitted . . . the foolish and childish actions suitable to their years.”168 The idea that the way children are treated determines the kinds of adults they grow into is conventional wisdom today, but it was news at the time. Several of Locke’s contemporaries and successors turned to metaphor to remind people about the formative years of life. John Milton wrote, “The childhood shows the man as morning shows the day.” Alexander Pope elevated the correlation to causation: “Just as the twig is bent, the tree’s inclined.” And William Wordsworth inverted the metaphor of childhood itself: “The child is father of the man.” The new understanding required people to rethink the moral and practical implications of the treatment of children. Beating a child was no longer an exorcism of malign forces possessing a child, or even a technique of behavior modification designed to reduce the frequency of bratty behavior in the present.


pages: 105 words: 18,832

The Collapse of Western Civilization: A View From the Future by Naomi Oreskes, Erik M. Conway

anti-communist, correlation does not imply causation, creative destruction, en.wikipedia.org, energy transition, Intergovernmental Panel on Climate Change (IPCC), invisible hand, laissez-faire capitalism, market fundamentalism, mass immigration, means of production, oil shale / tar sands, Pierre-Simon Laplace, road to serfdom, Ronald Reagan, stochastic process, the built environment, the market place

fisherian statistics A form of mathematical analysis developed in the early twentieth century and designed to help distinguish between causal and accidental relation-ships between phenomena. Its originator, R. A. Fisher, was one of the founders of the science of population genetics, and also an advocate of racially-based eugenics programs. Fisher also rejected the evidence that tobacco use caused cancer, and his argument that “correlation is not causation” was later used as a mantra by neoliberals rejecting the scientific evidence of various forms of adverse environmental and health effects from industrial products (see statistical significance). fugitive emissions Leakage from wellheads, pipelines, refineries, etc. Considered “fugitive” because the releases were supposedly unintentional, at least some of them (e.g., methane venting at oil wells) were in fact entirely deliberate.


pages: 586 words: 159,901

Wall Street: How It Works And for Whom by Doug Henwood

accounting loophole / creative accounting, activist fund / activist shareholder / activist investor, affirmative action, Andrei Shleifer, asset allocation, asset-backed security, bank run, banking crisis, barriers to entry, borderless world, Bretton Woods, British Empire, business cycle, capital asset pricing model, capital controls, central bank independence, computerized trading, corporate governance, corporate raider, correlation coefficient, correlation does not imply causation, credit crunch, currency manipulation / currency intervention, David Ricardo: comparative advantage, debt deflation, declining real wages, deindustrialization, dematerialisation, diversification, diversified portfolio, Donald Trump, equity premium, Eugene Fama: efficient market hypothesis, experimental subject, facts on the ground, financial deregulation, financial innovation, Financial Instability Hypothesis, floating exchange rates, full employment, George Akerlof, George Gilder, hiring and firing, Hyman Minsky, implied volatility, index arbitrage, index fund, information asymmetry, interest rate swap, Internet Archive, invisible hand, Irwin Jacobs, Isaac Newton, joint-stock company, Joseph Schumpeter, kremlinology, labor-force participation, late capitalism, law of one price, liberal capitalism, liquidationism / Banker’s doctrine / the Treasury view, London Interbank Offered Rate, Louis Bachelier, market bubble, Mexican peso crisis / tequila crisis, microcredit, minimum wage unemployment, money market fund, moral hazard, mortgage debt, mortgage tax deduction, Myron Scholes, oil shock, Paul Samuelson, payday loans, pension reform, plutocrats, Plutocrats, price mechanism, price stability, prisoner's dilemma, profit maximization, publication bias, Ralph Nader, random walk, reserve currency, Richard Thaler, risk tolerance, Robert Gordon, Robert Shiller, Robert Shiller, selection bias, shareholder value, short selling, Slavoj Žižek, South Sea Bubble, The inhabitant of London could order by telephone, sipping his morning tea in bed, the various products of the whole earth, The Market for Lemons, The Nature of the Firm, The Predators' Ball, The Wealth of Nations by Adam Smith, transaction costs, transcontinental railway, women in the workforce, yield curve, zero-coupon bond

A policy-led decline in interest rates pushes up stock prices, and stockholding households spend more. Recently, however, that relationship seems to have broken down. Why this should be isn't clear; it could be that both the stock market and consumer spending were independently responding to lower interest rates, and that the conclusion that stocks were "causing" the spending changes are a classic example of confusing correlation with causation. Or it may be that the increasing institutionalization of the market has reduced the effect of stock prices on personal spending. Or it may have been that household balance sheets were in such terrible shape in the early 1990s that a bull market was of little help (Steindel 1992). But whatever the reason, this household application of q theory isn't quite as impressive as it once was.

., 249-251, 297 Colby, William, 104 Colgate-Palmolive, 113 College Retirement Equities Fund (CREF), 289 Collier, Sophia, 310 Columbia Savings, 88 commercial banks, 81-84 commodity prices, futures markets and, 33; see also futures markets common stock, 12 Community Capital Bank, 311 community development banks, 311-314 community development organizations, co-optation of, 315 community land trusts, 314-315 compensatory borrowing, 65 competition managerialist view of, 260 return of, 1970s, 260 Comstock, Lyndon, 311 Conant, Charles, 94-95 Conference Board, 136, 291 consciousness credit and, 236-237 rentier, profit with passage of time, 238 consensus, 133 consumer credit, 64-66, 77 in a Marxian light, 234 in 1930s depression, 156-157 rare in Keynes's day, 242 see also households Consumer Expenditure Survey, 70 consumption, 189 contracts, 249; see afao transactions-cost economics control. 5ee corporations, governance cooperatives, 321 managers hired by workers, 239 weaknesses of ownership structure, 88 corporate control, market for, 277-282 Manne on, 278 corporations debt distribution of, 1980s, 159 and early 1990s slump, 158-161 development, and stock market, 14 emergence, and Federal Reserve, 92-96 emergence of complex ownership, 188 evolution, 253 form as presaging worker control (Marx), 239-240 importance of railroads in emergence, 188 localist critique of, 241 managers' concern for stock price, 171 multinational evolution, and financial markets, 112-113 investment clusters, 111-112 nonfinancial, 72-76 finances (table), 75 financial interests, 262 refinancing in early 1990s, l6l role in economic analysis, 248 shareholders conribute nothing or less, 238 soulful, 258, 263; see afeo social investing stock markets' role in constitution of, 254 transforming, 320-321 virtues of size, 282 corporations, governance, 246-294 Baran and Sweezy on, 258 Berle and Means on, 252-258 abuse of owners by managers, 254 interest-group model, 257-258 Berle on collective capitalism, 253-254 boaids of directors, 27-29, 246, 257, 259, 263, 272 financial representatives on, 265 keiretsu, 275 of a "Morganized" firm, 264 rentier agenda, 290 structure, 299 competition's obsolescence/return, 260 debt and equity, differences, 247 EM theory and Jensenism as unified field theory, 276 financial control 359 WALL STREET meaning, 264 theories of, rebirth in 1970s, 260-263 financial interests asserted in crisis, 265 financial upsurge since 1980s, 263-265 Fitch/Oppenheimer controversy, 261-262 Galbraith on, 258-260 Golden Age managerialism, 258-260 Herman on, 260 influence vs. ownership, 264—265 international comparisons, 248 Jensenism. 5eeJensen, Michael market for corporate control, 277-282 narrowness of debate, 246 Rathenau on, 256 shareholder activism of 1990s, 288-291 Smith on, 255-256 Spencer on, 256-257 stockholder-bondholder conflicts, 248 theoretical taxonomy, 251-252 transactions cost economics, 248-251 transformation, 320-321 useless shareholders, 291-294 correlation coefficient, 116 correlation vs. causation, 145 cost of capital, 184, 298 Council of Institutional Investors, 290 Cowles, Alfred, 164 crack spread, 31 Cramer, James, 103 crank, 243 credit/credit markets assets, holders of, 59-61 as barrier to growth, 237 as boundary-smasher (Marx), 235 centrality of, 118-121 and consciousness, 236-237 European vs. U.S. theories of, 137 function, 59 as "fundamental" (Marx), 244 information asymmetry, 172 market share by lending institution, 81 structure, 58-62 subordination to production (Marx), 237 U.S. international position, 61 see also bond markets; debt; money, psychology of credit crunch 0989-92), 158-161 credit gratuitiVioudhon), 302 credit rationing, 172 in Keynes's Treatise, 193-194 crime, business, 252 crises, corporate, financial interests assert power during, 265 crises, financial, 265 financiers' political uses of, 294-297 increasing prominence starting in 1970s, 222 money and, 93-94 Keynes, 202-205 Marx, 232-236 Third World, 110, 294-295 see also bailouts Crotty, James, 229 crowd psychology, 176-177, 185 currency markets, 41^9 crises, economic causes, 44 gold, 46-49 history, 41^4 mechanics and trading volume, 45—46 during trading week, 130-131 underlying values, 44-45 currency swaps, 35 Dale, James Davidson, 104 Davidson, Paul, 242, 243 Debreu, Gerard, 139 debt appropriate underlying assets, 247 as conservatizing force, 66 ideal level, pre-MM, 150 and 1930s depression, 155-158 and political power, 4, 23 reasons to shun, 149 by sector, 58-59 by type (table), 60 see also credit/credit markets; specific sectors debt deflation (Fisher), 157 modern absence of, 234-235 why there was none in early 1990s, 158-161 deficit financing, 297 deflations.


pages: 863 words: 159,091

A Manual for Writers of Research Papers, Theses, and Dissertations, Eighth Edition: Chicago Style for Students and Researchers by Kate L. Turabian

Bretton Woods, conceptual framework, correlation does not imply causation, illegal immigration, Menlo Park, meta analysis, meta-analysis, Steven Pinker, Telecommunications Act of 1996, yellow journalism, Zeno's paradox

Note whether it's an important claim, a minor point, a qualification or concession, and so on. Such distinctions help you avoid mistakes like this: Original by Jones: We cannot conclude that one event causes another because the second follows the first. Nor can statistical correlation prove causation. But no one who has studied the data doubts that smoking is a causal factor in lung cancer. Misleading report: Jones claims “we cannot conclude that one event causes another because the second follows the first. Nor can statistical correlation prove causation.” Therefore, statistical evidence is not a reliable indicator that smoking causes lung cancer. 4.3.4 Categorize Your Notes for Sorting Finally, a conceptually demanding task: as you take notes, categorize the content of each one under two or more different keywords (see the upper right corner of the note card in fig. 4.1).


pages: 554 words: 149,489

The Content Trap: A Strategist's Guide to Digital Change by Bharat Anand

Airbnb, Benjamin Mako Hill, Bernie Sanders, Clayton Christensen, cloud computing, commoditize, correlation does not imply causation, creative destruction, crowdsourcing, death of newspapers, disruptive innovation, Donald Trump, Google Glasses, Google X / Alphabet X, information asymmetry, Internet of things, inventory management, Jean Tirole, Jeff Bezos, John Markoff, Just-in-time delivery, Khan Academy, Kickstarter, late fees, Mark Zuckerberg, market design, Minecraft, multi-sided market, Network effects, post-work, price discrimination, publish or perish, QR code, recommendation engine, ride hailing / ride sharing, selection bias, self-driving car, shareholder value, Shenzhen was a fishing village, Silicon Valley, Silicon Valley startup, Skype, social graph, social web, special economic zone, Stephen Hawking, Steve Jobs, Steven Levy, Thomas L Friedman, transaction costs, two-sided market, ubercab, WikiLeaks, winner-take-all economy, zero-sum game

Figure 14: Impact of Format Changes on Music Sales, 1973–2013 (Peak unit sales normalized to 100 for all formats) Diagnosing the music industry problem is not simply a question of seeing that CD declines are coincident with trends in file sharing. It requires separating cause from effect. The problem with the diagnosis stems from an age-old problem in statistical inference: separating correlation from causation. We see it everywhere. Does TV viewing increase obesity, or are obese individuals more inclined to watch TV? Are Asians innately better at math, or do they work harder at it? Simple correlations would lead you to infer that there’s some causal relation between two variables, when in fact there might be none. The most common approach to uncovering causation between two variables is to look for a third variable that correlates with only one of them—an “instrumental” variable, in the language of economic statisticians.

But most reactions to the paper were generally of a different ilk, Tadelis recalled: Several bloggers in the business of Internet marketing analytics argued that “of course, paid search didn’t work for eBay—eBay’s a stupid company, and it doesn’t know how to spend its money.” Was eBay stupid in that it was using the wrong keywords? Absolutely not—by then eBay had learned a lot about keyword bidding from prediction models developed by a very sophisticated group of Ph.D. computer scientists. But the models fell under the category of machine learning, where all you care about is correlation, not causation. Was eBay wasting a lot of money? Yes—just like any company that’s not aware. And that’s practically all companies using the industry’s best practices, which are flawed because of the endogeneity problem. eBay hadn’t set out to understand how the endogeneity problem affected the returns on paid search. And Tadelis would never have thought about the question if he hadn’t been at eBay.


pages: 433 words: 129,636

Dreamland: The True Tale of America's Opiate Epidemic by Sam Quinones

1960s counterculture, Affordable Care Act / Obamacare, Albert Einstein, British Empire, call centre, centralized clearinghouse, correlation does not imply causation, crack epidemic, deindustrialization, feminist movement, illegal immigration, mass immigration, Maui Hawaii, McMansion, obamacare, zero-sum game

This was preposterous. Never in thirty years of statistical mechanics had Orman Hall heard of a correlation that close to 1.0, which was almost as if the charts were saying that dispensing prescription painkillers was the same thing as people dying. Gay couldn’t believe it either. He ran the DOH numbers again. Each time, 0.979 appeared on his computer screen. Every statistician knows correlation does not mean causation. But to Gay the correlations did mean that Ohio could all but predict one overdose death for roughly every two months’ worth of prescription opiates dispensed. A Pro Wrestler’s Legacy Seattle, Washington In 2007, Alex Cahana opened the door to what had been John Bonica’s Center for Pain Relief at the University of Washington and found a cobwebbed relic. The pathbreaking clinic was now in a windowless basement.

In fact, fewer medical students were going into primary care—repelled by long hours, the modest money, and the lack of respect. One study estimated the country would need fifty-two thousand more primary care docs by 2025. A commentary by four doctors and researchers in the American Journal of Public Health in September 2014 insisted that “It is difficult to believe that the parallel rise in prescriptions and associated harms is mere correlation without causation. [Also] it is difficult to believe that the problem is solely attributable to patients with already existing substance use disorders.” They went on, “Appropriate medical use of prescription opioidscan, in some unknown proportion of cases, initiate a progression toward misuse and ultimately addiction . . . Even if an initial exposure is insufficient to cause addiction directly, perhaps it is sufficient to trigger initial misuse that could ultimately lead to addiction.”


pages: 502 words: 107,657

Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die by Eric Siegel

Albert Einstein, algorithmic trading, Amazon Mechanical Turk, Apple's 1984 Super Bowl advert, backtesting, Black Swan, book scanning, bounce rate, business intelligence, business process, butter production in bangladesh, call centre, Charles Lindbergh, commoditize, computer age, conceptual framework, correlation does not imply causation, crowdsourcing, dark matter, data is the new oil, en.wikipedia.org, Erik Brynjolfsson, Everything should be made as simple as possible, experimental subject, Google Glasses, happiness index / gross national happiness, job satisfaction, Johann Wolfgang von Goethe, lifelogging, Machine translation of "The spirit is willing, but the flesh is weak." to Russian and back, mass immigration, Moneyball by Michael Lewis explains big data, Nate Silver, natural language processing, Netflix Prize, Network effects, Norbert Wiener, personalized medicine, placebo effect, prediction markets, Ray Kurzweil, recommendation engine, risk-adjusted returns, Ronald Coase, Search for Extraterrestrial Intelligence, self-driving car, sentiment analysis, Shai Danziger, software as a service, speech recognition, statistical model, Steven Levy, text mining, the scientific method, The Signal and the Noise by Nate Silver, The Wisdom of Crowds, Thomas Bayes, Thomas Davenport, Turing test, Watson beat the top human players on Jeopardy!, X Prize, Yogi Berra, zero-sum game

Public health offices in the UK Band members benefit from peer support and solo artists exhibit even riskier behaviour. Correlation Does Not Imply Causation Satisfaction came in the chain reaction. —From the song “Disco Inferno,” by The Trammps The preceding tables, packed with fun-filled facts, do not explain a single thing. Take note, the third column is headed “Suggested Explanation.” The left column’s discoveries are real, validated by data, but the reasons behind them are unknown. Every explanation put forth, each entry in the rightmost column, is pure conjecture with absolutely no hard facts to back it up. The dilemma is, as it is often said, correlation does not imply causation.5 The discovery of a predictive relationship between A and B does not mean one causes the other, not even indirectly.

Insights: The Factors behind Quitting Delivering Dynamite Don’t Quit While You’re Ahead Predicting Crime to Stop It Before It Happens The Data of Crime and the Crime of Data Machine Risk without Measure The Cyclicity of Prejudice Good Prediction, Bad Prediction The Source of Power Chapter 3: The Data Effect (data) The Data of Feelings and the Feelings of Data Predicting the Mood of Blog Posts The Anxiety Index Visualizing a Moody World Put Your Money Where Your Mouth Is Inspiration and Perspiration Sifting Through the Data Dump The Instrumentation of Everything We Do Batten Down the Hatches: T.M.I. The Big Bad Wolf The End of the Rainbow Prediction Juice Far Out, Bizarre, and Surprising Insights Correlation Does Not Imply Causation The Cause and Effect of Emotions A Picture Is Worth a Thousand Diamonds Validating Feelings and Feeling Validated Serendipity and Innovation Investment Advice from the Blogosphere Money Makes the World Go ‘Round Putting It All Together Chapter 4: The Machine That Learns (modeling) Boy Meets Bank Bank Faces Risk Prediction Battles Risk Risky Business The Learning Machine Building the Learning Machine Learning from Bad Experiences How Machine Learning Works Decision Trees Grow on You Computer, Program Thyself Learn Baby Learn Bigger Is Better Overlearning: Assuming Too Much The Conundrum of Induction The Art and Science of Machine Learning Feeling Validated: Test Data Carving Out a Work of Art Putting Decision Trees to Work for Chase Money Grows on Trees The Recession—Why Microscopes Can’t Detect Asteroid Collisions After Math Chapter 5: The Ensemble Effect (ensembles) Casual Rocket Scientists Dark Horses Mindsourced: Wealth in Diversity Crowdsourcing Gone Wild Your Adversary Is Your Amigo United Nations Meta-Learning A Big Fish at the Big Finish Collective Intelligence The Wisdom of Crowds . . . of Models A Bag of Models Ensemble Models in Action The Generalization Paradox: More Is Less The Sky’s the Limit Chapter 6: Watson and the Jeopardy!


pages: 523 words: 61,179

Human + Machine: Reimagining Work in the Age of AI by Paul R. Daugherty, H. James Wilson

3D printing, AI winter, algorithmic trading, Amazon Mechanical Turk, augmented reality, autonomous vehicles, blockchain, business process, call centre, carbon footprint, cloud computing, computer vision, correlation does not imply causation, crowdsourcing, digital twin, disintermediation, Douglas Hofstadter, en.wikipedia.org, Erik Brynjolfsson, friendly AI, future of work, industrial robot, Internet of things, inventory management, iterative process, Jeff Bezos, job automation, job satisfaction, knowledge worker, Lyft, natural language processing, personalized medicine, precision agriculture, Ray Kurzweil, recommendation engine, RFID, ride hailing / ride sharing, risk tolerance, Rodney Brooks, Second Machine Age, self-driving car, sensor fusion, sentiment analysis, Shoshana Zuboff, Silicon Valley, software as a service, speech recognition, telepresence, telepresence robot, text mining, the scientific method, uber lyft

There, in their own records, was validation of a causal connection, hidden in plain sight. “It was the first time that I know of that machines discovered new medical knowledge,” says Hill. “Straight from the data. There was no human involved in this discovery.”7 GNS Healthcare is showing that it’s possible, when AI is injected into the hypothesis phase of the scientific method, to find previously hidden correlations and causations. Moreover, use of the technology can result in dramatic cost savings. In one recent success, GNS was able to reverse-engineer—without using a hypothesis or preexisting assumptions—PCSK9, a class of drug that reduces bad cholesterol in the bloodstream. It took seventy years to discover PCSK9 and tens of billions of dollars over decades. But using the same starting data only, GNS’s machine-learning models were able to recreate all the known LDL biology in less than ten months for less than $1 million.


Logically Fallacious: The Ultimate Collection of Over 300 Logical Fallacies (Academic Edition) by Bo Bennett

Black Swan, butterfly effect, clean water, cognitive bias, correlation does not imply causation, Donald Trump, equal pay for equal work, Richard Feynman, side project, statistical model, the scientific method

Exception: Making a scientific claim about quantum physics, using the scientific method, is not fallacious. Tip: Pick up an introductory book to quantum physics, it is not only a fascinating subject, but you will be well prepared to ask the right questions and expose this fallacy when used. Questionable Cause cum hoc ergo propter hoc (also known as: ignoring a common cause, neglecting a common cause, confusing correlation and causation, confusing cause and effect, false cause, third cause, juxtaposition [form of], reversing causality/wrong direction [form of]) Description: Concluding that one thing caused another, simply because they are regularly associated. Logical Form: A is regularly associated with B, therefore, A causes B. Example #1: Every time I go to sleep, the sun goes down. Therefore, my going to sleep causes the sun to set.


pages: 337 words: 103,522

The Creativity Code: How AI Is Learning to Write, Paint and Think by Marcus Du Sautoy

3D printing, Ada Lovelace, Albert Einstein, Alvin Roth, Andrew Wiles, Automated Insights, Benoit Mandelbrot, Claude Shannon: information theory, computer vision, correlation does not imply causation, crowdsourcing, data is the new oil, Donald Trump, double helix, Douglas Hofstadter, Elon Musk, Erik Brynjolfsson, Fellow of the Royal Society, Flash crash, Gödel, Escher, Bach, Henri Poincaré, Jacquard loom, John Conway, Kickstarter, Loebner Prize, mandelbrot fractal, Minecraft, music of the spheres, Narrative Science, natural language processing, Netflix Prize, PageRank, pattern recognition, Paul Erdős, Peter Thiel, random walk, Ray Kurzweil, recommendation engine, Rubik’s Cube, Second Machine Age, Silicon Valley, speech recognition, Turing test, Watson beat the top human players on Jeopardy!, wikimedia commons

The algorithm was programmed to calculate the effect on the score of moving left or right given the current state of the screen. The impact of a move could be several seconds down the line, so you have to calculate the delayed impact. This is quite tricky because it isn’t always clear what causes a certain effect. This is one of the shortcomings of machine learning: it sometimes picks up correlation and believes it to be causation. Animals suffer from the same problem. This is rather beautifully illustrated by an experiment that revealed pigeons to be superstitious. A number of pigeons were filmed in their cages and, at certain moments during the day, a food dispenser was moved into the cage. The door to the dispenser was on a delay so the pigeons, although excited by the arrival of the food dispenser, would have to wait to get the food.

E. 139 Beveridge, Andrew 56 Beyond the Fence (musical) 290–1 Białystok University 236 biases and blind spots, algorithmic 91–5 Birtwistle, Harrison 193 Blake, William 279 Blombos Cave, South Africa 103 Bloom (app) 229 BOB (artificial life form) 146–8 Boden, Margaret 9, 10, 11, 16, 39, 209, 222 Boeing 114 Bonaparte, Napoleon 158 bone carvings 104–5 booksellers 62–5 bordeebook 62–5 Borges, Jorge Luis: ‘The Library of Babel’ 241–4, 253, 304 Botnik 284–6 Boulanger, Nadia 186, 189, 205, 209 Boulez, Pierre 11, 223 brachistochrone 244 Braff, Zach 284 brain: biases and blind spots 91–2; consciousness and 274, 304–5; fractals and 124–5; mathematics and 155, 156, 160–1, 171, 174, 177, 178; musical composition and 187, 189, 193, 203, 205, 231; neural networks and 68–71, 68, 70; pattern recognition and 6, 20–1, 99–101, 155; stroke and 133–4; visual recognition and 76, 79, 143–4 Breakout (game) 26–8, 91, 92, 210 Brew, Jamie 284 Brin, Sergey 48–9, 51–2, 57 Bronowski, Jacob 104 Brown, Glenn 141 Bruner, Jerome 303 Buolamwini, Joy 94 Cage, John 106, 206 Calculus of Constructions (CoC) 173–4 see also Coq Cambridge Analytica 296 Cambridge University 18–19, 23–4, 43, 72, 81, 150, 225, 240, 278, 290 Carpenter, Loren 114, 115 Carré, Benoit 224 cars, driverless 6, 29–30, 79, 91 Cartesian geometry 110–11 Catmull, Ed 115 cave art, ancient 103–4, 105, 156, 230 Cawelti, John: Adventure, Mystery and Romance 252–3 Chang, Alex 23 chaos theory 124 Cheng, Ian 146–8 chess 16, 18–20, 21, 22–3, 29, 32–3, 34, 97, 151, 153, 162, 163, 246, 260–1, 304 child pornography 77 Chilvers, Peter 229 Chinese Room experiment 164, 273–5 Chomsky, Noam 260 Chopin, Frédéric 13, 197, 200, 202, 204, 206–7, 304 Christie’s 141 classemes 138 Classical era of music 10, 12–13, 190, 199, 207 Classification of Finite Simple Groups 18, 172, 175, 177, 244 Coelho, Paulo 302 Cohen, Harold 116–17, 118, 121 Coleridge, Samuel Taylor: ‘Kubla Khan’ 14 Colton, Simon 119, 120, 121–2, 291, 292, 293 Coltrane, John 223 Commodore Amiga 23 Congo (chimp) 107 consciousness 107, 231, 232, 270, 274, 283, 300, 302–6 Continuator, The 218–21, 286 Conway, John 18–19 Cope, David 195–203, 207, 208, 210, 304 copyright ownership 108–9 Coq 173–6, 177, 184 Coquand, Thierry 173 correlation as causation, mistaking 92–4 Corresponding Society of Musical Sciences 193, 208 Coulom, Rémi 31 Crazy Stone 31 Creative Adversarial Networks 140–1 creativity: algorithmic and rule-based, as 5; animals and 107–9; art, definition of and 103–7; audiences and 303; coder to code, shifting from 7, 102–3, 116–22, 132–42, 219–20; combinational 10–11, 16, 181, 222, 299; commercial incentive and 131–2; competition and 132–42; consciousness and 301–2, 303–5; death and 304; definition of 3–5, 9–13, 301–2; drugs and 181–2; exploratory 9–10, 40, 181, 219, 299; failure as component part of 17; feedback from others and 132; flow and 221–4, 222; Go and see Go; human lives as act of 303–4; Lovelace Test and see Lovelace Test; mathematics and 3, 150–1, 153, 161, 167–8, 170, 181–2, 185, 245–8, 253, 279–80; mechanical nature of 298; music and see music; new/novelty and 3, 4, 7–8, 12, 13, 16, 17, 40–3, 102–3, 109, 138–41, 140, 167–8, 238–9, 291–3, 299, 301; origins of our obsession with 301; political role of 303; randomness and 117–18; romanticising 14–15; self-reflection and 300; storytelling and see storytelling; surprise and 4, 8, 40, 65, 66, 102–3, 148, 168, 202, 241, 248–9; teaching 13–17; three types of 9–13; transformational 11–13, 17, 39, 41, 181, 209, 299; value and 4, 8, 12, 16, 17, 40–1, 102–3, 167–8, 238–9, 301, 304 Csikszentmihalyi, Mihaly 221 Cubism 11, 138, 139 Cybernetic Poet 280–2 Cybernetic Serendipity (ICA exhibition, 1968) 118–19 Dahl, Roald: Tales of the Unexpected 276–7; ‘The Great Automatic Grammatizator’ 276–7, 297 dating/matching 57–61, 58, 59, 60 da Vinci, Leonardo 106, 118, 128; Treatise on Painting 117 Davis, Miles: Kind of Blue 214 Debussy, Claude 1 DeepBach 210–12, 232 DeepBlue 29, 214, 260–1 DeepMind 25–43, 65, 95, 97, 98, 131, 132, 151, 210, 233–9, 241, 266 Deep Watch 224 Delft University of Technology 127 democracy 165–6 Dennett, Daniel 147 Descartes, René 12, 110–11 Disney 289–90 Duchamp, Marcel 106 du Sautoy, Marcus: attempts to fake a Jackson Pollock 123–5; composes music 186–8; The Music of the Primes 285–6; uses AI to write section of this book 297 Dylan, Bob 223 EEG 125 Egyptians, Ancient 157, 165 eigenvectors of matrices 53 Eisen, Michael 62, 64 Elgammal, Ahmed 132–3, 134, 135, 139, 140, 141 Eliot, George 302 Eliot, T.


pages: 293 words: 81,183

Doing Good Better: How Effective Altruism Can Help You Make a Difference by William MacAskill

barriers to entry, basic income, Black Swan, Branko Milanovic, Cal Newport, Capital in the Twenty-First Century by Thomas Piketty, carbon footprint, clean water, corporate social responsibility, correlation does not imply causation, Daniel Kahneman / Amos Tversky, David Brooks, effective altruism, en.wikipedia.org, end world poverty, experimental subject, follow your passion, food miles, immigration reform, income inequality, index fund, Intergovernmental Panel on Climate Change (IPCC), Isaac Newton, job automation, job satisfaction, Lean Startup, M-Pesa, mass immigration, meta analysis, meta-analysis, microcredit, Nate Silver, Peter Singer: altruism, purchasing power parity, quantitative trading / quantitative finance, randomized controlled trial, self-driving car, Skype, Stanislav Petrov, Steve Jobs, Steve Wozniak, Steven Pinker, The Future of Employment, The Wealth of Nations by Adam Smith, universal basic income, women in the workforce

Even among the “bottom billion”—the population of countries that have experienced the weakest economic growth over the last few decades—quality of life has increased dramatically. In 1950, life expectancy in sub-Saharan Africa was just 36.7 years. Now it’s 56 years, a gain of almost 50 percent. The picture that Dambisa Moyo paints is inaccurate. In reality, a tiny amount of aid has been spent, and there have been dramatic increases in the welfare of the world’s poorest people. Of course, correlation is not causation. Merely showing that the people’s welfare has improved at the same time the West has been offering aid does not prove that aid caused the improvement. It could be that aid is entirely incidental, or even harmful, holding back even greater progress that would have happened anyway or otherwise. But in fact there’s good reason to think that, on average, international aid spending has been incredibly beneficial.

Robustness of evidence is very important for the simple reason that many programs don’t work, and it’s hard to distinguish the programs that don’t work from the programs that do. If we’d assessed Scared Straight by looking just at before-and-after delinquency rates for individuals who went through the program, we would have concluded it was a great program. Only after looking at randomized controlled trials could we tell that correlation did not indicate causation in this case and that Scared Straight programs were actually doing more harm than good. One of the most damning examples of low-quality evidence concerns microcredit (that is, lending small amounts of money to the very poor, a form of microfinance most famously associated with Muhammad Yunus and the Grameen Bank). Intuitively, microcredit seems like it would be very cost-effective, and there were many anecdotes of people who’d received microloans and used them to start businesses that, in turn, helped them escape poverty.


pages: 332 words: 104,587

Half the Sky: Turning Oppression Into Opportunity for Women Worldwide by Nicholas D. Kristof, Sheryl Wudunn

agricultural Revolution, correlation does not imply causation, demographic dividend, feminist movement, Flynn Effect, illegal immigration, Mahatma Gandhi, microcredit, paper trading, rolodex, Ronald Reagan, Rosa Parks, school choice, special economic zone, transatlantic slave trade, women in the workforce

The methodology of such studies is typically weak, and it doesn’t adequately account for cause and effect. “The evidence, in most cases, suffers from obvious biases: educated girls come from richer families and marry richer, more educated, more progressive husbands,” notes Esther Duflo of MIT, one of the most careful scholars of gender and development. “As such, it is, in general, difficult to account for all these factors, and few of the studies have tried to do so.” Correlation, in short, is not causation.* Advocates also undermine the trustworthiness of their cause by cherry-picking evidence. While we argue that schooling girls does stimulate economic growth and foster stability, for example, it is also true that one of the most educated parts of rural India is the state of Kerala, which has stagnated economically. Likewise, two of the places in the Arab world that have given girls the most education were Lebanon and Saudi Arabia, yet the former has been a vortex of conflict and the latter a breeding ground for violent fundamentalists.

This was something that we didn’t expect at all. It shows the power of education.” Speaking of role models and the power of education, Camfed Zimbabwe has a new and dynamic executive director. She’s a young woman who knows something about overcoming long odds and the impact a few dollars in tuition assistance can make in a girl’s life. It’s Angeline. * Larry Summers offers an example to emphasize the distinction between correlation and causation. He notes that there is an almost perfect correlation between literacy and ownership of dictionaries. But handing out more dictionaries will not raise literacy. CHAPTER ELEVEN Microcredit: The Financial Revolution It is impossible to realize our goals while discriminating against half the human race. As study after study has taught us, there is no tool for development more effective than the empowerment of women.


pages: 295 words: 66,824

A Mathematician Plays the Stock Market by John Allen Paulos

Benoit Mandelbrot, Black-Scholes formula, Brownian motion, business climate, business cycle, butter production in bangladesh, butterfly effect, capital asset pricing model, correlation coefficient, correlation does not imply causation, Daniel Kahneman / Amos Tversky, diversified portfolio, dogs of the Dow, Donald Trump, double entry bookkeeping, Elliott wave, endowment effect, Erdős number, Eugene Fama: efficient market hypothesis, four colour theorem, George Gilder, global village, greed is good, index fund, intangible asset, invisible hand, Isaac Newton, John Nash: game theory, Long Term Capital Management, loss aversion, Louis Bachelier, mandelbrot fractal, margin call, mental accounting, Myron Scholes, Nash equilibrium, Network effects, passive investing, Paul Erdős, Paul Samuelson, Ponzi scheme, price anchoring, Ralph Nelson Elliott, random walk, Richard Thaler, Robert Shiller, Robert Shiller, short selling, six sigma, Stephen Hawking, stocks for the long run, survivorship bias, transaction costs, ultimatum game, Vanguard fund, Yogi Berra

To find the volatility of a portfolio in general, we need what is called the “covariance” (closely related to the correlation coefficient) between any pair of stocks X and Y in the portfolio. The covariance between two stocks is roughly the degree to which they vary together—the degree, that is, to which a change in one is proportional to a change in the other. Note that unlike many other contexts in which the distinction between covariance (or, more familiarly, correlation) and causation is underlined, the market generally doesn’t care much about it. If an increase in the price of ice cream stocks is correlated to an increase in the price of lawn mower stocks, few ask whether the association is causal or not. The aim is to use the association, not understand it—to be right about the market, not necessarily to be right for the right reasons. Given the above distinction, some of you may wish to skip the next three paragraphs on the calculation of covariance.


The Data Journalism Handbook by Jonathan Gray, Lucy Chambers, Liliana Bounegru

Amazon Web Services, barriers to entry, bioinformatics, business intelligence, carbon footprint, citizen journalism, correlation does not imply causation, crowdsourcing, David Heinemeier Hansson, eurozone crisis, Firefox, Florence Nightingale: pie chart, game design, Google Earth, Hans Rosling, information asymmetry, Internet Archive, John Snow's cholera map, Julian Assange, linked data, moral hazard, MVC pattern, New Journalism, openstreetmap, Ronald Reagan, Ruby on Rails, Silicon Valley, social graph, SPARQL, text mining, web application, WikiLeaks

Or you can divide the data subjects into groups: Analysis by categories “Councils run by the Purple Party spend 50% more on paper clips than those controlled by the Yellow Party.” Or you can relate factors numerically: Association “Councils run by politicians who have received donations from stationery companies spend more on paper clips, with spending increasing on average by £100 for each pound donated.” But, of course, always remember that correlation and causation are not the same thing. So if you’re investigating paper clip spending, are you also getting the following figures? Total spending to provide context? Geographical/historical/other breakdowns to provide comparative data? The additional data you need to ensure comparisons are fair, such as population size? Other data that might provide interesting analysis to compare or relate the spending to?


pages: 220 words: 66,518

The Biology of Belief: Unleashing the Power of Consciousness, Matter & Miracles by Bruce H. Lipton

Albert Einstein, Benoit Mandelbrot, correlation does not imply causation, discovery of DNA, double helix, Drosophila, epigenetics, Isaac Newton, Mahatma Gandhi, mandelbrot fractal, Mars Rover, On the Revolutions of the Heavenly Spheres, phenotype, placebo effect, randomized controlled trial, selective serotonin reuptake inhibitor (SSRI), stem cell

What about all those headlines trumpeting the discovery of a gene for everything from depression to schizophrenia? Read those articles closely and you’ll see that behind the breathless headline is a more sober truth. Scientists have linked lots of genes to lots of different diseases and traits, but scientists have rarely found that one gene causes a trait or a disease. The confusion occurs when the media repeatedly distort the meaning of two words: correlation and causation. It’s one thing to be linked to a disease; it’s quite another to cause a disease, which implies a directing, controlling action. If I show you my keys and say that a particular key “controls” my car, you at first might think that makes sense because you know you need that key to turn on the ignition. But does the key actually “control” the car? If it did, you couldn’t leave the key in the car alone because it might just borrow your car for a joy ride when you are not paying attention.


pages: 719 words: 181,090

Site Reliability Engineering: How Google Runs Production Systems by Betsy Beyer, Chris Jones, Jennifer Petoff, Niall Richard Murphy

Air France Flight 447, anti-pattern, barriers to entry, business intelligence, business process, Checklist Manifesto, cloud computing, combinatorial explosion, continuous integration, correlation does not imply causation, crowdsourcing, database schema, defense in depth, DevOps, en.wikipedia.org, fault tolerance, Flash crash, George Santayana, Google Chrome, Google Earth, information asymmetry, job automation, job satisfaction, Kubernetes, linear programming, load shedding, loose coupling, meta analysis, meta-analysis, microservices, minimum viable product, MVC pattern, performance metric, platform as a service, revision control, risk tolerance, side project, six sigma, the scientific method, Toyota Production System, trickle-down economics, web application, zero day

Fixing the first and second common pitfalls is a matter of learning the system in question and becoming experienced with the common patterns used in distributed systems. The third trap is a set of logical fallacies that can be avoided by remembering that not all failures are equally probable—as doctors are taught, “when you hear hoofbeats, think of horses not zebras.”4 Also remember that, all things being equal, we should prefer simpler explanations.5 Finally, we should remember that correlation is not causation:6 some correlated events, say packet loss within a cluster and failed hard drives in the cluster, share common causes—in this case, a power outage, though network failure clearly doesn’t cause the hard drive failures nor vice versa. Even worse, as systems grow in size and complexity and as more metrics are monitored, it’s inevitable that there will be events that happen to correlate well with other events, purely by coincidence.7 Understanding failures in our reasoning process is the first step to avoiding them and becoming more effective in solving problems.

Diskerase example, Recommendations focus on reliability, Reliability Is the Fundamental Feature Google's approach to, The Value for Google SRE hierarchy of automation classes, A Hierarchy of Automation Classes recommendations for enacting, Recommendations specialized application of, The Inclination to Specialize use cases for, The Use Cases for Automation-A Hierarchy of Automation Classes automation tools, Testing Scalable Tools autonomous systems, The Evolution of Automation at Google Auxon case study, Auxon Case Study: Project Background and Problem Space-Our Solution: Intent-Based Capacity Planning, Introduction to Auxon-Introduction to Auxon availability, Indicators, Choosing a Strategy for Superior Data Integrity(see also service availability) availability table, Availability Table B B4 network, Hardware backend servers, Our Software Infrastructure, Load Balancing in the Datacenter backends, fake, Production Probes backups (see data integrity) Bandwidth Enforcer (BwE), Networking barrier tools, Testing Scalable Tools, Testing Disaster, Distributed Coordination and Locking Services batch processing pipelines, First Layer: Soft Deletion batching, Eliminate Batch Load, Batching, Drawbacks of Periodic Pipelines in Distributed Environments Bazel, Building best practicescapacity planning, Capacity Planning for change management, Change Management error budgets, Error Budgets failures, Fail Sanely feedback, Introducing a Postmortem Culture for incident management, In Summary monitoring, Monitoring overloads and failure, Overloads and Failure postmortems, Google’s Postmortem Philosophy-Collaborate and Share Knowledge, Postmortems reward systems, Introducing a Postmortem Culture role of release engineers in, The Role of a Release Engineer rollouts, Progressive Rollouts service level objectives, Define SLOs Like a User team building, SRE Teams bibliography, Bibliography Big Data, Origin of the Pipeline Design Pattern Bigtable, Storage, Target level of availability, Bigtable SRE: A Tale of Over-Alerting bimodal latency, Bimodal latency black-box monitoring, Definitions, Black-Box Versus White-Box, Black-Box Monitoring blameless cultures, Google’s Postmortem Philosophy Blaze build tool, Building Blobstore, Storage, Choosing a Strategy for Superior Data Integrity Borg, Hardware-Managing Machines, Borg: Birth of the Warehouse-Scale Computer-Borg: Birth of the Warehouse-Scale Computer, Drawbacks of Periodic Pipelines in Distributed Environments Borg Naming Service (BNS), Managing Machines Borgmon, The Rise of Borgmon-Ten Years On…(see also time-series monitoring) alerting, Monitoring and Alerting, Alerting configuration, Maintaining the Configuration rate() function, Rule Evaluation rules, Rule Evaluation-Rule Evaluation sharding, Sharding the Monitoring Topology timeseries arena, Storage in the Time-Series Arena vectors, Labels and Vectors-Labels and Vectors break-glass mechanisms, Expect Testing Fail build environments, Creating a Test and Build Environment business continuity, Ensuring Business Continuity Byzantine failures, How Distributed Consensus Works, Number of Replicas C campuses, Hardware canarying, Motivation for Error Budgets, What we learned, Canary test, Gradual and Staged Rollouts CAP theorem, Managing Critical State: Distributed Consensus for Reliability CAPA (corrective and preventative action), Postmortem Culture capacity planningapproaches to, Practices best practices for, Capacity Planning Diskerase example, Recommendations distributed consensus systems and, Capacity and Load Balancing drawbacks of "queries per second", The Pitfalls of “Queries per Second” drawbacks of traditional plans, Brittle by nature further reading on, Practices intent-based (see intent-based capacity planning) mandatory steps for, Demand Forecasting and Capacity Planning preventing server overload with, Preventing Server Overload product launches and, Capacity Planning traditional approach to, Traditional Capacity Planning cascading failuresaddressing, Immediate Steps to Address Cascading Failures-Eliminate Bad Traffic causes of, Causes of Cascading Failures and Designing to Avoid Them-Service Unavailability defined, Addressing Cascading Failures, Capacity and Load Balancing factors triggering, Triggering Conditions for Cascading Failures overview of, Closing Remarks preventing server overload, Preventing Server Overload-Always Go Downward in the Stack testing for, Testing for Cascading Failures-Test Noncritical Backends(see also overload handling) change management, Change Management(see also automation) change-induced emergencies, Change-Induced Emergency-What we learned changelists (CLs), Our Development Environment Chaos Monkey, Testing Disaster checkpoint state, Testing Disaster cherry picking tactic, Hermetic Builds Chubby lock service, Lock Service, System Architecture Patterns for Distributed Consensusplanned outage, Objectives, SLOs Set Expectations client tasks, Load Balancing in the Datacenter client-side throttling, Client-Side Throttling clients, Our Software Infrastructure clock drift, Managing Critical State: Distributed Consensus for Reliability Clos network fabric, Hardware cloud environmentdata integrity strategies, Choosing a Strategy for Superior Data Integrity, Challenges faced by cloud developers definition of data integrity in, Data Integrity’s Strict Requirements evolution of applications in, Choosing a Strategy for Superior Data Integrity technical challenges of, Requirements of the Cloud Environment in Perspective clustersapplying automation to turnups, Soothing the Pain: Applying Automation to Cluster Turnups-Service-Oriented Cluster-Turnup cluster management solution, Drawbacks of Periodic Pipelines in Distributed Environments defined, Hardware code samples, Using Code Examples cognitive flow state, Cognitive Flow State cold caching, Slow Startup and Cold Caching colocation facilities (colos), Recommendations Colossus, Storage command posts, A Recognized Command Post communication and collaborationblameless postmortems, Collaborate and Share Knowledge case studies, Case Study of Collaboration in SRE: Viceroy-Case Study: Migrating DFP to F1 importance of, Conclusion with Outalator, Reporting and communication outside SRE team, Collaboration Outside SRE position of SRE in Google, Communication and Collaboration in SRE production meetings (see production meetings) within SRE team, Collaboration within SRE company-wide resilience testing, Practices compensation structure, Compensation computational optimization, Our Solution: Intent-Based Capacity Planning configuration management, Configuration Management, Change-Induced Emergency, Integration, Process Updates configuration tests, Configuration test consensus algorithmsEgalitarian Paxos, Stable Leaders Fast Paxos, Reasoning About Performance: Fast Paxos, The Use of Paxos improving performance of, Distributed Consensus Performance Multi-Paxos, Disk Access Paxos, How Distributed Consensus Works, Disk Access Raft, Multi-Paxos: Detailed Message Flow, Stable Leaders Zab, Stable Leaders(see also distributed consensus systems) consistencyeventual, Managing Critical State: Distributed Consensus for Reliability through automation, Consistency consistent hashing, Load Balancing at the Virtual IP Address constraints, Laborious and imprecise Consul, System Architecture Patterns for Distributed Consensus consumer services, identifying risk tolerance of, Identifying the Risk Tolerance of Consumer Services-Other service metrics continuous build and deploymentBlaze build tool, Building branching, Branching build targets, Building configuration management, Configuration Management deployment, Deployment packaging, Packaging Rapid release system, Continuous Build and Deployment, Rapid testing, Testing typical release process, Rapid contributors, Acknowledgments-Acknowledgments coroutines, Origin of the Pipeline Design Pattern corporate network security, Practices correctness guarantees, Workflow Correctness Guarantees correlation vs. causation, Theory costsavailability targets and, Cost, Cost direct, The Sysadmin Approach to Service Management of failing to embrace risk, Managing Risk indirect, The Sysadmin Approach to Service Management of sysadmin management approach, The Sysadmin Approach to Service Management CPU consumption, The Pitfalls of “Queries per Second”, CPU, Overload Behavior and Load Tests crash-fail vs. crash-recover algorithms, How Distributed Consensus Works cronat large scale, Running Large Cron building at Google, Building Cron at Google-Running Large Cron idempotency, Cron Jobs and Idempotency large-scale deployment of, Cron at Large Scale leader and followers, The leader overview of, Summary Paxos algorithm and, The Use of Paxos-Storing the State purpose of, Distributed Periodic Scheduling with Cron reliability applications of, Reliability Perspective resolving partial failures, Resolving partial failures storing state, Storing the State tracking cron job state, Tracking the State of Cron Jobs uses for, Cron cross-industry lessonsApollo 8, Preface comparative questions presented, Lessons Learned from Other Industries decision-making skills, Structured and Rational Decision Making-Structured and Rational Decision Making Google's application of, Conclusions industry leaders contributing, Meet Our Industry Veterans key themes addressed, Lessons Learned from Other Industries postmortem culture, Postmortem Culture-Postmortem Culture preparedness and disaster testing, Preparedness and Disaster Testing-Defense in Depth and Breadth repetitive work/operational overhead, Automating Away Repetitive Work and Operational Overhead current state, exposing, Examine D D storage layer, Storage dashboardsbenefits of, Why Monitor?


pages: 270 words: 73,485

Hubris: Why Economists Failed to Predict the Crisis and How to Avoid the Next One by Meghnad Desai

"Robert Solow", 3D printing, bank run, banking crisis, Berlin Wall, Big bang: deregulation of the City of London, Bretton Woods, BRICs, British Empire, business cycle, Capital in the Twenty-First Century by Thomas Piketty, Carmen Reinhart, central bank independence, collapse of Lehman Brothers, collateralized debt obligation, correlation coefficient, correlation does not imply causation, creative destruction, Credit Default Swap, credit default swaps / collateralized debt obligations, David Ricardo: comparative advantage, deindustrialization, demographic dividend, Eugene Fama: efficient market hypothesis, eurozone crisis, experimental economics, Fall of the Berlin Wall, financial innovation, Financial Instability Hypothesis, floating exchange rates, full employment, German hyperinflation, Gunnar Myrdal, Home mortgage interest deduction, imperial preference, income inequality, inflation targeting, invisible hand, Isaac Newton, Joseph Schumpeter, Kenneth Arrow, Kenneth Rogoff, laissez-faire capitalism, liquidity trap, Long Term Capital Management, market bubble, market clearing, means of production, Mexican peso crisis / tequila crisis, mortgage debt, Myron Scholes, negative equity, Northern Rock, oil shale / tar sands, oil shock, open economy, Paul Samuelson, price stability, purchasing power parity, pushing on a string, quantitative easing, reserve currency, rising living standards, risk/return, Robert Shiller, Robert Shiller, Ronald Reagan, savings glut, secular stagnation, seigniorage, Silicon Valley, Simon Kuznets, The Chicago School, The Great Moderation, The inhabitant of London could order by telephone, sipping his morning tea in bed, the various products of the whole earth, The Wealth of Nations by Adam Smith, Tobin tax, too big to fail, women in the workforce

(i) Clayton Act (i) Clinton Administration (i) closed economy (i), (ii), (iii), (iv), (v), (vi) Cobb, Charles (i) Cobb-Douglas Production Function (i), (ii) coincidence, vs.causation (i) Cold War (i) collateralized debt obligations (CDO) (i) colonization (i) Combinations (trade unions), as harmful (i) Committee on the Bank of England Charter (i) commodity markets price rises (i) regulation (i) Common Market (i) communications, advances in (i), (ii) companies, collapse of (i) comparative advantage (i) compatibility microeconomics/macroeconomics (i), (ii), (iii) unique static equilibrium/moving data (i) competition and efficiency (i) imperfect (i) theory of (Marshall) (i) computer technology development of (i), (ii); see also technological innovations stock markets (i) confidence, rise and fall (i) conflicting interests (i), (ii) Connally, John (i) consols (i) consumer credit (i) consumption function (i), (ii) contagion (i), (ii) control of money supply (i) convertibility (i) cooperation (i) correlation/coincidence, vs. causation (i) corruption (i) Countrywide Financial (i) Cournot, Antoine Augustin (i) Cowles, Alfred (i) Cowles Foundation (i) creative destruction (i) credit business dependence (i) cheap (i) as driver of investment (i) credit cards (i) credit default swaps (CDS) (i) crises beginnings of (i) developing countries (i) Juglar’s theory (i) Mexican (i) proliferation (i) as recurrent (i), (ii) as regular occurrences (i) ten year pattern (i) unpredictability (i) crisis of 1825 (i) crisis of profitability (i) Crosland, Anthony (i) The Future of Socialism (i) currency, convertibility (i) depreciation (i) pegging (i), (ii) cycles (i) banking system as root (i) combinations of (i) Goodwin (i), (ii) Juglar’s study (i) Keynes on (i) long (i) loss of interest in (i) Marx’s theories (i), (ii) measuring (i) origins (i) random events (i) reproduction by Keynesian models (i) rocking horse analogy (i) short (i) Wicksell’s theory (i) see also Frisch; Kondratieff cycles debit cards (i) Debreu, Gerard (i), (ii) debt crises (i) easy availability (i) levels (i) see also government debt debt-fueled boom (i) debts brokers (i) farmers’ (i) post-World War II (i) purchase of (i) decisions, patterns (i) deficits, endemic (i) deflation (i) deindustrialization (i), (ii) Deism (i) demand, factors in (i) demographics (i) demutualization (i) depreciation (i) advocacy of (i) Ricardo’s theory (i) value of goods (i) deregulation, banking (i) derivatives (i), (ii) Deserted Village, The (Oliver Goldsmith) (i) deutschmark (i) developing countries, Wicksellian boom (i) disequilibrium dynamic (i), (ii), (iii), (iv) stock (i) system, capitalism as (i) tradition (i) displacement effect, technological innovations (i) division of knowledge (i) division of labor (i), (ii) dollar purchasing power (i) as reserve currency (i), (ii) dollar exchange standard (i), (ii) dot.com boom (i) double deficits (i) Douglas, Paul (i), (ii) Dow Jones (i) Duménil, Gerard (i) durable goods (i) Dutch Disease (i) dynamic stochastic general equilibrium (DSGE) models (i), (ii) econometric modeling (i), (ii) Econometric Society (i), (ii) econometrics (i), (ii) economic activity, shift (i) economic analysis, applicability (i) economic cycles (i) Marx/Engels (i) see also Kondratieff cycles economic data, proliferation (i) economic growth, problems of (i) economic policy, activism (i) economic sectors, conflicting interests (i), (ii) economic slump, post-World War I (i) economic stagnation (i) economic theory (i) and individual lives (i) economic trajectories (i) economic vocabulary (i), (ii), (iii) economics background to (i) celebrated (i) changing scope of (i) as dismal science (i) professionalization (i) teaching of (i) “Economics and Knowledge” (Hayek) (i) economies, interconnections (i) economies of scale (i) economists, research methods (i) economy changing nature of (i) equilibrium/disequilibrium (i) visions of (i) efficiency, use of term (i) efficient market hypothesis (EMH) (i), (ii), (iii) Eisenhower, Dwight D.


pages: 50 words: 13,399

The Elements of Data Analytic Style by Jeff Leek

correlation does not imply causation, Netflix Prize, p-value, pattern recognition, Ronald Coase, statistical model

A mechanistic data analysis seeks to demonstrate that changing one measurement always and exclusively leads to a specific, deterministic behavior in another. The goal is to not only understand that there is an effect, but how that effect operates. An example of a mechanistic analysis is analyzing data on how wing design changes air flow over a wing, leading to decreased drag. Outside of engineering, mechanistic data analysis is extremely challenging and rarely undertaken. 2.8 Common mistakes 2.8.1 Correlation does not imply causation Interpreting an inferential analysis as causal. Most data analyses involve inference or prediction. Unless a randomized study is performed, it is difficult to infer why there is a relationship between two variables. Some great examples of correlations that can be calculated but are clearly not causally related appear at http://tylervigen.com/ (Figure 2.2). Figure 2.2 A spurious correlation Particular caution should be used when applying words such as “cause” and “effect” when performing inferential analysis.


pages: 685 words: 203,949

The Organized Mind: Thinking Straight in the Age of Information Overload by Daniel J. Levitin

airport security, Albert Einstein, Amazon Mechanical Turk, Anton Chekhov, Bayesian statistics, big-box store, business process, call centre, Claude Shannon: information theory, cloud computing, cognitive bias, complexity theory, computer vision, conceptual framework, correlation does not imply causation, crowdsourcing, cuban missile crisis, Daniel Kahneman / Amos Tversky, delayed gratification, Donald Trump, en.wikipedia.org, epigenetics, Eratosthenes, Exxon Valdez, framing effect, friendly fire, fundamental attribution error, Golden Gate Park, Google Glasses, haute cuisine, impulse control, index card, indoor plumbing, information retrieval, invention of writing, iterative process, jimmy wales, job satisfaction, Kickstarter, life extension, longitudinal study, meta analysis, meta-analysis, more computing power than Apollo, Network effects, new economy, Nicholas Carr, optical character recognition, Pareto efficiency, pattern recognition, phenotype, placebo effect, pre–internet, profit motive, randomized controlled trial, Rubik’s Cube, shared worldview, Skype, Snapchat, social intelligence, statistical model, Steve Jobs, supply-chain management, the scientific method, The Wealth of Nations by Adam Smith, The Wisdom of Crowds, theory of mind, Thomas Bayes, Turing test, ultimatum game, zero-sum game

The results of Harvard’s salary survey are no doubt intended to lead the average person to infer that a Harvard education is responsible for the high salaries of recent graduates. This may be the case, but it’s also possible that the kinds of people who go to Harvard in the first place come from wealthy and supportive families and therefore might have been likely to obtain higher-paying jobs regardless of where they went to college. Childhood socioeconomic status has been shown to be a major quantity correlated with adult salaries. Correlation is not causation. Proving causation requires carefully controlled scientific experiments. Then there are truly spurious correlations—odd pairings of facts that have no relationship to each other and no third factor x linking them. For example, we could plot the relationship between the global average temperature over the past four hundred years and the number of pirates in the world and conclude that the drop in the number of pirates is caused by global warming.

The Gricean maxim of relevance implies that no one would construct such a graph (below) unless they felt these two were related, but this is where critical thinking comes in. The graph shows that they are correlated, but not that one causes the other. You could spin an ad hoc theory—pirates can’t stand heat, and so, as the oceans became warmer, they sought other employment. Examples such as this demonstrate the folly of failing to separate correlation from causation. It is easy to confuse cause and effect when encountering correlations. There is often that third factor x that ties together correlative observations. In the case of the decline in pirates being related to the increase in global warming, factor x might plausibly be claimed to be industrialization. With industrialization came air travel and air cargo; larger, better fortified ships; and improved security and policing practices.


pages: 346 words: 89,180

Capitalism Without Capital: The Rise of the Intangible Economy by Jonathan Haskel, Stian Westlake

"Robert Solow", 23andMe, activist fund / activist shareholder / activist investor, Airbnb, Albert Einstein, Andrei Shleifer, bank run, banking crisis, Bernie Sanders, business climate, business process, buy and hold, Capital in the Twenty-First Century by Thomas Piketty, cloud computing, cognitive bias, computer age, corporate governance, corporate raider, correlation does not imply causation, creative destruction, dark matter, Diane Coyle, Donald Trump, Douglas Engelbart, Douglas Engelbart, Edward Glaeser, Elon Musk, endogenous growth, Erik Brynjolfsson, everywhere but in the productivity statistics, Fellow of the Royal Society, financial innovation, full employment, fundamental attribution error, future of work, Gini coefficient, Hernando de Soto, hiring and firing, income inequality, index card, indoor plumbing, intangible asset, Internet of things, Jane Jacobs, Jaron Lanier, job automation, Kenneth Arrow, Kickstarter, knowledge economy, knowledge worker, laissez-faire capitalism, liquidity trap, low skilled workers, Marc Andreessen, Mother of all demos, Network effects, new economy, open economy, patent troll, paypal mafia, Peter Thiel, pets.com, place-making, post-industrial society, Productivity paradox, quantitative hedge fund, rent-seeking, revision control, Richard Florida, ride hailing / ride sharing, Robert Gordon, Ronald Coase, Sand Hill Road, Second Machine Age, secular stagnation, self-driving car, shareholder value, sharing economy, Silicon Valley, six sigma, Skype, software patent, sovereign wealth fund, spinning jenny, Steve Jobs, survivorship bias, The Rise and Fall of American Growth, The Wealth of Nations by Adam Smith, Tim Cook: Apple, total factor productivity, Tyler Cowen: Great Stagnation, urban planning, Vanguard fund, walkable city, X Prize, zero-sum game

The fact that there is a long-term increase in price suggests that, while good management practices improve firm performance (hence the long-term share price increase), equity markets undervalue the benefits of this type of intangible (since equity analysts should be able to recognize good management at the time the award is given, rather than waiting for its results to show up on the income statement). But, of course, correlation is not causation: just because a publicly listed firm invests less in R&D, training, or other intangibles does not mean it is being led astray by equity markets. Managers might choose to invest less because they know the investments available to them are unlikely to be profitable or, more narrowly, that they might be profitable for someone, but not necessarily for them. The business pages are full of companies that have launched new products or set up new service lines only to regret their overoptimism.

Research by one of the authors together with Alan Hughes, Peter Goodridge, and Gavin Wallis suggests that extra investment by the UK government in research in universities increases national productivity by 20 percent (Haskel et al. 2015). (There were substantial swings in government support for universities over the 1990s and 2000s, and those ups and downs are well correlated with productivity ups and downs, with around a three-year lag.) As we have pointed out, correlation does not prove causation. For example, many universities are in economically fortunate areas. But does this mean having a good university raises local economic fortunes? Or do rich areas open universities? One needs a strategy to identify the causal link, if there is one, from university spending to local prosperity. One clever way to get at the answer to this question of linkage is by studying more or less an experiment arising from a unique custom of US university finance.


pages: 295 words: 89,430

Small Data: The Tiny Clues That Uncover Huge Trends by Martin Lindstrom

autonomous vehicles, Berlin Wall, big-box store, correlation does not imply causation, Edward Snowden, Fall of the Berlin Wall, land reform, Mikhail Gorbachev, Murano, Venice glass, Richard Florida, rolodex, self-driving car, Skype, Snapchat, Steve Jobs, Steven Pinker, too big to fail, urban sprawl

A source who works at Google once confessed to me that despite the almost 3 billion humans who are online,4 and the 70 percent of online shoppers who go onto Facebook daily,5 and the 300 hours of videos on YouTube (which is owned by Google) uploaded every minute,6 and the fact that 90 percent of all the world’s data has been generated over the last two years.7 Google ultimately has only limited information about consumers. Yes, search engines can detect unusual correlations (as opposed to causations). With 70 percent accuracy, my source tells me, software can assess how people feel based on the way they type, and the number of typos they make. With 79 percent precision, software can determine a user’s credit rating based on the degree to which they write in ALL CAPS. Yet even with all these stats, Google has come to realize it knows almost nothing about humans and what really drives us, and it is now bringing in consultants to do what small data researchers have been doing for decades.


pages: 660 words: 141,595

Data Science for Business: What You Need to Know About Data Mining and Data-Analytic Thinking by Foster Provost, Tom Fawcett

Albert Einstein, Amazon Mechanical Turk, big data - Walmart - Pop Tarts, bioinformatics, business process, call centre, chief data officer, Claude Shannon: information theory, computer vision, conceptual framework, correlation does not imply causation, crowdsourcing, data acquisition, David Brooks, en.wikipedia.org, Erik Brynjolfsson, Gini coefficient, information retrieval, intangible asset, iterative process, Johann Wolfgang von Goethe, Louis Pasteur, Menlo Park, Nate Silver, Netflix Prize, new economy, p-value, pattern recognition, placebo effect, price discrimination, recommendation engine, Ronald Coase, selection bias, Silicon Valley, Skype, speech recognition, Steve Jobs, supply-chain management, text mining, The Signal and the Noise by Nate Silver, Thomas Bayes, transaction costs, WikiLeaks

The contents of these New Briefs varied, but because of their very similar form they clustered together: BRIEF-Apple releases Safari 3.1 BRIEF-Apple introduces ilife 2009 BRIEF-Apple announces iPhone 2.0 software beta BRIEF-Apple to offer movies on iTunes same day as DVD release BRIEF-Apple says sold one million iPhone 3G's in first weekend As we can see, some of these clusters are interesting and thematically consistent while others are not. Some are just collections of superficially similar text. There is an old cliché in statistics: Correlation is not causation, meaning that just because two things co-occur doesn’t mean that one causes another. A similar caveat in clustering could be: Syntactic similarity is not semantic similarity. Just because two things—particularly text passages—have common surface characteristics doesn’t mean they’re necessarily related semantically. We shouldn’t expect every cluster to be meaningful and interesting. Nevertheless, clustering is often a useful tool to uncover structure in our data that we did not foresee.

Check out Wikipedia to find out more about them. Data-Driven Causal Explanation and a Viral Marketing Example One important topic that we have only touched on in this book (in Chapter 2 and Chapter 11) is causal explanation from data. Predictive modeling is extremely useful for many business problems. However, the sort of predictive modeling that we have discussed so far is based on correlations rather than on knowledge of causation. We often want to look more deeply into a phenomenon and ask what influences what. We may want to do this simply to understand our business better, or we may want to use data to improve decisions about how to intervene to cause a desired outcome. Consider a detailed example. Recently there has been much attention paid to “viral” marketing. One common interpretation of viral marketing is that consumers can be helped to influence each other to purchase a product, and so a marketer can get significant benefit by “seeding” certain consumers (e.g., by giving them the product for free), and they then will be “influencers”— they will cause an increase in the likelihood that the people they know will purchase the product.


pages: 527 words: 147,690

Terms of Service: Social Media and the Price of Constant Connection by Jacob Silverman

23andMe, 4chan, A Declaration of the Independence of Cyberspace, Airbnb, airport security, Amazon Mechanical Turk, augmented reality, basic income, Brian Krebs, California gold rush, call centre, cloud computing, cognitive dissonance, commoditize, correlation does not imply causation, Credit Default Swap, crowdsourcing, don't be evil, drone strike, Edward Snowden, feminist movement, Filter Bubble, Firefox, Flash crash, game design, global village, Google Chrome, Google Glasses, hive mind, income inequality, informal economy, information retrieval, Internet of things, Jaron Lanier, jimmy wales, Kevin Kelly, Kickstarter, knowledge economy, knowledge worker, late capitalism, license plate recognition, life extension, lifelogging, Lyft, Mark Zuckerberg, Mars Rover, Marshall McLuhan, mass incarceration, meta analysis, meta-analysis, Minecraft, move fast and break things, move fast and break things, national security letter, Network effects, new economy, Nicholas Carr, Occupy movement, optical character recognition, payday loans, Peter Thiel, postindustrial economy, prediction markets, pre–internet, price discrimination, price stability, profit motive, quantitative hedge fund, race to the bottom, Ray Kurzweil, recommendation engine, rent control, RFID, ride hailing / ride sharing, self-driving car, sentiment analysis, shareholder value, sharing economy, Silicon Valley, Silicon Valley ideology, Snapchat, social graph, social intelligence, social web, sorting algorithm, Steve Ballmer, Steve Jobs, Steven Levy, TaskRabbit, technoutopianism, telemarketer, transportation-network company, Travis Kalanick, Turing test, Uber and Lyft, Uber for X, uber lyft, universal basic income, unpaid internship, women in the workforce, Y Combinator, Zipcar

Both are in the data collection and targeting business, and Silicon Valley collects heaps of data which the NSA would love to have.* Silicon Valley is merely targeting consumers with ads and prompts and nudges that might get them to click or to buy something. They are bound together by common interests, philosophies, and methods. One of the main problems with Big Data is that it produces correlations but not causations. We learn that two things seem to be related—for example, that people with a specific set of personal characteristics are prone to depression or bad driving—but we don’t learn why. This is ironic given that Big Data is the ultimate fact-producing discipline: it promises answers, actionable ones. But data itself can be messy and often must be smoothed over, interpreted, supplemented.

., 21 banality problem on Facebook, 45–50 Barbrook, Richard, 1–3, 4, 250–51 Barlow, John Perry, 251–52 Beacon advertising platform, Facebook, 287 “Bed Intruder Song” (Gregory Brothers), 71 BehaviorMatrix, 39 Beliebers, 147–48 Bellow, Saul, 59 Benjamin, Walter, 267 Berger, John, 24 Bergus, Nick, 31–32 Berlusconi, Silvio, 211 Beyond Verbal, 40–41 Bieber, Justin, 147–48 Big Brother (reality TV show), 135 Big Data overview, 232, 313–14, 316 correlations without causations, 315 and ethics, 325–26 future of, 329–32 as information harvesting, 297 need for, 323 and patterns, 315 uses for, 316, 327–28 See also informational appetite Bilton, Nick, 34 Binder, Matt, 170–72, 173 Bing search engine, 195 biometric targeting tools, 305–6 Blanchard, Nathalie, 308–9 Bleacher Report, 125–28 BlinkLink app, 358 Blodget, Henry, 125 Bogost, Ian, 264 Booker, Cory, 104–5 BookVibe, 34 Boorstin, Daniel J., 67, 104 Boston Marathon bombing, 110–11, 113 Bosworth, Andrew, 25 bots overview, 38–39 and fraudulent ad companies, 97–98 influence scores, 194 recognizing a trend, 89–90 remote personal assistants, 42–43 as substitutes for individuals, 151 botting, 85–87, 88–89 Boyd, Danah, 168, 274, 284, 291, 315 Boy Kings, The (Losse), 6, 129, 142n Bradbury, Ray, 339–40 Brady, Tom, 126 Brandeis, Louis, 288–90 Brand Yourself, 213 Breaking Bad hashtag, 94 “Breaking News Consumer’s Handbook” (On the Media), 109 BRICKiPhone, 359–60 Britain, 144, 306, 314 Bucher, Taina, 200, 201 Burberry, 96–97 businesses.


pages: 470 words: 148,730

Good Economics for Hard Times: Better Answers to Our Biggest Problems by Abhijit V. Banerjee, Esther Duflo

"Robert Solow", 3D printing, affirmative action, Affordable Care Act / Obamacare, Airbnb, basic income, Bernie Sanders, business cycle, call centre, Capital in the Twenty-First Century by Thomas Piketty, Cass Sunstein, charter city, correlation does not imply causation, creative destruction, Daniel Kahneman / Amos Tversky, David Ricardo: comparative advantage, decarbonisation, Deng Xiaoping, Donald Trump, Edward Glaeser, en.wikipedia.org, endowment effect, energy transition, Erik Brynjolfsson, experimental economics, experimental subject, facts on the ground, fear of failure, financial innovation, George Akerlof, high net worth, immigration reform, income inequality, Indoor air pollution, industrial cluster, industrial robot, information asymmetry, Intergovernmental Panel on Climate Change (IPCC), Jane Jacobs, Jean Tirole, Jeff Bezos, job automation, Joseph Schumpeter, labor-force participation, land reform, loss aversion, low skilled workers, manufacturing employment, Mark Zuckerberg, mass immigration, Network effects, new economy, New Urbanism, non-tariff barriers, obamacare, offshore financial centre, open economy, Paul Samuelson, place-making, price stability, profit maximization, purchasing power parity, race to the bottom, RAND corporation, randomized controlled trial, Richard Thaler, ride hailing / ride sharing, Robert Gordon, Ronald Reagan, school choice, Second Machine Age, secular stagnation, self-driving car, shareholder value, short selling, Silicon Valley, smart meter, social graph, spinning jenny, Steve Jobs, technology bubble, The Chicago School, The Future of Employment, The Market for Lemons, The Rise and Fall of American Growth, The Wealth of Nations by Adam Smith, total factor productivity, trade liberalization, transaction costs, trickle-down economics, universal basic income, urban sprawl, very high income, War on Poverty, women in the workforce, working-age population, Y2K

According to the World Inequality Database team, in 1978 the bottom 50 percent and the top 10 percent of the population both took home the same share of Chinese income (27 percent). The two shares starting diverging in 1978, with the poorest 50 percent taking less and less and the richest 10 percent taking more and more. By 2015, the top 10 percent received 41 percent of Chinese income, while the bottom 50 percent received 15 percent.19 Of course, correlation is not causation. Perhaps globalization per se did not cause the increase in inequality. Trade liberalizations almost never take place in a vacuum; in all these countries, trade reforms were part of a broader reform package. For example, the most drastic trade policy liberalization in Colombia in 1990 and 1991 coincided with changes in labor market regulation meant to substantially increase labor market flexibility.

There is no evidence the Reagan tax cuts, or the Clinton top marginal rate increase, or the Bush tax cuts, did anything to change the long-run growth rate.52 Of course, as the Republican Paul Ryan, former Speaker of the House of Representatives, pointed out, there is no evidence that they did not. Many other things were happening at the same time. Ryan painstakingly explained to a journalist why all of these things lined up to make tax increases look good and tax decreases look bad: I wouldn’t say that correlation is causation. I would say Clinton had the tech-productivity boom, which was enormous. Trade barriers were going down in the Clinton years. He had the peace dividend he was enjoying.… The economy in the Bush years, by contrast, had to cope with the popping of the technology bubble, 9/11, a couple of wars and the financial meltdown.… Some of this is just the timing, not the person.… Just as the Keynesians say the economy would have been worse without the stimulus [that Mr.


pages: 56 words: 16,788

The New Kingmakers by Stephen O'Grady

AltaVista, Amazon Web Services, barriers to entry, cloud computing, correlation does not imply causation, crowdsourcing, David Heinemeier Hansson, DevOps, Jeff Bezos, Khan Academy, Kickstarter, Marc Andreessen, Mark Zuckerberg, Netflix Prize, Paul Graham, Ruby on Rails, Silicon Valley, Skype, software as a service, software is eating the world, Steve Ballmer, Steve Jobs, Tim Cook: Apple, Y Combinator

Seventeen months into its existence, Android was an interesting project, but an also-ran next to Apple’s iPhone OS (it was not renamed iOS until June 2010). Google understood that developers are more likely to build for themselves—what’s referred to in the industry as “scratching their own itch”—Google made sure that several thousand developers motivated enough to attend their conference had an Android device to use for themselves. The statistics axiom that correlation does not prove causation certainly applies here, but it’s impossible not to notice the timing of that handset giveaway. On the day that Google sent all of those I/O attendees home happy, the number of Android devices being activated per day was likely in the low tens of thousands (Google hasn’t made this data available). By the time the conference rolled around again a year later, the number was around 100,000.


pages: 265 words: 74,000

The Numerati by Stephen Baker

Berlin Wall, Black Swan, business process, call centre, correlation does not imply causation, Drosophila, full employment, illegal immigration, index card, Isaac Newton, job automation, job satisfaction, McMansion, Myron Scholes, natural language processing, PageRank, personalized medicine, recommendation engine, RFID, Silicon Valley, Skype, statistical model, Watson beat the top human players on Jeopardy!

A man can write like a woman, he says, but does he buy like a woman? Sifry goes on at length about the dangers of predicting people's behavior based on statistical correlations. "Let's say that according to my analytics, you said that Mission Impossible III was no good and that you can't wait to see Prairie Home Companion," he says. "I can't assume from that that you're an NPR listener. That's where you get into trouble." That's mistaking correlation for causation, he says. It's common among data miners—and most other humans. How many times have you heard people say, "They always do that..."? For Kaushansky, putting his skateboarding friend and a few others in the wrong tribes may not turn out to be too serious. That's why advertising and marketing are such wonderful testing grounds for the Numerati. If they screw up, the only harm is that we see the wrong ad or receive irrelevant coupons.


pages: 231 words: 73,818

The Achievement Habit: Stop Wishing, Start Doing, and Take Command of Your Life by Bernard Roth

Albert Einstein, Build a better mousetrap, Burning Man, cognitive bias, correlation does not imply causation, deskilling, fear of failure, functional fixedness, Mahatma Gandhi, Mark Zuckerberg, school choice, Silicon Valley, The Wealth of Nations by Adam Smith, zero-sum game

A woman comes up to him after some time and says, “Pardon me, sir, why are you snapping your fingers?” He replies, “I am keeping the tigers away.” She says, “Sir, except for the zoo, there’s not a tiger for thousands of miles.” “Pretty effective, isn’t it?” he says. This joke uses what is called a causal fallacy. The fallacy comes because the finger snapper mistakenly believes that correlation implies causation. This is just one of several logical fallacies in which two events that occur at the same time are taken to have a cause-and-effect relationship. This version of the fallacy is also known as cum hoc ergo propter hoc (Latin for “with this, therefore because of this”) or, simply, false cause. A similar fallacy—that an event that follows another was a consequence of the first—is described as post hoc ergo propter hoc (Latin for “after this, therefore because of this”).


pages: 589 words: 69,193

Mastering Pandas by Femi Anthony

Amazon Web Services, Bayesian statistics, correlation coefficient, correlation does not imply causation, Debian, en.wikipedia.org, Internet of things, natural language processing, p-value, random walk, side project, statistical model, Thomas Bayes

Correlation is the general term we use in statistics for variables that express dependence with each other. We can then use this relationship to try and predict the value of one set of variables from the other; this is termed as regression. Correlation The statistical dependence expressed in a correlation relationship does not imply a causal relationship between the two variables; the famous line on this is "Correlation does not imply Causation". Thus, correlation between two variables or datasets implies just a casual rather than a causal relationship or dependence. For example, there is a correlation between the amount of ice cream purchased on a given day and the weather. For more information on correlation and dependency, refer to http://en.wikipedia.org/wiki/Correlation_and_dependence. The correlation measure, known as correlation coefficient, is a number that captures the size and direction of the relationship between the two variables.


pages: 233 words: 75,712

In Defense of Global Capitalism by Johan Norberg

anti-globalists, Asian financial crisis, capital controls, clean water, correlation does not imply causation, creative destruction, Deng Xiaoping, Edward Glaeser, Gini coefficient, half of the world's population has never made a phone call, Hernando de Soto, illegal immigration, income inequality, income per capita, informal economy, Joseph Schumpeter, Kenneth Rogoff, land reform, Lao Tzu, liberal capitalism, market fundamentalism, Mexican peso crisis / tequila crisis, Naomi Klein, new economy, open economy, prediction markets, profit motive, race to the bottom, rising living standards, Silicon Valley, Simon Kuznets, structural adjustment programs, The Wealth of Nations by Adam Smith, Tobin tax, trade liberalization, trade route, transaction costs, trickle-down economics, union organizing, zero-sum game

Criticism has been leveled at this type of regression analysis, which is based on statistics from many economies and tries to control for other factors that can affect economic outcomes, because of the many problems of measurement that such analysis involves. Coping with enormous masses of data is always a problem. Where exactly is the line between open and closed economies? How does one distinguish between correlation and causation? How can the direction of causation be established? Consider, after all, that it is common for countries implementing free trade to also introduce other liberal reforms, such as protection for property rights, reduced inflation, and balanced budgets. That makes it hard to separate the effects of one policy from the effects of another.8 The problems of measurement are real ones, and results of this kind always have to be taken with a grain of salt, but it remains interesting that, with so very few exceptions, those studies point to great advantages with free trade.


pages: 290 words: 76,216

What's Wrong with Economics? by Robert Skidelsky

"Robert Solow", additive manufacturing, agricultural Revolution, Black Swan, Bretton Woods, business cycle, Cass Sunstein, central bank independence, cognitive bias, conceptual framework, Corn Laws, corporate social responsibility, correlation does not imply causation, creative destruction, Daniel Kahneman / Amos Tversky, David Ricardo: comparative advantage, disruptive innovation, Donald Trump, full employment, George Akerlof, George Santayana, global supply chain, global village, Gunnar Myrdal, happiness index / gross national happiness, hindsight bias, Hyman Minsky, income inequality, index fund, inflation targeting, information asymmetry, Internet Archive, invisible hand, John Maynard Keynes: Economic Possibilities for our Grandchildren, Joseph Schumpeter, Kenneth Arrow, knowledge economy, labour market flexibility, loss aversion, Mark Zuckerberg, market clearing, market friction, market fundamentalism, Martin Wolf, means of production, moral hazard, paradox of thrift, Pareto efficiency, Paul Samuelson, Philip Mirowski, precariat, price anchoring, principal–agent problem, rent-seeking, Richard Thaler, road to serfdom, Robert Shiller, Robert Shiller, Ronald Coase, shareholder value, Silicon Valley, Simon Kuznets, survivorship bias, technoutopianism, The Chicago School, The Market for Lemons, The Nature of the Firm, the scientific method, The Wealth of Nations by Adam Smith, Thomas Kuhn: the structure of scientific revolutions, Thomas Malthus, Thorstein Veblen, transaction costs, transfer pricing, Vilfredo Pareto, Washington Consensus, Wolfgang Streeck, zero-sum game

Some countries, regions and people do better than others. The result is growing inequality. To regret that is to regret the growth itself. It is to hold, in effect, that it is better for everyone . . . to remain equally poor. [This] seems to me . . . morally indefensible and practically untenable . . . This debate illustrates very well why economics is not a hard science. At issue is correlation versus causation (if two or more events run in parallel, which, if either, causes the other?), reliability of the data (how much trust can you put in official statistics?), the ideological complexion of economic models (is the world economy best understood as a unitary or binary system?), universal versus contingent truths (do different economic structures have the same laws of development?), the role of power (are market transactions spontaneous or induced?)


pages: 309 words: 81,975

Brave New Work: Are You Ready to Reinvent Your Organization? by Aaron Dignan

"side hustle", activist fund / activist shareholder / activist investor, Airbnb, Albert Einstein, autonomous vehicles, basic income, Bertrand Russell: In Praise of Idleness, bitcoin, Black Swan, blockchain, Buckminster Fuller, Burning Man, butterfly effect, cashless society, Clayton Christensen, clean water, cognitive bias, cognitive dissonance, corporate governance, corporate social responsibility, correlation does not imply causation, creative destruction, crony capitalism, crowdsourcing, cryptocurrency, David Heinemeier Hansson, deliberate practice, DevOps, disruptive innovation, don't be evil, Elon Musk, endowment effect, Ethereum, ethereum blockchain, Frederick Winslow Taylor, future of work, gender pay gap, Geoffrey West, Santa Fe Institute, gig economy, Google X / Alphabet X, hiring and firing, hive mind, income inequality, information asymmetry, Internet of things, Jeff Bezos, job satisfaction, Kevin Kelly, Kickstarter, Lean Startup, loose coupling, loss aversion, Lyft, Marc Andreessen, Mark Zuckerberg, minimum viable product, new economy, Paul Graham, race to the bottom, remote working, Richard Thaler, shareholder value, Silicon Valley, six sigma, smart contracts, Social Responsibility of Business Is to Increase Its Profits, software is eating the world, source of truth, Stanford marshmallow experiment, Steve Jobs, TaskRabbit, the High Line, too big to fail, Toyota Production System, uber lyft, universal basic income, Y Combinator, zero-sum game

Rather than hold these “steps” lightly, executives tend to view them as best practice and attempt to implement them linearly and authoritatively, often with lackluster results. Kotter himself updated the process in 2014, acknowledging that the steps should really be done concurrently and continuously. The problem isn’t that we use models and frameworks to better understand change (although we need to be careful about correlation versus causation when it comes to defining what “works”). The problem is that we mistake the organization for an ordered system. And so we oversimplify. As a result, we tend to force whatever is happening in the system—in the hearts and minds of our colleagues—into that framework. We start saying things such as “You’re frozen right now, Michael, and we’re in the ‘unfreeze’ part of our change process, so . . . we’re going to need you to change your attitude.”


pages: 276 words: 81,153

Outnumbered: From Facebook and Google to Fake News and Filter-Bubbles – the Algorithms That Control Our Lives by David Sumpter

affirmative action, Bernie Sanders, correlation does not imply causation, crowdsourcing, don't be evil, Donald Trump, Elon Musk, Filter Bubble, Google Glasses, illegal immigration, Jeff Bezos, job automation, Kenneth Arrow, Loebner Prize, Mark Zuckerberg, meta analysis, meta-analysis, Minecraft, Nate Silver, natural language processing, Nelson Mandela, p-value, prediction markets, random walk, Ray Kurzweil, Robert Mercer, selection bias, self-driving car, Silicon Valley, Skype, Snapchat, speech recognition, statistical model, Stephen Hawking, Steven Pinker, The Signal and the Noise by Nate Silver, traveling salesman, Turing test

In fact, the regression model I fitted to Facebook data does not reveal anything about the 76 per cent of people who didn’t register their political allegiance. While the data shows us that Democrats tend to like Harry Potter, it doesn’t necessarily tell us that other Harry Potter fans like the Democrats. This is the classic problem inherent to all statistical analyses; of potentially confusing correlation with causation. A second limitation relates to the number of ‘likes’ needed to make predictions. The regression model only works when a person has made more than 50 ‘likes’ and, to make really reliable predictions, a few hundred ‘likes’ are required. In the Facebook data set, only 18 per cent of users ‘liked’ more than 50 sites. After this data was collected, Facebook has succeeded in increasing the number of sites its users ‘like’, precisely so that it can better target advertising.


pages: 451 words: 103,606

Machine Learning for Hackers by Drew Conway, John Myles White

call centre, centre right, correlation does not imply causation, Debian, Erdős number, Nate Silver, natural language processing, Netflix Prize, p-value, pattern recognition, Paul Erdős, recommendation engine, social graph, SpamAssassin, statistical model, text mining, the scientific method, traveling salesman

You can perform scaling in R using the scale function: coef(lm(scale(y) ~ scale(x))) # (Intercept) scale(x) #-1.386469e-16 9.745586e-01 As you can see, in this case the correlation between x and y is exactly equal to the coefficient relating the two in linear regression after scaling both of them. This is a general fact about how correlations work, so you can always use linear regression to help you envision exactly what it means for two variables to be correlated. Because correlation is just a measure of how linear the relationship between two variables is, it tells us nothing about causality. This leads to the maxim that “correlation is not causation.” Nevertheless, it’s very important to know whether two things are correlated if you want to use one of them to make predictions about the other. That concludes our introduction to linear regression and the concept of correlation. In the next chapter, we’ll show how to run much more sophisticated regression models that can handle nonlinear patterns in data and simultaneously prevent overfitting.


The Autistic Brain: Thinking Across the Spectrum by Temple Grandin, Richard Panek

Asperger Syndrome, correlation does not imply causation, dark matter, David Brooks, deliberate practice, double helix, ghettoisation, if you see hoof prints, think horses—not zebras, impulse control, Khan Academy, Mark Zuckerberg, meta analysis, meta-analysis, mouse model, neurotypical, pattern recognition, phenotype, Richard Feynman, selective serotonin reuptake inhibitor (SSRI), Silicon Valley, Steve Jobs, theory of mind, twin studies

They can transform a life merely being lived into a life worth living. So women who are pregnant or are thinking about becoming pregnant and who take antidepressants should consult a doctor and weigh the risks and benefits. In any case, we have to be very careful about looking for cause-and-effect relationships between environmental factors and genetics. As every scientist knows, correlation does not imply causation. An observed correlation—two events happening around the same time—might just be coincidence. Let’s use the now infamous vaccination controversy as a way to look at the logical complexity of a causation-versus-coincidence argument. The story goes like this. Parents routinely have their children vaccinated around age eighteen months. Some parents note that their children begin exhibiting signs of autism around age eighteen months—withdrawing into themselves, reversing the gains they’d made in learning language, engaging in repetitive behaviors.


pages: 483 words: 134,377

The Tyranny of Experts: Economists, Dictators, and the Forgotten Rights of the Poor by William Easterly

"Robert Solow", air freight, Andrei Shleifer, battle of ideas, Bretton Woods, British Empire, business process, business process outsourcing, Carmen Reinhart, clean water, colonial rule, correlation does not imply causation, creative destruction, Daniel Kahneman / Amos Tversky, Deng Xiaoping, desegregation, discovery of the americas, Edward Glaeser, en.wikipedia.org, European colonialism, Francisco Pizarro, fundamental attribution error, germ theory of disease, greed is good, Gunnar Myrdal, income per capita, invisible hand, James Watt: steam engine, Jane Jacobs, John Snow's cholera map, Joseph Schumpeter, Kenneth Arrow, Kenneth Rogoff, M-Pesa, microcredit, Monroe Doctrine, oil shock, place-making, Ponzi scheme, risk/return, road to serfdom, Silicon Valley, Steve Jobs, The Death and Life of Great American Cities, The Wealth of Nations by Adam Smith, Thomas L Friedman, urban planning, urban renewal, Washington Consensus, WikiLeaks, World Values Survey, young professional

42 As already mentioned, this survey does not meet the standards of hard evidence (which does not exist on the autocracy-versus-freedom issue either way). The survey does provide a rare opportunity for poor people to speak for themselves. The authors of the survey found a lot of poor people who contradicted the common assumption that poor people don’t care about their rights and care only about their material needs. EVIDENCE AND DEBATE The patterns discussed here do not prove that autocracy and collectivist values cause poverty—correlation is not causation. It could be that people who get rich for some other reason desire more individualism and democracy and are able to get it. Some studies cited here use some formal statistical methods to argue that a history of autocracy causes collectivist values, and both autocracy and collectivist values in turn cause poverty, but most economists find the methods used not very convincing. Some who favor technocratic approaches disqualify any discussion of rights because the evidence for positive consequences of rights is not rigorous enough.

And we also saw already what we see more evidence for here: a history of autocracy and violence breeds more lack of trust. The slave trade’s disastrous effects help explain a result that Nathan Nunn had already found in his doctoral dissertation—that among today’s African nations, those where Europeans had seized the most slaves were poorer than nations that had largely escaped slavery. Benin today is one of the poorest African nations.13 EVIDENCE WITH A CAUSE But once again correlation is not causation. It is plausible that the correlation could also run in reverse: poverty caused enslavement. Poorer people are less able to defend themselves because they cannot afford as many weapons as richer people. Also pre-existing lack of trust could have caused more enslavement. People who were already less trusting and less trustworthy are more likely to help the slavers by betraying their neighbors.


pages: 313 words: 94,490

Made to Stick: Why Some Ideas Survive and Others Die by Chip Heath, Dan Heath

affirmative action, availability heuristic, Barry Marshall: ulcers, correlation does not imply causation, desegregation, low cost airline, Menlo Park, Pepto Bismol, Ronald Reagan, Rosa Parks, shareholder value, Silicon Valley, Stephen Hawking, telemarketer

Marshall and Warren could not even get their research paper accepted by a medical journal. When Marshall presented their findings at a professional conference, the scientists snickered. One of the researchers who heard one of his presentations commented that he “simply didn’t have the demeanor of a scientist.” To be fair to the skeptics, they had a reasonable argument: Marshall and Warren’s evidence was based on correlation, not causation. Almost all of the ulcer patients seemed to have H. pylori. Unfortunately, there were also people who had H. pylori but no ulcer. And, as for proving causation, the researchers couldn’t very well dose a bunch of innocent people with bacteria to see whether they sprouted ulcers. By 1984, Marshall’s patience had run out. One morning he skipped breakfast and asked his colleagues to meet him in the lab.


Beautiful Data: The Stories Behind Elegant Data Solutions by Toby Segaran, Jeff Hammerbacher

23andMe, airport security, Amazon Mechanical Turk, bioinformatics, Black Swan, business intelligence, card file, cloud computing, computer vision, correlation coefficient, correlation does not imply causation, crowdsourcing, Daniel Kahneman / Amos Tversky, DARPA: Urban Challenge, data acquisition, database schema, double helix, en.wikipedia.org, epigenetics, fault tolerance, Firefox, Hans Rosling, housing crisis, information retrieval, lake wobegon effect, longitudinal study, Mars Rover, natural language processing, openstreetmap, prediction markets, profit motive, semantic web, sentiment analysis, Simon Singh, social graph, SPARQL, speech recognition, statistical model, supply-chain management, text mining, Vernor Vinge, web application

This idea has been used to describe a system that, due to an arms race of external pressures, must continue to co-evolve. 210 CHAPTER THIRTEEN Download at Boykma.Com correlated with higher rates of divorce, unwed couples could avoid living together in order to improve their chances of staying together after marriage. The research described never suggested a causal link, but the journalist offered her own advice to couples based on the “data.” The substitution of correlation with causality need not be so explicit. When a scientific research project is undertaken, there exists the assumption that correlation, if discovered, would imply causation, albeit unknown. Else, why seek to answer a research question at all: large-scale search for correlation without causation is aleatory computation, not science. Even with so-called big data, science remains an intensely hypothesis-driven process. The limits of empirical research is not grounds to throw up our hands, only to be careful to push discovery forward without getting rosy-eyed about causality. Creating stories about data is only human: it’s the ability to revise consistently that makes a story sound. 4.


pages: 281 words: 95,852

The Googlization of Everything: by Siva Vaidhyanathan

1960s counterculture, activist fund / activist shareholder / activist investor, AltaVista, barriers to entry, Berlin Wall, borderless world, Burning Man, Cass Sunstein, choice architecture, cloud computing, computer age, corporate social responsibility, correlation does not imply causation, creative destruction, data acquisition, death of newspapers, don't be evil, Firefox, Francis Fukuyama: the end of history, full text search, global pandemic, global village, Google Earth, Howard Rheingold, informal economy, information retrieval, John Markoff, Joseph Schumpeter, Kevin Kelly, knowledge worker, libertarian paternalism, market fundamentalism, Marshall McLuhan, means of production, Mikhail Gorbachev, moral panic, Naomi Klein, Network effects, new economy, Nicholas Carr, PageRank, Panopticon Jeremy Bentham, pirate software, Ray Kurzweil, Richard Thaler, Ronald Reagan, side project, Silicon Valley, Silicon Valley ideology, single-payer health, Skype, Social Responsibility of Business Is to Increase Its Profits, social web, Steven Levy, Stewart Brand, technoutopianism, The Nature of the Firm, The Structural Transformation of the Public Sphere, Thorstein Veblen, urban decay, web application, zero-sum game

And Google was right.40 Needless to say, Anderson’s techno-fundamentalist hyperbole belies a vested interest in the narrative of the revolutionary and transformational power of computing. But here Anderson has stepped out even beyond the pop sociology and economics that usually dominate the magazine. Anderson claims “correlation is enough.”41 In other words, the entire process of generating scientific (or, for that matter, social-scientific) theories and modestly limiting claims to correlation without causation is obsolete and quaint: given enough data and enough computing power, you can draw strong enough correlations to claim with confidence that what you have discovered is indisputably true. THE GOOGL I ZAT I ON OF ME MORY 197 The risk here is more than one of intellectual hubris: the academy has no dearth of that. Given the passionate promotion of such computational models for science of all types, we run the risk of diverting precious research funding and initiatives away from the hard, expensive, painstaking laboratory science that has worked so brilliantly for three centuries.


pages: 571 words: 105,054

Advances in Financial Machine Learning by Marcos Lopez de Prado

algorithmic trading, Amazon Web Services, asset allocation, backtesting, bioinformatics, Brownian motion, business process, Claude Shannon: information theory, cloud computing, complexity theory, correlation coefficient, correlation does not imply causation, diversification, diversified portfolio, en.wikipedia.org, fixed income, Flash crash, G4S, implied volatility, information asymmetry, latency arbitrage, margin call, market fragmentation, market microstructure, martingale, NP-complete, P = NP, p-value, paper trading, pattern recognition, performance metric, profit maximization, quantitative trading / quantitative finance, RAND corporation, random walk, risk-adjusted returns, risk/return, selection bias, Sharpe ratio, short selling, Silicon Valley, smart cities, smart meter, statistical arbitrage, statistical model, stochastic process, survivorship bias, transaction costs, traveling salesman

The cost of lending and the amount available is generally unknown, and depends on relations, inventory, relative demand, etc. These are just a few basic errors that most papers published in journals make routinely. Other common errors include computing performance using a non-standard method (Chapter 14); ignoring hidden risks; focusing only on returns while ignoring other metrics; confusing correlation with causation; selecting an unrepresentative time period; failing to expect the unexpected; ignoring the existence of stop-out limits or margin calls; ignoring funding costs; and forgetting practical aspects (Sarfati [2015]). There are many more, but really, there is no point in listing them, because of the title of the next section. 11.3 Even If Your Backtest Is Flawless, It Is Probably Wrong Congratulations!


Data and the City by Rob Kitchin,Tracey P. Lauriault,Gavin McArdle

A Declaration of the Independence of Cyberspace, bike sharing scheme, bitcoin, blockchain, Bretton Woods, Chelsea Manning, citizen journalism, Claude Shannon: information theory, clean water, cloud computing, complexity theory, conceptual framework, corporate governance, correlation does not imply causation, create, read, update, delete, crowdsourcing, cryptocurrency, dematerialisation, digital map, distributed ledger, fault tolerance, fiat currency, Filter Bubble, floating exchange rates, global value chain, Google Earth, hive mind, Internet of things, Kickstarter, knowledge economy, lifelogging, linked data, loose coupling, new economy, New Urbanism, Nicholas Carr, open economy, openstreetmap, packet switching, pattern recognition, performance metric, place-making, RAND corporation, RFID, Richard Florida, ride hailing / ride sharing, semantic web, sentiment analysis, sharing economy, Silicon Valley, Skype, smart cities, Smart Cities: Big Data, Civic Hackers, and the Quest for a New Utopia, smart contracts, smart grid, smart meter, social graph, software studies, statistical model, TaskRabbit, text mining, The Chicago School, The Death and Life of Great American Cities, the market place, the medium is the message, the scientific method, Toyota Production System, urban planning, urban sprawl, web application

Rob Kitchin (2014a) has described how a new empiricist school of thought has emerged that takes the data these computer systems are generating at face value to produce direct insights in (amongst others) urban patterns. As one of their protagonists, Chris Anderson (2008), claims: We can throw the numbers into the biggest computing clusters the world has ever seen and let statistical algorithms find patterns where science cannot . . . Correlation supersedes causation, and science can advance even without coherent models, unified theories, or really any mechanistic explanation at all. In their vision, Batty’s observation about the use of computers in the city has come full circle: computer systems produce data about the city that allegedly give us a transparent look into the city’s dynamics. In turn, these data can be used to analyse the city in order to optimize that system.


pages: 181 words: 52,147

The Driver in the Driverless Car: How Our Technology Choices Will Create the Future by Vivek Wadhwa, Alex Salkever

23andMe, 3D printing, Airbnb, artificial general intelligence, augmented reality, autonomous vehicles, barriers to entry, Bernie Sanders, bitcoin, blockchain, clean water, correlation does not imply causation, distributed ledger, Donald Trump, double helix, Elon Musk, en.wikipedia.org, epigenetics, Erik Brynjolfsson, Google bus, Hyperloop, income inequality, Internet of things, job automation, Kevin Kelly, Khan Academy, Kickstarter, Law of Accelerating Returns, license plate recognition, life extension, longitudinal study, Lyft, M-Pesa, Menlo Park, microbiome, mobile money, new economy, personalized medicine, phenotype, precision agriculture, RAND corporation, Ray Kurzweil, recommendation engine, Ronald Reagan, Second Machine Age, self-driving car, Silicon Valley, Skype, smart grid, stem cell, Stephen Hawking, Steve Wozniak, Stuxnet, supercomputer in your pocket, Tesla Model S, The Future of Employment, Thomas Davenport, Travis Kalanick, Turing test, Uber and Lyft, Uber for X, uber lyft, uranium enrichment, Watson beat the top human players on Jeopardy!, zero day

In February 2015, researchers from M.I.T. and from Harvard University released the results of the most comprehensive longitudinal study yet of how the diversity and types of gut flora affect onset of this type of diabetes.3 The scientists tracked what happened to the gut bacteria of a large number of subjects from birth to their third year in life, and found that children who became diabetic suffered a 25 percent reduction in their gut bacteria’s diversity. What’s more, the mix of bacteria shifted away from types known to promote health toward types known to promote inflammation. Correlation is not causation, but the results added to evidence that the bacteria in our intestines have a strong effect on our health. In fact, manipulating the microbiome may even become more important than genomics and gene-based medicine. Unlike genomics and gene therapy, which require a relatively heroic effort to induce physiological changes, tweaking the microbiome appears to be relatively straightforward and safe: just mix up a cocktail of the appropriate bacteria, and transplant it into your gut.


pages: 372 words: 111,573

10% Human: How Your Body's Microbes Hold the Key to Health and Happiness by Alanna Collen

Asperger Syndrome, Barry Marshall: ulcers, Berlin Wall, biofilm, clean water, correlation does not imply causation, David Strachan, discovery of penicillin, Drosophila, Fall of the Berlin Wall, friendly fire, germ theory of disease, global pandemic, hygiene hypothesis, Ignaz Semmelweis: hand washing, illegal immigration, John Snow's cholera map, Kickstarter, Louis Pasteur, Maui Hawaii, meta analysis, meta-analysis, microbiome, phenotype, placebo effect, the scientific method

The data included information about antibiotic use when the children were infants. It turned out that those who had been given antibiotics before the age of two – a startling 74 per cent of them – were on average nearly twice as likely to have developed asthma by the time they were eight. The more courses of antibiotics the children received, the more likely they were to develop asthma, eczema and hay fever. But, as the saying goes, correlation does not always mean causation. The lead researcher on the antibiotics study had discovered four years earlier that the more television children watched, the more likely they were to develop asthma. Of course, despite similar results as in the antibiotics study, no one really believed that the act of watching television could bring about immune dysfunction in the lungs. In fact, the number of hours in front of a television was being used as a proxy for the amount of exercise children were getting.

It took time for antibiotics to reach common usage, for further antibiotic drugs to be developed, for children to grow up with the influence of these drugs on their bodies, and for chronic diseases to develop in their own insidious way. It also takes time for the effects to become clear across populations, countries and continents. If the introduction of antibiotics in 1944 is in some way responsible for our current state of health, the 1950s are exactly when we would expect to see the dawning of their impact. Let us not jump the gun though. As any scientist would hasten to point out, correlation does not always mean causation. The timely introduction of antibiotics may be as unrealistic a connection to rising chronic illness as the self-serve supermarkets that made their debut in the 1940s. Connections alone, whilst useful guides, do not always provide a causal link. An amusing website about spurious correlations tells me that there’s an impressively close correlation between per capita consumption of cheese in the US and the number of people who die each year by becoming tangled in their bed sheets.


pages: 321 words: 85,893

The Vegetarian Myth: Food, Justice, and Sustainability by Lierre Keith

British Empire, car-free, clean water, cognitive dissonance, correlation does not imply causation, Drosophila, dumpster diving, en.wikipedia.org, Gary Taubes, Haber-Bosch Process, longitudinal study, McMansion, meta analysis, meta-analysis, out of africa, peak oil, placebo effect, Rosa Parks, the built environment

The kind of cross-country comparison that Keys did “involves comparing apples with oranges—that is countries with widely varying cultural, social, political and physical environments.”52 With such an infinite number of variables, a finding of definitive causation would be ridiculous. Figure 4A. Correlation between the total fat consumption as a percent of total calorie consumption, and mortality from coronary heart disease in six countries. Redrawn from The Cholesterol Myths by Uffe Ravnskov. John Yudkin’s 1957 study shows the error of conflating correlation with causation. You can see from Figure 4B (page over) that owning a TV and radio had a much stronger association with Coronary Heart Disease (CHD) than any nutritional elements.53 But no one would suggest that TV causes CHD, or that sacrificing our TVs will grant us a longer life. No one went on to investigate whether TVs produced heart-stopping emissions or blood-damaging toxins. No government health agency paid for people to throw out their TVs as a treatment for CHD.


pages: 322 words: 107,576

Bad Science by Ben Goldacre

Asperger Syndrome, correlation does not imply causation, experimental subject, hygiene hypothesis, Ignaz Semmelweis: hand washing, John Snow's cholera map, Louis Pasteur, meta analysis, meta-analysis, Nelson Mandela, offshore financial centre, p-value, placebo effect, publication bias, Richard Feynman, risk tolerance, Ronald Reagan, selection bias, selective serotonin reuptake inhibitor (SSRI), the scientific method, urban planning

Or you could get really serious, and start to manipulate the statistics. For two pages only, this book will now get quite nerdy. I understand if you want to skip it, but know that it is here for the doctors who bought the book to laugh at homeopaths. Here are the classic tricks to play in your statistical analysis to make sure your trial has a positive result. Ignore the protocol entirely Always assume that any correlation proves causation. Throw all your data into a spreadsheet programme and report—as significant—any relationship between anything and everything if it helps your case. If you measure enough, some things are bound to be positive just by sheer luck. Play with the baseline Sometimes, when you start a trial, quite by chance the treatment group is already doing better than the placebo group. If so, then leave it like that.


pages: 465 words: 109,653

Free Ride by Robert Levine

A Declaration of the Independence of Cyberspace, Anne Wojcicki, book scanning, borderless world, Buckminster Fuller, citizen journalism, commoditize, correlation does not imply causation, creative destruction, crowdsourcing, death of newspapers, Edward Lloyd's coffeehouse, Electric Kool-Aid Acid Test, Firefox, future of journalism, Googley, Hacker Ethic, informal economy, Jaron Lanier, Joi Ito, Julian Assange, Justin.tv, Kevin Kelly, linear programming, Marc Andreessen, Mitch Kapor, moral panic, offshore financial centre, pets.com, publish or perish, race to the bottom, Saturday Night Live, Silicon Valley, Silicon Valley startup, Skype, spectrum auction, Steve Jobs, Steven Levy, Stewart Brand, subscription business, Telecommunications Act of 1996, Whole Earth Catalog, WikiLeaks

Asked about the study, Andersen says the Pirate Bay file-sharing service doesn’t offer much copyrighted music from major-label acts—although even a cursory glance at its Web site shows that isn’t true. Many factors have hurt music sales, including the closing of so many record stores. But almost every other study has concluded that file sharing played a role,74 and anyone who believes otherwise is running out of alternate explanations. Several studies have shown that individuals who download music illegally also buy it, but that proves only correlation, not causation. Some suggested CD sales fell because music fans are no longer replacing their old records, but “catalog” sales of older releases declined less than overall sales from 2004 to 2009.75 Others speculated that DVD sales cut into the CD market, but now they’re declining as well. Music sales have also declined disproportionately in countries where file sharing is more common. In Spain, where fully 45 percent of Internet users get media from pirate services—about twice the average rate in Europe76—CD sales declined 77 percent since 2001, compared with a continent-wide average decline of 54 percent.77 And Japan is now the No. 1 market for CDs, despite having one-third the population of the United States, partly because many people there access the Internet with mobile phones that don’t run file-sharing programs.78 In June 2010, with U.S. music sales less than half what they were a decade earlier, Oberholzer-Gee and Strumpf published another paper which conceded that file sharing had affected sales, but not that much and in a way that helped society.79 They said that 60 percent of online traffic consisted of file trading—an astonishing statistic that implies illegal downloading accounts for more than half of all Internet use—but that all this piracy accounted for 20 percent or less of the decline in music sales.


pages: 349 words: 114,038

Culture & Empire: Digital Revolution by Pieter Hintjens

4chan, airport security, AltaVista, anti-communist, anti-pattern, barriers to entry, Bill Duvall, bitcoin, blockchain, business climate, business intelligence, business process, Chelsea Manning, clean water, commoditize, congestion charging, Corn Laws, correlation does not imply causation, cryptocurrency, Debian, Edward Snowden, failed state, financial independence, Firefox, full text search, German hyperinflation, global village, GnuPG, Google Chrome, greed is good, Hernando de Soto, hiring and firing, informal economy, intangible asset, invisible hand, James Watt: steam engine, Jeff Rulifson, Julian Assange, Kickstarter, M-Pesa, mass immigration, mass incarceration, mega-rich, MITM: man-in-the-middle, mutually assured destruction, Naomi Klein, national security letter, Nelson Mandela, new economy, New Urbanism, Occupy movement, offshore financial centre, packet switching, patent troll, peak oil, pre–internet, private military company, race to the bottom, rent-seeking, reserve currency, RFC: Request For Comment, Richard Feynman, Richard Stallman, Ross Ulbricht, Satoshi Nakamoto, security theater, selection bias, Skype, slashdot, software patent, spectrum auction, Steve Crocker, Steve Jobs, Steven Pinker, Stuxnet, The Wealth of Nations by Adam Smith, The Wisdom of Crowds, trade route, transaction costs, twin studies, union organizing, wealth creators, web application, WikiLeaks, Y2K, zero day, Zipf's Law

And yet, when people stopped talking about religions, and instead looked at the politics, we found solutions. This is a consistent pattern. Conflict is always political, yet leaders often invoke religion to bolster their followers, and create more tribalism. Outsiders, searching for simplistic explanations, and possibly arms sales, embrace this rhetoric as reality. As the conflict increases, the religious arguments will definitely increase. However, it's correlation, not causation. And in the end, the solution comes from addressing the original political issues. Until then, and as long as possible, the beneficiaries (war can be incredibly profitable!) will pump up the "irreconcilable ancient hatreds" angle. And so it goes with the Global Extremist Islamic Threat to Modern Civilization. It appeals to atheists and Christians alike, and provides convenient cover, both for unprecedented profit-taking, and for creating the spy networks.


pages: 396 words: 117,149

The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World by Pedro Domingos

Albert Einstein, Amazon Mechanical Turk, Arthur Eddington, basic income, Bayesian statistics, Benoit Mandelbrot, bioinformatics, Black Swan, Brownian motion, cellular automata, Claude Shannon: information theory, combinatorial explosion, computer vision, constrained optimization, correlation does not imply causation, creative destruction, crowdsourcing, Danny Hillis, data is the new oil, double helix, Douglas Hofstadter, Erik Brynjolfsson, experimental subject, Filter Bubble, future of work, global village, Google Glasses, Gödel, Escher, Bach, information retrieval, job automation, John Markoff, John Snow's cholera map, John von Neumann, Joseph Schumpeter, Kevin Kelly, lone genius, mandelbrot fractal, Mark Zuckerberg, Moneyball by Michael Lewis explains big data, Narrative Science, Nate Silver, natural language processing, Netflix Prize, Network effects, NP-complete, off grid, P = NP, PageRank, pattern recognition, phenotype, planetary scale, pre–internet, random walk, Ray Kurzweil, recommendation engine, Richard Feynman, scientific worldview, Second Machine Age, self-driving car, Silicon Valley, social intelligence, speech recognition, Stanford marshmallow experiment, statistical model, Stephen Hawking, Steven Levy, Steven Pinker, superintelligent machines, the scientific method, The Signal and the Noise by Nate Silver, theory of mind, Thomas Bayes, transaction costs, Turing machine, Turing test, Vernor Vinge, Watson beat the top human players on Jeopardy!, white flight, zero-sum game

This technique, called A/B testing, was at first used mainly in drug trials but has since spread to many fields where data can be gathered on demand, from marketing to foreign aid. It can also be generalized to try many combinations of changes at once, without losing track of which changes lead to which gains (or losses). Companies like Amazon and Google swear by it; you’ve probably participated in thousands of A/B tests without realizing it. A/B testing gives the lie to the oft-heard criticism that big data is only good for finding correlations, not causation. Philosophical fine points aside, learning causality is learning the effects of your actions, and anyone with a stream of data they can affect can do it—from a one-year-old splashing around in the bathtub to a president campaigning for reelection. Learning to relate If we endow Robby the robot with all the learning abilities we’ve seen so far in this book, he’ll be pretty smart but still a bit autistic.


pages: 425 words: 112,220

The Messy Middle: Finding Your Way Through the Hardest and Most Crucial Part of Any Bold Venture by Scott Belsky

23andMe, 3D printing, Airbnb, Albert Einstein, Anne Wojcicki, augmented reality, autonomous vehicles, Ben Horowitz, bitcoin, blockchain, Chuck Templeton: OpenTable:, commoditize, correlation does not imply causation, cryptocurrency, delayed gratification, DevOps, Donald Trump, Elon Musk, endowment effect, hiring and firing, Inbox Zero, iterative process, Jeff Bezos, knowledge worker, Lean Startup, Lyft, Mark Zuckerberg, Marshall McLuhan, minimum viable product, move fast and break things, move fast and break things, NetJets, Network effects, new economy, old-boy network, pattern recognition, Paul Graham, ride hailing / ride sharing, Silicon Valley, slashdot, Snapchat, Steve Jobs, subscription business, TaskRabbit, the medium is the message, Travis Kalanick, Uber for X, uber lyft, Y Combinator, young professional

For example, if a product manager is presenting his or her team a series of mock-ups and wireframes that were designed by a designer on the team, the designer should present that portion of the deck. This gives your staff ownership over their work and also allows the most knowledgeable person to lead the discussion. False attribution can wreak havoc in a team. This applies just as much to congratulating the wrong person as it does putting a success down to circumstance instead of skill. In our effort to optimize whatever seems to work, we’re liable to conflate correlation with causation. If things go well, it doesn’t necessarily mean someone’s tactics worked. Go a level deeper to understand whether a success resulted from good timing, external market forces, great skills and execution, or some combination of the above. Attribute success at the element level: the skills, decisions, tactics, relationships, and hard work that contributed to the outcome. Don’t make the mistake of abstracting away the forces driving the success.


pages: 384 words: 112,971

What’s Your Type? by Merve Emre

Albert Einstein, anti-communist, card file, correlation does not imply causation, Frederick Winslow Taylor, God and Mammon, Golden Gate Park, hiring and firing, index card, Isaac Newton, job satisfaction, late capitalism, means of production, Menlo Park, mutually assured destruction, Norman Mailer, p-value, Panopticon Jeremy Bentham, Ralph Waldo Emerson, Socratic dialogue, Stanford prison experiment, traveling salesman, upwardly mobile, uranium enrichment, women in the workforce

It was an intimidating machine, one that resembled a church organ, with a keyboard painted after a Mondrian city grid and four racks of processors, which, when hooked together by a large cable, created a phalanx of 17,000 transistors. The RCA 501 made it possible to score tests at an inhuman rate—one test every nine seconds in 1958, the year the computer first arrived at ETS. It also allowed ETS’s statisticians to write programs that could perform basic statistical analyses across large samples of test subjects: the calculation of means and modes, correlations and causations, the determination of significance. But the computer was more than just a convenient time- and labor-saving device. It was an indispensable technology given the new scale of Isabel’s and ETS’s operations. In just the first few years of their partnership, the indicator swept the country from east to west and back again with a frenetic, vital energy. ETS’s list of early adopters included not only Hay’s corporate clients and MacKinnon’s creative types but also the Protestant Episcopal Church in the United States of America, which had given Form C to seventy-two women applying to serve as directors of religious education for the church’s national council; the Palo Alto Public School District, which had tested two thousand students to help select participants for its gifted child program; Brown University, which had administered the indicator to all 950 members of its 1958 class; and the California State Department of Corrections, which had used the indicator to divide the inmates at their Vacaville prison into low- and high-risk populations.


pages: 514 words: 111,012

The Art of Monitoring by James Turnbull

Amazon Web Services, anti-pattern, cloud computing, continuous integration, correlation does not imply causation, Debian, DevOps, domain-specific language, failed state, Kickstarter, Kubernetes, microservices, performance metric, pull request, Ruby on Rails, software as a service, source of truth, web application, WebSocket

Visualization Visualizing data is both an incredibly powerful analytic and interpretive technique and an amazing learning tool. Throughout the book we'll look at ways to visualize the data and metrics we've collected. However metrics and their visualizations are often tricky to interpret. Humans tend towards apophenia—the perception of meaningful patterns within random data—when viewing visualizations. This often leads to sudden leaps from correlation to causation. This can be further exacerbated by the granularity and resolution of our available data, how we choose to represent it, and the scale on which we represent it. Our ideal visualizations will clearly show the data, with an emphasis on highlighting substance over visuals. In this book we've tried to build visuals that subscribe to these broad rules: Clearly show the data. Induce the viewer to think about the substance, not the visuals.


pages: 163 words: 42,402

pages: 538 words: 121,670

Republic, Lost: How Money Corrupts Congress--And a Plan to Stop It by Lawrence Lessig

asset-backed security, banking crisis, carried interest, circulation of elites, cognitive dissonance, corporate personhood, correlation does not imply causation, crony capitalism, David Brooks, Edward Glaeser, Filter Bubble, financial deregulation, financial innovation, financial intermediation, invisible hand, jimmy wales, Martin Wolf, meta analysis, meta-analysis, Mikhail Gorbachev, moral hazard, Pareto efficiency, place-making, profit maximization, Ralph Nader, regulatory arbitrage, rent-seeking, Ronald Reagan, Sam Peltzman, Silicon Valley, single-payer health, The Wealth of Nations by Adam Smith, too big to fail, upwardly mobile, WikiLeaks, Zipcar

As Fiorina and Abrams put it, “the natural place to look for campaign money is in the ranks of the single-issue groups, and a natural strategy to motivate their members is to exaggerate the threats their enemies pose.”29 In this odd and certainly unintended way, then, the demand for cash could also be changing the substance of American politics. Could be, because all I’ve described is correlation, not causation. But at a minimum the correlation should concern us: On some issues, the parties become more united—those issues that appeal to corporate America. On other issues, the parties become more divided—the more campaign funds an issue inspires, the more extremely it gets framed. In both cases, the change correlates with a strategy designed to maximize campaign cash, while weakening the connection between what Congress does (or at least campaigns on) and the potential needs of ordinary Americans.


pages: 473 words: 121,895

Come as You Are: The Surprising New Science That Will Transform Your Sex Life by Emily Nagoski Ph.d.

cognitive dissonance, correlation does not imply causation, delayed gratification, meta analysis, meta-analysis, placebo effect, Skype, Snapchat, spaced repetition, the scientific method, twin studies

Suppose you recognize that nonconcordance exists, you acknowledge that it’s expecting without necessarily indicating enjoying or eagerness, and then you read the research that shows there is a correlation between nonconcordance and sexual dysfunctions related to desire and arousal.21 And so you decide that, because nonconcordance is associated with dysfunction, nonconcordance must be a problem. Which brings me to a sentence every undergraduate who takes a research methods class will memorize: “Correlation does not imply causation.” It refers to the cum hoc ergo propter hoc fallacy—“with this, therefore because of this”—which means that just because two things happen together doesn’t mean that one thing caused the other thing. The quintessential example in the twenty-first century is the relationship between pirates and global warming.22 This is a joke made by Bobby Henderson, as part of the belief system of the Church of the Flying Spaghetti Monster.


pages: 239 words: 77,436

Pure, White and Deadly: How Sugar Is Killing Us and What We Can Do to Stop It by John Yudkin

correlation coefficient, correlation does not imply causation, discovery of penicillin

I already knew from my own work that sugar at our current rate of consumption is a medical disaster. But to learn that Yudkin foresaw what a problem sugar was thirty-six years earlier, and at a much lower dose (i.e. before the advent of high-fructose corn syrup and the two-litre bottle) was a true revelation. Indeed, I was a Yudkin disciple and I hadn’t even realized it. Yudkin didn’t have the voluminous data that exist today. He had correlation, but not causation. He didn’t have mechanism. He didn’t know that sugar caused insulin resistance by being turned into fat in the liver through the process of de novo lipogenesis, or that sugar induced protein damage through the Maillard or browning reaction. He didn’t know that sugar was weakly addictive, although he surmised it. Despite that, Pure, White and Deadly draws direct lines between sugar and dental caries, gout, autoimmune disease, heart disease and cancer.


pages: 252 words: 74,167

Thinking Machines: The Inside Story of Artificial Intelligence and Our Race to Build the Future by Luke Dormehl

Ada Lovelace, agricultural Revolution, AI winter, Albert Einstein, Alexey Pajitnov wrote Tetris, algorithmic trading, Amazon Mechanical Turk, Apple II, artificial general intelligence, Automated Insights, autonomous vehicles, book scanning, borderless world, call centre, cellular automata, Claude Shannon: information theory, cloud computing, computer vision, correlation does not imply causation, crowdsourcing, drone strike, Elon Musk, Flash crash, friendly AI, game design, global village, Google X / Alphabet X, hive mind, industrial robot, information retrieval, Internet of things, iterative process, Jaron Lanier, John Markoff, John Maynard Keynes: Economic Possibilities for our Grandchildren, John Maynard Keynes: technological unemployment, John von Neumann, Kickstarter, Kodak vs Instagram, Law of Accelerating Returns, life extension, Loebner Prize, Marc Andreessen, Mark Zuckerberg, Menlo Park, natural language processing, Norbert Wiener, out of africa, PageRank, pattern recognition, Ray Kurzweil, recommendation engine, remote working, RFID, self-driving car, Silicon Valley, Skype, smart cities, Smart Cities: Big Data, Civic Hackers, and the Quest for a New Utopia, social intelligence, speech recognition, Stephen Hawking, Steve Jobs, Steve Wozniak, Steven Pinker, strong AI, superintelligent machines, technological singularity, The Coming Technological Singularity, The Future of Employment, Tim Cook: Apple, too big to fail, Turing machine, Turing test, Vernor Vinge, Watson beat the top human players on Jeopardy!

The case works by measuring the heart’s electrical patterns through the fingertips of the person holding it. An algorithm then analyses the regularity of their heartbeat and suggests if the person should see a doctor. As our environment gets ever smarter, we will enter an age of continuous, real-time risk assessments. For the first time in history it will be possible to draw constant correlations, and possibly causations, between a large number of genomic, physiological, biological and environmental factors on an individual basis. Wearable devices will tirelessly monitor our heart rate, blood oxygen levels, physical activity, breathing patterns, facial expression, lung function, voice inflection, brain waves, posture, sleep quality and more, in addition to external measurements like air quality and noise level.


pages: 268 words: 74,724

Who Needs the Fed?: What Taylor Swift, Uber, and Robots Tell Us About Money, Credit, and Why We Should Abolish America's Central Bank by John Tamny

Airbnb, bank run, Bernie Madoff, bitcoin, Bretton Woods, buy and hold, Carmen Reinhart, corporate raider, correlation does not imply causation, creative destruction, Credit Default Swap, crony capitalism, crowdsourcing, Donald Trump, Downton Abbey, fiat currency, financial innovation, Fractional reserve banking, full employment, George Gilder, Home mortgage interest deduction, Jeff Bezos, job automation, Joseph Schumpeter, Kenneth Rogoff, Kickstarter, liquidity trap, Mark Zuckerberg, market bubble, money market fund, moral hazard, mortgage tax deduction, NetJets, offshore financial centre, oil shock, peak oil, Peter Thiel, price stability, profit motive, quantitative easing, race to the bottom, Ronald Reagan, self-driving car, sharing economy, Silicon Valley, Silicon Valley startup, Steve Jobs, The Wealth of Nations by Adam Smith, too big to fail, Travis Kalanick, Uber for X, War on Poverty, yield curve

But for the purposes of this chapter, FDR’s dollar meddling requires discussion, because one of the most common objections to the Federal Reserve is that since its creation in 1913, the dollar has lost more than 90 percent of its value. It’s a horrid number, and the unseen is the massive economic advances that would have made the abundant present seem impoverished by comparison but that did not come into being. However, this objection to the Fed is one of those instances where correlation is not causation. Lest we forget, FDR decided to devalue the dollar, and per Shlaes, “It did not matter what the Federal Reserve said.” Stated simply, the first major decline in the value of the dollar had nothing to do with the Fed. So incensed was Fed Chairman Eugene Meyer by FDR’s decision that he actually resigned.6 Let’s shift to 1944 and the Bretton Woods monetary conference at the Mount Washington Hotel.


pages: 492 words: 141,544

Red Moon by Kim Stanley Robinson

artificial general intelligence, basic income, blockchain, Brownian motion, correlation does not imply causation, cryptocurrency, Deng Xiaoping, gig economy, Hyperloop, illegal immigration, income inequality, invisible hand, low earth orbit, Magellanic Cloud, megacity, precariat, Schrödinger's Cat, seigniorage, strong AI, Turing machine, universal basic income, zero-sum game

It was a brief message conveyed by laser light. An amateur astronomer observing the moon was in the beam’s target circle, and captured a recording of part of it. It was an encrypted message.” “And you broke the code?” “No. But the timing of this message is suggestive. An hour after this light from the moon was seen, people from all over China began to head for Beijing.” “Coincidence?” Zhou suggested. “Correlation, not causation?” Bo and Dhu did not reply. Ta Shu saw that Zhou was not going to share anything with these two, just out of a general sense of caution. War of the agencies at least, and maybe something more. The discipline inspection commission didn’t have much direct presence on the moon, so far as Ta Shu knew, even if they did oversee the Lunar Authority as they did all the agencies. So as interlopers these two were not going to get very far with locals like Zhou.


pages: 504 words: 129,087

The Ones We've Been Waiting For: How a New Generation of Leaders Will Transform America by Charlotte Alter

"side hustle", 4chan, affirmative action, Affordable Care Act / Obamacare, basic income, Berlin Wall, Bernie Sanders, carbon footprint, clean water, collective bargaining, Columbine, corporate personhood, correlation does not imply causation, Credit Default Swap, crowdsourcing, David Brooks, Donald Trump, double helix, East Village, ending welfare as we know it, Fall of the Berlin Wall, feminist movement, Ferguson, Missouri, financial deregulation, Francis Fukuyama: the end of history, gig economy, glass ceiling, Google Hangouts, housing crisis, illegal immigration, immigration reform, income inequality, Intergovernmental Panel on Climate Change (IPCC), job-hopping, Kevin Kelly, knowledge economy, Lyft, mandatory minimum, Marc Andreessen, Mark Zuckerberg, mass incarceration, McMansion, medical bankruptcy, move fast and break things, move fast and break things, Nate Silver, obamacare, Occupy movement, passive income, pre–internet, race to the bottom, RAND corporation, Ronald Reagan, sexual politics, Silicon Valley, single-payer health, Snapchat, TaskRabbit, too big to fail, Uber and Lyft, uber lyft, universal basic income, unpaid internship, We are the 99%, white picket fence, working poor, Works Progress Administration

These findings have been replicated elsewhere: a 2014 peer-reviewed study in the Journal of Social Psychology found that kids who read Harry Potter had more tolerance toward social outsiders than kids who didn’t. It’s hard to establish whether millennials were influenced by Harry Potter or if the series took off because it touched on themes that were already brewing in young minds in the late 1990s and early 2000s. Most likely, the huge influence of Harry Potter and the rise of progressive attitudes among millennials are correlated, not causational. Harry’s world—the heroes and villains, the assumptions and challenges—largely mirrors millennial attitudes about what’s good (teamwork, diversity, tolerance) and what’s bad (bigotry, racial purity, authoritarianism). Not everyone was happy about it: evangelical pastors condemned the book, Pope Benedict XVI warned that it might “distort Christianity,” and it topped the American Library Association’s list of most frequently challenged books, primarily in conservative communities.


pages: 428 words: 134,832

Straphanger by Taras Grescoe

active transport: walking or cycling, Affordable Care Act / Obamacare, airport security, Albert Einstein, big-box store, bike sharing scheme, Boris Johnson, British Empire, call centre, car-free, carbon footprint, City Beautiful movement, congestion charging, correlation does not imply causation, David Brooks, deindustrialization, East Village, edge city, Enrique Peñalosa, extreme commuting, financial deregulation, Frank Gehry, glass ceiling, Golden Gate Park, housing crisis, hydraulic fracturing, indoor plumbing, intermodal, invisible hand, Jane Jacobs, jitney, Joan Didion, Kickstarter, Kitchen Debate, laissez-faire capitalism, Marshall McLuhan, mass immigration, McMansion, megacity, mortgage tax deduction, Network effects, New Urbanism, obamacare, oil shale / tar sands, oil shock, Own Your Own Home, peak oil, pension reform, Peter Calthorpe, Ponzi scheme, Ronald Reagan, Rosa Parks, sensible shoes, Silicon Valley, Skype, the built environment, The Death and Life of Great American Cities, the High Line, transit-oriented development, union organizing, urban planning, urban renewal, urban sprawl, walkable city, white flight, working poor, young professional, Zipcar

Louis and Atlanta, you were five times more likely than the national average to be the victim of a murder or robbery, a burglary, or a motor vehicle theft. It’s an interesting list. One thing the nation’s worst crime hot spots seem to have in common is that they are highly sprawled metropolitan regions—Greater St. Louis covers almost 8,500 square miles—whose atrophied public transport systems make their residents almost completely dependent on cars. Any responsible criminologist would protest that only a fool confounds correlation and causation. Fair enough, though this raises another question: Doesn’t believing that your transportation and housing choices shield you from crime when they actually make you more likely to be a victim of it mean you are already living in a fool’s paradise? Rubber and Rail Planners are at ease talking about residential densities, workplace clusters, and transit ridership rates, but they are strangely silent on the role skin color plays in public transport.


pages: 470 words: 130,269

The Marginal Revolutionaries: How Austrian Economists Fought the War of Ideas by Janek Wasserman

Albert Einstein, American Legislative Exchange Council, anti-communist, battle of ideas, Berlin Wall, Bretton Woods, business cycle, collective bargaining, Corn Laws, correlation does not imply causation, creative destruction, David Ricardo: comparative advantage, different worldview, Donald Trump, experimental economics, Fall of the Berlin Wall, floating exchange rates, Fractional reserve banking, Francis Fukuyama: the end of history, full employment, Gunnar Myrdal, housing crisis, Internet Archive, invisible hand, John von Neumann, Joseph Schumpeter, laissez-faire capitalism, liberal capitalism, market fundamentalism, mass immigration, means of production, Menlo Park, Mont Pelerin Society, New Journalism, New Urbanism, old-boy network, Paul Samuelson, Philip Mirowski, price mechanism, price stability, RAND corporation, random walk, rent control, road to serfdom, Robert Bork, rolodex, Ronald Coase, Ronald Reagan, Silicon Valley, Simon Kuznets, The Chicago School, The Wealth of Nations by Adam Smith, Thomas Kuhn: the structure of scientific revolutions, Thomas Malthus, trade liberalization, union organizing, urban planning, Vilfredo Pareto, Washington Consensus, zero-sum game, éminence grise

But they are not made to appreciate one another.”31 Although the debate concerned significant philosophical issues, it was primarily a dispute between two irascible, vain professors, Menger and Schmoller. Simmering beneath the surface for over a decade, petty resentments erupted into an academic contretemps in 1883. The vitriol bewildered participants and observers. The debate resolved little and cost much. More than any other event, the dispute precipitated the formation of a distinctive Austrian School, yet one must not confuse correlation and causation: the latter-day Austrian approach owed little to this contest of egos. Instead, a new cohort of scholars entered the field simultaneously with the struggle and began to enrich the embryonic Austrian approach.32 In the decade after Principles appeared, it barely made an impression outside of Vienna. The silence from the German Reich in particular was deafening. Menger did not see himself as a heterodox thinker and wanted to contribute to the improvement of German-language scholarship.


pages: 301 words: 89,076

The Globotics Upheaval: Globalisation, Robotics and the Future of Work by Richard Baldwin

agricultural Revolution, Airbnb, AltaVista, Amazon Web Services, augmented reality, autonomous vehicles, basic income, business process, business process outsourcing, call centre, Capital in the Twenty-First Century by Thomas Piketty, Cass Sunstein, commoditize, computer vision, Corn Laws, correlation does not imply causation, Credit Default Swap, David Ricardo: comparative advantage, declining real wages, deindustrialization, deskilling, Donald Trump, Douglas Hofstadter, Downton Abbey, Elon Musk, Erik Brynjolfsson, facts on the ground, future of journalism, future of work, George Gilder, Google Glasses, Google Hangouts, hiring and firing, impulse control, income inequality, industrial robot, intangible asset, Internet of things, invisible hand, James Watt: steam engine, Jeff Bezos, job automation, knowledge worker, laissez-faire capitalism, low skilled workers, Machine translation of "The spirit is willing, but the flesh is weak." to Russian and back, manufacturing employment, Mark Zuckerberg, mass immigration, mass incarceration, Metcalfe’s law, new economy, optical character recognition, pattern recognition, Ponzi scheme, post-industrial society, post-work, profit motive, remote working, reshoring, ride hailing / ride sharing, Robert Gordon, Robert Metcalfe, Ronald Reagan, Second Machine Age, self-driving car, side project, Silicon Valley, Skype, Snapchat, social intelligence, sovereign wealth fund, standardized shipping container, statistical model, Stephen Hawking, Steve Jobs, supply-chain management, TaskRabbit, telepresence, telepresence robot, telerobotics, Thomas Malthus, trade liberalization, universal basic income

Note that this importance of hairdos would not be explicit—you probably couldn’t even be sure if it was baked into the algorithm. In the 1960s and 1970s, something fundamental changed that led many boys to have longer hair and many girls to have shorter hair. Using the 1950s algorithm would thus misclassify many students. The topline here is that AI-trained robots do not understand the world. They just understand patterns in their training data sets. This reliance on correlation rather than causation will inevitably lead to very systematic mistakes when underlying factors change. This is another reason AI robots are unlikely to be trusted with critical tasks. There is no danger in letting them suggests tags for your Facebook friends. There could be real danger if we fully relied on them for more essential tasks. There will long be a demand for having humans in the decision loop.


pages: 688 words: 147,571

Robot Rules: Regulating Artificial Intelligence by Jacob Turner

Ada Lovelace, Affordable Care Act / Obamacare, AI winter, algorithmic trading, artificial general intelligence, Asilomar, Asilomar Conference on Recombinant DNA, autonomous vehicles, Basel III, bitcoin, blockchain, brain emulation, Clapham omnibus, cognitive dissonance, corporate governance, corporate social responsibility, correlation does not imply causation, crowdsourcing, distributed ledger, don't be evil, Donald Trump, easy for humans, difficult for computers, effective altruism, Elon Musk, financial exclusion, financial innovation, friendly fire, future of work, hive mind, Internet of things, iterative process, job automation, John Markoff, John von Neumann, Loebner Prize, medical malpractice, Nate Silver, natural language processing, nudge unit, obamacare, off grid, pattern recognition, Peace of Westphalia, race to the bottom, Ray Kurzweil, Rodney Brooks, self-driving car, Silicon Valley, Stanislav Petrov, Stephen Hawking, Steve Wozniak, strong AI, technological singularity, Tesla Model S, The Coming Technological Singularity, The Future of Employment, The Signal and the Noise by Nate Silver, Turing test, Vernor Vinge

If the only data made available to the AI are age and gender, then it is most likely that the AI will select younger men for the job. However, the gender or indeed age of the applicants is not strictly relevant at all to their aptitude. Rather, the key skills which building site labourers need are strength and dexterity. This may be correlated with age, and it may be correlated with gender (especially as regards strength). But it is important not to confuse correlation with causation: both of these data points are merely ciphers for the salient ones of strength and dexterity. If the AI was trained using data based on core aptitudes, then it would result in choices which might still favour young men, but at least it would do so in a way which minimises bias . 3.2.5 Bias in the Training of AI AI training bias applies particularly to reinforcement learning: a type of AI which (as noted in Chapter 2) is trained using a “reward” function when it gets a right answer.


pages: 205 words: 58,054

pages: 187 words: 55,801

The New Division of Labor: How Computers Are Creating the Next Job Market by Frank Levy, Richard J. Murnane

Atul Gawande, business cycle, call centre, computer age, Computer Numeric Control, correlation does not imply causation, David Ricardo: comparative advantage, deskilling, Frank Levy and Richard Murnane: The New Division of Labor, Gunnar Myrdal, hypertext link, index card, information asymmetry, job automation, knowledge economy, knowledge worker, low skilled workers, low-wage service sector, pattern recognition, profit motive, Robert Shiller, Robert Shiller, Ronald Reagan, speech recognition, talking drums, telemarketer, The Wealth of Nations by Adam Smith, working poor

The pattern for routine manual tasks—tasks that might be subsumed by automation—is roughly similar: a slight rise during the 1970s and a steady decline in the subsequent two decades. The share of the labor force working in occupations that emphasize nonroutine manual tasks declined throughout the period. This reflects in part the movement of manufacturing jobs offshore. The data in figures 3.2 (occupations) and 3.5 (tasks) are consistent with our description of computers’ economic impacts. But this correlation does not prove causation—the trend in both figures could have been caused by other factors. To make a stronger case, we must increase the level of detail to look at changes within industries. If our argument is right—if the adoption of computers shifts work away from routine tasks and toward tasks requiring expert thinking and complex communication—it should be observable when we look within industries. Specifically, we can ask: are those industries that invested most heavily in computers the industries where we see the greatest changes in task structure?


pages: 356 words: 95,647

pages: 346 words: 90,371

Rethinking the Economics of Land and Housing by Josh Ryan-Collins, Toby Lloyd, Laurie Macfarlane

"Robert Solow", agricultural Revolution, asset-backed security, balance sheet recession, bank run, banking crisis, barriers to entry, basic income, Bretton Woods, business cycle, Capital in the Twenty-First Century by Thomas Piketty, collective bargaining, Corn Laws, correlation does not imply causation, creative destruction, credit crunch, debt deflation, deindustrialization, falling living standards, financial deregulation, financial innovation, Financial Instability Hypothesis, financial intermediation, full employment, garden city movement, George Akerlof, ghettoisation, Gini coefficient, Hernando de Soto, housing crisis, Hyman Minsky, income inequality, information asymmetry, knowledge worker, labour market flexibility, labour mobility, land reform, land tenure, land value tax, Landlord’s Game, low skilled workers, market bubble, market clearing, Martin Wolf, means of production, money market fund, mortgage debt, negative equity, Network effects, new economy, New Urbanism, Northern Rock, offshore financial centre, Pareto efficiency, place-making, price stability, profit maximization, quantitative easing, rent control, rent-seeking, Richard Florida, Right to Buy, rising living standards, risk tolerance, Second Machine Age, secular stagnation, shareholder value, the built environment, The Great Moderation, The Market for Lemons, The Spirit Level, The Wealth of Nations by Adam Smith, Thomas Malthus, transaction costs, universal basic income, urban planning, urban sprawl, working poor, working-age population

Without the existence of a credit- and money-creating banking system, it is impossible to envisage how such huge increases in prices would have been possible given the slower pace of income growth. The relationship between mortgage credit and house prices also applies internationally. An International Monetary Fund (IMF) study of thirty-six advanced and emerging economies (including the UK) found that a 10 percentage point growth in mortgage credit as a percentage of GDP was associated with a 16 percentage point higher growth of real house prices (IMF, 2011). Of course, correlation is not causation and it is likely that rising house prices, potentially driven by other factors, lead to an increase in demand for mortgage credit which itself helps to drive up house prices (Goodhart and Hofmann, 2008). Figure 5.3 Disaggregated nominal credit stocks (loans outstanding) as % of GDP in the UK since 1963 (source: Bank of England, GDP from ONS; credit series are break adjusted) Figure 5.3 shows how, since the early 1980s, UK banks have significantly increased their lending to domestic mortgages relative to GDP.


Upstream: The Quest to Solve Problems Before They Happen by Dan Heath

Affordable Care Act / Obamacare, airport security, Albert Einstein, bank run, British Empire, Buckminster Fuller, call centre, cloud computing, cognitive dissonance, colonial rule, correlation does not imply causation, cuban missile crisis, en.wikipedia.org, epigenetics, illegal immigration, Internet of things, mandatory minimum, millennium bug, move fast and break things, move fast and break things, payday loans, Ralph Nader, RAND corporation, randomized controlled trial, self-driving car, Skype, Snapchat, subscription business, urban planning, Watson beat the top human players on Jeopardy!, Y2K

A journalist makes the choice to fight on behalf of the millions of women enduring sexual harassment. A woman pressured into a C-section becomes a champion for thousands of other mothers she’ll never meet. The upstream advocate concludes: I was not the one who created this problem. But I will be the one to fix it. That shift in ownership—and its consequences—is what we will analyze next. I. The old warnings about correlation not equaling causation apply here. There was no guarantee that improving freshmen’s FOT scores would boost the graduation rates. But there were good reasons to believe the two were linked causally, and of course they were tracking their efforts so that they could prove it. CHAPTER 3 A Lack of Ownership Until 1994, Ray Anderson, the founder of the industrial carpet firm Interface, had lived every entrepreneur’s dream.


pages: 223 words: 10,010

The Cost of Inequality: Why Economic Equality Is Essential for Recovery by Stewart Lansley

"Robert Solow", banking crisis, Basel III, Big bang: deregulation of the City of London, Bonfire of the Vanities, borderless world, Branko Milanovic, Bretton Woods, British Empire, business cycle, business process, call centre, capital controls, collective bargaining, corporate governance, corporate raider, correlation does not imply causation, creative destruction, credit crunch, Credit Default Swap, crony capitalism, David Ricardo: comparative advantage, deindustrialization, Edward Glaeser, Everybody Ought to Be Rich, falling living standards, financial deregulation, financial innovation, Financial Instability Hypothesis, floating exchange rates, full employment, Goldman Sachs: Vampire Squid, high net worth, hiring and firing, Hyman Minsky, income inequality, James Dyson, Jeff Bezos, job automation, John Meriwether, Joseph Schumpeter, Kenneth Rogoff, knowledge economy, laissez-faire capitalism, light touch regulation, Long Term Capital Management, low skilled workers, manufacturing employment, market bubble, Martin Wolf, mittelstand, mobile money, Mont Pelerin Society, Myron Scholes, new economy, Nick Leeson, North Sea oil, Northern Rock, offshore financial centre, oil shock, plutocrats, Plutocrats, Plutonomy: Buying Luxury, Explaining Global Imbalances, Right to Buy, rising living standards, Robert Shiller, Robert Shiller, Ronald Reagan, savings glut, shareholder value, The Great Moderation, The Spirit Level, The Wealth of Nations by Adam Smith, Thomas Malthus, too big to fail, Tyler Cowen: Great Stagnation, Washington Consensus, Winter of Discontent, working-age population

‘I could hardly believe how tight the fit was—it was a stunning correlation,’ Moss told the New York Times. ‘And it began to raise the question of whether there are causal links between financial deregulation, economic inequality and instability.’240 Of course, as Moss has accepted, correlation is not the same as causation. As one of his critics, R Glenn Hubbard, dean of the Columbian Business School and top economic adviser to former President George W Bush has put it, ‘Cars go faster every year, and GDP rises every year, but that doesn’t mean speed causes GDP.’ 241 The correlation could mean that the direction of causation is from slump to inequality. Yet what is significant about this pattern is that in both the 1920s and the pre-2007 period, inequality rose sharply in the years before recession took hold. There is now an increasing, if still small, body of academics that have attributed the crisis at least in part to rising inequality.


pages: 901 words: 234,905

The Blank Slate: The Modern Denial of Human Nature by Steven Pinker

affirmative action, Albert Einstein, Alfred Russel Wallace, anti-communist, British Empire, clean water, cognitive dissonance, Columbine, conceptual framework, correlation coefficient, correlation does not imply causation, cuban missile crisis, Daniel Kahneman / Amos Tversky, Defenestration of Prague, desegregation, epigenetics, Exxon Valdez, George Akerlof, germ theory of disease, ghettoisation, glass ceiling, Hobbesian trap, income inequality, invention of agriculture, invisible hand, Joan Didion, long peace, meta analysis, meta-analysis, More Guns, Less Crime, Murray Gell-Mann, mutually assured destruction, Norman Mailer, Peter Singer: altruism, phenotype, plutocrats, Plutocrats, Potemkin village, prisoner's dilemma, profit motive, QWERTY keyboard, Richard Feynman, Richard Thaler, risk tolerance, Robert Bork, Rodney Brooks, Saturday Night Live, social intelligence, speech recognition, Stanford prison experiment, stem cell, Steven Pinker, The Bell Curve by Richard Herrnstein and Charles Murray, the new new thing, theory of mind, Thomas Malthus, Thorstein Veblen, twin studies, ultimatum game, urban renewal, War on Poverty, women in the workforce, Yogi Berra, zero-sum game

Equally surprising are the sorry standards to which the great scholar here has sunk. The suggestion that a language can be “grown-up” and “masculine” is so subjective as to be meaningless. He attributes a personality trait to an entire people without any evidence, then advances two theories—that phonology reflects personality, and that warm climates breed laziness—without invoking even correlational data, let alone proof of causation. Even on his home ground the reasoning is flimsy. Languages with a consonant-vowel syllable structure like Hawaiian call for longer words to convey the same amount of information, hardly what you would expect in a people without “vigor or energy.” And the consonant-encrusted syllables of English are liable to be swallowed and misheard, hardly what you would expect from a logical, businesslike people.

The parents of an affectionate child may return that affection and thereby act differently from the parents of a child who squirms and wipes off his parents’ kisses. The parents of a quiet, spacey child might feel they are talking to a wall and jabber at him less. The parents of a docile child can get away with setting firm but reasonable limits; the parents of a hellion might find themselves at their wits’ end and either lay down the law or give up. In other words, correlation does not imply causation. A correlation between parents and children does not mean that parents affect children; it could mean that children affect parents, that genes affect both parents and children, or both. It gets worse. In many studies, the same parties (in some studies the parents, in others the children) supply the data on both the parents’ behavior and the child’s. Parents tell the experimenter how they treat their children and what their children are like, or adolescents tell the experimenter what they are like and how their parents treat them.


pages: 144 words: 43,356

Surviving AI: The Promise and Peril of Artificial Intelligence by Calum Chace

"Robert Solow", 3D printing, Ada Lovelace, AI winter, Airbnb, artificial general intelligence, augmented reality, barriers to entry, basic income, bitcoin, blockchain, brain emulation, Buckminster Fuller, cloud computing, computer age, computer vision, correlation does not imply causation, credit crunch, cryptocurrency, cuban missile crisis, dematerialisation, discovery of the americas, disintermediation, don't be evil, Elon Musk, en.wikipedia.org, epigenetics, Erik Brynjolfsson, everywhere but in the productivity statistics, Flash crash, friendly AI, Google Glasses, hedonic treadmill, industrial robot, Internet of things, invention of agriculture, job automation, John Maynard Keynes: Economic Possibilities for our Grandchildren, John Maynard Keynes: technological unemployment, John von Neumann, Kevin Kelly, life extension, low skilled workers, Mahatma Gandhi, means of production, mutually assured destruction, Nicholas Carr, pattern recognition, peer-to-peer, peer-to-peer model, Peter Thiel, Ray Kurzweil, Rodney Brooks, Second Machine Age, self-driving car, Silicon Valley, Silicon Valley ideology, Skype, South Sea Bubble, speech recognition, Stanislav Petrov, Stephen Hawking, Steve Jobs, strong AI, technological singularity, The Future of Employment, theory of mind, Turing machine, Turing test, universal basic income, Vernor Vinge, wage slave, Wall-E, zero-sum game

The book points out some interesting unexpected side-effects of Big Data. It turns out that having more data beats having better data, if what you want is to be able to understand, predict and influence the behaviour of large numbers of people. It also turns out that if you find a reliable correlation then it often doesn’t matter if there is a causal link between the two phenomena. We all know of cases where correlation has been mistaken for causation and ineffective or counter-productive policies have been imposed as a result. But if a correlation persists long enough it may provide decision-makers with a useful early warning signal. For instance, a supermarket and an insurance company shared data sets and discovered that men buying red meat and milk during the day were better insurance risks than men buying pasta and petrol late at night.


pages: 614 words: 174,226

The Economists' Hour: How the False Prophets of Free Markets Fractured Our Society by Binyamin Appelbaum

"Robert Solow", airline deregulation, Alvin Roth, Andrei Shleifer, anti-communist, battle of ideas, Benoit Mandelbrot, Big bang: deregulation of the City of London, Bretton Woods, British Empire, business cycle, capital controls, Carmen Reinhart, Cass Sunstein, Celtic Tiger, central bank independence, clean water, collective bargaining, Corn Laws, correlation does not imply causation, Credit Default Swap, currency manipulation / currency intervention, David Ricardo: comparative advantage, deindustrialization, Deng Xiaoping, desegregation, Diane Coyle, Donald Trump, ending welfare as we know it, financial deregulation, financial innovation, fixed income, floating exchange rates, full employment, George Akerlof, George Gilder, Gini coefficient, greed is good, Growth in a Time of Debt, income inequality, income per capita, index fund, inflation targeting, invisible hand, Isaac Newton, Jean Tirole, John Markoff, Kenneth Arrow, Kenneth Rogoff, land reform, Long Term Capital Management, low cost airline, manufacturing employment, means of production, Menlo Park, minimum wage unemployment, Mohammed Bouazizi, money market fund, Mont Pelerin Society, Network effects, new economy, oil shock, Paul Samuelson, Philip Mirowski, plutocrats, Plutocrats, price stability, profit motive, Ralph Nader, RAND corporation, rent control, rent-seeking, Richard Thaler, road to serfdom, Robert Bork, Robert Gordon, Ronald Coase, Ronald Reagan, Sam Peltzman, Silicon Valley, Simon Kuznets, starchitect, Steve Jobs, supply-chain management, The Chicago School, The Great Moderation, The Myth of the Rational Market, The Wealth of Nations by Adam Smith, Thomas Kuhn: the structure of scientific revolutions, transaction costs, trickle-down economics, ultimatum game, Unsafe at Any Speed, urban renewal, War on Poverty, Washington Consensus

By the end of the decade, Keynesian economics had entered the high summer of its self-regard. Leading economists insisted governments could adjust economic conditions like the settings on a thermostat. The economist A. W. Phillips plotted the relationship between unemployment and wages in the United Kingdom over the previous century, and found wages tended to rise when unemployment was low. Enthusiastically conflating correlation and causation, economists concluded governments could glide up and down a “Phillips curve,” trading off unemployment and inflation. In an influential paper published in 1960, Samuelson and Robert Solow, two of the most important economists of the postwar era, said the American government could choose from a “menu” of unemployment and inflation rates. The available options included 5 to 6 percent unemployment with no inflation or, if one preferred, 3 percent unemployment with 4 to 5 percent inflation.12 In Great Britain, one economist recalled a meeting in the early 1960s that devolved into an emotional confrontation between those wanting to limit unemployment to 1.25 percent and those favoring 1.75 percent: “There was a figure called Professor Frank Paish who proposed 2.5 percent, who was regarded as, more or less, a Nazi.”13 Washington’s embrace of economics began in the engine rooms of government.


pages: 272 words: 66,985

Hyperfocus: How to Be More Productive in a World of Distraction by Chris Bailey

"side hustle", Albert Einstein, Any sufficiently advanced technology is indistinguishable from magic, Cal Newport, Chuck Templeton: OpenTable:, Clayton Christensen, correlation does not imply causation, deliberate practice, functional fixedness, game design, knowledge economy, knowledge worker, Parkinson's law, randomized controlled trial, Richard Feynman, Skype, twin studies, Zipcar

Women experience fewer interruptions and interrupt themselves less overall. And they do so while working on more projects at once. Compared with men, women are also happier and more engaged in the workplace. * Something that’s gone down since the introduction of the smartphone? Chewing gum sales. Since 2007—the year the iPhone was introduced—gum sales have plummeted 17 percent. Obviously correlation doesn’t imply causation, but it does make you wonder. * Curiously, the distractions you’re most likely to fall victim to differ depending on what you’re working on. When you’re doing rote work, you’re significantly more likely to visit Facebook or initiate a face-to-face interaction with a coworker. When you’re focused on more complex work, you’re more likely to check your email. * This isn’t the case if you’re a manager or team leader, however—in this case, 60 percent of your interruptions come from others


pages: 339 words: 105,938

The Skeptical Economist: Revealing the Ethics Inside Economics by Jonathan Aldred

airport security, Berlin Wall, carbon footprint, citizen journalism, clean water, cognitive dissonance, congestion charging, correlation does not imply causation, Diane Coyle, endogenous growth, experimental subject, Fall of the Berlin Wall, first-past-the-post, framing effect, greed is good, happiness index / gross national happiness, hedonic treadmill, Intergovernmental Panel on Climate Change (IPCC), invisible hand, job satisfaction, John Maynard Keynes: Economic Possibilities for our Grandchildren, labour market flexibility, laissez-faire capitalism, libertarian paternalism, longitudinal study, new economy, Pareto efficiency, pension reform, positional goods, Ralph Waldo Emerson, RAND corporation, risk tolerance, school choice, spectrum auction, Thomas Bayes, trade liberalization, ultimatum game

And if it can, is that what these surveys are measuring? With its breezy optimism the happiness experts’ reading of the evidence ignores some awkward objections. To begin with, it is easy to pick holes in the neuroscience. As ever, the problem is not the science itself, but the interpretative spin put on the results. To begin with, the core of the neuroscientific research is a set of correlations which do not demonstrate any causation. There is little understanding of why external stimuli are associated with increased brain activity, so there is no basis for assuming causation. And even the correlations are less robust than they appear, because of the assumptions which have been made to derive them. For instance, most of the research adopts the ‘subtractive method’, in which measurements of brain activity in the control condition (when there is no stimulus) are subtracted from measurements in the experimental condition (when the stimulus is present).


pages: 593 words: 189,857

Stress Test: Reflections on Financial Crises by Timothy F. Geithner

Affordable Care Act / Obamacare, asset-backed security, Atul Gawande, bank run, banking crisis, Basel III, Bernie Madoff, Bernie Sanders, break the buck, Buckminster Fuller, Carmen Reinhart, central bank independence, collateralized debt obligation, correlation does not imply causation, creative destruction, credit crunch, Credit Default Swap, credit default swaps / collateralized debt obligations, currency manipulation / currency intervention, David Brooks, Doomsday Book, eurozone crisis, financial innovation, Flash crash, Goldman Sachs: Vampire Squid, housing crisis, Hyman Minsky, illegal immigration, implied volatility, Kickstarter, London Interbank Offered Rate, Long Term Capital Management, margin call, market fundamentalism, Martin Wolf, McMansion, Mexican peso crisis / tequila crisis, money market fund, moral hazard, mortgage debt, Nate Silver, negative equity, Northern Rock, obamacare, paradox of thrift, pets.com, price stability, profit maximization, pushing on a string, quantitative easing, race to the bottom, RAND corporation, regulatory arbitrage, reserve currency, Saturday Night Live, savings glut, selection bias, short selling, sovereign wealth fund, The Great Moderation, The Signal and the Noise by Nate Silver, Tobin tax, too big to fail, working poor

I helped call more attention to the vulnerabilities that come from risky forms of financing, alongside the IMF’s traditional focus on large fiscal deficits and inflation. And while the Bush team liked to criticize the bailouts of the Clinton era, they ultimately supported large IMF rescue packages for Brazil, Uruguay, and Turkey with the familiar wall-of-money strategy. That was what the IMF was for. Years later, Mervyn King, the governor of the Bank of England, joked at a farewell dinner that I was a textbook proof of the difference between correlation and causation. “Tim was present at all the crises,” he said. “But he didn’t cause the crises. The crises caused him.” Again and again, I got to see how indulgent capital financed booms, how cracks in confidence turned boom to bust to panic, how crisis managers could help contain panics with decisiveness and overwhelming force, and how the kind of actions needed to defuse crises were inherently unpopular and fraught with risk.


pages: 366 words: 76,476

Dataclysm: Who We Are (When We Think No One's Looking) by Christian Rudder

4chan, Affordable Care Act / Obamacare, bitcoin, cloud computing, correlation does not imply causation, crowdsourcing, cuban missile crisis, Donald Trump, Edward Snowden, en.wikipedia.org, Frank Gehry, Howard Zinn, Jaron Lanier, John Markoff, John Snow's cholera map, lifelogging, Mahatma Gandhi, Mikhail Gorbachev, Nate Silver, Nelson Mandela, new economy, obamacare, Occupy movement, p-value, pre–internet, race to the bottom, selection bias, Snapchat, social graph, Solar eclipse in 1919, Steve Jobs, the scientific method

Often that cause is my best guess, given my understanding of all the forces in play. To interpret results—a necessity in any book that isn’t just reams of numbers—I had to choose one explanation from a variety of possibilities. Is there some force besides age behind what I call Wooderson’s law (the fact that straight men of all ages are most interested in twenty-year-old women)? Perhaps. But I think it is very unlikely. “Correlation does not imply causation” is a good thing for everyone to keep in mind—and an excellent check on narrative overreach. But a snappy phrase doesn’t mean that the question of causation isn’t itself interesting, and I’ve tried to attribute causes only where they are most justified. For almost all the parts of Dataclysm that overlap with posts on OkCupid’s blog, I chose to redo the work from scratch, on the most recent data, rather than quote my own previous findings.


pages: 249 words: 77,342

The Behavioral Investor by Daniel Crosby

affirmative action, Asian financial crisis, asset allocation, availability heuristic, backtesting, bank run, Black Swan, buy and hold, cognitive dissonance, colonial rule, compound rate of return, correlation coefficient, correlation does not imply causation, Daniel Kahneman / Amos Tversky, diversification, diversified portfolio, Donald Trump, endowment effect, feminist movement, Flash crash, haute cuisine, hedonic treadmill, housing crisis, IKEA effect, impulse control, index fund, Isaac Newton, job automation, longitudinal study, loss aversion, market bubble, market fundamentalism, mental accounting, meta analysis, meta-analysis, Milgram experiment, moral panic, Murray Gell-Mann, Nate Silver, neurotypical, passive investing, pattern recognition, Ponzi scheme, prediction markets, random walk, Richard Feynman, Richard Thaler, risk tolerance, Robert Shiller, Robert Shiller, science of happiness, Shai Danziger, short selling, South Sea Bubble, Stanford prison experiment, Stephen Hawking, Steve Jobs, stocks for the long run, Thales of Miletus, The Signal and the Noise by Nate Silver, tulip mania, Vanguard fund

And then consider the strange case of the correlation between moves in the S&P 500 and Bangladeshi butter production. Yes, you read that right – Bangladeshi butter production. This relationship, which accounts for 95% of covariance, is of course spurious even though the fit is nearly perfect. The relationship was discovered and set forth by researchers anxious to prove the old axiom that correlation does not equal causation and to show that by analysing a glut of information you are bound to find relationships, even if no causal relationship exists. In a world of big data, we all too often fail to see the forest of “is this a good business?” for looking at the trees of esoteric data points. No matter what exotic economic measures professors and pundits may dream up in the future, there will always be some that show some fleeting correlation with stock returns, but fail to pass the sniff test of “Should it matter when determining whether or not to become partial owner of a business?”


pages: 272 words: 78,876

Heart: A History by Sandeep Jauhar

blue-collar work, clean water, correlation does not imply causation, Honoré de Balzac, John Snow's cholera map, mass immigration, medical residency, placebo effect, publish or perish, Rubik’s Cube, selection bias, stem cell, the scientific method

In one example, patients who were depressed after a heart attack were four times as likely to die within six months as those who were not, irrespective of usual Framingham risk factors like high cholesterol, hypertension, obesity, and smoking. In another study, menopausal women with no history of cardiovascular disease who expressed more hopelessness on a psychological questionnaire had more carotid artery thickening and an older vascular age than matched patients who felt good about their lives.* No doubt many of these studies are small, and of course correlation does not prove causation; it is certainly possible that stress leads to unhealthy habits—poor nutrition, less physical activity, more smoking—and this is the real reason for the increased cardiovascular risk. But as with the association of smoking with lung cancer, when so many studies show the same thing and there are mechanisms to explain a causal relationship, it seems perverse to deny that one probably exists.


pages: 335 words: 114,039

David Mitchell: Back Story by David Mitchell

British Empire, call centre, correlation does not imply causation, credit crunch, Desert Island Discs, Downton Abbey, energy security, Kickstarter, lateral thinking

Although it may explain some of the murders (see Book 2). I’m always suspicious of that ‘comedy comes from pain’ reasoning. Trite magazine interviewers talk to comedians, tease a perfectly standard amount of doubt, fear and self-analysis out of them and infer therefrom that it’s this phenomenon of not-feeling-perpetually-fine that allowed them to come up with that amusing routine about towels. Well, correlation is not causation, as they say on Radio 4’s statistics programme More or Less. Everyone’s unhappy sometimes, and not everyone is funny. The interviewers may as well infer that the comedy comes from the inhalation of oxygen. Which of course it partly does. We have no evidence for any joke ever having emanated from a non-oxygen-breathing organism. At a sub-atomic level, oxygen is absolutely packed with hilarions.


pages: 388 words: 119,492

Ghettoside: A True Story of Murder in America by Jill Leovy

Affordable Care Act / Obamacare, always be closing, Cass Sunstein, correlation does not imply causation, Gunnar Myrdal, illegal immigration, mass incarceration

The tables I’ve compiled include names of victims, circumstances of deaths, and, in many cases, observations made at crime scenes and funerals and information provided by families and detectives. Over the years, in search of clarity on clearance rates, I have conducted surveys of case outcomes by calling or visiting the assigned detectives or their field supervisors and asking for updates. For years now, I have tried to penetrate the mystery of disproportionate black homicide. Correlation is not causation. I wanted to know exactly what was happening and why. I’ve sought answers in reported facts and observations, and tried to avoid pat speculation and received wisdom. Mostly, I’ve relied on what I have myself seen or heard directly from those who are close to homicide. I have made deliberate efforts to listen to the bereaved—to seek out the parents, siblings, spouses, and children of black homicide victims, whose viewpoints are under-represented in our national debates over criminal justice.


pages: 467 words: 116,094

I Think You'll Find It's a Bit More Complicated Than That by Ben Goldacre

call centre, conceptual framework, correlation does not imply causation, crowdsourcing, death of newspapers, Desert Island Discs, en.wikipedia.org, experimental subject, Firefox, Flynn Effect, jimmy wales, John Snow's cholera map, Loebner Prize, meta analysis, meta-analysis, moral panic, placebo effect, publication bias, selection bias, selective serotonin reuptake inhibitor (SSRI), Simon Singh, statistical model, stem cell, the scientific method, Turing test, WikiLeaks

Systematic reviews of randomised trials are considered to be the most reliable: because they ensure that your conclusions are based on all of the information, rather than just some of it; and because randomised trials – when conducted properly – are the least vulnerable to bias, and so they are the ‘most fair tests’. After these, there are observational studies: these are much more prone to bias, and produce findings which might just reflect correlation instead of causation (‘People who choose to eat vegetables live longer’) but they are generally cheaper to do. Then there are individual case reports. And then, finally, because medical academics like to think they’re funny, right at the bottom of the hierarchy you will find something called ‘expert opinion’. In the Dartmouth study, among the press releases covering human research, only 17 per cent promoted the studies with the strongest designs, either randomised trials or meta-analyses.


pages: 432 words: 124,635

Happy City: Transforming Our Lives Through Urban Design by Charles Montgomery

2013 Report for America's Infrastructure - American Society of Civil Engineers - 19 March 2013, agricultural Revolution, American Society of Civil Engineers: Report Card, Bernie Madoff, British Empire, Buckminster Fuller, car-free, carbon footprint, centre right, City Beautiful movement, clean water, congestion charging, correlation does not imply causation, East Village, edge city, energy security, Enrique Peñalosa, experimental subject, Frank Gehry, Google Earth, happiness index / gross national happiness, hedonic treadmill, Home mortgage interest deduction, housing crisis, income inequality, income per capita, Induced demand, Intergovernmental Panel on Climate Change (IPCC), invisible hand, Jane Jacobs, license plate recognition, McMansion, means of production, megacity, Menlo Park, meta analysis, meta-analysis, mortgage tax deduction, New Urbanism, Panopticon Jeremy Bentham, peak oil, Ponzi scheme, rent control, ride hailing / ride sharing, risk tolerance, science of happiness, Seaside, Florida, Silicon Valley, starchitect, the built environment, The Death and Life of Great American Cities, the High Line, The Spirit Level, The Wealth of Nations by Adam Smith, trade route, transit-oriented development, upwardly mobile, urban planning, urban sprawl, wage slave, white flight, World Values Survey, zero-sum game, Zipcar

And people who trust their neighbors feel a greater sense of that belonging. And that sense of belonging is influenced by social contact. And casual encounters (such as, say, the kind that might happen around a volleyball court on a Friday night) are just as important to belonging and trust as contact with family and close friends. It is hard to say which condition is lifting the others—Helliwell admits that his statistical analysis demonstrates correlation rather than causation—but what is strikingly apparent is that trust, feelings of belonging, social time, and happiness are like balloons tied together in a bouquet. They rise and fall together. This suggests that it has been a terrible mistake to design cities around the nuclear family at the expense of other ties. But it also suggests that even the high-status, deeply desired, uniquely biophilic brand of verticalism embodied by Vancouverism and McDowell’s high-rise apartment is not a panacea.


pages: 484 words: 136,735

Capitalism 4.0: The Birth of a New Economy in the Aftermath of Crisis by Anatole Kaletsky

"Robert Solow", bank run, banking crisis, Benoit Mandelbrot, Berlin Wall, Black Swan, bonus culture, Bretton Woods, BRICs, business cycle, buy and hold, Carmen Reinhart, cognitive dissonance, collapse of Lehman Brothers, Corn Laws, correlation does not imply causation, creative destruction, credit crunch, currency manipulation / currency intervention, David Ricardo: comparative advantage, deglobalization, Deng Xiaoping, Edward Glaeser, Eugene Fama: efficient market hypothesis, eurozone crisis, experimental economics, F. W. de Klerk, failed state, Fall of the Berlin Wall, financial deregulation, financial innovation, Financial Instability Hypothesis, floating exchange rates, full employment, George Akerlof, global rebalancing, Hyman Minsky, income inequality, information asymmetry, invisible hand, Isaac Newton, Joseph Schumpeter, Kenneth Arrow, Kenneth Rogoff, Kickstarter, laissez-faire capitalism, Long Term Capital Management, mandelbrot fractal, market design, market fundamentalism, Martin Wolf, money market fund, moral hazard, mortgage debt, Nelson Mandela, new economy, Northern Rock, offshore financial centre, oil shock, paradox of thrift, Pareto efficiency, Paul Samuelson, peak oil, pets.com, Ponzi scheme, post-industrial society, price stability, profit maximization, profit motive, quantitative easing, Ralph Waldo Emerson, random walk, rent-seeking, reserve currency, rising living standards, Robert Shiller, Robert Shiller, Ronald Reagan, shareholder value, short selling, South Sea Bubble, sovereign wealth fund, special drawing rights, statistical model, The Chicago School, The Great Moderation, The inhabitant of London could order by telephone, sipping his morning tea in bed, the various products of the whole earth, The Wealth of Nations by Adam Smith, Thomas Kuhn: the structure of scientific revolutions, too big to fail, Vilfredo Pareto, Washington Consensus, zero-sum game

The question therefore arises whether Japan is now the most plausible model for a New Normal of sluggish growth and financial paralysis in the United States, Britain, and other economies emerging from the credit crunch. Luckily, this analogy between Japan and the Western world looks increasingly far-fetched. It is certainly true that the Japanese financial system remained paralyzed for a decade as banks and borrowers survived on government life support and failed to recognize their true losses. It is true also that the Japanese economy spent twenty years almost continuously in recession. But correlation is not causation. The question that needs to be asked about the Japanese experience is whether government support for struggling banks and overindebted borrowers caused the twenty years of stagnation or whether twenty years of economic stagnation prevented a recovery for weak borrowers and banks. A similar question must be asked about a fascinating and much-quoted historic study, coauthored by Carmen Reinhart and Kenneth Rogoff, the IMF’s former chief economist, which looked at the macroeconomic effect of financial crises in dozens of countries over the past six hundred years.


pages: 588 words: 131,025

The Patient Will See You Now: The Future of Medicine Is in Your Hands by Eric Topol

23andMe, 3D printing, Affordable Care Act / Obamacare, Anne Wojcicki, Atul Gawande, augmented reality, bioinformatics, call centre, Clayton Christensen, clean water, cloud computing, commoditize, computer vision, conceptual framework, connected car, correlation does not imply causation, creative destruction, crowdsourcing, dark matter, data acquisition, disintermediation, disruptive innovation, don't be evil, Edward Snowden, Elon Musk, en.wikipedia.org, Erik Brynjolfsson, Firefox, global village, Google Glasses, Google X / Alphabet X, Ignaz Semmelweis: hand washing, information asymmetry, interchangeable parts, Internet of things, Isaac Newton, job automation, Julian Assange, Kevin Kelly, license plate recognition, lifelogging, Lyft, Mark Zuckerberg, Marshall McLuhan, meta analysis, meta-analysis, microbiome, Nate Silver, natural language processing, Network effects, Nicholas Carr, obamacare, pattern recognition, personalized medicine, phenotype, placebo effect, RAND corporation, randomized controlled trial, Second Machine Age, self-driving car, Silicon Valley, Skype, smart cities, Smart Cities: Big Data, Civic Hackers, and the Quest for a New Utopia, Snapchat, social graph, speech recognition, stealth mode startup, Steve Jobs, the scientific method, The Signal and the Noise by Nate Silver, The Wealth of Nations by Adam Smith, Turing test, Uber for X, uber lyft, Watson beat the top human players on Jeopardy!, WikiLeaks, X Prize

Later, a team of four highly respected data scientists wrote in Science that GFT had systematically overestimated the prevalence of flu every week since August 2011, going on to criticize “big data hubris,” the “often implicit assumption that big data are a substitute for, rather than a supplement to, traditional data collection and analysis.”17 They attacked the “algorithm dynamics” of GFT, pointing out that the forty-five search terms used were never documented, key elements such as core search terms were not provided in the publications, and the original algorithm did not undergo constant adjustment and recalibration. What’s more, while the GFT algorithm was static, the search engine itself underwent constant change—as many as six hundred revisions per year—which was not taken into account. Many other editorialists opined on the matter.13–15,18,19 Correlation rather than causation and the critical absence of context were the most prominent critique points. There was also the sampling issue as the crowdsourcing was limited to those doing searches on Google. Further, there was a major analytical problem: GFT performed so many multiple comparisons of data that they were likely to be getting spurious results. These can all be viewed as common traps when we are trying to understand the world through data.13 As Krenchel and Madsbjerg wrote in Wired, “The real big data hubris is not that we have too much confidence in a set of algorithms and methods that aren’t quite there yet.


pages: 502 words: 128,126

Rule Britannia: Brexit and the End of Empire by Danny Dorling, Sally Tomlinson

3D printing, Ada Lovelace, Alfred Russel Wallace, anti-communist, anti-globalists, Big bang: deregulation of the City of London, Boris Johnson, British Empire, centre right, colonial rule, Corn Laws, correlation does not imply causation, David Ricardo: comparative advantage, deindustrialization, Dominic Cummings, Donald Trump, Edward Snowden, en.wikipedia.org, epigenetics, Etonian, falling living standards, Flynn Effect, housing crisis, illegal immigration, imperial preference, income inequality, inflation targeting, invisible hand, knowledge economy, market fundamentalism, mass immigration, megacity, New Urbanism, Nick Leeson, North Sea oil, offshore financial centre, out of africa, Right to Buy, Ronald Reagan, Silicon Valley, South China Sea, sovereign wealth fund, spinning jenny, Steven Pinker, The Wealth of Nations by Adam Smith, Thomas Malthus, University of East Anglia, We are the 99%, wealth creators

As one voter retorted: ‘I am quite trim and voted out of the undemocratic EU ruled by unelected commissioners.’18 On 30 June 2016, ITV reported the news as ‘UEA research claims link between obesity and Brexit voters’.19 On 1 July, the Daily Express ran the headline ‘Outrage after academic says Brexit voters were probably FAT’,20 reporting that ‘the jibe is the latest insult fired at leave voters already labelled “stupid” and “Little Englanders” on social media sites’. No newspaper asked why the correlation was so high and what it could really be telling us. Areas in which people tend to be thinner are often very similar to each other in many other ways. Peter Ormosi managed to produce one of the best examples of why correlation is not causation. He used public health data to suggest that the more obese people there were in an area, the more the area voted Leave.21 Many other analyses showed that the fewer immigrants there were in the area, the more people voted to leave, in both working- and middle-class areas. The village in the Cotswolds where the trimmer one of the two of us lives was very much pro-Leave, for example, and is home to almost no immigrants from overseas.


pages: 518 words: 143,914

God Is Back: How the Global Revival of Faith Is Changing the World by John Micklethwait, Adrian Wooldridge

affirmative action, anti-communist,