Bayesian statistics

47 results back to index


pages: 589 words: 69,193

Mastering Pandas by Femi Anthony

Amazon Web Services, Bayesian statistics, correlation coefficient, correlation does not imply causation, Debian, en.wikipedia.org, Internet of things, natural language processing, p-value, random walk, side project, statistical model, Thomas Bayes

Index A .at operatorabout / The .iat and .at operators Active State PythonURL / Third-party Python software installation aggregate methodusing / Using the aggregate method aggregation, in Rabout / Aggregation in R aliases, for Time Series frequenciesabout / Aliases for Time Series frequencies alphaabout / The alpha and p-values alternative hypothesisabout / The null and alternative hypotheses Anacondaabout / Continuum Analytics Anaconda URL / Continuum Analytics Anaconda, Final step for all platforms, Other numeric or analytics-focused Python distributions installing / Installing Anaconda URL, for download / Installing Anaconda installing, on Linux / Linux installing, on Mac OS/X / Mac OS X installing, on Windows / Windows installing, final steps / Final step for all platforms numeric or analytics-focused Python distributions / Other numeric or analytics-focused Python distributions IPython installation / Install via Anaconda (for Linux/Mac OS X) scikit-learn, installing via / Installing via Anaconda appendusing / Using append arithmetic operationsapplying, on columns / Arithmetic operations on columns B Bayesian analysis exampleswitchpoint detection / Bayesian analysis example – Switchpoint detection Bayesiansabout / How the model is defined Bayesian statistical analysisconducting, steps / Conducting Bayesian statistical analysis Bayesian statisticsabout / Introduction to Bayesian statistics reference link / Introduction to Bayesian statistics mathematical framework / Mathematical framework for Bayesian statistics references / Mathematical framework for Bayesian statistics, Applications of Bayesian statistics, References applications / Applications of Bayesian statistics versus Frequentist statistics / Bayesian statistics versus Frequentist statistics Bayes theoryabout / Bayes theory and odds Bernoulli distributionabout / The Bernoulli distribution reference link / The Bernoulli distribution big datareferences / We live in a big data world 4V’s / 4 V's of big data about / 4 V's of big data examples / The move towards real-time analytics binomial distributionabout / The binomial distribution Boolean indexingabout / Boolean indexing any() method / The is in and any all methods isin method / The is in and any all methods all method / The is in and any all methods where() method, using / Using the where() method indexes, operations / Operations on indexes C 4-4-5 calendarreference link / pandas/tseries central limit theoremreference link / Background central limit theorem (CLT)about / The mean classes, converter.pyConverter / pandas/tseries Formatters / pandas/tseries Locators / pandas/tseries classes, offsets.pyDateOffset / pandas/tseries BusinessMixin / pandas/tseries MonthOffset / pandas/tseries MonthBegin / pandas/tseries MonthEnd / pandas/tseries BusinessMonthEnd / pandas/tseries BusinessMonthBegin / pandas/tseries YearOffset / pandas/tseries YearBegin / pandas/tseries YearEnd / pandas/tseries BYearEnd / pandas/tseries BYearBegin / pandas/tseries Week / pandas/tseries WeekDay / pandas/tseries WeekOfMonth / pandas/tseries LastWeekOfMonth / pandas/tseries QuarterOffset / pandas/tseries QuarterEnd / pandas/tseries QuarterrBegin / pandas/tseries BQuarterEnd / pandas/tseries BQuarterBegin / pandas/tseries FY5253Quarter / pandas/tseries FY5253 / pandas/tseries Easter / pandas/tseries Tick / pandas/tseries classes, parsers.pyTextFileReader / pandas/io ParserBase / pandas/io CParserWrapper / pandas/io PythonParser / pandas/io FixedWidthReader / pandas/io FixedWithFieldParser / pandas/io classes, plm.pyPanelOLS / pandas/stats MovingPanelOLS / pandas/stats NonPooledPanelOLS / pandas/stats classes, sql.pyPandasSQL / pandas/io PandasSQLAlchemy / pandas/io PandasSQLTable / pandas/io PandasSQLTableLegacy / pandas/io PandasSQLLegacy / pandas/io columnmultiple functions, applying to / Applying multiple functions column namespecifying, in R / Specifying column name in R specifying, in pandas / Specifying column name in pandas columnsarithmetic operations, applying on / Arithmetic operations on columns concat functionabout / The concat function concat function, elementsobjs function / The concat function axis function / The concat function join function / The concat function join_axes function / The concat function keys function / The concat function concat operationreference link / The join function Condadocumentation, URL / Final step for all platforms conda commandURL / Final step for all platforms Confidence (Frequentist) intervalversus Credible (Bayesian) interval / Confidence (Frequentist) versus Credible (Bayesian) intervals confidence intervalabout / Confidence intervals example / An illustrative example container types, RVector / R data types List / R data types DataFrame / R data types Matrix / R data types continuous probability distributionsabout / Continuous probability distributions continuous uniform distribution / The continuous uniform distribution exponential distribution / The exponential distribution normal distribution / The normal distribution continuous uniform distributionabout / The continuous uniform distribution Continuum AnalyticsURL / Third-party Python software installation correlationabout / Correlation and linear regression, Correlation reference link / Correlation, An illustrative example Credible (Bayesian) intervalversus Confidence (Frequentist) interval / Confidence (Frequentist) versus Credible (Bayesian) intervals cross-sections / Cross sections cut() function, pandasabout / The pandas solution cut() method, Rabout / An R example using cut() reference link / An R example using cut() Cython / What is pandas?

A Tour of Statistics – The Classical Approach Descriptive statistics versus inferential statistics Measures of central tendency and variability Measures of central tendency The mean The median The mode Computing measures of central tendency of a dataset in Python Measures of variability, dispersion, or spread Range Quartile Deviation and variance Hypothesis testing – the null and alternative hypotheses The null and alternative hypotheses The alpha and p-values Type I and Type II errors Statistical hypothesis tests Background The z-test The t-test Types of t-tests A t-test example Confidence intervals An illustrative example Correlation and linear regression Correlation Linear regression An illustrative example Summary 8. A Brief Tour of Bayesian Statistics Introduction to Bayesian statistics Mathematical framework for Bayesian statistics Bayes theory and odds Applications of Bayesian statistics Probability distributions Fitting a distribution Discrete probability distributions Discrete uniform distributions The Bernoulli distribution The binomial distribution The Poisson distribution The Geometric distribution The negative binomial distribution Continuous probability distributions The continuous uniform distribution The exponential distribution The normal distribution Bayesian statistics versus Frequentist statistics What is probability? How the model is defined Confidence (Frequentist) versus Credible (Bayesian) intervals Conducting Bayesian statistical analysis Monte Carlo estimation of the likelihood function and PyMC Bayesian analysis example – Switchpoint detection References Summary 9.

A Brief Tour of Bayesian Statistics In this chapter, we will take a brief tour of an alternative approach to statistical inference called Bayesian statistics. It is not intended to be a full primer but just serve as an introduction to the Bayesian approach. We will also explore the associated Python-related libraries, how to use pandas, and matplotlib to help with the data analysis. The various topics that will be discussed are as follows: Introduction to Bayesian statistics Mathematical framework for Bayesian statistics Probability distributions Bayesian versus Frequentist statistics Introduction to PyMC and Monte Carlo simulation Illustration of Bayesian inference – Switchpoint detection Introduction to Bayesian statistics The field of Bayesian statistics is built on the work of Reverend Thomas Bayes, an 18th century statistician, philosopher, and Presbyterian minister.


pages: 561 words: 120,899

The Theory That Would Not Die: How Bayes' Rule Cracked the Enigma Code, Hunted Down Russian Submarines, and Emerged Triumphant From Two Centuries of Controversy by Sharon Bertsch McGrayne

Bayesian statistics, bioinformatics, British Empire, Claude Shannon: information theory, Daniel Kahneman / Amos Tversky, double helix, Edmond Halley, Fellow of the Royal Society, full text search, Henri Poincaré, Isaac Newton, Johannes Kepler, John Markoff, John Nash: game theory, John von Neumann, linear programming, longitudinal study, meta analysis, meta-analysis, Nate Silver, p-value, Pierre-Simon Laplace, placebo effect, prediction markets, RAND corporation, recommendation engine, Renaissance Technologies, Richard Feynman, Richard Feynman: Challenger O-ring, Robert Mercer, Ronald Reagan, speech recognition, statistical model, stochastic process, Thomas Bayes, Thomas Kuhn: the structure of scientific revolutions, traveling salesman, Turing machine, Turing test, uranium enrichment, Yom Kippur War

JASA (95) 1282–86 Couzin, Jennifer. (2004) The new math of clinical trials. Science (303) 784–86. DeGroot, Morris H. (1986b) A conversation with Persi Diaconis. Statistical Science (1:3) 319–34. Diaconis P, Efron B. (1983) Computer-intensive methods in statistics. Scientific American (248) 116–30. Diaconis, Persi. (1985) Bayesian statistics as honest work. Proceedings of the Berkeley Conference in Honor of Jerzy Neyman and Jack Kiefer (1), eds., Lucien M. Le Cam and Richard A. Olshen. Wadsworth. Diaconis P, Holmes S. (1996) Are there still things to do in Bayesian statistics? Erkenntnis (45) 145–58. Diaconis P. (1998) A place for philosophy? The rise of modeling in statistical science. Quarterly of Applied Mathematics (56:4) 797–805. DuMouchel WH, Harris JE. (1983) Bayes methods for combining the results of cancer studies in humans and other species.

Today, Bayes’ rule is used everywhere from DNA de-coding to Homeland Security. Drawing on primary source material and interviews with statisticians and other scientists, The Theory That Would Not Die is the riveting account of how a seemingly simple theorem ignited one of the greatest controversies of all time”—Provided by publisher. Includes bibliographical references and index. ISBN 978-0-300-16969-0 (hardback) 1. Bayesian statistical decision theory—History. I. Title. QA279.5.M415 2011 519.5’42—dc22 2010045037 A catalogue record for this book is available from the British Library. This paper meets the requirements of ANSI/NISO Z39.48–1992 (Permanence of Paper). 10 9 8 7 6 5 4 3 2 1 When the facts change, I change my opinion. What do you do, sir? —John Maynard Keynes contents Preface and Note to Readers Acknowledgments Part I.

Bayes combined judgments based on prior hunches with probabilities based on repeatable experiments. He introduced the signature features of Bayesian methods: an initial belief modified by objective new information. He could move from observations of the world to abstractions about their probable cause. And he discovered the long-sought grail of probability, what future mathematicians would call the probability of causes, the principle of inverse probability, Bayesian statistics, or simply Bayes’ rule. Given the revered status of his work today, it is also important to recognize what Bayes did not do. He did not produce the modern version of Bayes’ rule. He did not even employ an algebraic equation; he used Newton’s old-fashioned geometric notation to calculate and add areas. Nor did he develop his theorem into a powerful mathematical method. Above all, unlike Price, he did not mention Hume, religion, or God.


Bulletproof Problem Solving by Charles Conn, Robert McLean

active transport: walking or cycling, Airbnb, Amazon Mechanical Turk, asset allocation, availability heuristic, Bayesian statistics, Black Swan, blockchain, business process, call centre, carbon footprint, cloud computing, correlation does not imply causation, Credit Default Swap, crowdsourcing, David Brooks, Donald Trump, Elon Musk, endowment effect, future of work, Hyperloop, Innovator's Dilemma, inventory management, iterative process, loss aversion, meta analysis, meta-analysis, Nate Silver, nudge unit, Occam's razor, pattern recognition, pets.com, prediction markets, principal–agent problem, RAND corporation, randomized controlled trial, risk tolerance, Silicon Valley, smart contracts, stem cell, the rule of 72, the scientific method, The Signal and the Noise by Nate Silver, time value of money, transfer pricing, Vilfredo Pareto, walkable city, WikiLeaks

Adding more variables may improve the performance of the regression analysis—but adding more variables may then be overfitting the data. This problem is a consequence of the underlying mathematics—and a reminder to always use the simplest model that sufficiently explains your phenomenon. Bayesian Statistics and the Space Shuttle Challenger Disaster For those who lived through the Space Shuttle Challenger disaster, it is remembered as an engineering failure. It was that of course, but more importantly it was a problem solving failure. It involved risk assessment relating to O‐ring damage that we now know is best assessed with Bayesian statistics. Bayesian statistics are useful in incomplete data environments, and especially as a way of assessing conditional probability in complex situations. Conditional probability occurs in situations where a set of probable outcomes depends in turn on another set of conditions that are also probabilistic.

To illustrate each of these analytic tools in action, we provide case examples of how they are used in problem solving. We start with simple data analysis and then move on to multiple regression, Bayesian statistics, simulations, constructed experiments, natural experiments, machine learning, crowd‐sourced problem solving, and finish up with another big gun for competitive settings, game theory. Of course each of these tools could warrant a textbook on their own, so this is necessarily only an introduction to the power and applications of each technique. Summary of Case Studies Data visualization: London air quality Multivariate regression: Understanding obesity Bayesian statistics: Space Shuttle Challenger disaster Constructed experiments: RCTs and A|B testing Natural experiments: Voter prejudice Simulations: Climate change example Machine learning: Sleep apnea, bus routing, and shark spotting Crowd‐sourcing algorithms Game theory: Intellectual property and serving in tennis It is a reasonable amount of effort to work through these, but bear with us—these case studies will give you a solid sense of which advanced tool to use in a variety of problem settings.

The resulting posterior probability of failure given launch at 31F is a staggering 99.8%, almost identical to the estimate of another research team who also used Bayesian analysis. Several lessons emerge for the use of big guns in data analysis from the Challenger disaster. First is that the choice of model, in this case Bayesian statistics, can have an impact on conclusions about risks, in this case catastrophic risks. Second is that it takes careful thinking to arrive at the correct conditional probability. Finally, how you handle extreme values like launch temperature at 31F, when the data is incomplete, requires a probabilistic approach where a distribution is fitted to available data. Bayesian statistics may be the right tool to test your hypothesis when the opportunity exists to do updating of a prior probability with new evidence, in this case exploring the full experience of success and failure at a temperature not previously experienced.


pages: 354 words: 105,322

The Road to Ruin: The Global Elites' Secret Plan for the Next Financial Crisis by James Rickards

"Robert Solow", Affordable Care Act / Obamacare, Albert Einstein, asset allocation, asset-backed security, bank run, banking crisis, barriers to entry, Bayesian statistics, Ben Bernanke: helicopter money, Benoit Mandelbrot, Berlin Wall, Bernie Sanders, Big bang: deregulation of the City of London, bitcoin, Black Swan, blockchain, Bonfire of the Vanities, Bretton Woods, British Empire, business cycle, butterfly effect, buy and hold, capital controls, Capital in the Twenty-First Century by Thomas Piketty, Carmen Reinhart, cellular automata, cognitive bias, cognitive dissonance, complexity theory, Corn Laws, corporate governance, creative destruction, Credit Default Swap, cuban missile crisis, currency manipulation / currency intervention, currency peg, Daniel Kahneman / Amos Tversky, David Ricardo: comparative advantage, debt deflation, Deng Xiaoping, disintermediation, distributed ledger, diversification, diversified portfolio, Edward Lorenz: Chaos theory, Eugene Fama: efficient market hypothesis, failed state, Fall of the Berlin Wall, fiat currency, financial repression, fixed income, Flash crash, floating exchange rates, forward guidance, Fractional reserve banking, G4S, George Akerlof, global reserve currency, high net worth, Hyman Minsky, income inequality, information asymmetry, interest rate swap, Isaac Newton, jitney, John Meriwether, John von Neumann, Joseph Schumpeter, Kenneth Rogoff, labor-force participation, large denomination, liquidity trap, Long Term Capital Management, mandelbrot fractal, margin call, market bubble, Mexican peso crisis / tequila crisis, money market fund, mutually assured destruction, Myron Scholes, Naomi Klein, nuclear winter, obamacare, offshore financial centre, Paul Samuelson, Peace of Westphalia, Pierre-Simon Laplace, plutocrats, Plutocrats, prediction markets, price anchoring, price stability, quantitative easing, RAND corporation, random walk, reserve currency, RFID, risk-adjusted returns, Ronald Reagan, Silicon Valley, sovereign wealth fund, special drawing rights, stocks for the long run, The Bell Curve by Richard Herrnstein and Charles Murray, The Wealth of Nations by Adam Smith, The Wisdom of Crowds, theory of mind, Thomas Bayes, Thomas Kuhn: the structure of scientific revolutions, too big to fail, transfer pricing, value at risk, Washington Consensus, Westphalian system

This railroad incident took place before the Balkan Wars of 1912–13, and six years before the outbreak of the First World War. Yet, based on the French-Russian reaction alone, Somary correctly inferred that world war was inevitable. His analysis was that if an insignificant matter excited geopolitical tensions to the boiling point, then larger matters, which inevitably occur, must lead to war. This inference is a perfect example of Bayesian statistics. Somary, in effect, started with a hypothesis about the probability of world war, which in the absence of any information is weighted fifty-fifty. As incidents like the sanjak railway emerge, they are added to the numerator and denominator of the mathematical form of Bayes’ theorem, increasing the odds of war. Contemporary intelligence analysts call these events “indications and warnings.”

This is why central bank and Wall Street equilibrium models produce consistently weak results in forecasting and risk management. Every analysis starts with the same data. Yet when you enter that data into a deficient model, you get deficient output. Investors who use complexity theory can leave mainstream analysis behind and get better forecasting results. The third tool in addition to behavioral psychology and complexity theory is Bayesian statistics, a branch of etiology also referred to as causal inference. Both terms derive from Bayes’ theorem, an equation first described by Thomas Bayes and published posthumously in 1763. A version of the theorem was elaborated independently and more formally by the French mathematician Pierre-Simon Laplace in 1774. Laplace continued work on the theorem in subsequent decades. Twentieth-century statisticians have developed more rigorous forms.

Austrians made invaluable contributions to the study of choice and markets. Yet their emphasis on the explanatory power of money seems narrow. Money matters, but an emphasis on money to the exclusion of psychology is a fatal flaw. Keynesian and monetarist schools have lately merged into the neoliberal consensus, a nightmarish surf and turf presenting the worst of both. In this book, I write as a theorist using complexity theory, Bayesian statistics, and behavioral psychology to study economics. That approach is unique and not yet a “school” of economic thought. This book also uses one other device—history. When asked to identify which established school of economic thought I find most useful, my reply is Historical. Notable writers of the Historical school include the liberal Walter Bagehot, the Communist Karl Marx, and the conservative Austrian-Catholic Joseph A.


The Book of Why: The New Science of Cause and Effect by Judea Pearl, Dana Mackenzie

affirmative action, Albert Einstein, Asilomar, Bayesian statistics, computer age, computer vision, correlation coefficient, correlation does not imply causation, Daniel Kahneman / Amos Tversky, Edmond Halley, Elon Musk, en.wikipedia.org, experimental subject, Isaac Newton, iterative process, John Snow's cholera map, Loebner Prize, loose coupling, Louis Pasteur, Menlo Park, pattern recognition, Paul Erdős, personalized medicine, Pierre-Simon Laplace, placebo effect, prisoner's dilemma, probability theory / Blaise Pascal / Pierre de Fermat, randomized controlled trial, selection bias, self-driving car, Silicon Valley, speech recognition, statistical model, Stephen Hawking, Steve Jobs, strong AI, The Design of Experiments, the scientific method, Thomas Bayes, Turing test

She must abandon the centuries-old dogma of objectivity for objectivity’s sake. Where causation is concerned, a grain of wise subjectivity tells us more about the real world than any amount of objectivity. In the above paragraph, I said that “most of” the tools of statistics strive for complete objectivity. There is one important exception to this rule, though. A branch of statistics called Bayesian statistics has achieved growing popularity over the last fifty years or so. Once considered almost anathema, it has now gone completely mainstream, and you can attend an entire statistics conference without hearing any of the great debates between “Bayesians” and “frequentists” that used to thunder in the 1960s and 1970s. The prototype of Bayesian analysis goes like this: Prior Belief + New Evidence Revised Belief.

We also need to take into account our prior knowledge about the coin.” Did it come from the neighborhood grocery or a shady gambler? If it’s just an ordinary quarter, most of us would not let the coincidence of nine heads sway our belief so dramatically. On the other hand, if we already suspected the coin was weighted, we would conclude more willingly that the nine heads provided serious evidence of bias. Bayesian statistics give us an objective way of combining the observed evidence with our prior knowledge (or subjective belief) to obtain a revised belief and hence a revised prediction of the outcome of the coin’s next toss. Still, what frequentists could not abide was that Bayesians were allowing opinion, in the form of subjective probabilities, to intrude into the pristine kingdom of statistics. Mainstream statisticians were won over only grudgingly, when Bayesian analysis proved a superior tool for a variety of applications, such as weather prediction and tracking enemy submarines.

Journal of Educational Statistics 12: 101–223. Galton, F. (1869). Hereditary Genius. Macmillan, London, UK. Galton, F. (1883). Inquiries into Human Faculty and Its Development. Macmillan, London, UK. Galton, F. (1889). Natural Inheritance. Macmillan, London, UK. Goldberger, A. (1972). Structural equation models in the social sciences. Econometrica: Journal of the Econometric Society 40: 979–1001. Lindley, D. (1987). Bayesian Statistics: A Review. CBMS-NSF Regional Conference Series in Applied Mathematics (Book 2). Society for Industrial and Applied Mathematics, Philadelphia, PA. McGrayne, S. B. (2011). The Theory That Would Not Die. Yale University Press, New Haven, CT. Pearl, J. (2000). Causality: Models, Reasoning, and Inference. Cambridge University Press, New York, NY. Pearl, J. (2015). Trygve Haavelmo and the emergence of causal calculus.


Super Thinking: The Big Book of Mental Models by Gabriel Weinberg, Lauren McCann

affirmative action, Affordable Care Act / Obamacare, Airbnb, Albert Einstein, anti-pattern, Anton Chekhov, autonomous vehicles, bank run, barriers to entry, Bayesian statistics, Bernie Madoff, Bernie Sanders, Black Swan, Broken windows theory, business process, butterfly effect, Cal Newport, Clayton Christensen, cognitive dissonance, commoditize, correlation does not imply causation, crowdsourcing, Daniel Kahneman / Amos Tversky, David Attenborough, delayed gratification, deliberate practice, discounted cash flows, disruptive innovation, Donald Trump, Douglas Hofstadter, Edward Lorenz: Chaos theory, Edward Snowden, effective altruism, Elon Musk, en.wikipedia.org, experimental subject, fear of failure, feminist movement, Filter Bubble, framing effect, friendly fire, fundamental attribution error, Gödel, Escher, Bach, hindsight bias, housing crisis, Ignaz Semmelweis: hand washing, illegal immigration, income inequality, information asymmetry, Isaac Newton, Jeff Bezos, John Nash: game theory, lateral thinking, loss aversion, Louis Pasteur, Lyft, mail merge, Mark Zuckerberg, meta analysis, meta-analysis, Metcalfe’s law, Milgram experiment, minimum viable product, moral hazard, mutually assured destruction, Nash equilibrium, Network effects, nuclear winter, offshore financial centre, p-value, Parkinson's law, Paul Graham, peak oil, Peter Thiel, phenotype, Pierre-Simon Laplace, placebo effect, Potemkin village, prediction markets, premature optimization, price anchoring, principal–agent problem, publication bias, recommendation engine, remote working, replication crisis, Richard Feynman, Richard Feynman: Challenger O-ring, Richard Thaler, ride hailing / ride sharing, Robert Metcalfe, Ronald Coase, Ronald Reagan, school choice, Schrödinger's Cat, selection bias, Shai Danziger, side project, Silicon Valley, Silicon Valley startup, speech recognition, statistical model, Steve Jobs, Steve Wozniak, Steven Pinker, survivorship bias, The Present Situation in Quantum Mechanics, the scientific method, The Wisdom of Crowds, Thomas Kuhn: the structure of scientific revolutions, transaction costs, uber lyft, ultimatum game, uranium enrichment, urban planning, Vilfredo Pareto, wikimedia commons

Bayesians, by contrast, allow probabilistic judgments about any situation, regardless of whether any observations have yet occurred. To do this, Bayesians begin by bringing related evidence to statistical determinations. For example, picking a penny up off the street, you’d probably initially estimate a fifty-fifty chance that it would come up heads if you flipped it, even if you’d never observed a flip of that particular coin before. In Bayesian statistics, you can bring such knowledge of base rates to a problem. In frequentist statistics, you cannot. Many people find this Bayesian way of looking at probability more intuitive because it is similar to how your beliefs naturally evolve. In everyday life, you aren’t starting from scratch every time, as you would in frequentist statistics. For instance, on policy issues, your starting point is what you currently know on that topic—what Bayesians call a prior—and then when you get new data, you (hopefully) update your prior based on the new information.

., the one-hundred-coin-flips example we presented), the confidence intervals calculated should contain the parameter you are studying (e.g., 50 percent probability of getting heads) to the level of confidence specified (e.g., 95 percent of the time). To many people’s dismay, a confidence interval does not say there is a 95 percent chance of the true value of the parameter being in the interval. By contrast, Bayesian statistics analogously produces credible intervals, which do say that; credible intervals specify the current best estimated range for the probability of the parameter. As such, this Bayesian way of doing things is again more intuitive. In practice, though, both approaches yield very similar conclusions, and as more data becomes available, they should converge on the same conclusion. That’s because they are both trying to estimate the same underlying truth.

Crowdsourcing has been effective across a wide array of situations, from soliciting tips in journalism, to garnering contributions to Wikipedia, to solving the real-world problems of companies and governments. For example, Netflix held a contest in 2009 in which crowdsourced researchers beat Netflix’s own recommendation algorithms. Crowdsourcing can help you get a sense of what a wide array of people think about a topic, which can inform your future decision making, updating your prior beliefs (see Bayesian statistics in Chapter 5). It can also help you uncover unknown unknowns and unknown knowns as you get feedback from people with previous experiences you might not have had. In James Surowiecki’s book The Wisdom of Crowds, he examines situations where input from crowds can be particularly effective. It opens with a story about how the crowd at a county fair in 1906, attended by statistician Francis Galton, correctly guessed the weight of an ox.


Science Fictions: How Fraud, Bias, Negligence, and Hype Undermine the Search for Truth by Stuart Ritchie

Albert Einstein, anesthesia awareness, Bayesian statistics, Carmen Reinhart, Cass Sunstein, citation needed, Climatic Research Unit, cognitive dissonance, complexity theory, coronavirus, correlation does not imply causation, COVID-19, Covid-19, crowdsourcing, deindustrialization, Donald Trump, double helix, en.wikipedia.org, epigenetics, Estimating the Reproducibility of Psychological Science, Growth in a Time of Debt, Kenneth Rogoff, l'esprit de l'escalier, meta analysis, meta-analysis, microbiome, Milgram experiment, mouse model, New Journalism, p-value, phenotype, placebo effect, profit motive, publication bias, publish or perish, race to the bottom, randomized controlled trial, recommendation engine, rent-seeking, replication crisis, Richard Thaler, risk tolerance, Ronald Reagan, Scientific racism, selection bias, Silicon Valley, Silicon Valley startup, Stanford prison experiment, statistical model, stem cell, Steven Pinker, Thomas Bayes, twin studies, University of East Anglia

Doing away with p-values wouldn’t necessarily improve matters; in fact, by introducing another source of subjectivity, it might make the situation a lot worse.26 With tongue only partly in cheek, John Ioannidis has noted that if we remove all such objective measures we invite a situation where ‘all science will become like nutritional epidemiology’ – a scary prospect indeed.27 The same criticism is often levelled at the other main alternative to p-values: Bayesian statistics. Drawing on a probability theorem devised by the eighteenth-century statistician Thomas Bayes, this method allows researchers to take the strength of previous evidence – referred to as a ‘prior’ – into account when assessing the significance of new findings. For instance, if someone tells you their weather forecast predicts a rainy day in London in the autumn, it won’t take too much to convince you that they’re right.

A Bayesian can build all that pre-existing evidence into their initial calculation – in the latter case, they’d require the new forecast to be extraordinarily convincing in order to overturn all the previous meteorological knowledge.28 This isn’t something you can do so easily with p-values, since they’re almost always calculated independently of any prior evidence. However, the Bayesian ‘prior’ is inherently subjective: we can all agree that the Sahara is hot and dry, but how strongly we should believe before a study starts that a particular drug will reduce depression symptoms, or that a specific government policy will boost economic growth, is wholly debatable. Aside from taking prior evidence into account, Bayesian statistics also have other differences from p-values.29 They’re less affected by sample size, for example: statistical power is not a factor because the Bayesian approach is aimed not at detecting the effect of a particular set of conditions, but simply at weighing up the evidence for and against a hypothesis. Arguably, they’re also closer to how people normally reason about statistics. Bayesians say ‘what is the probability my hypothesis is true, given these observations?’

The broader statistical tradition where p-values sit, incidentally, is called frequentist statistics. That’s because, fundamentally, users of p-values are interested in frequencies – most notably the frequency with which you’ll find results with p-values below 0.05 if you run your study an infinite number of times and the hypothesis you’re testing isn’t true. 30.  A useful annotated reading list that serves as an introduction to Bayesian statistics is given by Etz et al., ‘How to Become a Bayesian in Eight Easy Steps: An Annotated Reading List’, Psychonomic Bulletin & Review 25, no. 1 (Feb. 2018): 219–34; https://doi.org/10.3758/s13423-017-1317-5. See also Richard McElreath, Statistical Rethinking: A Bayesian Course with Examples in R and Stan, Chapman & Hall/CRC Texts in Statistical Science Series 122 (Boca Raton: CRC Press/Taylor & Francis Group, 2016). 31.  


pages: 442 words: 94,734

The Art of Statistics: Learning From Data by David Spiegelhalter

Antoine Gombaud: Chevalier de Méré, Bayesian statistics, Carmen Reinhart, complexity theory, computer vision, correlation coefficient, correlation does not imply causation, dark matter, Edmond Halley, Estimating the Reproducibility of Psychological Science, Hans Rosling, Kenneth Rogoff, meta analysis, meta-analysis, Nate Silver, Netflix Prize, p-value, placebo effect, probability theory / Blaise Pascal / Pierre de Fermat, publication bias, randomized controlled trial, recommendation engine, replication crisis, self-driving car, speech recognition, statistical model, The Design of Experiments, The Signal and the Noise by Nate Silver, The Wisdom of Crowds, Thomas Bayes, Thomas Malthus

So the product of the likelihood ratio and the prior odds ends up being around 72,000/1,000,000, which are odds of around 7/100, corresponding to a probability of 7/107 or 7% that he is a cheat. So we should give him the benefit of the doubt at this stage, whereas we might not be so generous with someone we had just met in the pub. And perhaps we should keep a careful eye on the Archbishop. Bayesian Statistical Inference Bayes’ theorem, even if it is not permitted in UK courts, is the scientifically correct way to change our mind on the basis of new evidence. Expected frequencies make Bayesian analysis reasonably straightforward for simple situations that involve only two hypotheses, say about whether someone does or does not have a disease, or has or has not committed an offence. However, things get trickier when we want to apply the same ideas to drawing inferences about unknown quantities that might take on a range of values, such as parameters in statistical models.

Locators in italics refer to figures and tables A A/B tests 107 absolute risk 31–2, 36–7, 383 adjustment 110, 133, 135, 383 adjuvant therapy 181–5, 183–4 agricultural experiments 105–6 AI (artificial intelligence) 144–5, 185–6, 383 alcohol consumption 112–13, 299–300 aleatory uncertainty 240, 306, 383 algorithms – accuracy 163–7 – biases 179 – for classification 143–4, 148 – complex 174–7 – contests 148, 156, 175, 277–8 see also Titanic challenge – meaning of 383 – parameters 171 – performance assessment 156–63, 176, 177 – for prediction 144, 148 – robustness 178 – sensitivity 157 – specificity 157 – and statistical variability 178–9 – transparency 179–81 allocation bias 85 analysis 6–12, 15 apophenia 97, 257 Arbuthnot, John 253–5 Archbishop of Canterbury 322–3 arm-crossing behaviour 259–62, 260, 263, 268–70, 269 artificial intelligence (AI) 144–5, 185–6, 383 ascertainment bias 96, 383 assessment of statistical claims 368–71 associations 109–14, 138 autism 113 averages 46–8, 383 B bacon sandwiches 31–4 bar charts 28, 30 Bayes, Thomas 305 Bayes factors 331–2, 333, 384 Bayes’ Theorem 307, 313, 315–16, 384 Bayesian hypothesis testing 219, 305–38 Bayesian learning 331 Bayesian smoothing 330 Bayesian statistical inference 323–34, 325, 384 beauty 179 bell-shaped curves 85–91, 87 Bem, Daryl 341, 358–9 Bernoulli distribution 237, 384 best-fit lines 125, 393 biases 85, 179 bias/variance trade-off 169–70, 384 big data 145–6, 384 binary data 22, 385 binary variables 27 binomial distribution 230–6, 232, 235, 385 birth weight 85–91 blinding 101, 385 BMI (body mass index) 28 body mass index (BMI) 28 Bonferroni correction 280, 290–1, 385 boosting 172 bootstrapping 195–203, 196, 198, 200, 202, 208, 229–30, 386 bowel cancer 233–6, 235 Box, George 139 box-and-whisker plots 42, 43, 44, 45 Bradford-Hill, Austin 114 Bradford-Hill criteria 114–17 brain tumours 95–6, 135, 301–3 breast cancer screening 214–16, 215 breast cancer surgery 181–5, 183–4 Brier score 164–7, 386 Bristol Royal Infirmary 19–21, 56–8 C Cairo, Alberto 25, 65 calibration 161–3, 162, 386 Cambridge University 110, 111 cancer – breast 181–5, 183–4, 214–16, 215 – lung 98, 114, 266 – ovarian 361 – risk of 31–6 carbonated soft drinks 113 Cardiac Surgical Registry (CSR) 20–1 case-control studies 109, 386 categorical variables 27–8, 386 causation 96–9, 114–17, 128 reverse causation 112–15, 404 Central Limit Theorem 199, 238–9, 386–7 chance 218, 226 child heart surgery see heart surgery chi-squared goodness-of-fittest 271, 272, 387 chi-squared test of association 268–70, 387 chocolate 348 classical probability 217 classification 143–4, 148–54 classification trees 154–6, 155, 168, 174, 387 cleromancy 81 clinical trials 82–3, 99–107, 131, 280, 347 clustering 147 cohort studies 109, 387 coins 308, 309 communication 66–9, 353, 354, 364–5 complex algorithms 138–9 complexity parameters 171 computer simulation 205–7, 208 conclusions 15, 22, 347 conditional probability 214–16 confidence intervals 241–4, 243, 248–51, 250, 271–3, 335–6, 387–8 confirmatory studies 350–1, 388 confounders 110, 135, 388 confusion matrixes 157 continuous variables 46, 388 control groups 100, 389 control limits 234, 389 correlation 96–7, 113 count variables 44–6, 389 counterfactuals 97–8, 389 crime 83–5, 321–2 see also homicides Crime Survey for England and Wales 83–5 cross-sectional studies 108–9 cross-validation 170–1, 389 CSR(Cardiac Surgical Registry) 20–1 D Data 7–12, 15, 22 data collection 345 data distribution see sample distribution data ethics 371 data literacy 12, 389 data science 11, 145–6, 389 data summaries 40 data visualization 22, 25, 65–6, 69 data-dredging 12 death 9 see also mortality; murder; survival rates deduction 76 deep learning 147, 389 dependent events 214, 389 dependent variables 60, 125–6, 389 deterministic models 128–9, 138 dice 205–7, 206, 213 differences between groups of numbers 51–6 distribution 43 DNA evidence 216 dogs 179 Doll, Richard 114 doping 310–13, 311–12, 314, 315–16 dot-diagrams 42, 43, 44, 45 dynamic graphics 71 E Ears 108–9 education 95–6, 106–7, 131, 135, 178–9 election result predictions 372–6, 375 see also opinion polls empirical distribution 197, 404 enumerative probability 217–18 epidemiology 95, 117, 389 epistemic uncertainty 240, 306, 308, 309, 390 error matrixes 157, 158, 390 errors in coding 345–6 ESP (extra-sensory perception) 341, 358–9 ethics 371 eugenics 39 expectation 231, 390 expected frequencies 32, 209–13, 211, 214–16, 215, 390 explanatory variables 126, 132–5 exploratory studies 350, 390 exposures 114, 390 external validity 82–3, 390 extra-sensory perception (ESP) 341, 358–9 F False discovery rate 280, 390 false-positives 278–80, 390 feature engineering 147, 390 Fermat, Pierre de 207 final odds 316 financial crisis of 2007–2008 139–40 financial models 139–40 Fisher, Ronald 258, 265–6, 336, 345 five-sigma results 281–2 forensic epidemiology 117, 391 forensic statistics 6 framing 391 – of numbers 24–5 – of questions 79–80 fraud 347–50 funnel plots 234, 391 G Gallup, George 81 Galton, Francis 39–40, 58, 121–2, 238–9 gambler’s fallacy 237 gambling 205–7, 206, 213 garden of forking paths 350 Gaussian distribution see normal distribution GDP (Gross Domestic Product) 8–9 gender discrimination 110, 111 Gini index 49 Gombaud, Antoine 205–7 Gross Domestic Product (GDP) 8–9 Groucho principle 358 H Happiness 9 HARKing 351–2 hazard ratios 357, 391 health 169–70 heart attacks 99–104 Heart Protection Study (HPS) 100–2, 103, 273–5, 274, 282–7 heart surgery 19–21, 22–4, 23, 56–8, 57, 93, 136–8, 137 heights 122–5, 123, 124, 127, 134, 201, 202, 243, 275–8, 276 hernia surgery 106 HES (Hospital Episode Statistics) 20–1 hierarchical modelling 328, 391 Higgs bosons 281–2 histograms 42, 43, 44, 45 homicides 1–6, 222–6, 225, 248, 270–1, 272, 287–94 Hospital Episode Statistics (HES) 20–1 hospitals 19–21, 25–7, 26, 56–61, 138 house prices 48, 112–14 HPS (Heart Protection Study) 100–2, 103, 273–5, 274, 282–7 hypergeometric distribution 264, 391 hypotheses 256–7 hypothesis testing 253–303, 336, 392 see also Neyman-Pearson Theory; null hypothesis significance testing; P-values I IARC (International Agency for Research in Cancer) 31 icon arrays 32–4, 33, 392 income 47–8 independent events 214, 392 independent variables 60, 126, 392 induction 76–7, 392 inductive behaviour 283 inductive inference 76–83, 78, 239, 392 infographics 69, 70 insurance 180 ‘intention to treat’ principle 100–1, 392 interactions 172, 392 internal validity 80–1, 392 International Agency for Research in Cancer (IARC) 31 inter-quartile range (IQR) 51, 89, 392 IQ 349 IQR (inter-quartile range) 49, 51, 89, 392 J Jelly beans in a jar 40–6, 48, 49, 50 K Kaggle contests 148, 156, 175, 277–8 see also Titanic challenge k-nearest neighbors algorithm 175 L LASSO 172–4 Law of Large Numbers 237, 393 law of the transposed conditional 216, 313 league tables 25, 130–1 see also tables least-squares regression lines 124, 125, 393 left-handedness 113–14, 229–33, 232 legal cases 313, 321, 331–2 likelihood 327, 336, 394 likelihood ratios 314–23, 319–20, 332, 394 line graphs 4, 5 linear models 132, 138 literal populations 91–2 logarithmic scale 44, 45, 394 logistic regression 136, 172, 173, 394 London Underground 24 loneliness 80 long-run frequency probability 218 look elsewhere effect 282 lung cancer 98, 114, 266 lurking factors 113, 135, 394–5 M Machine learning 139, 144–5, 395 mammography 214–16, 215 margins of error 189, 199, 200, 244–8, 395 mean average 46–8 mean squared error (MSE) 163–4, 165, 395 measurement 77–9 meat 31–4 media 356–8 median average 46, 47–8, 51, 89, 395 Méré, Chevalier de 205–7, 213 meta-analysis 102, 104, 395 metaphorical populations 92–3 mode 46, 48, 395 mortality 47, 113–14 MRP (multilevel regression and post-stratification) 329, 396 MSE (mean squared error) 163–4, 165, 395 mu 190 multilevel regression and post-stratification (MRP) 329, 396 multiple linear regression 132–3, 134 multiple regression 135, 136, 396 multiple testing 278–80, 290, 396 murders 1–6, 222–6, 225, 248, 270–1, 287–94 N Names, popularity of 66, 67 National Sexual Attitudes and Lifestyle Survey (Natsal) 52, 69, 70, 73–5 natural variability 226 neural networks 174 Neyman, Jerzy 242, 283, 335–6 Neyman-Pearson Theory 282–7, 336–7 NHST (null hypothesis significance testing) 266–71, 294–7, 296 non-significant results 299, 346–7, 370 normal distribution 85–91, 87, 226, 237–9, 396–7 null hypotheses 257–65, 336, 397 null hypothesis significance testing (NHST) 266–71, 294–7, 296 O Objective priors 327 observational data 108, 114–17, 128 odds 34, 314, 316 odds ratios 34–6 one-sided tests 264, 397–8 one-tailed P-values 264, 398 opinion polls 82, 245–7, 246, 328–9 see also election result predictions ovarian cancer 361 over-fitting 167–71, 168 P P-hacking 351 P-values 264–5, 283, 285, 294–303, 336, 401 parameters 88, 240, 398 Pascal, Blaise 207 patterns 146–7 Pearson, Egon 242, 283, 336 Pearson, Karl 58 Pearson correlation coefficient 58, 59, 96–7, 126, 398 percentiles 48, 89, 398–9 performance assessment of algorithms 156–67, 176, 177 permutation tests 261–4, 263, 399 personal probability 218–19 pie charts 28, 29 placebo effect 131 placebos 100, 101, 399 planning 13–15, 344–5 Poisson distribution 223–4, 225, 270–1, 399 poker 322–3 policing 107 popes 114 population distribution 86–91, 195, 399 population growth 61–6, 62–4 population mean 190–1, 395 see also expectation populations 74–5, 80–93, 399 posterior distributions 327, 400 power of a test 285–6, 400 PPDAC (Problem, Plan, Data, Analysis, Conclusion) problem-solving cycle 13–15, 14, 108–9, 148–54, 344–8, 372–6, 400 practical significance 302, 400 prayer 107 precognition 341, 358–9 Predict 2.1 182 prediction 144, 148–54 predictive analytics 144, 400 predictor variables 392 pre-election polls see opinion polls presentation 22–7 press offices 355–6 priming 80 prior distributions 327, 400 prior odds 316 probabilistic forecasts 161, 400 probabilities, accuracy 163–7 probability 10 meaning of 216–22, 400–1 rules of 210–13 and uncertainty 306–7 probability distribution 90, 401 probability theory 205–27, 268–71 probability trees 210–13, 212 probation decisions 180 Problem, Plan, Data, Analysis, Conclusion (PPDAC) problem-solving cycle 13–15, 14, 108–9, 148–54, 344–8, 372–6, 400 problems 13 processed meat 31–4 propensity 218 proportions, comparisons 28–37, 33, 35 prosecutor’s fallacy 216, 313 prospective cohort studies 109, 401 pseudo-random-number generators 219 publication bias 367–8 publication of findings 355 Q QRPs (questionable research practices) 350–3 quartiles 89, 402 questionable research practices (QRPs) 350–3 Quetelet, Adolphe 226 R Race 179 random forests 174 random match probability 321, 402 random observations 219 random sampling 81–2, 208, 220–2 random variables 221, 229, 402 randomization 108, 266 randomization tests 261–4, 263, 399 randomized controlled trials (RCTs) 100–2, 105–7, 114, 135, 402 randomizing devices 219, 220–1 range 49, 402 rate ratios 357, 402 Receiver Operating Characteristic (ROC) curves 157–60, 160, 402 recidivism algorithms 179–80 regression 121–40 regression analysis 125–8, 127 regression coefficients 126, 133, 403 regression modelling strategies 138–40 regression models 171–4 regression to the mean 125, 129–32, 403 regularization 170 relative risk 31, 403 reliability of data 77–9 replication crisis in science 11–12 representative sampling 82 reproducibility crisis 11–12, 297, 342–7, 403 researcher degrees of freedom 350–1 residual errors 129, 403 residuals 122–5, 403 response variables 126, 135–8 retrospective cohort studies 109, 403 reverse causation 112–15, 404 Richard III 316–21 risk, expression of 34 robust measures 51 ROC (Receiver Operating Characteristic) curves 157–60, 160, 402 Rosling, Hans 71 Royal Statistical Society 68, 79 rules for effective statistical practice 379–80 Ryanair 79 S Salmon 279 sample distribution 43 sample mean 190–1, 395 sample size 191, 192–5, 193–4, 283–7 sampling 81–2, 93 sampling distributions 197, 404 scatter-plots 2–4, 3 scientific research 11–12 selective reporting 12, 347 sensitivity 157–60, 404 sentencing 180 Sequential Probability Ratio Test (SPRT) 292, 293 sequential testing 291–2, 404 sex ratio 253–5, 254, 261, 265 sexual partners 47, 51–6, 53, 55, 73–5, 191–201, 193–4, 196, 198, 200 Shipman, Harold 1–6, 287–94, 289, 293 shoe sizes 49 shrinkage 327, 404 sigma 190, 281–2 signal and the noise 129, 404 significance testing see null hypothesis significance testing Silver, Nate 27 Simonsohn, Uri 349–52, 366 Simpson’s Paradox 111, 112, 405 size of a test 285–6, 405 skewed distribution 43, 405 smoking 98, 114, 266 social acceptability bias 74 social physics 226 Somerton, Francis see Titanic challenge sortilege 81 sortition 81 Spearman’s rank correlation 58–60, 405 specificity 157–9, 405 speed cameras 130, 131–2 speed of light 247 sports doping 310–13, 311–12, 314, 315–16 sports teams 130–1 spread 49–51 SPRT (Sequential Probability Ratio Test) 292, 293 standard deviation 49, 88, 126, 405 standard error 231, 405–6 statins 36–7, 99–104, 273–5, 274, 282–7 statistical analysis 6–12, 15 statistical inference 208, 219, 229–51, 305–38, 323–8, 335, 404 statistical methods 12, 346–7, 379 statistical models 121, 128–9, 404 statistical practice 365–7 statistical science 2, 7, 404 statistical significance 255, 265–8, 270–82, 404 Statistical Society 68 statistics – assessment of claims 368–71 – as a discipline 10–11 – ideology 334–8 – improvements 362–4 – meaning of 404 – publications 16 – rules for effective practice 379–80 – teaching of 13–15 STEP (Study of the Therapeutic Effects of Intercessory Prayer) 107 storytelling 69–71 stratification 110, 383 Streptomycin clinical trial 105, 114 strip-charts 42, 43, 44, 45 strokes 99–104 Student’s t-statistic 275–7 Study of the Therapeutic Effects of Intercessory Prayer (STEP) 107 subjective probability 218–19 summaries 40, 49, 50, 51 supermarkets 112–14 supervised learning 143–4, 404 support-vector machines 174 surgery – breast cancer surgery 181–5, 183–4 – heart surgery 19–21, 22–4, 23, 56–8, 57, 93, 136–8, 137 – hernia surgery 106 survival rates 25–7, 26, 56–61, 57, 60–1 systematic reviews 102–4 T T-statistic 275–7, 404 tables 22–7, 23 tail-area 231 tea tasting 266 teachers 178–9 teaching of statistics 13–15 technology 1 telephone polls 82 Titanic challenge 148–56, 150, 152–3, 155, 162, 166–7, 172, 173, 175, 176, 177, 277 transposed conditionals, law of 216, 313 trees 7–8 trends 61–6, 62–4, 67 two-sided tests 265, 397–8 two-tailed P-values 265, 398 Type I errors 283–5, 404 Type II errors 283–5, 407 U Uncertainty 208, 240, 306–7, 383, 390 uncertainty intervals 199, 200, 241, 335 unemployment 8–9, 189–91, 271–3 university education 95–6, 135, 301–3 see also Cambridge University unsupervised learning 147, 407 US Presidents 167–9 V Vaccination 113 validity of data 79–83 variability 10, 49–51, 178–9, 407 variables 27, 56–61 variance 49, 407 Vietnam War draft lottery 81–2 violence 113 virtual populations 92 volunteer bias 85 voting age 79–80 W Waitrose 112–14 weather forecasts 161, 164, 165 weight loss 348 ‘When I’m Sixty-Four’ 351–2 wisdom of crowds 39–40, 48, 51, 407 Z Z-scores 89, 407 PELICAN BOOKS Economics: The User’s Guide Ha-Joon Chang Human Evolution Robin Dunbar Revolutionary Russia: 1891–1991 Orlando Figes The Domesticated Brain Bruce Hood Greek and Roman Political Ideas Melissa Lane Classical Literature Richard Jenkyns Who Governs Britain?

CHAPTER 9: Putting Probability and Statistics Together 1 To derive this distribution, we could calculate the probability of two left-handers as 0.2 × 0.2 = 0.04, the probability of two right-handers as 0.8 × 0.8 = 0.64, and so the probability of one of each must be 1 − 0.04 − 0.64 = 0.32. 2 There are important exceptions to this – some distributions have such long, ‘heavy’ tails that their expectations and standard deviations do not exist, and so averages have nothing to converge to. 3 If we can assume that all our observations are independent and come from the same population distribution, the standard error of their average is just the standard deviation of the population distribution divided by the square root of the sample size. 4 We shall see in Chapter 12 that practitioners of Bayesian statistics are happy using probabilities for epistemic uncertainty about parameters. 5 Strictly speaking, a 95% confidence interval does not mean there is a 95% probability that this particular interval contains the true value, although in practice people often give this incorrect interpretation. 6 Both of whom I had the pleasure of knowing in their more advanced years. 7 More precisely, 95% confidence intervals are often set as plus or minus 1.96 standard errors, based on assuming a precise normal sampling distribution for the statistic. 8 With 1,000 participants, the margin of error (in %) is at most ±100/√1,000 = 3%.


The Ethical Algorithm: The Science of Socially Aware Algorithm Design by Michael Kearns, Aaron Roth

23andMe, affirmative action, algorithmic trading, Alvin Roth, Bayesian statistics, bitcoin, cloud computing, computer vision, crowdsourcing, Edward Snowden, Elon Musk, Filter Bubble, general-purpose programming language, Google Chrome, ImageNet competition, Lyft, medical residency, Nash equilibrium, Netflix Prize, p-value, Pareto efficiency, performance metric, personalized medicine, pre–internet, profit motive, quantitative trading / quantitative finance, RAND corporation, recommendation engine, replication crisis, ride hailing / ride sharing, Robert Bork, Ronald Coase, self-driving car, short selling, sorting algorithm, speech recognition, statistical model, Stephen Hawking, superintelligent machines, telemarketer, Turing machine, two-sided market, Vilfredo Pareto

Differential privacy can also be interpreted as a promise that no outside observer can learn very much about any individual because of that person’s specific data, while still allowing observers to change their beliefs about particular individuals as a result of learning general facts about the world, such as that smoking and lung cancer are correlated. To clarify this, we need to think for a moment about how learning (machine or otherwise) works. The framework of Bayesian statistics provides a mathematical formalization of learning. A learner starts out with some set of initial beliefs about the world. Whenever he observes something, he changes his beliefs about the world. After he updates his beliefs, he now has a new set of beliefs about the world (his posterior beliefs). Differential privacy provides the following guarantee: for every individual in the dataset, and for any observer no matter what their initial beliefs about the world were, after observing the output of a differentially private computation, their posterior belief about anything is close to what it would have been had they observed the output of the same computation run without the individual’s data.

See also p-hacking advantages of machine learning, 190–93 advertising, 191–92 Afghanistan, 50–51 age data, 27–29, 65–66, 86–89 aggregate data, 2, 30–34, 50–51 AI labs, 145–46 alcohol use data, 51–52 algebraic equations, 37 algorithmic game theory, 100–101 Amazon, 60–61, 116–17, 121, 123, 125 analogies, 57–63 anonymization of data “de-anonymizing,” 2–3, 14–15, 23, 25–26 reidentification of anonymous data, 22–31, 33–34, 38 shortcomings of anonymization methods, 23–29 and weaknesses of aggregate data, 31–32 Apple, 47–50 arbitrary harms, 38 Archimedes, 160–62 arms races, 180–81 arrest data, 92 artificial intelligence (AI), 13, 176–77, 179–82 Atari video games, 132 automation, 174–78, 180 availability of data, 1–3, 51, 66–67 averages, 40, 44–45 backgammon, 131 backpropagation algorithm, 9–10, 78–79, 145–46 “bad equilibria,” 95, 97, 136 Baidu, 148–51, 166, 185 bans on data uses, 39 Bayesian statistics, 38–39, 173 behavioral data, 123 benchmark datasets, 136 Bengio, Yoshua, 133 biases and algorithmic fairness, 57–63 and data collection, 90–93 and word embedding, 58–63, 77–78 birth date information, 23 bitcoin, 183–84 blood-type compatibility, 130 board games, 131–32 Bonferroni correction, 149–51, 153, 156, 164 book recommendation algorithms, 117–21 Bork, Robert, 24 bottlenecks, 107 breaches of data, 32 British Doctors Study, 34–36, 39, 51 brute force tasks, 183–84, 186 Cambridge University, 51–52 Central Intelligence Agency (CIA), 49–50 centralized differential privacy, 46–47 chain reaction intelligence growth, 185 cheating, 115, 148, 166 choice, 101–3 Chrome browser, 47–48, 195 classification of data, 146–48, 152–55 cloud computing, 121–23 Coase, Ronald, 159 Coffee Meets Bagel (dating app), 94–97, 100–101 coin flips, 42–43, 46–47 Cold War, 100 collaborative filtering, 23–24, 116–18, 123–25 collective behavioral data, 105–6, 109, 123–24 collective good, 112 collective language, 64 collective overfitting, 136.


Hands-On Machine Learning With Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems by Aurelien Geron

Amazon Mechanical Turk, Bayesian statistics, centre right, combinatorial explosion, constrained optimization, correlation coefficient, crowdsourcing, en.wikipedia.org, iterative process, Netflix Prize, NP-complete, optical character recognition, P = NP, p-value, pattern recognition, performance metric, recommendation engine, self-driving car, SpamAssassin, speech recognition, statistical model

Bayes’ theorem Unfortunately, in a Gaussian mixture model (and many other problems), the denominator p(x) is intractable, as it requires integrating over all the possible values of z (Equation 9-3). This means considering all possible combinations of cluster parameters and cluster assignments. Equation 9-3. The evidence p(X) is often intractable This is one of the central problems in Bayesian statistics, and there are several approaches to solving it. One of them is variational inference, which picks a family of distributions q(z; λ) with its own variational parameters λ (lambda), then it optimizes these parameters to make q(z) a good approximation of p(z|X). This is achieved by finding the value of λ that minimizes the KL divergence from q(z) to p(z|X), noted DKL(q‖p). The KL divergence equation is shown in (see Equation 9-4), and it can be rewritten as the log of the evidence (log p(X)) minus the evidence lower bound (ELBO).

A simpler approach to maximizing the ELBO is called black box stochastic variational inference (BBSVI): at each iteration, a few samples are drawn from q and they are used to estimate the gradients of the ELBO with regards to the variational parameters λ, which are then used in a gradient ascent step. This approach makes it possible to use Bayesian inference with any kind of model (provided it is differentiable), even deep neural networks: this is called Bayesian deep learning. Tip If you want to dive deeper into Bayesian statistics, check out the Bayesian Data Analysis book by Andrew Gelman, John Carlin, Hal Stern, David Dunson, Aki Vehtari, and Donald Rubin. Gaussian mixture models work great on clusters with ellipsoidal shapes, but if you try to fit a dataset with different shapes, you may have bad surprises. For example, let’s see what happens if we use a Bayesian Gaussian mixture model to cluster the moons dataset (see Figure 9-24): Figure 9-24. moons_vs_bgm_diagram Oops, the algorithm desperately searched for ellipsoids, so it found 8 different clusters instead of 2.


Dinosaurs Rediscovered by Michael J. Benton

All science is either physics or stamp collecting, Bayesian statistics, biofilm, bioinformatics, David Attenborough, Ernest Rutherford, germ theory of disease, Isaac Newton, lateral thinking, North Sea oil, nuclear winter

With colleagues Manabu Sakamoto and Chris Venditti from the University of Reading, we put together an even larger supertree of all dinosaur species, and dated it as accurately as we could. We then ran calculations to work out whether speciation and extinction rates were stable, rising, or falling through the Mesozoic. We were looking for one of three possible outcomes: that overall the balance of speciation and extinction gave ever-rising values, or levelling off, or declining values. We used Bayesian statistical methods, which involve seeding the calculations with a starting model, and then running the data millions or billions of times to assess how well the starting model fits the data, allowing for every possible source of uncertainty, and repeatedly adjusting the model to make it fit better. In this case, Manabu modelled uncertainty about dating the rocks, gaps in the record, accuracy of the phylogenetic tree, and many other issues.

McNeill 215, 216, 218, 228–29, 234, 252 Allen, Percy 73 alligators 118, 164–65, 194 Allosaurus 49, 121, 188 animated skin of 250 diet 206 fact file 188–89 feeding mechanisms 186–88, 190–91, 193, 193 medullary bone 145 Morrison Formation 69, 71 movement 248 skulls 17–18, X teeth and bite force 188, 189, 192, 196 Alvarez, Luis 259–62, 260, 264, 267, 285, 286 Alvarez, Walter 259, 260, 261–62, 264 amber dinosaurs preserved in 131–32, VI extracting DNA from fossils in 136, 137 American Museum of Natural History (AMNH) 54, 156, 166, 243 American National Science Foundation 52 Amherst College Museum, Connecticut 223, 224–25, 227 Amphicoelias 206 analogues, modern 16 Anatosaurus 221, 221 Anchiornis 68–69, 70, V fact file 70 feathers 125, 126 flight 245 footprints 224–25, 225 angiosperms 78–79 animation 249–52, 251 Ankylosaurus 65, 79, 272 extinction 276 fact file 272–73 Hell Creek Formation 270 use of arms and legs 236 Anning, Mary 195 apatite 142 Apatosaurus 206 Archaeopteryx 110, 112, IV as ‘missing link’ fossil 114, 121 fact file 112–13 flight 114, 124, 247 Richard Owen and 111, 114 skeleton found at Solnhofen 111, 277 archosauromorphs 35–36, 37 archosaurs 16, 21–22, 35, 39, 56 Armadillosuchus 201 Asaro, Frank 259 Asilisaurus 32–33 asteroid impact 254–69, 275–76, 280, 281, 286–87, XIX Attenborough, David 98, 213 B Bakker, Bob 109–10, 115, 126 asteroid impact and extinction 262 Deinonychus 110, 111, 221, 244–45 dinosaurs as warm-blooded creatures 109, 116, 117 modern birds as dinosaurs 110 speed of dinosaurs 230 validity of Owen’s Dinosauria 57, 59 Baron, Matt 80–83 Barosaurus 206 Barreirosuchus 201 Barrett, Paul 80–83 Baryonyx 193 Bates, Karl 192 Bayesian statistical methods 273, 275 BBC Horizon 229, 264–65 Walking with Dinosaurs 249–52, 251 beetles 78, 139, 204 Beloc, Haiti 265–66, 265 Bernard Price Palaeontological Institute 160, 163 Bernardi, Massimo 43, 46 biodiversity, documenting 52 bioinformatics 52 bipedal dinosaurs arms and legs 235–40 early images of 219–21 movement and posture 221–22, 222, 249 speed 228 Bird, Roland T. 242–43 birds 145 brains 129 breathing 118 eggs 155, 158, 159, 166 evolution of 277, 278–79, 279–81, 280 feathers 125–26, 127 flight 244, 247, 248 gastroliths 194 growth 174 identifying ancestral genetic sequences 151–52 intelligence 128 as living dinosaurs 110–15, 118, 120–21, 124, 132 and the mass extinction 277–81 medullary bone 143, 145 Mesozoic birds from China 118–24 movement 234 sexual selection 126 using feet to hold prey down 235, 235 bite force 191–94 blood, identifying dinosaur 141–43 Bonaparte, José 239 bones 99 age of 155 bone histology 116–18, 119 bone remodelling 116–17 casting 100 composition 142 excavating from rock 87–99, 105 extracting blood from 141–42 first found 65 first illustrated 65 growth lines 116, 117, 154–55, 170, 172–73, 184 how dinosaurs’ jaws worked 186 mapping 93–94 reconstructing 99–101 structures 170, XIII Brachiosaurus 49, 69, 178–79 diet 206, 207–8 fact file 178–79 Morrison Formation 69 size 175 bracketing 15–17 brain size 128–30, XI, XII breakpoint analysis 42, 43 breathing 118 Bristol City Museum 104 Bristol Dinosaur Project 101–4 British Museum, London 111, 114 Brontosaurus 69, 225 Brookes, Richard 65 Brown, Barnum 273 Brusatte, Steve 32, 36–37, 39 bubble plots 42, 43 Buckland, William 67, 195 Buckley, Michael 142 Burroughs, Edgar Rice, The Land that Time Forgot 134 Butler, Richard 32 Button, David 208, 213 C Camarasaurus 175, 206, 208–9, 209, 213, IX Cano, Raúl 136 Carcharodontosaurus 196 Carnegie, Andrew 211 Carnian Pluvial Episode 40, 42, 43, 45, 46, 50 carnivores 201 see also individual dinosaurs Carnotaurus 201, 238, 239, 240 fact file 239 carotenoids 124 cartilage 142 Caudipteryx 121, 123 fact file 123 Centrosaurus 87, 88 fact file 88–89 ceratopsians 79, 143, 156 diversity of 272, 275 use of arms and legs 236 Ceratosaurus 69, 71, 187, 206 Cetiosaurus 57, 66 Chapman Andrews, Roy 156, 166 Charig, Alan 22–23, 34, 39 Chasmosaurus 87 Chen, Pei-ji 121 Chicxulub crater, Mexico 264–68, 267, 285, 286 Chin, Karen 195, 204 China Jurassic dinosaurs 68–71 Mesozoic birds from 118–24 Chinsamy-Turan, Anusuya 145 chitin 139 chromosomes 151–52 Chukar partridges 248 clades 55, 82, 110 cladistics 53–55, 82–83 cladograms 55, 56 Clashach, Scotland 85, 86 classic model 21, 21 classification, evolutionary trees 52–84, 60–61 climate climate change 22, 40, 41, 43 Cretaceous 269 identifying ancient 46–47 Late Triassic 40, 41, 43, 49 Triassic Period 48, 49 cloning 134–35, 137, 148–51, 150 Coelophysis 193, 236, I, X Colbert, Ned 22, 23, 34 Romer-Colbert ecological relay model 22, 35, 36, 39–40 size and core temperature 118 cold-blooded animals 116 collagen 142, 143 colour of dinosaurs 124–25 of feathers 8–10, 17, 139, V computational methods 35–39 Conan Doyle, Sir Arthur, The Lost World 133–34, 133, 135 Confuciusornis 144, 145, 147, XIII fact file 146–47 conifers 22, 131, 197, III Connecticut Valley 223–26, 224–25, 227, 243 contamination of DNA 138 continental plates 47 Cope, Edward 208 coprolites 195, 195, 197, 204 coprophagy 204 crests 126, 128, 143 Cretaceous 50, 71–75 birds 277–78 climate 269 decline of dinosaurs 274, 275 dinosaur evolution rates 77 ecosystems 205 in North America 240–42 ornithopods 71 sauropods 71 see also Early Cretaceous; Late Cretaceous Cretaceous–Palaeogene boundary 260, 261–62, 265–66, 269 evolution of birds 276, 277, 278–79 Cretaceous Terrestrial Revolution 77–80, 131 Crichton, Michael, Jurassic Park 134–35, 136 criticism and scientific method 287–88 crocodiles 218 Adamantina Formation food web 201–3 eggs and babies 155, 159, 164, 165 feeding methods 194 function of the snout 193 crurotarsans 39 CT (computerized tomographic) scanning 97, 99 dinosaur embryos 160, 162 dinosaur skulls 163, 191 Currie, Phil 86, 91, 121 Cuvier, Georges 257 D Dal Corso, Jacopo 40 Daohugou Bed, China 68 Darwin, Charles 23, 107, 114, 132, 287 Daspletosaurus 170, 171 dating dinosaurian diversification 44–46 de-extinction science 149, 151 death of dinosaurs see extinction Deccan Traps 268, 285, 287 Deinonychus 112, 114, 121 fact file 112–13 John Ostrom’s monograph on 110, 111, 113, 116, 244–45 movement 221 dentine 196, 197 Dial, Ken 248 diet collapsing food webs 204–5 dinosaur food webs 201–4 fossil evidence for 194–95 microwear on teeth and diet 199–201 niche division and specialization in 205–13 digital models 17, 18, 19, 191–94, 231–34, 249, 252 dimorphism, sexual 126, 143 dinomania 107 Dinosaur Park Formation, Drumheller 86, 91–99, 100 Dinosaur Provincial Park, Alberta 86, 87, 91–92, 91 Dinosaur Ridge, Colorado 240 Dinosauria 33, 55, 82, 107 discovery of the clade 57–59 Diplodocus 175, 210–11, II diet 207, 208–9, 213 fact file 210–11 Morrison Formation 69 skulls IX teeth and bite force 209, 213 diversification of dinosaurs 29, 44–46 DNA (deoxyribonucleic acid) 134–35 cloning 148–51 dinosaurian genome 151–52 extracting from fossils in amber 136 extracting from museum skins and skeletons 138 identifying dinosaur 136–37 survival of in fossils 138–39, 141 Doda, Bajazid 180 Dolly the sheep 148, 149 Dromaeosaurus 87, 121 duck-billed dinosaurs see hadrosaurs dung beetles 204 dwarf dinosaurs 180–84 Dysalotosaurus 145 Dzik, Jerzy 29, 31 E Early Cretaceous diversity of species on land and in sea 78 Jehol Beds 124 Wealden 72–74, 74, 75, 78 ecological relay model 21, 22, 35, 36, 39 ecology, and the origin of dinosaurs 23–25 education, using dinosaurs in 101–4 eggs, birds 155, 158, 159, 166 eggs, dinosaur 154, 155–56 dinosaur embryos 160–63 nests and parental care 163–67 size of 158–59 El Kef, Tunisia 276 Elgin, Scotland 25–26, 26, 34, 85–86 embryos, dinosaur 154, 160–63 enamel, tooth 196, 197 enantiornithines 277–78 encephalization quotient (EQ) 130 engineering models 17–18 Eoraptor 29 Erickson, Greg 154–55, 170, 172–73, 184–85, 197 eumelanin 124 eumelanosomes V Euoplocephalus 87, 88 fact file 88–89 Europasaurus 117 European Synchrotron Radiation Facility (ESRF) 162 evolution 13, 23, 40 evolutionary trees 52–84, 60–61, 281 Richard Owen’s views on 106–7, 114 size and 181, 184 Evolution (journal) 109 excavations 87–99 Dinosaur Park Formation 86, 91–99, 100 recording 92–97 extant phylogenetic bracket 16, 217 external fundamental system (EFS) 170 extinction Carnian Pluvial Episode 40, 42, 43, 45, 46, 50 end-Triassic event 64 mass extinction 254–85 Permian–Triassic mass extinction 14, 33–34, 46, 222 sudden or gradual 270–75 eyes 100 F faeces, fossil 194, 195, 197, 204 Falkingham, Peter 192, 226 feathers 99, 245 in amber 131, VI bird feathers 125–26, 127 colour of 8–10, 17, 139, V as insulation 126 melanosomes 8–10, 8, 17, 124–25, 132, V sexual signalling 126, 128, 143 Sinosauropteryx 8–9, 8, 10, 17, 119, 120–21, 125, 126 Field, Dan 279, 281 films, dinosaurs in 249–52 Jurassic Park 134–35, 136, 217, 252 finding dinosaurs 87–105 finite element analysis (FEA) 18, 190–91, 199, 208 fishes 128, 159, 163–64, 196 flight 244–49 flowering plants 78–79, III food webs 71–75, 201–4 Adamantina Formation 201–4, 202–3 collapsing 204–5 Wealden 74, 75 footprints 223–27, 240 megatracksites 242 photogrammetry 94 swimming tracks 242, 243 fossils casting 100 extracting skeletons from 94–99, 105 plants 269 reconstructing 99–101 scanning 97, 99 survival of organic molecules in 138–39, 141 Framestore 249–50 Froude, William 228–29 G Galton, Peter 58, 59, 110, 115, 221, 221 Garcia, Mariano 232, 234 gastroliths 194 Gatesy, Stephen 226, 231 gaur 148–49 Gauthier, Jacques 53, 59, 245 genetic engineering, bringing dinosaurs back to life with 148–51 genome, dinosaurian 151–52 geological time scale 6–7, 44–45 gharials 193, 194 gigantothermy 117, 118 Gill, Pam 199 glasses, impact 265–66, 269 gliding 245, 247, 248 Gorgosaurus 87, 170, 171 Granger, Walter 157 Great Exhibition (1851) 107, 108 Gregory, William 157 Grimaldi, David 131 growth dwarf dinosaurs 180–84 growth rates 154, 170–74, 184 growth rings 116, 117, 154–55, 170, 172–73, 184 growth spurts 145 how dinosaurs could be so huge 175–79 Gryposaurus 87 Gubbio, Italy 260, 261–62, 265, 266, 286 H hadrosaurs 79, 143 Dinosaur Park Formation 91–99, 100 diversity of 272, 275 first skeleton 218–19, 220 teeth 196–97, 198, 201, XVIII use of arms and legs 236 Hadrosaurus foulkii 220 Haiti 265–66, 265 Haldane, J.


pages: 294 words: 81,292

Our Final Invention: Artificial Intelligence and the End of the Human Era by James Barrat

AI winter, AltaVista, Amazon Web Services, artificial general intelligence, Asilomar, Automated Insights, Bayesian statistics, Bernie Madoff, Bill Joy: nanobots, brain emulation, cellular automata, Chuck Templeton: OpenTable:, cloud computing, cognitive bias, commoditize, computer vision, cuban missile crisis, Daniel Kahneman / Amos Tversky, Danny Hillis, data acquisition, don't be evil, drone strike, Extropian, finite state, Flash crash, friendly AI, friendly fire, Google Glasses, Google X / Alphabet X, Isaac Newton, Jaron Lanier, John Markoff, John von Neumann, Kevin Kelly, Law of Accelerating Returns, life extension, Loebner Prize, lone genius, mutually assured destruction, natural language processing, Nicholas Carr, optical character recognition, PageRank, pattern recognition, Peter Thiel, prisoner's dilemma, Ray Kurzweil, Rodney Brooks, Search for Extraterrestrial Intelligence, self-driving car, semantic web, Silicon Valley, Singularitarianism, Skype, smart grid, speech recognition, statistical model, stealth mode startup, stem cell, Stephen Hawking, Steve Jobs, Steve Wozniak, strong AI, Stuxnet, superintelligent machines, technological singularity, The Coming Technological Singularity, Thomas Bayes, traveling salesman, Turing machine, Turing test, Vernor Vinge, Watson beat the top human players on Jeopardy!, zero day

But by the time the tragedy unfolded, Holtzman told me, Good had retired. He was not in his office but at home, perhaps calculating the probability of God’s existence. According to Dr. Holtzman, sometime before he died, Good updated that probability from zero to point one. He did this because as a statistician, he was a long-term Bayesian. Named for the eighteenth-century mathematician and minister Thomas Bayes, Bayesian statistics’ main idea is that in calculating the probability of some statement, you can start with a personal belief. Then you update that belief as new evidence comes in that supports your statement or doesn’t. If Good’s original disbelief in God had remained 100 percent, no amount of data, not even God’s appearance, could change his mind. So, to be consistent with his Bayesian perspective, Good assigned a small positive probability to the existence of God to make sure he could learn from new data, if it arose.

Aboujaoude, Elias accidents AI and, see risks of artificial intelligence nuclear power plant Adaptive AI affinity analysis agent-based financial modeling “Age of Robots, The” (Moravec) Age of Spiritual Machines, The: When Computers Exceed Human Intelligence (Kurzweil) AGI, see artificial general intelligence AI, see artificial intelligence AI-Box Experiment airplane disasters Alexander, Hugh Alexander, Keith Allen, Paul Allen, Robbie Allen, Woody AM (Automatic Mathematician) Amazon Anissimov, Michael anthropomorphism apoptotic systems Apple iPad iPhone Siri Arecibo message Aristotle artificial general intelligence (AGI; human-level AI): body needed for definition of emerging from financial markets first-mover advantage in jump to ASI from; see also intelligence explosion by mind-uploading by reverse engineering human brain time and funds required to develop Turing test for artificial intelligence (AI): black box tools in definition of drives in, see drives as dual use technology emotional qualities in as entertainment examples of explosive, see intelligence explosion friendly, see Friendly AI funding for jump to AGI from Joy on risks of, see risks of artificial intelligence Singularity and, see Singularity tight coupling in utility function of virtual environments for artificial neural networks (ANNs) artificial superintelligence (ASI) anthropomorphizing gradualist view of dealing with jump from AGI to; see also intelligence explosion morality of nanotechnology and runaway Artilect War, The (de Garis) ASI, see artificial superintelligence Asilomar Guidelines ASIMO Asimov, Isaac: Three Laws of Robotics of Zeroth Law of Association for the Advancement of Artificial Intelligence (AAAI) asteroids Atkins, Brian and Sabine Automated Insights availability bias Banks, David L. Bayes, Thomas Bayesian statistics Biden, Joe biotechnology black box systems Blue Brain project Bok globules Borg, Scott Bostrom, Nick botnets Bowden, B. V. brain augmentation of, see intelligence augmentation basal ganglia in cerebral cortex in neurons in reverse engineering of synapses in uploading into computer Brautigan, Richard Brazil Brooks, Rodney Busy Child scenario Butler, Samuel CALO (Cognitive Assistant that Learns and Organizes) Carr, Nicholas cave diving Center for Applied Rationality (CFAR) Chandrashekar, Ashok chatbots chess-playing computers Deep Blue China Chinese Room Argument Cho, Seung-Hui Church, Alonso Churchill, Winston Church-Turing hypothesis Clarke, Arthur C.


pages: 283 words: 81,376

The Doomsday Calculation: How an Equation That Predicts the Future Is Transforming Everything We Know About Life and the Universe by William Poundstone

Albert Einstein, anthropic principle, Any sufficiently advanced technology is indistinguishable from magic, Arthur Eddington, Bayesian statistics, Benoit Mandelbrot, Berlin Wall, bitcoin, Black Swan, conceptual framework, cosmic microwave background, cosmological constant, cosmological principle, cuban missile crisis, dark matter, digital map, discounted cash flows, Donald Trump, Doomsday Clock, double helix, Elon Musk, Gerolamo Cardano, index fund, Isaac Newton, Jaron Lanier, Jeff Bezos, John Markoff, John von Neumann, mandelbrot fractal, Mark Zuckerberg, Mars Rover, Peter Thiel, Pierre-Simon Laplace, probability theory / Blaise Pascal / Pierre de Fermat, RAND corporation, random walk, Richard Feynman, ride hailing / ride sharing, Rodney Brooks, Ronald Reagan, Ronald Reagan: Tear down this wall, Sam Altman, Schrödinger's Cat, Search for Extraterrestrial Intelligence, self-driving car, Silicon Valley, Skype, Stanislav Petrov, Stephen Hawking, strong AI, Thomas Bayes, Thomas Malthus, time value of money, Turing test

The Copernican method is like the sleek tech gadget that comes with intelligent defaults. The Carter-Leslie argument promises to be more customizable, more suited to those who like to tinker. Gott’s 1993 article does not mention Bayes’s theorem or prior probabilities. For some Nature readers that was a great sin. I asked Gott why he omitted Bayes, and he had a quick answer: “Bayesians.” “I didn’t put any Bayesian statistics in this paper because I didn’t want to muddy the waters,” he explained. “Because Bayesian people will argue about their priors, endlessly. I had a falsifiable hypothesis.” The long-standing complaint is that prior probabilities are subjective. A Bayesian prediction can be a case of garbage in, garbage out. There is plenty of scope to slant the results to one’s liking, and to wrap them up in the flag of impartial mathematics.

Twenty-Four Dogs in Albuquerque 1. “incredibly irresponsible”; “Anybody can see it’s garbage”: Caves interview, December 12, 2017. 2. “Gott dismisses the entire process”: Caves 2000, 2. 3. “it was important to find”: Caves 2000, 2. 4. “a notarized list of…24 dogs”: Caves 2000, 15. 5. “Gott is on record as applying”: Caves 2008, 2. 6. “We can distinguish two forms”: Bostrom 2002, 89. 7. “I didn’t put any Bayesian statistics”: Gott interview, July 31, 2017. 8. “When you can’t identify any time scales”: Caves 2008, 11. 9. “No other formula in the alchemy of logic”: Keynes 1921, 89. 10. Goodman’s objection to Gott: Goodman 1994. 11. Jeffreys prior compatible with location-and scale-invariance: This fact was demonstrated not by Jeffreys but by Washington University physicist E. T. Jaynes. See Jaynes 1968. 12.


pages: 319 words: 90,965

The End of College: Creating the Future of Learning and the University of Everywhere by Kevin Carey

Albert Einstein, barriers to entry, Bayesian statistics, Berlin Wall, business cycle, business intelligence, carbon-based life, Claude Shannon: information theory, complexity theory, David Heinemeier Hansson, declining real wages, deliberate practice, discrete time, disruptive innovation, double helix, Douglas Engelbart, Douglas Engelbart, Downton Abbey, Drosophila, Firefox, Frank Gehry, Google X / Alphabet X, informal economy, invention of the printing press, inventory management, John Markoff, Khan Academy, Kickstarter, low skilled workers, Lyft, Marc Andreessen, Mark Zuckerberg, meta analysis, meta-analysis, natural language processing, Network effects, open borders, pattern recognition, Peter Thiel, pez dispenser, ride hailing / ride sharing, Ronald Reagan, Ruby on Rails, Sand Hill Road, self-driving car, Silicon Valley, Silicon Valley startup, social web, South of Market, San Francisco, speech recognition, Steve Jobs, technoutopianism, transcontinental railway, uber lyft, Vannevar Bush

In describing how the brain reacts to surprise, Lue said that “everything is a function of risk and opportunity.” To survive and prosper in the world with limited cognitive capacity, humans filter waves of constant sensory information through neural patterns—heuristics and mental shortcuts that our minds use to weigh the odds that what we are sensing is familiar and categorizable based on our past experience. Sebastian Thrun’s self-driving car does this with Bayesian statistics built into silicon and code, while the human mind uses electrochemical processes that we still don’t fully understand. But the underlying principle is the same: Based on the pattern of lines and shapes and edges, that is probably a boulder and I should drive around it. That is probably a group of three young women eating lunch at a table near the sushi bar and I should pay them no mind. Heuristics are also critically important to the market for higher education.

., 90–91, 98 Air Force, 91 Artificial Intelligence (AI), 11, 79, 136, 153, 159, 170, 264n Adaptive Control of Thought—Rational (ACT-R) model for, 101–4 cognitive tutoring using, 103, 105, 138, 179, 210 Dartmouth conference on, 79, 101 learning pathways for, 155 personalized learning with, 5, 232 theorem prover based in, 110 Thrun’s work in, 147–50 Arum, Richard, 9, 10, 36, 85, 244 Associate’s degrees, 6, 61, 117, 141, 193, 196, 198 Atlantic magazine, 29, 65, 79, 123 AT&T, 146 Australian National University, 204 Bachelor’s degrees, 6–9, 31, 36, 60–61, 64 for graduate school admission, 30 percentage of Americans with, 8, 9, 57, 77 professional versus liberal arts, 35 required for public school teachers, 117 social mobility and, 76 time requirement for, 6, 22 value in labor market of, 58 Badges, digital, 207–12, 216–18, 233, 245, 248 Barzun, Jacques, 32–34, 44, 45, 85 Bayesian statistics, 181 Bell Labs, 123–24 Bellow, Saul, 59, 78 Berlin, University of, 26, 45-46 Bhave, Amol, 214–15 Bing, 212 Binghamton, State University of New York at, 183–84 Bishay, Shereef, 139, 140 Bloomberg, Michael, 251 Blue Ocean Strategy (Kim and Mauborgne), 130 Bologna, University of, 16–17, 21, 41 Bonn, University of, 147 Bonus Army, 51 Borders Books, 127 Boston College, 164, 175 Boston Gazette, 95 Boston Globe, 2 Boston University (BU), 59, 61–62, 64 Bowen, William G., 112–13 Bowman, John Gabbert, 74–75 Brigham Young University, 2 Brilliant, 213 British Army, 98 Brookings Institution, 54 Brooklyn College, 44 Brown v.


pages: 397 words: 102,910

The Idealist: Aaron Swartz and the Rise of Free Culture on the Internet by Justin Peters

4chan, activist lawyer, Any sufficiently advanced technology is indistinguishable from magic, Bayesian statistics, Brewster Kahle, buy low sell high, crowdsourcing, disintermediation, don't be evil, global village, Hacker Ethic, hypertext link, index card, informal economy, information retrieval, Internet Archive, invention of movable type, invention of writing, Isaac Newton, John Markoff, Joi Ito, Lean Startup, moral panic, Paul Buchheit, Paul Graham, profit motive, RAND corporation, Republic of Letters, Richard Stallman, selection bias, semantic web, Silicon Valley, social web, Steve Jobs, Steven Levy, Stewart Brand, strikebreaker, Vannevar Bush, Whole Earth Catalog, Y Combinator

So we need an algorithm or computer program that would encourage lots of people to identify the fights and to start the campaigns,” McLean told the Sydney Morning Herald in 2014. “We’d put the tools that we have at our disposal in their hands.”32 Swartz had actually been building tools like these for several months with his colleagues at ThoughtWorks. Victory Kit, as the project was called, was an open-source version of the expensive community-organizing software used by groups such as MoveOn. Victory Kit incorporated Bayesian statistics—an analytical method that gets smarter as it goes along by consistently incorporating new information into its estimates—to improve activists’ ability to reach and organize their bases. “In the end, a lot of what the software was about was doing quite sophisticated A/B testing of messages for advocacy,” remembered Swartz’s friend Nathan Woodhull.33 Swartz was scheduled to present Victory Kit to the group at the Holmes retreat.

Ashcroft, 137–38, 140 FBI file on, 191–92, 223 fleeing the system, 8, 145, 151, 158–59, 161, 171, 173, 193, 248, 267 and free culture movement, 3–4, 141, 152–55, 167, 223 and Harvard, 3, 205, 207, 223, 224, 229 health issues of, 9, 150, 165–66, 222 immaturity of, 8–9 and Infogami, 147, 148–51, 158 interests of, 6–7, 8–9, 204, 221 “Internet and Mass Collaboration, The,” 166–67 lawyers for, 6, 254–55 legacy of, 14–15, 268, 269–70 and Library of Congress, 139 and Malamud, 187–93, 222, 223 manifesto of, 6–7, 178–81, 189–90, 201, 228–30, 247 mass downloading of documents by, 1, 3, 188–94, 197–202, 207, 213, 215, 222, 228, 235 media stories about, 125 and MIT, 1, 3, 201, 204, 207, 213, 222, 227, 232, 249–50, 262 and money, 170–71 on morality and ethics, 205–6 and Open Library, 163, 173, 179, 223, 228 and PCCC, 202–3, 225 as private person/isolation of, 2–3, 5, 124, 127, 143, 154–55, 158–60, 166, 169, 205, 224, 227, 228, 248–49, 251 and public domain, 123 as public speaker, 213–14, 224, 243, 257 and Reddit, see Reddit The Rules broken by, 14 “saving the world” on bucket list of, 7, 8, 15, 125, 151–52, 181, 205–6, 247–48, 266, 267, 268 self-help program of, 251–53 and theinfo.org, 172–73 and US Congress, 224–25, 239–40 Swartz, Robert: and Aaron’s death, 261, 262, 264 and Aaron’s early years, 124, 127 and Aaron’s legal woes, 232, 250, 254 and MIT Media Lab, 203–4, 212, 219, 232, 250 and technology, 124, 212 Swartz, Susan, 128–29, 160, 192 Swartz’s legal case: as “the bad thing,” 3, 7–8, 234 change in defense strategy, 256–57 evidence-suppression hearing, 259–60 facts of, 11 felony charges in, 235, 253 grand jury, 232–33 indictment, 1, 5, 8, 10, 11, 233, 234, 235–37, 241, 253–54 investigation and capture, 215–17, 223, 228 JSTOR’s waning interest in, 231–32 manifesto as evidence in, 228–30 motion to suppress, 6 motives sought in, 223, 229 Norton subpoenaed in, 1–2, 227–29 ongoing, 248, 249–51 online petitions against, 236–37 original charges in, 218, 222 plea deals offered, 227, 250 possible prison sentence, 1, 2, 5, 7–8, 11, 222, 232, 235–36, 253, 260 potential harm assessed, 218, 219, 222, 235 prosecutor’s zeal in, 7–8, 11, 218, 222–24, 235–37, 253–54, 259–60, 263, 264 search and seizure in, 6, 223–24, 256–57 Symbolics, 103 systems, flawed, 265–67 T. & J. W. Johnson, 49 Tammany Hall, New York, 57 tech bubble, 146, 156 technology: Bayesian statistics in, 258–59 burgeoning, 69, 71, 84, 87–88 communication, 12, 13, 18, 87–88 computing, see computers and digital culture, 122 and digital utopia, 91, 266–67 of electronic publishing, 120 and intellectual property, 90–91 and irrational exuberance, 146 in library of the future, 81–83 as magic, 152 moving inexorably forward, 134 overreaching police action against, 233 power of metadata, 128, 130 as private property, 210 resisting change caused by, 120 saving humanity via, 101 thinking machines, 102 unknown, future, 85 and World War II, 208 telephone, invention of, 69 Templeton, Brad, 261 theinfo.org, 172–73 theme parks, 134 ThoughtWorks, 9, 248, 257, 258 “thumb drive corps,” 187, 191, 193 Toyota Motor Corporation, “lean production” of, 7, 257, 265 Trumbull, John, McFingal, 26 trust-busting, 75 Tucher, Andie, 34 Tufte, Edward, 263–64 “tuft-hunter,” use of term, 28 Tumblr, 240 Twain, Mark, 60, 62, 73 Tweed, William “Boss,” 57 Twitter, 237 Ulrich, Lars, 133 United States: Articles of Confederation, 26 copyright laws in, 26–27 economy of, 44–45, 51, 55, 56 freedom to choose in, 80, 269 industrialization, 57 literacy in, 25, 26–27, 39, 44, 48 migration to cities in, 57 national identity of, 28, 32 new social class in, 69–70 opportunity in, 58, 80 poverty in, 59 railroads, 55, 56 rustic nation of, 44–45 values of, 85 UNIVAC computer, 81, 90 Universal Studios Orlando, 134 University of Illinois at Urbana-Champaign, 94, 95–96, 112–15 Unix, 104 US Chamber of Commerce, 239 utilitarianism, 214 Valenti, Jack, 111, 132 Van Buren, Martin, 44 Van Dyke, Henry, The National Sin of Literary Piracy, 61 venture capital, 146 Viaweb, 146 Victor, O.


pages: 412 words: 115,266

The Moral Landscape: How Science Can Determine Human Values by Sam Harris

Albert Einstein, banking crisis, Bayesian statistics, cognitive bias, end world poverty, endowment effect, energy security, experimental subject, framing effect, hindsight bias, impulse control, John Nash: game theory, longitudinal study, loss aversion, meta analysis, meta-analysis, out of africa, pattern recognition, placebo effect, Ponzi scheme, Richard Feynman, risk tolerance, scientific worldview, stem cell, Stephen Hawking, Steven Pinker, the scientific method, theory of mind, ultimatum game, World Values Survey

If we are measuring sanity in terms of sheer numbers of subscribers, then atheists and agnostics in the United States must be delusional: a diagnosis which would impugn 93 percent of the members of the National Academy of Sciences.63 There are, in fact, more people in the United States who cannot read than who doubt the existence of Yahweh.64 In twenty-first-century America, disbelief in the God of Abraham is about as fringe a phenomenon as can be named. But so is a commitment to the basic principles of scientific thinking—not to mention a detailed understanding of genetics, special relativity, or Bayesian statistics. The boundary between mental illness and respectable religious belief can be difficult to discern. This was made especially vivid in a recent court case involving a small group of very committed Christians accused of murdering an eighteen-month-old infant.65 The trouble began when the boy ceased to say “Amen” before meals. Believing that he had developed “a spirit of rebellion,” the group, which included the boy’s mother, deprived him of food and water until he died.

The ACC and the caudate display an unusual degree of connectivity, as the surgical lesioning of the ACC (a procedure known as a cingulotomy) causes atrophy of the caudate, and the disruption of this pathway is thought to be the basis of the procedure’s effect in treating conditions like obsessive-compulsive disorder (Rauch et al., 2000; Rauch et al., 2001). There are, however, different types of uncertainty. For instance, there is a difference between expected uncertainty—where one knows that one’s observations are unreliable—and unexpected uncertainty, where something in the environment indicates that things are not as they seem. The difference between these two modes of cognition has been analyzed within a Bayesian statistical framework in terms of their underlying neurophysiology. It appears that expected uncertainty is largely mediated by acetylcholine and unexpected uncertainty by norepinephrine (Yu & Dayan, 2005). Behavioral economists sometimes distinguish between “risk” and “ambiguity”: the former being a condition where probability can be assessed, as in a game of roulette, the latter being the uncertainty borne of missing information.


pages: 396 words: 117,149

The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World by Pedro Domingos

Albert Einstein, Amazon Mechanical Turk, Arthur Eddington, basic income, Bayesian statistics, Benoit Mandelbrot, bioinformatics, Black Swan, Brownian motion, cellular automata, Claude Shannon: information theory, combinatorial explosion, computer vision, constrained optimization, correlation does not imply causation, creative destruction, crowdsourcing, Danny Hillis, data is the new oil, double helix, Douglas Hofstadter, Erik Brynjolfsson, experimental subject, Filter Bubble, future of work, global village, Google Glasses, Gödel, Escher, Bach, information retrieval, job automation, John Markoff, John Snow's cholera map, John von Neumann, Joseph Schumpeter, Kevin Kelly, lone genius, mandelbrot fractal, Mark Zuckerberg, Moneyball by Michael Lewis explains big data, Narrative Science, Nate Silver, natural language processing, Netflix Prize, Network effects, NP-complete, off grid, P = NP, PageRank, pattern recognition, phenotype, planetary scale, pre–internet, random walk, Ray Kurzweil, recommendation engine, Richard Feynman, scientific worldview, Second Machine Age, self-driving car, Silicon Valley, social intelligence, speech recognition, Stanford marshmallow experiment, statistical model, Stephen Hawking, Steven Levy, Steven Pinker, superintelligent machines, the scientific method, The Signal and the Noise by Nate Silver, theory of mind, Thomas Bayes, transaction costs, Turing machine, Turing test, Vernor Vinge, Watson beat the top human players on Jeopardy!, white flight, zero-sum game

The distinction between descriptive and normative theories was articulated by John Neville Keynes in The Scope and Method of Political Economy (Macmillan, 1891). Chapter Six Sharon Bertsch McGrayne tells the history of Bayesianism, from Bayes and Laplace to the present, in The Theory That Would Not Die (Yale University Press, 2011). A First Course in Bayesian Statistical Methods,* by Peter Hoff (Springer, 2009), is an introduction to Bayesian statistics. The Naïve Bayes algorithm is first mentioned in Pattern Classification and Scene Analysis,* by Richard Duda and Peter Hart (Wiley, 1973). Milton Friedman argues for oversimplified theories in “The methodology of positive economics,” which appears in Essays in Positive Economics (University of Chicago Press, 1966). The use of Naïve Bayes in spam filtering is described in “Stopping spam,” by Joshua Goodman, David Heckerman, and Robert Rounthwaite (Scientific American, 2005).


pages: 829 words: 186,976

The Signal and the Noise: Why So Many Predictions Fail-But Some Don't by Nate Silver

"Robert Solow", airport security, availability heuristic, Bayesian statistics, Benoit Mandelbrot, Berlin Wall, Bernie Madoff, big-box store, Black Swan, Broken windows theory, business cycle, buy and hold, Carmen Reinhart, Claude Shannon: information theory, Climategate, Climatic Research Unit, cognitive dissonance, collapse of Lehman Brothers, collateralized debt obligation, complexity theory, computer age, correlation does not imply causation, Credit Default Swap, credit default swaps / collateralized debt obligations, cuban missile crisis, Daniel Kahneman / Amos Tversky, diversification, Donald Trump, Edmond Halley, Edward Lorenz: Chaos theory, en.wikipedia.org, equity premium, Eugene Fama: efficient market hypothesis, everywhere but in the productivity statistics, fear of failure, Fellow of the Royal Society, Freestyle chess, fudge factor, George Akerlof, global pandemic, haute cuisine, Henri Poincaré, high batting average, housing crisis, income per capita, index fund, information asymmetry, Intergovernmental Panel on Climate Change (IPCC), Internet Archive, invention of the printing press, invisible hand, Isaac Newton, James Watt: steam engine, John Nash: game theory, John von Neumann, Kenneth Rogoff, knowledge economy, Laplace demon, locking in a profit, Loma Prieta earthquake, market bubble, Mikhail Gorbachev, Moneyball by Michael Lewis explains big data, Monroe Doctrine, mortgage debt, Nate Silver, negative equity, new economy, Norbert Wiener, PageRank, pattern recognition, pets.com, Pierre-Simon Laplace, prediction markets, Productivity paradox, random walk, Richard Thaler, Robert Shiller, Robert Shiller, Rodney Brooks, Ronald Reagan, Saturday Night Live, savings glut, security theater, short selling, Skype, statistical model, Steven Pinker, The Great Moderation, The Market for Lemons, the scientific method, The Signal and the Noise by Nate Silver, The Wisdom of Crowds, Thomas Bayes, Thomas Kuhn: the structure of scientific revolutions, too big to fail, transaction costs, transfer pricing, University of East Anglia, Watson beat the top human players on Jeopardy!, wikimedia commons

Scott Armstrong, The Wharton School, University of Pennsylvania LIBRARY OF CONGRESS CATALOGING IN PUBLICATION DATA Silver, Nate. The signal and the noise : why most predictions fail but some don’t / Nate Silver. p. cm. Includes bibliographical references and index. ISBN 978-1-101-59595-4 1. Forecasting. 2. Forecasting—Methodology. 3. Forecasting—History. 4. Bayesian statistical decision theory. 5. Knowledge, Theory of. I. Title. CB158.S54 2012 519.5'42—dc23 2012027308 While the author has made every effort to provide accurate telephone numbers, Internet addresses, and other contact information at the time of publication, neither the publisher nor the author assumes any responsibility for errors, or for changes that occur after publication. Further, publisher does not have any control over and does not assume any responsibility for author or third-party Web sites or their content.

In essence, this player could go to work every day for a year and still lose money. This is why it is sometimes said that poker is a hard way to make an easy living. Of course, if this player really did have some way to know that he was a long-term winner, he’d have reason to persevere through his losses. In reality, there’s no sure way for him to know that. The proper way for the player to estimate his odds of being a winner, instead, is to apply Bayesian statistics,31 where he revises his belief about how good he really is, on the basis of both his results and his prior expectations. If the player is being honest with himself, he should take quite a skeptical attitude toward his own success, even if he is winning at first. The player’s prior belief should be informed by the fact that the average poker player by definition loses money, since the house takes some money out of the game in the form of the rake while the rest is passed around between the players.32 The Bayesian method described in the book The Mathematics of Poker, for instance, would suggest that a player who had made $30,000 in his first 10,000 hands at a $100/$200 limit hold ’em game was nevertheless more likely than not to be a long-term loser.

McGrayne, The Theory That Would Not Die, Kindle location 7. 61. Raymond S. Nickerson, “Null Hypothesis Significance Testing: A Review of an Old and Continuing Controversy,” Psychological Methods, 5, 2 (2000), pp. 241–301. http://203.64.159.11/richman/plogxx/gallery/17/%E9%AB%98%E7%B5%B1%E5%A0%B1%E5%91%8A.pdf. 62. Andrew Gelman and Cosma Tohilla Shalizi, “Philosophy and the Practice of Bayesian Statistics,” British Journal of Mathematical and Statistical Psychology, pp. 1–31, January 11, 2012. http://www.stat.columbia.edu/~gelman/research/published/philosophy.pdf. 63. Although there are several different formulations of the steps in the scientific method, this version is mostly drawn from “APPENDIX E: Introduction to the Scientific Method,” University of Rochester. http://teacher.pas.rochester.edu/phy_labs/appendixe/appendixe.html. 64.


pages: 573 words: 157,767

From Bacteria to Bach and Back: The Evolution of Minds by Daniel C. Dennett

Ada Lovelace, Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem, Andrew Wiles, Bayesian statistics, bioinformatics, bitcoin, Build a better mousetrap, Claude Shannon: information theory, computer age, computer vision, double entry bookkeeping, double helix, Douglas Hofstadter, Elon Musk, epigenetics, experimental subject, Fermat's Last Theorem, Gödel, Escher, Bach, information asymmetry, information retrieval, invention of writing, Isaac Newton, iterative process, John von Neumann, Menlo Park, Murray Gell-Mann, Necker cube, Norbert Wiener, pattern recognition, phenotype, Richard Feynman, Rodney Brooks, self-driving car, social intelligence, sorting algorithm, speech recognition, Stephen Hawking, Steven Pinker, strong AI, The Wealth of Nations by Adam Smith, theory of mind, Thomas Bayes, trickle-down economics, Turing machine, Turing test, Watson beat the top human players on Jeopardy!, Y2K

The Reverend Thomas Bayes (1701–1761) developed a method of calculating probabilities based on one’s prior expectations. Each problem is couched thus: Given that your expectations based on past experience (including, we may add, the experience of your ancestors as passed down to you) are such and such (expressed as probabilities for each alternative), what effect on your future expectations should the following new data have? What adjustments in your probabilities would it be rational for you to make? Bayesian statistics, then, is a normative discipline, purportedly prescribing the right way to think about probabilities.41 So it is a good candidate for a competence model of the brain: it works as an expectation-generating organ, creating new affordances on the fly. Consider the task of identifying handwritten symbols (letters and digits). It is no accident that such a task is often used by Internet sites as a test to distinguish real human beings from bots programmed to invade websites: handwriting perception, like speech perception, has proven to be an easy task for humans but an exceptionally challenging task for computers.

Friston, Karl, Michael Levin, Biswa Sengupta, and Giovanni Pezzulo. 2015. “Knowing One’s Place: A Free-Energy Approach to Pattern Regulation.” Journal of the Royal Society Interface, 12: 20141383. Frith, Chris D. 2012. “The Role of Metacognition in Human Social Interactions.” Philosophical Transactions of the Royal Society B: Biological Sciences 367 (1599): 2213–2223. Gelman, Andrew. 2008. “Objections to Bayesian Statistics.” Bayesian Anal. 3 (3): 445–449. Gibson, James J. 1966. “The Problem of Temporal Order in Stimulation and Perception.” Journal of Psychology 62 (2): 141–149. —. 1979. The Ecological Approach to Visual Perception. Boston: Houghton Mifflin. Godfrey-Smith, Peter. 2003. “Postscript on the Baldwin Effect and Niche Construction.” In Evolution and Learning: The Baldwin Effect Reconsidered, edited by Bruce H.


Natural Language Processing with Python and spaCy by Yuli Vasiliev

Bayesian statistics, computer vision, database schema, en.wikipedia.org, loose coupling, natural language processing, Skype, statistical model

More no-nonsense books from NO STARCH PRESS PYTHON CRASH COURSE, 2ND EDITION A Hands-On, Project-Based Introduction to Programming by ERIC MATTHES MAY 2019, 544 pp., $39.95 ISBN 978-1-59327-928-8 MATH ADVENTURES WITH PYTHON An Illustrated Guide to Exploring Math with Code by PETER FARRELL JANUARY 2019, 304 pp., $29.95 ISBN 978-1-59327-867-0 THE BOOK OF R A First Course in Programming and Statistics by TILMAN M. DAVIES JULY 2016, 832 pp., $49.95 ISBN 978-1-59327-651-5 BAYESIAN STATISTICS THE FUN WAY Understanding Statistics and Probability with Star Wars, LEGO, and Rubber Ducks by WILL KURT JULY 2019, 256 pp., $34.95 ISBN 978-1-59327-956-1 PYTHON ONE-LINERS by CHRISTIAN MAYER SPRING 2020, 256 pp., $39.95 ISBN 978-1-7185-0050-1 AUTOMATE THE BORING STUFF WITH PYTHON, 2ND EDITION Practical Programming for Total Beginners by AL SWEIGART NOVEMBER 2019, 592 pp., $39.95 ISBN 978-1-59327-992-9 PHONE: 800.420.7240 OR 415.863.9900 EMAIL: SALES@NOSTARCH.COM WEB: WWW.NOSTARCH.COM BUILD YOUR OWN NLP APPLICATIONS Natural Language Processing with Python and spaCy will show you how to create NLP applications like chatbots, text-condensing scripts, and order-processing tools quickly and easily.


Analysis of Financial Time Series by Ruey S. Tsay

Asian financial crisis, asset allocation, Bayesian statistics, Black-Scholes formula, Brownian motion, business cycle, capital asset pricing model, compound rate of return, correlation coefficient, data acquisition, discrete time, frictionless, frictionless market, implied volatility, index arbitrage, Long Term Capital Management, market microstructure, martingale, p-value, pattern recognition, random walk, risk tolerance, short selling, statistical model, stochastic process, stochastic volatility, telemarketer, transaction costs, value at risk, volatility smile, Wiener process, yield curve

In this chapter, we introduce the ideas of MCMC methods and data augmentation that are widely applicable in finance. In particular, we discuss Bayesian inference via Gibbs sampling and demonstrate various applications of MCMC methods. Rapid developments in the MCMC methodology make it impossible to cover all the new methods available in the literature. Interested readers are referred to some recent books on Bayesian and empirical Bayesian statistics (e.g., Carlin and Louis, 2000; Gelman, Carlin, Stern, and Rubin, 1995). For applications, we focus on issues related to financial econometrics. The demonstrations shown in this chapter only represent a small fraction of all possible applications of the techniques in finance. As a matter of fact, it is fair to say that Bayesian inference and the MCMC methods discussed here are applicable to most, if not all, of the studies in financial econometrics.

Such a prior distribution is called a conjugate prior distribution. For MCMC methods, use of conjugate priors means that a closed-form solution for the conditional posterior distributions is available. Random draws of the Gibbs sampler can then be obtained by using the commonly available computer routines of probability distributions. In what follows, we review some well-known conjugate priors. For more information, readers are referred to textbooks on Bayesian statistics (e.g., DeGroot, 1970, Chapter 9). Result 1: Suppose that x1 , . . . , xn form a random sample from a normal distribution with mean µ, which is unknown, and variance σ 2 , which is known and positive. Suppose that the prior distribution of µ is a normal distribution with mean µo and variance σo2 . Then the posterior distribution of µ given the data and prior is 401 BAYESIAN INFERENCE normal with mean µ∗ and variance σ∗2 given by µ∗ = σ 2 µo + nσo2 x̄ σ 2 + nσo2 and σ∗2 = σ 2 σo2 , σ 2 + nσo2 n xi /n is the sample mean. where x̄ = i=1 In Bayesian analysis, it is often convenient to use the precision parameter η = 1/σ 2 (i.e., the inverse of the variance σ 2 ).


pages: 206 words: 70,924

The Rise of the Quants: Marschak, Sharpe, Black, Scholes and Merton by Colin Read

"Robert Solow", Albert Einstein, Bayesian statistics, Black-Scholes formula, Bretton Woods, Brownian motion, business cycle, capital asset pricing model, collateralized debt obligation, correlation coefficient, Credit Default Swap, credit default swaps / collateralized debt obligations, David Ricardo: comparative advantage, discovery of penicillin, discrete time, Emanuel Derman, en.wikipedia.org, Eugene Fama: efficient market hypothesis, financial innovation, fixed income, floating exchange rates, full employment, Henri Poincaré, implied volatility, index fund, Isaac Newton, John Meriwether, John von Neumann, Joseph Schumpeter, Kenneth Arrow, Long Term Capital Management, Louis Bachelier, margin call, market clearing, martingale, means of production, moral hazard, Myron Scholes, Paul Samuelson, price stability, principal–agent problem, quantitative trading / quantitative finance, RAND corporation, random walk, risk tolerance, risk/return, Ronald Reagan, shareholder value, Sharpe ratio, short selling, stochastic process, Thales and the olive presses, Thales of Miletus, The Chicago School, the scientific method, too big to fail, transaction costs, tulip mania, Works Progress Administration, yield curve

He postulated that the rational decision-maker will align his or her beliefs of unknown probabilities to the consensus bets of impartial bookmakers, a technique often called the Dutch Book. Thirty later, the great mind Leonard “Jimmie” Savage (1917–1971) elaborated his concept into an axiomatic approach to decision-making under uncertainty using arguments remarkably similar to Ramsey’s logic. The concepts of Ramsey and Savage also formed the basis for the theory of Bayesian statistics and are important in many aspects of financial decision-making. Marschak’s great insight While Ramsey created and Savage broadened the logical landscape for the inclusion of uncertainty into decision-making, it was not possible to incorporate their logic until the finance discipline could develop actual measures of uncertainty. Of course, modern financial analysis depends crucially even today on such a methodology to measure uncertainty.


pages: 654 words: 191,864

Thinking, Fast and Slow by Daniel Kahneman

Albert Einstein, Atul Gawande, availability heuristic, Bayesian statistics, Black Swan, Cass Sunstein, Checklist Manifesto, choice architecture, cognitive bias, complexity theory, correlation coefficient, correlation does not imply causation, Daniel Kahneman / Amos Tversky, delayed gratification, demand response, endowment effect, experimental economics, experimental subject, Exxon Valdez, feminist movement, framing effect, hedonic treadmill, hindsight bias, index card, information asymmetry, job satisfaction, John von Neumann, Kenneth Arrow, libertarian paternalism, loss aversion, medical residency, mental accounting, meta analysis, meta-analysis, nudge unit, pattern recognition, Paul Samuelson, pre–internet, price anchoring, quantitative trading / quantitative finance, random walk, Richard Thaler, risk tolerance, Robert Metcalfe, Ronald Reagan, Shai Danziger, Supply of New York City Cabdrivers, The Chicago School, The Wisdom of Crowds, Thomas Bayes, transaction costs, union organizing, Walter Mischel, Yom Kippur War

So if you believe that there is a 40% chance plethat it will rain sometime tomorrow, you must also believe that there is a 60% chance it will not rain tomorrow, and you must not believe that there is a 50% chance that it will rain tomorrow morning. And if you believe that there is a 30% chance that candidate X will be elected president, and an 80% chance that he will be reelected if he wins the first time, then you must believe that the chances that he will be elected twice in a row are 24%. The relevant “rules” for cases such as the Tom W problem are provided by Bayesian statistics. This influential modern approach to statistics is named after an English minister of the eighteenth century, the Reverend Thomas Bayes, who is credited with the first major contribution to a large problem: the logic of how people should change their mind in the light of evidence. Bayes’s rule specifies how prior beliefs (in the examples of this chapter, base rates) should be combined with the diagnosticity of the evidence, the degree to which it favors the hypothesis over the alternative.

.); WYSIATI (what you see is all there is) and associative memory; abnormal events and; anchoring and; causality and; confirmation bias and; creativity and; and estimates of causes of death Åstebro, Thomas Atlantic, The attention; in self-control paneight="0%" width="-5%"> Attention and Effort (Kahneman) Auerbach, Red authoritarian ideas availability; affect and; and awareness of one’s biases; expectations about; media and; psychology of; risk assessment and, see risk assessment availability cascades availability entrepreneurs bad and good, distinctions between banks bank teller problem Barber, Brad Bargh, John baseball baseball cards baseline predictions base rates; in cab driver problem; causal; in helping experiment; low; statistical; in Tom W problem; in Yale exam problem basic assessments basketball basketball tickets bat-and-ball problem Baumeister, Roy Bayes, Thomas Bayesian statistics Bazerman, Max Beane, Billy Beatty, Jackson Becker, Gary “Becoming Famous Overnight” (Jacoby) behavioral economics Behavioral Insight Team “Belief in the Law of Small Numbers” (Tversky and Kahneman) beliefs: bias for; past, reconstruction of Benartzi, Shlomo Bentham, Jeremy Berlin, Isaiah Bernoulli, Daniel Bernouilli, Nicholas Beyth, Ruth bicycle messengers Black Swan, The (Taleb) blame Blink (Gladwell) Borg, Björn Borgida, Eugene “Boys Will Be Boys” (Barber and Odean) Bradlee, Ben brain; amygdala in; anterior cingulate in; buying and selling and; emotional framing and; frontal area of; pleasure and; prefrontal area of; punishment and; sugar in; threats and; and variations of probabilities British Toxicology Society broad framing Brockman, John broken-leg rule budget forecasts Built to Last (Collins and Porras) Bush, George W.


pages: 586 words: 186,548

Architects of Intelligence by Martin Ford

3D printing, agricultural Revolution, AI winter, Apple II, artificial general intelligence, Asilomar, augmented reality, autonomous vehicles, barriers to entry, basic income, Baxter: Rethink Robotics, Bayesian statistics, bitcoin, business intelligence, business process, call centre, cloud computing, cognitive bias, Colonization of Mars, computer vision, correlation does not imply causation, crowdsourcing, DARPA: Urban Challenge, deskilling, disruptive innovation, Donald Trump, Douglas Hofstadter, Elon Musk, Erik Brynjolfsson, Ernest Rutherford, Fellow of the Royal Society, Flash crash, future of work, gig economy, Google X / Alphabet X, Gödel, Escher, Bach, Hans Rosling, ImageNet competition, income inequality, industrial robot, information retrieval, job automation, John von Neumann, Law of Accelerating Returns, life extension, Loebner Prize, Mark Zuckerberg, Mars Rover, means of production, Mitch Kapor, natural language processing, new economy, optical character recognition, pattern recognition, phenotype, Productivity paradox, Ray Kurzweil, recommendation engine, Robert Gordon, Rodney Brooks, Sam Altman, self-driving car, sensor fusion, sentiment analysis, Silicon Valley, smart cities, social intelligence, speech recognition, statistical model, stealth mode startup, stem cell, Stephen Hawking, Steve Jobs, Steve Wozniak, Steven Pinker, strong AI, superintelligent machines, Ted Kaczynski, The Rise and Fall of American Growth, theory of mind, Thomas Bayes, Travis Kalanick, Turing test, universal basic income, Wall-E, Watson beat the top human players on Jeopardy!, women in the workforce, working-age population, zero-sum game, Zipcar

The basic problem is, how do we go beyond specific experiences to general truths? Or from the past to the future? In the case that Roger Shepard was thinking about, he was working on the basic mathematics of how might an organism, having experienced a certain stimulus to have some good or negative consequence, figure out which other things in the world are likely to have that same consequence? Roger had introduced some mathematics based on Bayesian statistics for solving that problem, which was a very elegant formulation of the general theory of how organisms could generalize from experience and he was looking to neural networks to try to take that theory and implement it in a more scalable way. Somehow, I wound up working with him on this project. Through that, I was exposed to both neural networks, as well as to Bayesian analyses of cognition early on, and you can view most of my career since then as working through those same ideas and methods.

Even a very young child can learn this new causal relation between moving your finger in a certain way and a screen lighting up, and that is how all sorts of other possibilities of action open to you. These problems of how we make a generalization from just one or a few examples are what I started working on with Roger Shepard when I was just an undergraduate. Early on, we used these ideas from Bayesian statistics, Bayesian inference, and Bayesian networks, to use the mathematics of probability theory to formulate how people’s mental models of the causal structure of the world might work. It turns out that tools that were developed by mathematicians, physicists, and statisticians to make inferences from very sparse data in a statistical setting were being deployed in the 1990s in machine learning and AI, and it revolutionized the field.


pages: 345 words: 75,660

Prediction Machines: The Simple Economics of Artificial Intelligence by Ajay Agrawal, Joshua Gans, Avi Goldfarb

"Robert Solow", Ada Lovelace, AI winter, Air France Flight 447, Airbus A320, artificial general intelligence, autonomous vehicles, basic income, Bayesian statistics, Black Swan, blockchain, call centre, Capital in the Twenty-First Century by Thomas Piketty, Captain Sullenberger Hudson, collateralized debt obligation, computer age, creative destruction, Daniel Kahneman / Amos Tversky, data acquisition, data is the new oil, deskilling, disruptive innovation, Elon Musk, en.wikipedia.org, Erik Brynjolfsson, everywhere but in the productivity statistics, Google Glasses, high net worth, ImageNet competition, income inequality, information retrieval, inventory management, invisible hand, job automation, John Markoff, Joseph Schumpeter, Kevin Kelly, Lyft, Minecraft, Mitch Kapor, Moneyball by Michael Lewis explains big data, Nate Silver, new economy, On the Economy of Machinery and Manufactures, pattern recognition, performance metric, profit maximization, QWERTY keyboard, race to the bottom, randomized controlled trial, Ray Kurzweil, ride hailing / ride sharing, Second Machine Age, self-driving car, shareholder value, Silicon Valley, statistical model, Stephen Hawking, Steve Jobs, Steven Levy, strong AI, The Future of Employment, The Signal and the Noise by Nate Silver, Tim Cook: Apple, Turing test, Uber and Lyft, uber lyft, US Airways Flight 1549, Vernor Vinge, Watson beat the top human players on Jeopardy!, William Langewiesche, Y Combinator, zero-sum game

Validere improves the efficiency of oil custody transfer by predicting the water content of incoming crude. These applications are a microcosm of what most businesses will be doing in the near future. If you’re lost in the fog trying to figure out what AI means for you, then we can help you understand the implications of AI and navigate through the advances in this technology, even if you’ve never programmed a convolutional neural network or studied Bayesian statistics. If you are a business leader, we provide you with an understanding of AI’s impact on management and decisions. If you are a student or recent graduate, we give you a framework for thinking about the evolution of jobs and the careers of the future. If you are a financial analyst or venture capitalist, we offer a structure around which you can develop your investment theses. If you are a policy maker, we give you guidelines for understanding how AI is likely to change society and how policy might shape those changes for the better.


pages: 267 words: 72,552

Reinventing Capitalism in the Age of Big Data by Viktor Mayer-Schönberger, Thomas Ramge

accounting loophole / creative accounting, Air France Flight 447, Airbnb, Alvin Roth, Atul Gawande, augmented reality, banking crisis, basic income, Bayesian statistics, bitcoin, blockchain, Capital in the Twenty-First Century by Thomas Piketty, carbon footprint, Cass Sunstein, centralized clearinghouse, Checklist Manifesto, cloud computing, cognitive bias, conceptual framework, creative destruction, Daniel Kahneman / Amos Tversky, disruptive innovation, Donald Trump, double entry bookkeeping, Elon Musk, en.wikipedia.org, Erik Brynjolfsson, Ford paid five dollars a day, Frederick Winslow Taylor, fundamental attribution error, George Akerlof, gig economy, Google Glasses, information asymmetry, interchangeable parts, invention of the telegraph, inventory management, invisible hand, James Watt: steam engine, Jeff Bezos, job automation, job satisfaction, joint-stock company, Joseph Schumpeter, Kickstarter, knowledge worker, labor-force participation, land reform, lone genius, low cost airline, low cost carrier, Marc Andreessen, market bubble, market design, market fundamentalism, means of production, meta analysis, meta-analysis, Moneyball by Michael Lewis explains big data, multi-sided market, natural language processing, Network effects, Norbert Wiener, offshore financial centre, Parag Khanna, payday loans, peer-to-peer lending, Peter Thiel, Ponzi scheme, prediction markets, price anchoring, price mechanism, purchasing power parity, random walk, recommendation engine, Richard Thaler, ride hailing / ride sharing, Sam Altman, Second Machine Age, self-driving car, Silicon Valley, Silicon Valley startup, six sigma, smart grid, smart meter, Snapchat, statistical model, Steve Jobs, technoutopianism, The Future of Employment, The Market for Lemons, The Nature of the Firm, transaction costs, universal basic income, William Langewiesche, Y Combinator

The idea was that every day, four hundred nationalized factories around the country would send data to Cybersyn’s nerve center in Santiago, the capital, where it would then be fed into a mainframe computer, scrutinized, and compared against forecasts. Divergences would be flagged and brought to the attention of factory directors, then to government decision makers sitting in a futuristic operations room. From there the officials would send directives back to the factories. Cybersyn was quite sophisticated for its time, employing a network approach to capturing and calculating economic activity and using Bayesian statistical models. Most important, it relied on feedback that would loop back into the decision-making processes. The system never became fully operational. Its communications network was in place and was used in the fall of 1972 to keep the country running when striking transportation workers blocked goods from entering Santiago. The computer-analysis part of Cybersyn was mostly completed, too, but its results were often unreliable and slow.


pages: 277 words: 87,082

Beyond Weird by Philip Ball

Albert Einstein, Bayesian statistics, cosmic microwave background, dark matter, dematerialisation, Ernest Rutherford, experimental subject, Isaac Newton, John von Neumann, Kickstarter, Murray Gell-Mann, Richard Feynman, Schrödinger's Cat, Stephen Hawking, theory of mind, Thomas Bayes

Those beliefs do not become realized as facts until they impinge on the consciousness of the observer – and so the facts are specific to every observer (although different observers can find themselves agreeing on the same facts). This notion takes its cue from standard Bayesian probability theory, introduced in the eighteenth century by the English mathematician and clergyman Thomas Bayes. In Bayesian statistics, probabilities are not defined with reference to some objective state of affairs in the world, but instead quantify personal degrees of belief of what might happen – which we update as we acquire new information. The QBist view, however, says something much more profound than simply that different people know different things. Rather, it asserts that there are no things that can be meaningfully spoken of beyond the self.


pages: 301 words: 85,126

AIQ: How People and Machines Are Smarter Together by Nick Polson, James Scott

Air France Flight 447, Albert Einstein, Amazon Web Services, Atul Gawande, autonomous vehicles, availability heuristic, basic income, Bayesian statistics, business cycle, Cepheid variable, Checklist Manifesto, cloud computing, combinatorial explosion, computer age, computer vision, Daniel Kahneman / Amos Tversky, Donald Trump, Douglas Hofstadter, Edward Charles Pickering, Elon Musk, epigenetics, Flash crash, Grace Hopper, Gödel, Escher, Bach, Harvard Computers: women astronomers, index fund, Isaac Newton, John von Neumann, late fees, low earth orbit, Lyft, Magellanic Cloud, mass incarceration, Moneyball by Michael Lewis explains big data, Moravec's paradox, more computing power than Apollo, natural language processing, Netflix Prize, North Sea oil, p-value, pattern recognition, Pierre-Simon Laplace, ransomware, recommendation engine, Ronald Reagan, self-driving car, sentiment analysis, side project, Silicon Valley, Skype, smart cities, speech recognition, statistical model, survivorship bias, the scientific method, Thomas Bayes, Uber for X, uber lyft, universal basic income, Watson beat the top human players on Jeopardy!, young professional

Allen WannaCry (ransomware attack) waterfall diagram Watson (IBM supercomputer) Waymo (autonomous-car company) WeChat word vectors word2vec model (Google) World War I World War II Battle of the Bulge Bayesian search and Hopper, Grace, and Schweinfurt-Regensburg mission (World War II) Statistical Research Group (Columbia) and Wald’s survivability recommendations for aircraft Yormark, Brett YouTube Zillow ABOUT THE AUTHORS NICK POLSON is professor of Econometrics and Statistics at the Chicago Booth School of Business. He does research on artificial intelligence, Bayesian statistics, and deep learning, and is a frequent speaker at conferences. He lives in Chicago. You can sign up for email updates here. JAMES SCOTT is associate professor of Statistics at the University of Texas at Austin. He earned his Ph.D. in statistics from Duke University in 2009 after studying mathematics at the University of Cambridge on a Marshall Scholarship. He has published over 45 peer-reviewed scientific articles, and he has worked with clients across many industries to help them understand the power of their data.


pages: 290 words: 82,871

The Hidden Half: How the World Conceals Its Secrets by Michael Blastland

air freight, Alfred Russel Wallace, banking crisis, Bayesian statistics, Berlin Wall, central bank independence, cognitive bias, complexity theory, Deng Xiaoping, Diane Coyle, Donald Trump, epigenetics, experimental subject, full employment, George Santayana, hindsight bias, income inequality, manufacturing employment, mass incarceration, meta analysis, meta-analysis, minimum wage unemployment, nudge unit, oil shock, p-value, personalized medicine, phenotype, Ralph Waldo Emerson, random walk, randomized controlled trial, replication crisis, Richard Thaler, selection bias, the map is not the territory, the scientific method, The Wisdom of Crowds, twin studies

., ‘Non-steroidal Anti-inflammatory Drug Use is Associated with Increased Risk of Out-of-Hospital Cardiac Arrest: A Nationwide Case-time-control Study’, European Heart Journal – Cardiovascular Pharmacotherapy, vol. 3, no. 2, 2017, pp. 100–107. 4 I wrote about this case in a blog for the Winton Centre for Risk and Evidence Communication: ‘Here we Go Again’, 21 March 2017. 5 See, for example, James Ware, ‘The Limitations of Risk Factors as Prognostic Tools’, New England Journal of Medicine, 21 December 2006; and Tjeerd-Pieter van Staa et al., ‘Prediction of Cardiovascular Risk Using Framingham, ASSIGN and QRISK2: How Well Do They Predict Individual Rather than Population Risk?’, PLOS One, 1 October 2014. 6 This is a metaphor often used by some statisticians. I have a lot of time for it. But we are teetering here on the brink of a discussion of Bayesian statistics, and had better resist. Readers can find plenty of such discussions elsewhere. 7 We simply don’t have the data to do it at the individual level. Some people think we do, but to begin to convert one to the other requires a series of medical trials involving multiple tests on the same person, known as ‘N of 1’ trials, and these are not standard. 8 For a favourable explanation of how NNTs are calculated, their advantages, and for a searchable database of NNTs for different treatments, see: theNNT.com. 9 The wide variability of response in individuals that could produce the kind of average effect shown in the chart – but might also be consistent with a quite different set of individual reactions – is discussed in two articles by Stephen Senn on https://errorstatistics.com: ‘Responder Despondency’ and ‘Painful Dichotomies’.


The Data Revolution: Big Data, Open Data, Data Infrastructures and Their Consequences by Rob Kitchin

Bayesian statistics, business intelligence, business process, cellular automata, Celtic Tiger, cloud computing, collateralized debt obligation, conceptual framework, congestion charging, corporate governance, correlation does not imply causation, crowdsourcing, discrete time, disruptive innovation, George Gilder, Google Earth, Infrastructure as a Service, Internet Archive, Internet of things, invisible hand, knowledge economy, late capitalism, lifelogging, linked data, longitudinal study, Masdar, means of production, Nate Silver, natural language processing, openstreetmap, pattern recognition, platform as a service, recommendation engine, RFID, semantic web, sentiment analysis, slashdot, smart cities, Smart Cities: Big Data, Civic Hackers, and the Quest for a New Utopia, smart grid, smart meter, software as a service, statistical model, supply-chain management, the scientific method, The Signal and the Noise by Nate Silver, transaction costs

Inferential statistics seek to explain, not simply describe, the patterns and relationships that may exist within a dataset, and to test the strength and significance of associations between variables. They include parametric statistics which are employed to assess hypotheses using interval and ratio level data, such as correlation and regression; non-parametric statistics used for testing hypotheses using nominal or ordinal-level data; and probabilistic statistics that determine the probability of a condition occurring, such as Bayesian statistics. The armoury of descriptive and inferential statistics that have traditionally been used to analyse small data are also being applied to big data, though as discussed in Chapter 9 this is not always straightforward because many of these techniques were developed to draw insights from relatively scarce rather than exhaustive data. Nonetheless, the techniques do provide a means of making sense of massive amounts of data.


Learn Algorithmic Trading by Sebastien Donadio

active measures, algorithmic trading, automated trading system, backtesting, Bayesian statistics, buy and hold, buy low sell high, cryptocurrency, DevOps, en.wikipedia.org, fixed income, Flash crash, Guido van Rossum, latency arbitrage, locking in a profit, market fundamentalism, market microstructure, martingale, natural language processing, p-value, paper trading, performance metric, prediction markets, quantitative trading / quantitative finance, random walk, risk tolerance, risk-adjusted returns, Sharpe ratio, short selling, sorting algorithm, statistical arbitrage, statistical model, stochastic process, survivorship bias, transaction costs, type inference, WebSocket, zero-sum game

This is known as machine learning, the fundamentals of which were developed in the 1800s and early 1900s and have been worked on ever since. Recently, there has been a resurgence in interest in machine learning algorithms and applications owing to the availability of extremely cost-effective processing power and the easy availability of large datasets. Understanding machine learning techniques in great detail is a massive field at the intersection of linear algebra, multivariate calculus, probability theory, frequentist and Bayesian statistics, and an in-depth analysis of machine learning is beyond the scope of a single book. Machine learning methods, however, are surprisingly easily accessible in Python and quite intuitive to understand, so we will explain the intuition behind the methods and see how they find applications in algorithmic trading. But first, let's introduce some basic concepts and notation that we will need for the rest of this chapter.


pages: 340 words: 97,723

The Big Nine: How the Tech Titans and Their Thinking Machines Could Warp Humanity by Amy Webb

Ada Lovelace, AI winter, Airbnb, airport security, Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem, artificial general intelligence, Asilomar, autonomous vehicles, Bayesian statistics, Bernie Sanders, bioinformatics, blockchain, Bretton Woods, business intelligence, Cass Sunstein, Claude Shannon: information theory, cloud computing, cognitive bias, complexity theory, computer vision, crowdsourcing, cryptocurrency, Daniel Kahneman / Amos Tversky, Deng Xiaoping, distributed ledger, don't be evil, Donald Trump, Elon Musk, Filter Bubble, Flynn Effect, gig economy, Google Glasses, Grace Hopper, Gödel, Escher, Bach, Inbox Zero, Internet of things, Jacques de Vaucanson, Jeff Bezos, Joan Didion, job automation, John von Neumann, knowledge worker, Lyft, Mark Zuckerberg, Menlo Park, move fast and break things, move fast and break things, natural language processing, New Urbanism, one-China policy, optical character recognition, packet switching, pattern recognition, personalized medicine, RAND corporation, Ray Kurzweil, ride hailing / ride sharing, Rodney Brooks, Rubik’s Cube, Sand Hill Road, Second Machine Age, self-driving car, SETI@home, side project, Silicon Valley, Silicon Valley startup, skunkworks, Skype, smart cities, South China Sea, sovereign wealth fund, speech recognition, Stephen Hawking, strong AI, superintelligent machines, technological singularity, The Coming Technological Singularity, theory of mind, Tim Cook: Apple, trade route, Turing machine, Turing test, uber lyft, Von Neumann architecture, Watson beat the top human players on Jeopardy!, zero day

James Andrews, mathematician and professor at Florida State University who specialized in group theory and knot theory. Jean Bartik, mathematician and one of the original programmers for the ENIAC computer. Albert Turner Bharucha-Reid, mathematician and theorist who made significant contributions in Markov chains, probability theory, and statistics. David Blackwell, statistician and mathematician who made significant contributions to game theory, information theory, probability theory, and Bayesian statistics. Mamie Phipps Clark, a PhD and social psychologist whose research focused on self-consciousness. Thelma Estrin, who pioneered the application of computer systems in neurophysiological and brain research. She was a researcher in the Electroencephalography Department of the Neurological Institute of Columbia Presbyterian at the time of the Dartmouth Summer Research Project. Evelyn Boyd Granville, a PhD in mathematics who developed the computer programs used for trajectory analysis in the first US-manned missions to space and the moon.


pages: 407 words: 104,622

The Man Who Solved the Market: How Jim Simons Launched the Quant Revolution by Gregory Zuckerman

affirmative action, Affordable Care Act / Obamacare, Albert Einstein, Andrew Wiles, automated trading system, backtesting, Bayesian statistics, beat the dealer, Benoit Mandelbrot, Berlin Wall, Bernie Madoff, blockchain, Brownian motion, butter production in bangladesh, buy and hold, buy low sell high, Claude Shannon: information theory, computer age, computerized trading, Credit Default Swap, Daniel Kahneman / Amos Tversky, diversified portfolio, Donald Trump, Edward Thorp, Elon Musk, Emanuel Derman, endowment effect, Flash crash, George Gilder, Gordon Gekko, illegal immigration, index card, index fund, Isaac Newton, John Meriwether, John Nash: game theory, John von Neumann, Loma Prieta earthquake, Long Term Capital Management, loss aversion, Louis Bachelier, mandelbrot fractal, margin call, Mark Zuckerberg, More Guns, Less Crime, Myron Scholes, Naomi Klein, natural language processing, obamacare, p-value, pattern recognition, Peter Thiel, Ponzi scheme, prediction markets, quantitative hedge fund, quantitative trading / quantitative finance, random walk, Renaissance Technologies, Richard Thaler, Robert Mercer, Ronald Reagan, self-driving car, Sharpe ratio, Silicon Valley, sovereign wealth fund, speech recognition, statistical arbitrage, statistical model, Steve Jobs, stochastic process, the scientific method, Thomas Bayes, transaction costs, Turing machine

Rather than manually programming in static knowledge about how language worked, they created a program that learned from data. Brown, Mercer, and the others relied upon Bayesian mathematics, which had emerged from the statistical rule proposed by Reverend Thomas Bayes in the eighteenth-century. Bayesians will attach a degree of probability to every guess and update their best estimates as they receive new information. The genius of Bayesian statistics is that it continuously narrows a range of possibilities. Think, for example, of a spam filter, which doesn’t know with certainty if an email is malicious, but can be effective by assigning odds to each one received by constantly learning from emails previously classified as “junk.” (This approach wasn’t as strange as it might seem. According to linguists, people in conversation unconsciously guess the next words that will be spoken, updating their expectations along the way.)


pages: 523 words: 112,185

Doing Data Science: Straight Talk From the Frontline by Cathy O'Neil, Rachel Schutt

Amazon Mechanical Turk, augmented reality, Augustin-Louis Cauchy, barriers to entry, Bayesian statistics, bioinformatics, computer vision, correlation does not imply causation, crowdsourcing, distributed generation, Edward Snowden, Emanuel Derman, fault tolerance, Filter Bubble, finite state, Firefox, game design, Google Glasses, index card, information retrieval, iterative process, John Harrison: Longitude, Khan Academy, Kickstarter, Mars Rover, Nate Silver, natural language processing, Netflix Prize, p-value, pattern recognition, performance metric, personalized medicine, pull request, recommendation engine, rent-seeking, selection bias, Silicon Valley, speech recognition, statistical model, stochastic process, text mining, the scientific method, The Wisdom of Crowds, Watson beat the top human players on Jeopardy!, X Prize

Finally, don’t ignore the necessary data intuition when you make use of algorithms. Just because your method converges, it doesn’t mean the results are meaningful. Make sure you’ve created a reasonable narrative and ways to check its validity. Chapter 12. Epidemiology The contributor for this chapter is David Madigan, professor and chair of statistics at Columbia. Madigan has over 100 publications in such areas as Bayesian statistics, text mining, Monte Carlo methods, pharmacovigilance, and probabilistic graphical models. Madigan’s Background Madigan went to college at Trinity College Dublin in 1980, and specialized in math except for his final year, when he took a bunch of stats courses, and learned a bunch about computers: Pascal, operating systems, compilers, artificial intelligence, database theory, and rudimentary computing skills.


pages: 398 words: 120,801

Little Brother by Cory Doctorow

airport security, Bayesian statistics, Berlin Wall, citizen journalism, Firefox, game design, Golden Gate Park, Haight Ashbury, Internet Archive, Isaac Newton, Jane Jacobs, Jeff Bezos, mail merge, Mitch Kapor, MITM: man-in-the-middle, RFID, Sand Hill Road, Silicon Valley, slashdot, Steve Jobs, Steve Wozniak, Thomas Bayes, web of trust, zero day

They hopped from Xbox to Xbox until they found one that was connected to the Internet, then they injected their material as undecipherable, encrypted data. No one could tell which of the Internet's packets were Xnet and which ones were just plain old banking and e-commerce and other encrypted communication. You couldn't find out who was tying the Xnet, let alone who was using the Xnet. But what about Dad's "Bayesian statistics?" I'd played with Bayesian math before. Darryl and I once tried to write our own better spam filter and when you filter spam, you need Bayesian math. Thomas Bayes was an 18th century British mathematician that no one cared about until a couple hundred years after he died, when computer scientists realized that his technique for statistically analyzing mountains of data would be super-useful for the modern world's info-Himalayas.


pages: 755 words: 121,290

Statistics hacks by Bruce Frey

Bayesian statistics, Berlin Wall, correlation coefficient, Daniel Kahneman / Amos Tversky, distributed generation, en.wikipedia.org, feminist movement, G4S, game design, Hacker Ethic, index card, Milgram experiment, p-value, place-making, reshoring, RFID, Search for Extraterrestrial Intelligence, SETI@home, Silicon Valley, statistical model, Thomas Bayes

William Skorupski is currently an assistant professor in the School of Education at the University of Kansas, where he teaches courses in psychometrics and statistics. He earned his Bachelor's degree in educational research and psychology from Bucknell University in 2000, and his Doctorate in psychometric methods from the University of Massachusetts, Amherst in 2004. His primary research interest is in the application of mathematical models to psychometric data, including the use of Bayesian statistics for solving practical measurement problems. He also enjoys applying his knowledge of statistics and probability to everyday situations, such as playing poker against the author of this book! Acknowledgments I'd like to thank all the contributors to this book, both those who are listed in the "Contributors" section and those who helped with ideas, reviewed the manuscript, and provided suggestions of sources and resources.


pages: 415 words: 125,089

Against the Gods: The Remarkable Story of Risk by Peter L. Bernstein

"Robert Solow", Albert Einstein, Alvin Roth, Andrew Wiles, Antoine Gombaud: Chevalier de Méré, Bayesian statistics, Big bang: deregulation of the City of London, Bretton Woods, business cycle, buttonwood tree, buy and hold, capital asset pricing model, cognitive dissonance, computerized trading, Daniel Kahneman / Amos Tversky, diversified portfolio, double entry bookkeeping, Edmond Halley, Edward Lloyd's coffeehouse, endowment effect, experimental economics, fear of failure, Fellow of the Royal Society, Fermat's Last Theorem, financial deregulation, financial innovation, full employment, index fund, invention of movable type, Isaac Newton, John Nash: game theory, John von Neumann, Kenneth Arrow, linear programming, loss aversion, Louis Bachelier, mental accounting, moral hazard, Myron Scholes, Nash equilibrium, Norman Macrae, Paul Samuelson, Philip Mirowski, probability theory / Blaise Pascal / Pierre de Fermat, random walk, Richard Thaler, Robert Shiller, Robert Shiller, spectrum auction, statistical model, stocks for the long run, The Bell Curve by Richard Herrnstein and Charles Murray, The Wealth of Nations by Adam Smith, Thomas Bayes, trade route, transaction costs, tulip mania, Vanguard fund, zero-sum game

Jahn Maynard Keynes. Vol. 1: Hopes Betrayed. New York: Viking. Slovic, Paul, Baruch Fischoff, and Sarah Lichtenstein, 1990. "Rating the Risks." In Glickman and Gough, 1990, pp. 61-75. Smith, Clifford W., Jr., 1995. "Corporate Risk Management: Theory and Practice." Journal of Derivatives, Summer, pp. 21-30. Smith, M. F. M., 1984. "Present Position and Potential Developments: Some Personal Views of Bayesian Statistics." Journal of the Royal Statistical Association, Vol. 147, Part 3, pp. 245-259. Smithson, Charles W., and Clifford W. Smith, Jr., 1995. Managing Financial Risk: A Guide to Derivative Products, Financial Engineering, and Value Maximization. New York: Irwin.* Sorensen, Eric, 1995. "The Derivative Portfolio Matrix-Combining Market Direction with Market Volatility." Institute for Quantitative Research in Finance, Spring 1995 Seminar.


pages: 483 words: 141,836

Red-Blooded Risk: The Secret History of Wall Street by Aaron Brown, Eric Kim

activist fund / activist shareholder / activist investor, Albert Einstein, algorithmic trading, Asian financial crisis, Atul Gawande, backtesting, Basel III, Bayesian statistics, beat the dealer, Benoit Mandelbrot, Bernie Madoff, Black Swan, business cycle, capital asset pricing model, central bank independence, Checklist Manifesto, corporate governance, creative destruction, credit crunch, Credit Default Swap, disintermediation, distributed generation, diversification, diversified portfolio, Edward Thorp, Emanuel Derman, Eugene Fama: efficient market hypothesis, experimental subject, financial innovation, illegal immigration, implied volatility, index fund, Long Term Capital Management, loss aversion, margin call, market clearing, market fundamentalism, market microstructure, money market fund, money: store of value / unit of account / medium of exchange, moral hazard, Myron Scholes, natural language processing, open economy, Pierre-Simon Laplace, pre–internet, quantitative trading / quantitative finance, random walk, Richard Thaler, risk tolerance, risk-adjusted returns, risk/return, road to serfdom, Robert Shiller, Robert Shiller, shareholder value, Sharpe ratio, special drawing rights, statistical arbitrage, stochastic volatility, stocks for the long run, The Myth of the Rational Market, Thomas Bayes, too big to fail, transaction costs, value at risk, yield curve

If you accept that your entire earthly life is the appropriate numeraire for decision making, then the rest of Pascal’s case is easy to accept. Just as Archimedes claimed that with a long enough lever he could move the earth, I claim that with a big enough numeraire, I can make any faith-based action seem reasonable. Frequentist statistics suffers from paradoxes because it doesn’t insist everything be stated in moneylike terms, without which there’s no logical connection between frequency and degree of belief. Bayesian statistics suffers from insisting on a single, universal numeraire, which is often not appropriate. One thing we know about money is that it can’t buy everything. One thing we know about people is they have multiple natures, and groups of people are even more complicated. There are many numeraires, more than there are people. Picking the right one is key to getting meaningful statistical results. The only statistical analyses that can be completely certain are ones that are pure mathematical results, and ones that refer to gamelike situations in which all outside considerations are excluded by rule and the numeraire is specified.


No Slack: The Financial Lives of Low-Income Americans by Michael S. Barr

active measures, asset allocation, Bayesian statistics, business cycle, Cass Sunstein, conceptual framework, Daniel Kahneman / Amos Tversky, financial exclusion, financial innovation, Home mortgage interest deduction, income inequality, information asymmetry, labor-force participation, late fees, London Interbank Offered Rate, loss aversion, market friction, mental accounting, Milgram experiment, mobile money, money market fund, mortgage debt, mortgage tax deduction, New Urbanism, p-value, payday loans, race to the bottom, regulatory arbitrage, Richard Thaler, risk tolerance, Robert Shiller, Robert Shiller, the payments system, transaction costs, unbanked and underbanked, underbanked

Barr, Anjali ­Kumar, and Robert E. Litan, 117–41. Brookings. Romich, Jennifer, Sarah Gordon, and Eric N. Waithaka. 2009. “A Tool for Getting By or Getting Ahead? Consumers’ Views on Prepaid Cards.” Working Paper 2009-WP-09. Terre Haute: Indiana State University, Networks Financial Institute (http://ssrn.com/ abstract=1491645). Rossi, Peter E., Greg M. Allenby, and Robert McCulloch. 2005. Bayesian Statistics and Marketing. West Sussex, U.K.: John Wiley & Sons. Sawtooth Software. 2008. “Proceedings of the Sawtooth Software Conference, October 2007” (www.sawtoothsoftware.com/download/techpap/2007Proceedings.pdf ). Seidman, Ellen, Moez Hababou, and Jennifer Kramer. 2005. A Financial Services Survey of Low- and Moderate-Income Households. Chicago: Center for Financial Services Innovation (http://cfsinnovation.com/system/files/imported/managed_documents/threecitysurvey. pdf ).


pages: 523 words: 143,139

Algorithms to Live By: The Computer Science of Human Decisions by Brian Christian, Tom Griffiths

4chan, Ada Lovelace, Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem, Albert Einstein, algorithmic trading, anthropic principle, asset allocation, autonomous vehicles, Bayesian statistics, Berlin Wall, Bill Duvall, bitcoin, Community Supported Agriculture, complexity theory, constrained optimization, cosmological principle, cryptocurrency, Danny Hillis, David Heinemeier Hansson, delayed gratification, dematerialisation, diversification, Donald Knuth, double helix, Elon Musk, fault tolerance, Fellow of the Royal Society, Firefox, first-price auction, Flash crash, Frederick Winslow Taylor, George Akerlof, global supply chain, Google Chrome, Henri Poincaré, information retrieval, Internet Archive, Jeff Bezos, Johannes Kepler, John Nash: game theory, John von Neumann, Kickstarter, knapsack problem, Lao Tzu, Leonard Kleinrock, linear programming, martingale, Nash equilibrium, natural language processing, NP-complete, P = NP, packet switching, Pierre-Simon Laplace, prediction markets, race to the bottom, RAND corporation, RFC: Request For Comment, Robert X Cringely, Sam Altman, sealed-bid auction, second-price auction, self-driving car, Silicon Valley, Skype, sorting algorithm, spectrum auction, Stanford marshmallow experiment, Steve Jobs, stochastic process, Thomas Bayes, Thomas Malthus, traveling salesman, Turing machine, urban planning, Vickrey auction, Vilfredo Pareto, Walter Mischel, Y Combinator, zero-sum game

Laplace was born in Normandy: For more details on Laplace’s life and work, see Gillispie, Pierre-Simon Laplace. distilled down to a single estimate: Laplace’s Law is derived by working through the calculation suggested by Bayes—the tricky part is the sum over all hypotheses, which involves a fun application of integration by parts. You can see a full derivation of Laplace’s Law in Griffiths, Kemp, and Tenenbaum, “Bayesian Models of Cognition.” From the perspective of modern Bayesian statistics, Laplace’s Law is the posterior mean of the binomial rate using a uniform prior. If you try only once and it works out: You may recall that in our discussion of multi-armed bandits and the explore/exploit dilemma in chapter 2, we also touched on estimates of the success rate of a process—a slot machine—based on a set of experiences. The work of Bayes and Laplace undergirds many of the algorithms we discussed in that chapter, including the Gittins index.


pages: 574 words: 164,509

Superintelligence: Paths, Dangers, Strategies by Nick Bostrom

agricultural Revolution, AI winter, Albert Einstein, algorithmic trading, anthropic principle, anti-communist, artificial general intelligence, autonomous vehicles, barriers to entry, Bayesian statistics, bioinformatics, brain emulation, cloud computing, combinatorial explosion, computer vision, cosmological constant, dark matter, DARPA: Urban Challenge, data acquisition, delayed gratification, demographic transition, different worldview, Donald Knuth, Douglas Hofstadter, Drosophila, Elon Musk, en.wikipedia.org, endogenous growth, epigenetics, fear of failure, Flash crash, Flynn Effect, friendly AI, Gödel, Escher, Bach, income inequality, industrial robot, informal economy, information retrieval, interchangeable parts, iterative process, job automation, John Markoff, John von Neumann, knowledge worker, longitudinal study, Menlo Park, meta analysis, meta-analysis, mutually assured destruction, Nash equilibrium, Netflix Prize, new economy, Norbert Wiener, NP-complete, nuclear winter, optical character recognition, pattern recognition, performance metric, phenotype, prediction markets, price stability, principal–agent problem, race to the bottom, random walk, Ray Kurzweil, recommendation engine, reversible computing, social graph, speech recognition, Stanislav Petrov, statistical model, stem cell, Stephen Hawking, strong AI, superintelligent machines, supervolcano, technological singularity, technoutopianism, The Coming Technological Singularity, The Nature of the Firm, Thomas Kuhn: the structure of scientific revolutions, transaction costs, Turing machine, Vernor Vinge, Watson beat the top human players on Jeopardy!, World Values Survey, zero-sum game

They also provide important insight into the concept of causality.28 One advantage of relating learning problems from specific domains to the general problem of Bayesian inference is that new algorithms that make Bayesian inference more efficient will then yield immediate improvements across many different areas. Advances in Monte Carlo approximation techniques, for example, are directly applied in computer vision, robotics, and computational genetics. Another advantage is that it lets researchers from different disciplines more easily pool their findings. Graphical models and Bayesian statistics have become a shared focus of research in many fields, including machine learning, statistical physics, bioinformatics, combinatorial optimization, and communication theory.35 A fair amount of the recent progress in machine learning has resulted from incorporating formal results originally derived in other academic fields. (Machine learning applications have also benefitted enormously from faster computers and greater availability of large data sets


pages: 579 words: 183,063

Tribe of Mentors: Short Life Advice From the Best in the World by Timothy Ferriss

23andMe, A Pattern Language, agricultural Revolution, Airbnb, Albert Einstein, Bayesian statistics, bitcoin, Black Swan, blockchain, Brownian motion, Buckminster Fuller, Clayton Christensen, cloud computing, cognitive dissonance, Colonization of Mars, corporate social responsibility, cryptocurrency, David Heinemeier Hansson, dematerialisation, don't be evil, double helix, effective altruism, Elon Musk, Ethereum, ethereum blockchain, family office, fear of failure, Gary Taubes, Geoffrey West, Santa Fe Institute, Google Hangouts, Gödel, Escher, Bach, haute couture, helicopter parent, high net worth, In Cold Blood by Truman Capote, income inequality, index fund, Jeff Bezos, job satisfaction, Johann Wolfgang von Goethe, Kevin Kelly, Lao Tzu, Law of Accelerating Returns, Lyft, Mahatma Gandhi, Marc Andreessen, Marshall McLuhan, Mikhail Gorbachev, minimum viable product, move fast and break things, move fast and break things, Naomi Klein, non-fiction novel, Peter Thiel, profit motive, Ralph Waldo Emerson, Ray Kurzweil, Saturday Night Live, side project, Silicon Valley, Skype, smart cities, smart contracts, Snapchat, Steve Jobs, Steven Pinker, Stewart Brand, TaskRabbit, Tesla Model S, too big to fail, Turing machine, uber lyft, web application, Whole Earth Catalog, Y Combinator

Sometimes it isn’t, and I need to spend time doing other stuff before I’m ready. Often, I end up realizing that those things aren’t important and I just forget about them forever. What is one of the best or most worthwhile investments you’ve ever made? Lots of time spent doing math and philosophy has paid off and will continue to pay off, I have (almost) no doubt. Questioning the foundation of Bayesian statistics has been a very valuable process. Reworking definitions and impossibility results from consensus literature has been equally valuable. What purchase of $100 or less has most positively impacted your life in the last six months (or in recent memory)? An audio lecture series on institutional economics called “International Economic Institutions: Globalism vs. Nationalism.” It was interesting/important to me because it was the first information about institutional design that I’ve ever really internalized.


Statistics in a Nutshell by Sarah Boslaugh

Antoine Gombaud: Chevalier de Méré, Bayesian statistics, business climate, computer age, correlation coefficient, experimental subject, Florence Nightingale: pie chart, income per capita, iterative process, job satisfaction, labor-force participation, linear programming, longitudinal study, meta analysis, meta-analysis, p-value, pattern recognition, placebo effect, probability theory / Blaise Pascal / Pierre de Fermat, publication bias, purchasing power parity, randomized controlled trial, selection bias, six sigma, statistical model, The Design of Experiments, the scientific method, Thomas Bayes, Vilfredo Pareto

The Reverend Thomas Bayes Bayes’ theorem was developed by a British Nonconformist minister, the Reverend Thomas Bayes (1702–1761). Bayes studied logic and theology at the University of Edinburgh and earned his livelihood as a minister in Holborn and Tunbridge Wells, England. However, his fame today rests on his theory of probability, which was developed in his essay, published after his death by the Royal Society of London. There is an entire field of study today known as Bayesian statistics, which is based on the notion of probability as a statement of strength of belief rather than as a frequency of occurrence. However, it is uncertain whether Bayes himself would have embraced this definition because he published relatively little on mathematics during his lifetime. Enough Exposition, Let’s Do Some Statistics! Statistics is something you do, not something you read about, so the real purpose of the preceding theoretical presentation is to give you the information you need to perform calculations about the probability of events and to use the concepts introduced to be able to reason using your knowledge of statistics.


pages: 685 words: 203,949

The Organized Mind: Thinking Straight in the Age of Information Overload by Daniel J. Levitin

airport security, Albert Einstein, Amazon Mechanical Turk, Anton Chekhov, Bayesian statistics, big-box store, business process, call centre, Claude Shannon: information theory, cloud computing, cognitive bias, complexity theory, computer vision, conceptual framework, correlation does not imply causation, crowdsourcing, cuban missile crisis, Daniel Kahneman / Amos Tversky, delayed gratification, Donald Trump, en.wikipedia.org, epigenetics, Eratosthenes, Exxon Valdez, framing effect, friendly fire, fundamental attribution error, Golden Gate Park, Google Glasses, haute cuisine, impulse control, index card, indoor plumbing, information retrieval, invention of writing, iterative process, jimmy wales, job satisfaction, Kickstarter, life extension, longitudinal study, meta analysis, meta-analysis, more computing power than Apollo, Network effects, new economy, Nicholas Carr, optical character recognition, Pareto efficiency, pattern recognition, phenotype, placebo effect, pre–internet, profit motive, randomized controlled trial, Rubik’s Cube, shared worldview, Skype, Snapchat, social intelligence, statistical model, Steve Jobs, supply-chain management, the scientific method, The Wealth of Nations by Adam Smith, The Wisdom of Crowds, theory of mind, Thomas Bayes, Turing test, ultimatum game, zero-sum game

For every 5 people who take the treatment, 1 will be cured (because that person actually has the disease) and .25 will have the side effects. In this case, with two tests, you’re now about 4 times more likely to experience the cure than the side effects, a nice reversal of what we saw before. (If it makes you uncomfortable to talk about .25 of a person, just multiply all the numbers above by 4.) We can take Bayesian statistics a step further. Suppose a newly published study shows that if you are a woman, you’re ten times more likely to get the disease than if you’re a man. You can construct a new table to take this information into account, and to refine the estimate that you actually have the disease. The calculations of probabilities in real life have applications far beyond medical matters. I asked Steve Wynn, who owns five casinos (at his Wynn and Encore hotels in Las Vegas, and the Wynn, Encore, and Palace in Macau), “Doesn’t it hurt, just a little, to see customers walking away with large pots of your money?”


pages: 1,737 words: 491,616

Rationality: From AI to Zombies by Eliezer Yudkowsky

Albert Einstein, Alfred Russel Wallace, anthropic principle, anti-pattern, anti-work, Arthur Eddington, artificial general intelligence, availability heuristic, Bayesian statistics, Berlin Wall, Build a better mousetrap, Cass Sunstein, cellular automata, cognitive bias, cognitive dissonance, correlation does not imply causation, cosmological constant, creative destruction, Daniel Kahneman / Amos Tversky, dematerialisation, different worldview, discovery of DNA, Douglas Hofstadter, Drosophila, effective altruism, experimental subject, Extropian, friendly AI, fundamental attribution error, Gödel, Escher, Bach, hindsight bias, index card, index fund, Isaac Newton, John Conway, John von Neumann, Long Term Capital Management, Louis Pasteur, mental accounting, meta analysis, meta-analysis, money market fund, Nash equilibrium, Necker cube, NP-complete, P = NP, pattern recognition, Paul Graham, Peter Thiel, Pierre-Simon Laplace, placebo effect, planetary scale, prediction markets, random walk, Ray Kurzweil, reversible computing, Richard Feynman, risk tolerance, Rubik’s Cube, Saturday Night Live, Schrödinger's Cat, scientific mainstream, scientific worldview, sensible shoes, Silicon Valley, Silicon Valley startup, Singularitarianism, Solar eclipse in 1919, speech recognition, statistical model, Steven Pinker, strong AI, technological singularity, The Bell Curve by Richard Herrnstein and Charles Murray, the map is not the territory, the scientific method, Turing complete, Turing machine, ultimatum game, X Prize, Y Combinator, zero-sum game

Some frequentists criticize Bayesians for treating probabilities as subjective states of belief, rather than as objective frequencies of events. Kruschke and Yudkowsky have replied that frequentism is even more “subjective” than Bayesianism, because frequentism’s probability assignments depend on the intentions of the experimenter.10 Importantly, this philosophical disagreement shouldn’t be conflated with the distinction between Bayesian and frequentist data analysis methods, which can both be useful when employed correctly. Bayesian statistical tools have become cheaper to use since the 1980s, and their informativeness, intuitiveness, and generality have come to be more widely appreciated, resulting in “Bayesian revolutions” in many sciences. However, traditional frequentist methods remain more popular, and in some contexts they are still clearly superior to Bayesian approaches. Kruschke’s Doing Bayesian Data Analysis is a fun and accessible introduction to the topic.11 In light of evidence that training in statistics—and some other fields, such as psychology—improves reasoning skills outside the classroom, statistical literacy is directly relevant to the project of overcoming bias.

I responded—note that this was completely spontaneous—“What on Earth do you mean? You can’t avoid assigning a probability to the mathematician making one statement or another. You’re just assuming the probability is 1, and that’s unjustified.” To which the one replied, “Yes, that’s what the Bayesians say. But frequentists don’t believe that.” And I said, astounded: “How can there possibly be such a thing as non-Bayesian statistics?” That was when I discovered that I was of the type called “Bayesian.” As far as I can tell, I was born that way. My mathematical intuitions were such that everything Bayesians said seemed perfectly straightforward and simple, the obvious way I would do it myself; whereas the things frequentists said sounded like the elaborate, warped, mad blasphemy of dreaming Cthulhu. I didn’t choose to become a Bayesian any more than fishes choose to breathe water.


pages: 827 words: 239,762

The Golden Passport: Harvard Business School, the Limits of Capitalism, and the Moral Failure of the MBA Elite by Duff McDonald

activist fund / activist shareholder / activist investor, Affordable Care Act / Obamacare, Albert Einstein, barriers to entry, Bayesian statistics, Bernie Madoff, Bob Noyce, Bonfire of the Vanities, business cycle, business process, butterfly effect, capital asset pricing model, Capital in the Twenty-First Century by Thomas Piketty, Clayton Christensen, cloud computing, collateralized debt obligation, collective bargaining, commoditize, corporate governance, corporate raider, corporate social responsibility, creative destruction, deskilling, discounted cash flows, disintermediation, disruptive innovation, Donald Trump, family office, financial innovation, Frederick Winslow Taylor, full employment, George Gilder, glass ceiling, global pandemic, Gordon Gekko, hiring and firing, income inequality, invisible hand, Jeff Bezos, job-hopping, John von Neumann, Joseph Schumpeter, Kenneth Arrow, Kickstarter, London Whale, Long Term Capital Management, market fundamentalism, Menlo Park, new economy, obamacare, oil shock, pattern recognition, performance metric, Peter Thiel, plutocrats, Plutocrats, profit maximization, profit motive, pushing on a string, Ralph Nader, Ralph Waldo Emerson, RAND corporation, random walk, rent-seeking, Ronald Coase, Ronald Reagan, Sam Altman, Sand Hill Road, Saturday Night Live, shareholder value, Silicon Valley, Skype, Social Responsibility of Business Is to Increase Its Profits, Steve Jobs, survivorship bias, The Nature of the Firm, the scientific method, Thorstein Veblen, union organizing, urban renewal, Vilfredo Pareto, War on Poverty, William Shockley: the traitorous eight, women in the workforce, Y Combinator

Anyone who has come across a decision tree when contemplating the choices and uncertainties in business owes them a debt. In short, their work opened up just about any business problem to mathematical analysis, without necessarily sacrificing expert opinion in the process. In 1959, Schlaifer published Probability and Statistics for Business Decisions, and in 1961, Raiffa and Schlaifer coauthored Applied Statistical Decision Theory, which “set the direction of Bayesian statistics for the next two decades.”10 But this was geeky stuff, especially for the more “broad-gauged” crowd at HBS. So even if the School was trying as hard as it could to keep up with the GSIAs of the world, it still felt a need to apologize for getting too geeky with Applied Statistical Decision Theory. Calling it “a new type of publication,” Dean Teele explained that “[whereas] most reports . . . published by the Division of Research have as their intended audience informed and forward-looking business executives in general, the new series has been written primarily for specialists. . . .”11 Translation: You may not understand it, but that doesn’t mean you’re not “informed and forward-looking.”


pages: 764 words: 261,694

The Elements of Statistical Learning (Springer Series in Statistics) by Trevor Hastie, Robert Tibshirani, Jerome Friedman

Bayesian statistics, bioinformatics, computer age, conceptual framework, correlation coefficient, G4S, greed is good, linear programming, p-value, pattern recognition, random walk, selection bias, speech recognition, statistical model, stochastic process, The Wisdom of Crowds

A modified principal component technique based on the lasso, Journal of Computational and Graphical Statistics 12: 531–547. Jones, L. (1992). A simple lemma on greedy approximation in Hilbert space and convergence rates for projection pursuit regression and neural network training, Annals of Statistics 20: 608–613. Jordan, M. (2004). Graphical models, Statistical Science (Special Issue on Bayesian Statistics) 19: 140–155. Jordan, M. and Jacobs, R. (1994). Hierachical mixtures of experts and the EM algorithm, Neural Computation 6: 181–214. Kalbfleisch, J. and Prentice, R. (1980). The Statistical Analysis of Failure Time Data, Wiley, New York. Kaufman, L. and Rousseeuw, P. (1990). Finding Groups in Data: An Introduction to Cluster Analysis, Wiley, New York. Kearns, M. and Vazirani, U. (1994). An Introduction to Computational Learning Theory, MIT Press, Cambridge, MA.