17 results back to index
Big Data: A Revolution That Will Transform How We Live, Work, and Think by Viktor Mayer-Schonberger, Kenneth Cukier
23andMe, Affordable Care Act / Obamacare, airport security, barriers to entry, Berlin Wall, big data - Walmart - Pop Tarts, Black Swan, book scanning, business intelligence, business process, call centre, cloud computing, computer age, correlation does not imply causation, dark matter, double entry bookkeeping, Eratosthenes, Erik Brynjolfsson, game design, IBM and the Holocaust, index card, informal economy, intangible asset, Internet of things, invention of the printing press, Jeff Bezos, Joi Ito, lifelogging, Louis Pasteur, Mark Zuckerberg, Menlo Park, Moneyball by Michael Lewis explains big data, Nate Silver, natural language processing, Netflix Prize, Network effects, obamacare, optical character recognition, PageRank, paypal mafia, performance metric, Peter Thiel, post-materialism, random walk, recommendation engine, self-driving car, sentiment analysis, Silicon Valley, Silicon Valley startup, smart grid, smart meter, social graph, speech recognition, Steve Jobs, Steven Levy, the scientific method, The Signal and the Noise by Nate Silver, The Wealth of Nations by Adam Smith, Thomas Davenport, Turing test, Watson beat the top human players on Jeopardy!
Next [>] Mike Flowers and New York City’s analytics—Based on interview with Cukier, July 2012. For a good description, see: Alex Howard, “Predictive data analytics is saving lives and taxpayer dollars in New York City,” O’Reilly Media, June 26, 2012 (http://strata.oreilly.com/2012/06/predictive-data-analytics-big-data-nyc.html). [>] Walmart and Pop-Tarts—Hays, “What Wal-Mart Knows About Customers’ Habits.” [>] Big data’s use in slums and in modeling refugee movements—Nathan Eagle, “Big Data, Global Development, and Complex Systems,” http://www.youtube.com/watch?v=yaivtqlu7iM. Perception of time—Benedict Anderson, Imagined Communities (Verso, 2006). [>] “What’s past is prologue”—William Shakespeare, “The Tempest,” Act 2, Scene I. [>] CERN experiment and data storage—Cukier email exchange with CERN researchers, November 2012.
We will still need causal studies and controlled experiments with carefully curated data in certain cases, such as designing a critical airplane part. But for many everyday needs, knowing what not why is good enough. And big-data correlations can point the way toward promising areas in which to explore causal relationships. These quick correlations let us save money on plane tickets, predict flu outbreaks, and know which manholes or overcrowded buildings to inspect in a resource-constrained world. They may enable health insurance firms to provide coverage without a physical exam and lower the cost of reminding the sick to take their medication. Languages are translated and cars drive themselves on the basis of predictions made through big-data correlations. Walmart can learn which flavor Pop-Tarts to stock at the front of the store before a hurricane. (Answer: strawberry.) Of course, causality is nice when you can get it.
. [>] Recommendations one-third of Amazon’s income—This figure has never been officially confirmed by the company but has been published in numerous analyst reports and articles in the media, including “Building with Big Data: The Data Revolution Is Changing the Landscape of Business,” The Economist, May 26, 2011 (http://www.economist.com/node/18741392/). The figure was also referenced by two former Amazon executives in interviews with Cukier. Netflix price information—Xavier Amatriain and Justin Basilico, “Netflix Recommendations: Beyond the 5 stars (Part 1),” Netflix blog, April 6, 2012. [>] “Fooled by Randomness”—Nassim Nicholas Taleb, Fooled by Randomness (Random House, 2008); for more, see Nassim Nicholas Taleb, The Black Swan: The Impact of the Highly Improbable (2nd ed., Random House, 2010). [>] Walmart and Pop-Tarts—Constance L. Hays, “What Wal-Mart Knows About Customers’ Habits,” New York Times, November 14, 2004 (http://www.nytimes.com/2004/11/14/business/yourmoney/14wal.html). [>] Examples of predictive models by FICO, Experian, and Equifax—Scott Thurm, “Next Frontier in Credit Scores: Predicting Personal Behavior,” Wall Street Journal, October 27, 2011 (http://online.wsj.com/article/SB10001424052970203687504576655182086300912.html). [>] Aviva’s predictive models—Leslie Scism and Mark Maremont, “Insurers Test Data Profiles to Identify Risky Clients,” Wall Street Journal, November 19, 2010 (http://online.wsj.com/article/SB10001424052748704648604575620750998072986.html).
Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are by Seth Stephens-Davidowitz
affirmative action, AltaVista, Amazon Mechanical Turk, Asian financial crisis, Bernie Sanders, big data - Walmart - Pop Tarts, Cass Sunstein, computer vision, correlation does not imply causation, crowdsourcing, Daniel Kahneman / Amos Tversky, desegregation, Donald Trump, Edward Glaeser, Filter Bubble, game design, happiness index / gross national happiness, income inequality, Jeff Bezos, John Snow's cholera map, longitudinal study, Mark Zuckerberg, Nate Silver, peer-to-peer lending, Peter Thiel, price discrimination, quantitative hedge fund, Ronald Reagan, Rosa Parks, sentiment analysis, Silicon Valley, statistical model, Steve Jobs, Steven Levy, Steven Pinker, TaskRabbit, The Signal and the Noise by Nate Silver, working poor
And, in the prediction business, you just need to know that something works, not why. For example, Walmart uses data from sales in all their stores to know what products to shelve. Before Hurricane Frances, a destructive storm that hit the Southeast in 2004, Walmart suspected—correctly—that people’s shopping habits may change when a city is about to be pummeled by a storm. They pored through sales data from previous hurricanes to see what people might want to buy. A major answer? Strawberry Pop-Tarts. This product sells seven times faster than normal in the days leading up to a hurricane. Based on their analysis, Walmart had trucks loaded with strawberry Pop-Tarts heading down Interstate 95 toward stores in the path of the hurricane. And indeed, these Pop-Tarts sold well. Why Pop-Tarts? Probably because they don’t require refrigeration or cooking.
Probably because they don’t require refrigeration or cooking. Why strawberry? No clue. But when hurricanes hit, people turn to strawberry Pop-Tarts apparently. So in the days before a hurricane, Walmart now regularly stocks its shelves with boxes upon boxes of strawberry Pop-Tarts. The reason for the relationship doesn’t matter. But the relationship itself does. Maybe one day food scientists will figure out the association between hurricanes and toaster pastries filled with strawberry jam. But, while waiting for some such explanation, Walmart still needs to stock its shelves with strawberry Pop-Tarts when hurricanes are approaching and save the Rice Krispies treats for sunnier days. This lesson is also clear in the story of Orley Ashenfelter. What Seder is to horses, Ashenfelter, an economist at Princeton, may be to wine. A little over a decade ago, Ashenfelter was frustrated.
., 227, 228 127 Hours (movie), 90, 91 Optimal Decisions Group, 262 Or, Flora, 266 Ortiz, David “Big Papi,” 197–200, 200n, 203 “out-of-sample” tests, 250–51 Page, Larry, 60, 61, 62, 103 pancreatic cancer, Columbia University-Microsoft study of, 28–29 Pandora, 203 Pantheon project (Massachusetts Institute of Technology), 184–85 parents/parenting and child abuse, 145–47, 149–50, 161 and examples of Big Data searches, 22 and prejudice against children, 134–36, 135n Parks, Rosa, 93, 94 Parr, Ben, 153–54 Pathak, Parag, 235–36 PatientsLikeMe.com, 205 patterns, and data science as intuitive, 27, 33 Paul, Chris, 37 paying back loans, 257–61 PECOTA model, 199–200, 200n pedigrees of basketball players, 67 of horses, 66–67, 69, 71 pedometer, Chance emphasis on, 252–53 penis and Freud’s theories, 46 and phallic symbols in dreams, 46–47 size of, 17, 19, 123–24, 124n, 127 “penistrian,” 45, 46, 48, 50 Pennsylvania State University, income of graduates of, 237–39 Peysakhovich, Alex, 254 phallic symbols, in dreams, 46–48 Philadelphia Daily News, and words as data, 95 Philippines, cigarette economy in, 102 physical appearance and dating, 82, 120n and parents prejudice against children, 135–36 and truth about sex, 120, 120n, 125–26, 127 physics, as science, 272–73 pictures, as data, 97–102, 103 Pierson, Emma, 160n Piketty, Thomas, 283 Pinky Pizwaanski (horse), 70 pizza, information about, 77 PlentyOfFish (dating site), 139 Plomin, Robert, 249–50 political science, and digital revolution, 244, 274 politics and A/B testing, 211–14 complexity of, 273 and ignoring what people tell you, 157 and origin of political preferences, 169–71 and truth about the internet, 140–44 and words as data, 95–97 See also conservatives; Democrats; liberals; Republicans polls Google searches compared with, 9 and lying, 107 reliability of, 12 See also specific poll or topic Pop-Tarts, 72 Popp, Noah, 202 Popper, Karl, 45, 272, 273 PornHub (website), 14, 50–52, 54, 116, 120–22, 274 pornography as addiction, 219 and bias of social media, 151 and breastfeeding, 19 cartoon, 52 child, 121 and digital revolution, 279 and gays, 114–15, 114n, 116, 117, 119 honesty of data about, 53–54 and incest, 50–52 in India, 19 and lying, 110 popular videos on, 152 popularity of, 53, 151 and power of Big Data, 53 search engines for, 61n and truth about sex, 114–15, 117 unemployed and, 58, 59 Posada, Jorge, 200 poverty and life expectancy, 176–78 and words as data, 93, 94 See also income distribution predictions and data science as intuitive, 27 and getting the numbers right, 74 and what counts as data, 74 and what vs. why it works, 71 See also specific topic pregnancy, 20, 187–90 prejudice implicit, 132–34 of parents against children, 134–36, 135n subconscious, 134, 163 truth about, 128–40, 162–63 See also bias; hate; race/racism; Stormfront Premise, 101–2, 103 price discrimination, 262–65 prison conditions, and crime, 235 privacy issues, and danger of empowered government, 267–70 property rights, and words as data, 93, 94 proquest.com, 95 Prosper (lending site), 257 Psy, “Gangnam Style” video of, 152 psychics, 266 psychology and digital revolution, 274, 277–78, 279 as science, 273 as soft science, 273 and traditional research methods, 274 Quantcast, 137 questions asking the right, 21–22 and dating, 82–83 race/racism causes of, 18–19 elections of 2008 and, 2, 6–7, 12, 133 elections of 2012 and, 2–3, 8, 133 elections of 2016 and, 8, 11, 12, 14, 133 explicit, 133, 134 and Harvard Crimson editorial about Zuckerberg, 155 and lying, 109 map of, 7–9 and Obama, 2, 6–7, 8–9, 12, 133, 240, 243–44 and predicting success in basketball, 35, 36–37 and Republicans, 3, 7, 8 Stephens-Davidowitz’s study of, 2–3, 6–7, 12, 14, 243–44 and Trump, 8, 9, 11, 12, 14, 133 and truth about hate and prejudice, 129–34, 162–63 See also Muslims; “nigger” randomized controlled experiments and A/B testing, 209–21 and causality, 208–9 rape, 121–22, 190–91 Rawlings, Craig, 80 “rawtube” (porn site), 59 Reagan, Andy, 88, 90, 91 Reagan, Ronald, 227 regression discontinuity, 234–36 Reisinger, Joseph, 101–2, 103 relationships, lasting, 31–33 religion, and life expectancy, 177 Renaissance (hedge fund), 246 Republicans core principles of, 94 and origins of political preferences, 170–71 and racism, 3, 7, 8 and words as data, 93–97 See also specific person or election research and expansion of research methodology, 275–76 See also specific researcher or research reviews, of businesses, 265 “Rocket Tube” (gay porn site), 115 Rolling Stones, 278 Romney, Mitt, 10, 212 Roseau County, Minnesota, successful/notable Americans from, 186, 187 Runaway Bride (movie), 192, 195 sabermetricians, 198–99 San Bernardino, California, shooting in, 129–30 Sands, Emily, 202 science and Big Data, 273 and experiments, 272–73 real, 272–73 at scale, 276 soft, 273 search engines differentiation of Google from other, 60–62 for pornography, 61n reliability of, 60 word-count, 71 See also specific engine searchers, typing errors by, 48–50 searches negative words used in, 128–29 See also specific search “secrets about people,” 155–56 Seder, Jeff, 63–66, 68–70, 71, 74, 155, 256 segregation, 141–44.
Data-Ism: The Revolution Transforming Decision Making, Consumer Behavior, and Almost Everything Else by Steve Lohr
"Robert Solow", 23andMe, Affordable Care Act / Obamacare, Albert Einstein, big data - Walmart - Pop Tarts, bioinformatics, business cycle, business intelligence, call centre, cloud computing, computer age, conceptual framework, Credit Default Swap, crowdsourcing, Daniel Kahneman / Amos Tversky, Danny Hillis, data is the new oil, David Brooks, East Village, Edward Snowden, Emanuel Derman, Erik Brynjolfsson, everywhere but in the productivity statistics, Frederick Winslow Taylor, Google Glasses, impulse control, income inequality, indoor plumbing, industrial robot, informal economy, Internet of things, invention of writing, Johannes Kepler, John Markoff, John von Neumann, lifelogging, Mark Zuckerberg, market bubble, meta analysis, meta-analysis, money market fund, natural language processing, obamacare, pattern recognition, payday loans, personalized medicine, precision agriculture, pre–internet, Productivity paradox, RAND corporation, rising living standards, Robert Gordon, Second Machine Age, self-driving car, Silicon Valley, Silicon Valley startup, six sigma, skunkworks, speech recognition, statistical model, Steve Jobs, Steven Levy, The Design of Experiments, the scientific method, Thomas Kuhn: the structure of scientific revolutions, unbanked and underbanked, underbanked, Von Neumann architecture, Watson beat the top human players on Jeopardy!
Exploiting correlation is the first wave of the big-data phenomenon, and it can be extremely powerful. Indeed, useful and profitable observations increasingly do come from “listening to the data” to find correlations. A handful of large corporations have been at this for years, using their own data. A canonical example of this kind of data discovery is the Pop-Tarts-and-beer case at Walmart from a decade ago. The giant retailer, mining the historical purchasing data from its stores, found that consumers in the path of a predicted hurricane bought strawberry Pop-Tarts at seven times the usual rate and the best-selling item of all before a hurricane was beer. Walmart’s store managers don’t care why that purchasing pattern occurs. They’re just going to stock up on beer and strawberry Pop-Tarts when hurricane warnings come their way.
Randall, 40 Mount Sinai Hospital, 8, 13–14, 15 data science and genomic research at, 163–65, 171, 173–81 medical data and human experience, 68–70 Mundie, Craig, 203 Nakashima, George, 65 Naked Society, The (Packard), 184 Narayanan, Arvind, 204 Nest learning thermostat, 143–45 Google and, 152–53 human behavior and, 147–52 Never-Ending Language Learning system (NELL), of Carnegie Mellon University, 110–11 New York State, Medicaid fraud prevention in, 48 Norvig, Peter, 116 Norway, 48 “notice and choice,” in data collection of personal information, 186, 187–88 Noyes, Eliot, 49 “numerical imagination,” of Hammerbacher, 13–14 Oak Ridge National Laboratory, 176 Obama administration, big data and, 203–4 O’Donnell, Tim, 180–81 OfficeMax, 188–89 Olmo, Harold, 126 Olson, Mike, 101 online advertising, 84–85 as “socio-technical construct,” 193–95 open-source code, IBM and, 9 operations research, 154 optimization, at IBM, 46 Packard, Vance, 184 Palmisano, Samuel, 49–51, 53 “Parable of Google Flu: Traps in Big Data Analysis, The” (Science), 108 Pattern Recognition (Gibson), 154 Paul, Sharoda, 135 payday lending market, 104–7 Pennebaker, James, 199 Pentland, Alex, 15, 203–4, 206 Perlich, Claudia, 120 personality traits, values, and needs, 198–99 personally identifying information, privacy concerns and, 187–92 Pieroni, Stephanie, 36 Pitts, Martha, 57 Pitts, Shereline, 57 Pop-Tarts, beer, and hurricane data, 104 precision agriculture, E. & J.
., 5–6 Snyder, Steven, 165–67, 170 social networks, research using human behavior and, 86–94 retail use, 153–62 spread of information and, 73–74 Twitter posts and, 197–202 see also privacy concerns Social Security numbers, data used to predict person’s, 187–88 software, origin of term, 96 Solow, Robert, 72 Speakeasy programming language, 160 Spee (Harvard club), 28–30 Spohrer, Jim, 25 Stanford University, 211–12 Starbucks, 157 Stockholm, rush-hour pricing in, 47 storytelling, computer algorithms and, 120–21, 149, 165–66, 205, 214 structural racism, in big data racial profiling, 194–95 Structure of Scientific Revolutions, The (Kuhn), 175 Sweeney, Latanya, 193–95 System S, at IBM, 40 Tarbell, Ida, 208 Taylor, Frederick Winslow, 207–8 Tecco, Halle, 16, 25, 28, 168–69 Tetlock, Philip, 67–68 thermostats, learning by, 143–45, 147–53 Thinking, Fast and Slow (Kahneman), 66–67 toggling, 84 Truth in Lending Act (1968), 185 T-shaped people, 25 Tukey, John, 96–97 Turing, Alan, 178–79 Tversky, Amos, 66 Twitter, 85 posts studied for personal information, 197–202 “Two Cultures, The” (Snow), 5–6 “universal machine” (Turing’s theoretical computer), 179 universities, data science and, 15–16, 97–98, 211–12 Unlocking the Value of Personal Data: From Collection to Usage (World Economic Forum), 203 “Unreasonable Effectiveness of Data, The” (Norvig), 116 use-only restrictions, on data, 203 Uttamchandani, Menka, 77–78, 80, 212 VALS (Values, Attitudes, and Lifestyles), 155 Van Alstyne, Marshall, 74 Vance, Ashlee, 85 Vargas, Veronica, 159–60 Varma, Anil, 136–37 Veritas, 91 vineyards, data used for precision agriculture in, 123–33, 212 Vivero, David, 29 Vladeck, David, 203, 204 von Neumann, John, 54 Von Neumann architecture, 54 Walker, Donald, 2, 63, 212 Walmart, 104, 154 Watson, Thomas Jr., 49 Watson technology, of IBM, 45, 66–67, 120, 205 as cloud service, 9, 54 Jeopardy and, 7, 40, 111, 114 medical diagnoses and, 69–70, 109 Watts, Duncan J., 86 weather analysis, with big data, 129–32 Weitzner, Daniel, 184 “Why ask Why?” (Gelman and Imbens), 115–16 winemaking, precision agriculture and, 123–33, 212 Wing, Michael, 49–50 workforce rebalancing, at IBM, 57 World Economic Forum, 203 Yarkoni, Tal, 199 Yoshimi, Bill, 198 ZestFinance, data correlation and, 104–7 Zeyliger, Philip, 100–101 Zhou, Michelle, 197–202 Zuckerberg, Mark, 28, 86, 89 ABOUT THE AUTHOR Photo by Fred Conrad STEVE LOHR reports on technology, business, and economics for the New York Times.
World Without Mind: The Existential Threat of Big Tech by Franklin Foer
artificial general intelligence, back-to-the-land, Berlin Wall, big data - Walmart - Pop Tarts, big-box store, Buckminster Fuller, citizen journalism, Colonization of Mars, computer age, creative destruction, crowdsourcing, data is the new oil, don't be evil, Donald Trump, Double Irish / Dutch Sandwich, Douglas Engelbart, Edward Snowden, Electric Kool-Aid Acid Test, Elon Musk, Fall of the Berlin Wall, Filter Bubble, global village, Google Glasses, Haight Ashbury, hive mind, income inequality, intangible asset, Jeff Bezos, job automation, John Markoff, Kevin Kelly, knowledge economy, Law of Accelerating Returns, Marc Andreessen, Mark Zuckerberg, Marshall McLuhan, means of production, move fast and break things, move fast and break things, new economy, New Journalism, Norbert Wiener, offshore financial centre, PageRank, Peace of Westphalia, Peter Thiel, planetary scale, Ray Kurzweil, self-driving car, Silicon Valley, Singularitarianism, software is eating the world, Steve Jobs, Steven Levy, Stewart Brand, strong AI, supply-chain management, the medium is the message, the scientific method, The Wealth of Nations by Adam Smith, The Wisdom of Crowds, Thomas L Friedman, Thorstein Veblen, Upton Sinclair, Vernor Vinge, Whole Earth Catalog, yellow journalism
The essence of the algorithm is entirely uncomplicated: John MacCormick, Nine Algorithms That Changed the Future (Princeton University Press, 2012), 3–4. “We can stop looking for models. We can analyze the data without hypotheses”: Chris Anderson, “The End of Theory: The Data Deluge Makes the Scientific Method Obsolete,” Wired, June 23, 2008. Walmart’s algorithms found that people desperately buy strawberry Pop-Tarts: Constance L. Hays, “What Wal-Mart Knows About Customers’ Habits,” New York Times, November 14, 2004. Sweeney conducted a study that found that users with African American names: Latanya Sweeney, “Discrimination in Online Ad Delivery,” Communications of the ACM 56, no. 5 (May 2013): 44–54. Every product you use: Charlie Rose Show, November 7, 2011. a “personalized newspaper”: Alexandra Chang, “Liveblog: Facebook Reveals a ‘New Look for News Feed’,” Wired, March 7, 2013.
We can throw the numbers into the biggest computing clusters the world has ever seen and let statistical algorithms find patterns where science cannot.” On one level, this is undeniable. Algorithms can translate languages without understanding words, simply by uncovering the patterns that undergird the construction of sentences. They can find coincidences that humans might never even think to seek. Walmart’s algorithms found that people desperately buy strawberry Pop-Tarts as they prepare for massive storms. Still, even as an algorithm mindlessly implements its procedures—and even as it learns to see new patterns in the data—it reflects the minds of its creators, the motives of its trainers. Both Amazon and Netflix use algorithms to make recommendations about books and films. (One-third of purchases on Amazon come from these recommendations.)
The year Facebook went public, it recorded $1.1 billion in American profits, but didn’t pay a cent of federal or state income tax. Indeed, it earned a $429 million refund. According to Citizens for Tax Justice, Facebook bilked the treasury by taking a single deduction: It wrote off the stock options it gave to its executives. It’s hard to have sympathy with Walmart or Home Depot or the other big-box stores. They hardly pay the largest tax rates in the nation. Still, they cough up a reasonable sum. Over the last decade, Walmart, the supposed Beast of Bentonville, handed over about 30 percent of its income in taxes; Home Depot paid 38 percent. We can bemoan the fact that they don’t pay more, yet it seems reasonable to note that their prime competitor isn’t paying even half that rate. Amazon averaged an effective tax rate of 13 percent—that includes taxes owed to states and localities, as well as the feds and foreign governments.
Trees on Mars: Our Obsession With the Future by Hal Niedzviecki
"Robert Solow", Ada Lovelace, agricultural Revolution, Airbnb, Albert Einstein, anti-communist, big data - Walmart - Pop Tarts, big-box store, business intelligence, Colonization of Mars, computer age, crowdsourcing, David Brooks, Elon Musk, en.wikipedia.org, Erik Brynjolfsson, Flynn Effect, Google Glasses, hive mind, Howard Zinn, if you build it, they will come, income inequality, Internet of things, invention of movable type, Jaron Lanier, Jeff Bezos, job automation, John von Neumann, knowledge economy, Kodak vs Instagram, life extension, Lyft, Marc Andreessen, Mark Zuckerberg, Marshall McLuhan, Peter H. Diamandis: Planetary Resources, Peter Thiel, Pierre-Simon Laplace, Ponzi scheme, precariat, prediction markets, Ralph Nader, randomized controlled trial, Ray Kurzweil, ride hailing / ride sharing, rising living standards, Ronald Reagan, self-driving car, shareholder value, sharing economy, Silicon Valley, Silicon Valley startup, Skype, Steve Jobs, TaskRabbit, technological singularity, technoutopianism, Ted Kaczynski, Thomas L Friedman, Uber and Lyft, uber lyft, working poor
The virtual reality we create out of millions of points of data can be analyzed, manipulated, and broken down the way the real world just can’t. From a data-crunching, owning-the-future perspective, converting the real into the virtual is becoming more and more desirable. Knowing what is going to happen is far more efficient than responding to events after the fact. And systematically controlling what is going to happen is most efficient of all. ° ° ° ° ° ° In the first phase of big data and information technology, Walmart deduced that before big storms, sales of Pop-Tarts skyrocket. Google analyzed millions of search terms and correlated searches for products and symptoms related to flu with actual flu outbreaks coming in the days and weeks ahead, ultimately providing faster and more accurate information about what areas were going to be ravaged by flu than the Centers for Disease Control and Prevention were able to achieve at the time.
We’ll watch it happen on the phone.”57 But maybe when we envision “progress” we should be picturing the cloistered aisles of the Walmart Supercenter where the real technological changes have affected billions of people—eliminating their jobs, supporting outsourcing to countries where it’s normal to have entire weeks when the smog is so thick you can’t see the sun, and all to give us shoppers slightly cheaper toasters and the Pop-Tarts to go with them. But, hey, things surely aren’t so very grim. For every takedown of Walmart—and really, what could be easier?—there’s a fabulous new business starting in the cloud that will eventually reverse the trend and make things good again. Everyone picks on Walmart. So let’s apply this same formula of information technology creating efficiencies that enhance productivity and profit but ultimately cost jobs and drop wages to a high-tech darling as far away from the Walmart aesthetic of buzzing fluorescent lights and glassy-eyed greeters as possible.
In fact, as the Big Data authors tells us, it was Walmart that revolutionized—or you might say disrupted retailing. In the 1990s, they developed a new tracking system called Retail Link. This massive IT system enabled and required suppliers to monitor what was selling where at Walmart stores. It then became their problem to keep up. “Wal-Mart used data to become, in effect, the world’s largest consignment shop.”53 Walmart’s world domination came by introducing cutting-edge, disruptive IT to its systems ranging from stock to shipping to employee monitoring. The more efficient Walmart’s system became, the more it could best all competitors in terms of price. Information technology allows Walmart to manage a massive outsourcing operation that wouldn’t have been possible a generation ago. Thus, Walmart doesn’t make anything, it doesn’t own any of the warehouses it stores its goods in, it outsources as much of its logistics as possible.
Data Science for Business: What You Need to Know About Data Mining and Data-Analytic Thinking by Foster Provost, Tom Fawcett
Albert Einstein, Amazon Mechanical Turk, big data - Walmart - Pop Tarts, bioinformatics, business process, call centre, chief data officer, Claude Shannon: information theory, computer vision, conceptual framework, correlation does not imply causation, crowdsourcing, data acquisition, David Brooks, en.wikipedia.org, Erik Brynjolfsson, Gini coefficient, information retrieval, intangible asset, iterative process, Johann Wolfgang von Goethe, Louis Pasteur, Menlo Park, Nate Silver, Netflix Prize, new economy, p-value, pattern recognition, placebo effect, price discrimination, recommendation engine, Ronald Coase, selection bias, Silicon Valley, Skype, speech recognition, Steve Jobs, supply-chain management, text mining, The Signal and the Noise by Nate Silver, Thomas Bayes, transaction costs, WikiLeaks
Understanding the process and the stages helps to structure our data-analytic thinking, and to make it more systematic and therefore less prone to errors and omissions. There is convincing evidence that data-driven decision-making and big data technologies substantially improve business performance. Data science supports data-driven decision-making—and sometimes conducts such decision-making automatically—and depends upon technologies for “big data” storage and engineering, but its principles are separate. The data science principles we discuss in this book also differ from, and are complementary to, other important technologies, such as statistical hypothesis testing and database querying (which have their own books and classes). The next chapter describes some of these differences in more detail. * * *  Of course! What goes better with strawberry Pop-Tarts than a nice cold beer?  Target was successful enough that this case raised ethical questions on the deployment of such techniques.
., Answering Business Questions with These Techniques B bag of words approach, Bag of Words bags, Bag of Words base rates, Class Probability Estimation and Logistic “Regression”, Holdout Data and Fitting Graphs, Problems with Unbalanced Classes baseline classifiers, Advantages and Disadvantages of Naive Bayes baseline methods, of data science, Summary Basie, Count, Example: Jazz Musicians Bayes rate, Bias, Variance, and Ensemble Methods Bayes, Thomas, Bayes’ Rule Bayesian methods, Bayes’ Rule, Summary Bayes’ Rule, Bayes’ Rule–A Model of Evidence “Lift” beer and lottery example, Example: Beer and Lottery Tickets–Example: Beer and Lottery Tickets Beethoven, Ludwig van, Example: Evidence Lifts from Facebook “Likes” beginning cross-validation, From Holdout Evaluation to Cross-Validation behavior description, From Business Problems to Data Mining Tasks Being John Malkovich (film), Data Reduction, Latent Information, and Movie Recommendation Bellkors Pragmatic Chaos (Netflix Challenge team), Data Reduction, Latent Information, and Movie Recommendation benefit improvement, calculating, Costs and benefits benefits and underlying profit calculation, ROC Graphs and Curves data-driven decision-making, Data Science, Engineering, and Data-Driven Decision Making estimating, Costs and benefits in budgeting, Ranking Instead of Classifying nearest-neighbor methods, Computational efficiency bi-grams, N-gram Sequences bias errors, ensemble methods and, Bias, Variance, and Ensemble Methods–Bias, Variance, and Ensemble Methods Big Data data science and, Data Processing and “Big Data”–Data Processing and “Big Data” evolution of, From Big Data 1.0 to Big Data 2.0–From Big Data 1.0 to Big Data 2.0 on Amazon and Google, Thinking Data-Analytically, Redux big data technologies, Data Processing and “Big Data” state of, From Big Data 1.0 to Big Data 2.0 utilizing, Data Processing and “Big Data” Big Red proposal example, Example Data Mining Proposal–Flaws in the Big Red Proposal Bing, Why Text Is Important, Representation Black-Sholes model, Models, Induction, and Prediction blog postings, Why Text Is Important blog posts, Example: Targeting Online Consumers With Advertisements Borders (book retailer), Achieving Competitive Advantage with Data Science breast cancer example, Example: Logistic Regression versus Tree Induction–Example: Logistic Regression versus Tree Induction Brooks, David, What Data Can’t Do: Humans in the Loop, Revisited browser cookies, Example: Targeting Online Consumers With Advertisements Brubeck, Dave, Example: Jazz Musicians Bruichladdich single malt scotch, Understanding the Results of Clustering Brynjolfsson, Erik, Data Science, Engineering, and Data-Driven Decision Making, Data Processing and “Big Data” budget, Ranking Instead of Classifying budget constraints, Profit Curves building modeling labs, From Holdout Evaluation to Cross-Validation building models, Data Mining and Its Results, Business Understanding, From Holdout Evaluation to Cross-Validation Bunnahabhain single malt whiskey, Example: Whiskey Analytics, Hierarchical Clustering business news stories example, Example: Clustering Business News Stories–The news story clusters business problems changing definition of, to fit available data, Changing the Way We Think about Solutions to Business Problems–Changing the Way We Think about Solutions to Business Problems data exploration vs., Stepping Back: Solving a Business Problem Versus Data Exploration–Stepping Back: Solving a Business Problem Versus Data Exploration engineering problems vs., Other Data Science Tasks and Techniques evaluating in a proposal, Be Ready to Evaluate Proposals for Data Science Projects expected value framework, structuring with, The Expected Value Framework: Structuring a More Complicated Business Problem–The Expected Value Framework: Structuring a More Complicated Business Problem exploratory data mining vs., The Fundamental Concepts of Data Science unique context of, What Data Can’t Do: Humans in the Loop, Revisited using expected values to provide framework for, The Expected Value Framework: Decomposing the Business Problem and Recomposing the Solution Pieces–The Expected Value Framework: Decomposing the Business Problem and Recomposing the Solution Pieces business strategy, Data Science and Business Strategy–A Firm’s Data Science Maturity accepting creative ideas, Be Ready to Accept Creative Ideas from Any Source case studies, examining, Examine Data Science Case Studies competitive advantages, Achieving Competitive Advantage with Data Science–Achieving Competitive Advantage with Data Science, Sustaining Competitive Advantage with Data Science–Superior Data Science Management data scientists, evaluating, Superior Data Scientists–Superior Data Scientists evaluating proposals, Be Ready to Evaluate Proposals for Data Science Projects–Flaws in the Big Red Proposal historical advantages and, Formidable Historical Advantage intangible collateral assets and, Unique Intangible Collateral Assets intellectual property and, Unique Intellectual Property managing data scientists effectively, Superior Data Science Management–Superior Data Science Management maturity of the data science, A Firm’s Data Science Maturity–A Firm’s Data Science Maturity thinking data-analytically for, Thinking Data-Analytically, Redux–Thinking Data-Analytically, Redux C Caesars Entertainment, Data and Data Science Capability as a Strategic Asset call center example, Profiling: Finding Typical Behavior–Profiling: Finding Typical Behavior Capability Maturity Model, A Firm’s Data Science Maturity Capital One, Data and Data Science Capability as a Strategic Asset, From an Expected Value Decomposition to a Data Science Solution Case-Based Reasoning, How Many Neighbors and How Much Influence?
A separate study, conducted by economist Prasanna Tambe of NYU’s Stern School, examined the extent to which big data technologies seem to help firms (Tambe, 2012). He finds that, after controlling for various possible confounding factors, using big data technologies is associated with significant additional productivity growth. Specifically, one standard deviation higher utilization of big data technologies is associated with 1%–3% higher productivity than the average firm; one standard deviation lower in terms of big data utilization is associated with 1%–3% lower productivity. This leads to potentially very large productivity differences between the firms at the extremes. From Big Data 1.0 to Big Data 2.0 One way to think about the state of big data technologies is to draw an analogy with the business adoption of Internet technologies.
Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy by Cathy O'Neil
Affordable Care Act / Obamacare, Bernie Madoff, big data - Walmart - Pop Tarts, call centre, carried interest, cloud computing, collateralized debt obligation, correlation does not imply causation, Credit Default Swap, credit default swaps / collateralized debt obligations, crowdsourcing, Emanuel Derman, housing crisis, I will remember that I didn’t make the world, and it doesn’t satisfy my equations, illegal immigration, Internet of things, late fees, mass incarceration, medical bankruptcy, Moneyball by Michael Lewis explains big data, new economy, obamacare, Occupy movement, offshore financial centre, payday loans, peer-to-peer lending, Peter Thiel, Ponzi scheme, prediction markets, price discrimination, quantitative hedge fund, Ralph Nader, RAND corporation, recommendation engine, Rubik’s Cube, Sharpe ratio, statistical model, Tim Cook: Apple, too big to fail, Unsafe at Any Speed, Upton Sinclair, Watson beat the top human players on Jeopardy!, working poor
American Express learned this the hard way: Ron Lieber, “American Express Kept a (Very) Watchful Eye on Charges,” New York Times, January 30, 2009, www.nytimes.com/2009/01/31/your-money/credit-and-debit-cards/31money.html. Douglas Merrill’s idea: Steve Lohr, “Big Data Underwriting for Payday Loans,” New York Times, January 19, 2015, http://bits.blogs.nytimes.com/2015/01/19/big-data-underwriting-for-payday-loans/. On the company web page: Website ZestFinance.com, accessed January 9, 2016, www.zestfinance.com/. A typical $500 loan: Lohr, “Big Data Underwriting.” ten thousand data points: Michael Carney, “Flush with $20M from Peter Thiel, ZestFinance Is Measuring Credit Risk Through Non-traditional Big Data,” Pando, July 31, 2013, https://pando.com/2013/07/31/flush-with-20m-from-peter-thiel-zestfinance-is-measuring-credit-risk-through-non-traditional-big-data/. one of the first peer-to-peer exchanges, Lending Club: Richard MacManus, “Facebook App, Lending Club, Passes Half a Million Dollars in Loans,” Readwrite, July 29, 2007, http://readwrite.com/2007/07/29/facebook_app_lending_club_passes_half_a_million_in_loans.
I’ve got loads of memories of people grabbing seconds of asparagus or avoiding the string beans. But they’re all mixed up and hard to formalize in a comprehensive list. The better solution would be to train the model over time, entering data every day on what I’d bought and cooked and noting the responses of each family member. I would also include parameters, or constraints. I might limit the fruits and vegetables to what’s in season and dole out a certain amount of Pop-Tarts, but only enough to forestall an open rebellion. I also would add a number of rules. This one likes meat, this one likes bread and pasta, this one drinks lots of milk and insists on spreading Nutella on everything in sight. If I made this work a major priority, over many months I might come up with a very good model. I would have turned the food management I keep in my head, my informal internal model, into a formal external one.
., schools, to return to that example, evaluates teachers largely on the basis of students’ test scores, while ignoring how much the teachers engage the students, work on specific skills, deal with classroom management, or help students with personal and family problems. It’s overly simple, sacrificing accuracy and insight for efficiency. Yet from the administrators’ perspective it provides an effective tool to ferret out hundreds of apparently underperforming teachers, even at the risk of misreading some of them. Here we see that models, despite their reputation for impartiality, reflect goals and ideology. When I removed the possibility of eating Pop-Tarts at every meal, I was imposing my ideology on the meals model. It’s something we do without a second thought. Our own values and desires influence our choices, from the data we choose to collect to the questions we ask. Models are opinions embedded in mathematics. Whether or not a model works is also a matter of opinion. After all, a key component of every model, whether formal or informal, is its definition of success.
Competing on Analytics: The New Science of Winning by Thomas H. Davenport, Jeanne G. Harris
always be closing, big data - Walmart - Pop Tarts, business intelligence, business process, call centre, commoditize, data acquisition, digital map, en.wikipedia.org, global supply chain, high net worth, if you build it, they will come, intangible asset, inventory management, iterative process, Jeff Bezos, job satisfaction, knapsack problem, late fees, linear programming, Moneyball by Michael Lewis explains big data, Netflix Prize, new economy, performance metric, personalized medicine, quantitative hedge fund, quantitative trading / quantitative ﬁnance, recommendation engine, RFID, search inside the book, shareholder value, six sigma, statistical model, supply-chain management, text mining, the scientific method, traveling salesman, yield management
So-called advanced planning and scheduling approaches also recognize material constraints in terms of current inventory and planned deliveries or allocations. As Wal-Mart’s data warehouse introduced additional information about customer behavior, applications using Wal-Mart’s massive database began to extend well beyond their supply chain. Wal-Mart now collects more data about more consumers than anyone in the private sector. Wal-Mart marketers mine this data to ensure that customers have the products they want, when they want them, and at the right price. For example, they’ve learned that before a hurricane, consumers stock up on food items that don’t require cooking or refrigeration. The top seller: Strawberry Pop Tarts. We expect that Wal-Mart asks Kellogg to rush shipments of them to stores just before a hurricane hits. In short, there are many analytical applications behind Wal-Mart’s success as the world’s largest retailer. Wal-Mart may be the world’s largest retailer, but at least it knows where all its stores are located.
In other cases, companies are managing logistics for their customers (refer to “Typical Analytical Applications in Supply Chains” box on the next page). Connecting Customers and Suppliers The mother of all supply chain analytics competitors is Wal-Mart. The company collects massive amounts of sales and inventory data (583 terabytes as of April 2006) into a single integrated technology platform. Its managers routinely analyze manifold aspects of its supply chain, and store managers use analytical tools to optimize product assortment; they examine not only detailed sales data but also qualitative factors such as the opportunity to tailor assortments to local community needs.18 The most distinctive element of Wal-Mart’s supply chain data is its availability to suppliers. Wal-Mart buys products from more than 17,400 suppliers in eighty countries, and each one uses the company’s Retail Link system to track the movement of its products—in fact, the system’s use is mandatory.
At Netflix, the most strategic application may be predicting customer movie preferences, but the company also employs testing and detailed analysis in its supply chain and its advertising. Harrah’s started in loyalty and service but also does detailed analyses of its slot machine pricing and placement, the design of its Web site, and many other issues in its business. Wal-Mart, Progressive Insurance, and the hospital supply distributor Owens & Minor are all examples of firms that started with an internal analytical focus but have broadened it externally—to suppliers in the case of Wal-Mart and to customers for the other two firms. Analytical competitors need a primary focus for their analytical activity, but once an analytical, test-and-learn culture has been created, it’s impossible to stop it from spreading. An Enterprise-Level Approach to and Management of Analytics Companies and organizations that compete analytically don’t entrust analytical activities just to one group within the company or to a collection of disparate employees across the organization.
The Network Imperative: How to Survive and Grow in the Age of Digital Business Models by Barry Libert, Megan Beck
active measures, Airbnb, Amazon Web Services, asset allocation, autonomous vehicles, big data - Walmart - Pop Tarts, business intelligence, call centre, Clayton Christensen, cloud computing, commoditize, crowdsourcing, disintermediation, diversification, Douglas Engelbart, Douglas Engelbart, future of work, Google Glasses, Google X / Alphabet X, Infrastructure as a Service, intangible asset, Internet of things, invention of writing, inventory management, iterative process, Jeff Bezos, job satisfaction, Kevin Kelly, Kickstarter, late fees, Lyft, Mark Zuckerberg, Oculus Rift, pirate software, ride hailing / ride sharing, self-driving car, sharing economy, Silicon Valley, Silicon Valley startup, six sigma, software as a service, software patent, Steve Jobs, subscription business, TaskRabbit, Travis Kalanick, uber lyft, Wall-E, women in the workforce, Zipcar
Most major retailers develop relationships with bloggers and sponsor posts that advertise their goods. They often maintain a significant presence on all major social media platforms (Starbucks has more than a million followers on Instagram—pretty good for a coffee company), and they use big data analytics to learn about and better serve their customers. You might think that the whole world is moving online and to the digital network and that brick-and-mortar is going the way of the dinosaurs, but some traditional retailers have found that their physical assets can be used to complement their emerging technology and network business models. Macy’s and Walmart, along with several others, have become masters of omnichannel strategies. Their customers can shop at home, in stores, or even on their phones and receive the product through delivery or in-store pickup—whichever is most convenient.
Those that fail to meet worker needs will see their best and brightest head off in search of their next great role. PRINCIPLE 8 MEASUREMENT From Accounting to Big Data Not everything that counts can be counted, and not everything that can be counted counts. —William Bruce Cameron, sociologist YOU MIGHT NOT EXPECT a chain of barbecue joints with eleven people on its information technology staff to be an innovator in the use of big data. If so, you’re in for a surprise. Big data isn’t just for the Googles, Apples, Amazons, and Facebooks. Using big data doesn’t have to be a complicated, resource-heavy, yearlong endeavor. Big data, for our purposes, is nothing more than large sets of information that can be analyzed to understand useful patterns, often, but not always, related to human behavior.
The odds of an ongoing relationship are greatly improved, however, if Caesars presents that customer with a free meal coupon or some other token while he is still in the casino. A real-time, integrated, big data system allows Caesars to take advantage of these opportunities. These three factors—measuring all assets, looking outward, and using real-time data—create great competitive advantage. As Laura Dickey said, it’s just not reasonable to do business without big data. Principle 8, Measurement: From Accounting to Big Data The eighth principle is to shift from basic accounting data, focused on the physical and having significant time delays, to big data analytics—including intangible, external assets and real-time analysis. On the left side of the measurement spectrum are organizations that count up their property, plant, and equipment, tally them in spreadsheets, e-mail them to finance, and report once a month.
The Formula: How Algorithms Solve All Our Problems-And Create More by Luke Dormehl
3D printing, algorithmic trading, Any sufficiently advanced technology is indistinguishable from magic, augmented reality, big data - Walmart - Pop Tarts, call centre, Cass Sunstein, Clayton Christensen, commoditize, computer age, death of newspapers, deferred acceptance, disruptive innovation, Edward Lorenz: Chaos theory, Erik Brynjolfsson, Filter Bubble, Flash crash, Florence Nightingale: pie chart, Frank Levy and Richard Murnane: The New Division of Labor, Google Earth, Google Glasses, High speed trading, Internet Archive, Isaac Newton, Jaron Lanier, Jeff Bezos, job automation, John Markoff, Kevin Kelly, Kodak vs Instagram, lifelogging, Marshall McLuhan, means of production, Nate Silver, natural language processing, Netflix Prize, Panopticon Jeremy Bentham, pattern recognition, price discrimination, recommendation engine, Richard Thaler, Rosa Parks, self-driving car, sentiment analysis, Silicon Valley, Silicon Valley startup, Slavoj Žižek, social graph, speech recognition, Steve Jobs, Steven Levy, Steven Pinker, Stewart Brand, the scientific method, The Signal and the Noise by Nate Silver, upwardly mobile, Wall-E, Watson beat the top human players on Jeopardy!, Y Combinator
CHAPTER 3 Do Algorithms Dream of Electric Laws? Adecade ago, Walmart stumbled upon an oddball piece of information while using its data-mining algorithms to comb through the mountains of information generated by its 245 million weekly customers. What it discovered was that, alongside the expected emergency supplies of duct tape, beer and bottled water, no product saw more of an increase in demand during severe weather warnings than strawberry Pop-Tarts. To test this insight, when news broke about the impending Hurricane Frances in 2004, Walmart bosses ordered trucks stocked with the Kellogg’s snack to be delivered to all its stores in the hurricane’s path. When these sold out just as quickly, Walmart bosses knew that they had gained a valuable glimpse into both consumer habits and the power of The Formula.1 Walmart executives weren’t alone in seeing the value of this discovery.
Elias, Norbert. The Civilizing Process (New York: Urizen Books, 1978). 13 This wave metaphor was not, in itself, new: the German sociologist Norbert Elias had referred to “a wave of advancing integration over several centuries” in his book The Civilizing Process, as had other writers over the previous century. 14 Richtel, Matt. “How Big Data Is Playing Recruiter for Specialized Workers.” New York Times, April 27, 2013. nytimes.com/2013/04/28/technology/how-big-data-is-playing-recruiter-for-specialized-workers.html?_r=0. 15 Kwoh, Leslie. “Facebook Profiles Found to Predict Job Performance.” Wall Street Journal, February 21, 2012. online.wsj.com/news/articles/SB10001424052970204909104577235474086304212. 16 Bulmer, Michael. Francis Galton: Pioneer of Heredity and Biometry (Baltimore: Johns Hopkins University Press, 2003). 17 Pearson, Karl.
When these sold out just as quickly, Walmart bosses knew that they had gained a valuable glimpse into both consumer habits and the power of The Formula.1 Walmart executives weren’t alone in seeing the value of this discovery. At the time, psychologist Colleen McCue and Los Angeles police chief Charlie Beck were collaborating on a paper for the law-enforcement magazine The Police Chief. They too seized upon Walmart’s revelation as a way of reimagining police work in a form that would be more predictive and less reactive. Entitled “Predictive Policing: What Can We Learn from Walmart and Amazon about Fighting Crime in a Recession?,” their 2009 paper immediately captured the imagination of law-enforcement professionals around the country when it was published.2 What McCue and Beck meant by “predictive policing” was that, thanks to advances in computing, crime data could now be gathered and analyzed in near-real time—and subsequently used to anticipate, prevent and respond more effectively to those crimes that would take place in the future.
The Internet of Us: Knowing More and Understanding Less in the Age of Big Data by Michael P. Lynch
Affordable Care Act / Obamacare, Amazon Mechanical Turk, big data - Walmart - Pop Tarts, bitcoin, Cass Sunstein, Claude Shannon: information theory, crowdsourcing, Edward Snowden, Firefox, Google Glasses, hive mind, income inequality, Internet of things, John von Neumann, meta analysis, meta-analysis, Nate Silver, new economy, Panopticon Jeremy Bentham, patient HM, prediction markets, RFID, sharing economy, Steve Jobs, Steven Levy, the scientific method, The Wisdom of Crowds, Thomas Kuhn: the structure of scientific revolutions, WikiLeaks
Similarly, Google Flu Trends doesn’t care why people are searching as they do; it just correlates the data. And Walmart doesn’t care why people buy more Pop-Tarts before a hurricane, nor do insurance companies care why certain credit scores correlate with certain medication adherences; they care only that they do. As Viktor Mayer-Schönberger and Kenneth Cukier put it, “predictions based on correlations lie at the heart of big data. Correlation analyses are now used so frequently that we sometimes fail to appreciate the inroads they have made. And the uses will only increase.” 4 Does the use of big data in this way however, really signal the end of theory, as Anderson alleged? The answer is no. And, as we’ll see, that is a very good thing. Start with Rudder and Anderson’s remarks. As Rudder puts it, big data seems to allow us to investigate by direct inspection.
As a consequence of the increasing importance of data analytics, we might employ “big data” in a third sense—to refer to firms like Google or Amazon that utilize data analytics as an essential part of their business model, and government agencies like the NSA that use these techniques as an essential part of, well, their business model. In this third sense, Big Data is like Big Oil. Large oil conglomerates are powerful because they control how the world’s major energy resource is not only distributed but how it is extracted. The tech giants are similar. Energy is not information, but both are resources, and resources by which the world runs. And Big Data, like Big Oil, is big precisely because it can control access to data as well as the extraction of information and knowledge from that data. Big Data refines data for information and knowledge, and we need to pay attention to that fact because knowledge, like energy, is not just a passive, inert resource.
Search as I just did for “Web 3.0 and …” and Google will suggest “big data” and “education”; search for “knowledge and …” and you might get “power” and “information systems.” Complete is a familiar, if rather gentle, form of big data analysis. It works because Google knows not only what much of the world is searching for on the Web, but also what you’ve been searching for. That data is useless without Google’s propriety analytic tools for transforming the numbers and words into a predictive search. These predictions aren’t perfect. But they are amazingly good, and getting better all the time. Google has done more than perhaps any other single high-profile company or entity to usher in the brave new world of big data. As I noted in the first chapter, the term “big data” can refer to three different things. The first is the ever-expanding volume of data being collected by our digital devices.
Technically Wrong: Sexist Apps, Biased Algorithms, and Other Threats of Toxic Tech by Sara Wachter-Boettcher
Airbnb, airport security, AltaVista, big data - Walmart - Pop Tarts, Donald Trump, Ferguson, Missouri, Firefox, Grace Hopper, job automation, Kickstarter, lifelogging, Mark Zuckerberg, Menlo Park, move fast and break things, move fast and break things, natural language processing, pattern recognition, Peter Thiel, recommendation engine, ride hailing / ride sharing, self-driving car, Silicon Valley, Silicon Valley startup, Snapchat, Steve Jobs, Tim Cook: Apple, Travis Kalanick, upwardly mobile, women in the workforce, zero-sum game
Writer Jesse Barron calls this “cuteness applied in the service of power-concealment”:15 an effort, on the part of tech companies, to make you feel safe and comfortable using their products, while they quietly hold the upper hand. According to Barron, tech products do this by employing “caretaker speech”—the linguistics term used to describe the way we talk to children. For example, when Seamless, a popular food delivery app, sends cutesy emails about the status of his order, he writes, “I picture a cool babysitter, Skylar, with his jean vest, telling me as he microwaves a pop-tart that ‘deliciousness is in the works,’ his tone just grazing the surface of mockery.” 16 But no matter how cool the babysitter—no matter how far past bedtime Skylar lets us stay up—at the end of the evening we’re still kids under someone else’s control. The result is an environment where we start to accept that the tech products we use, and the companies behind them, know best—and we’re just along for the ride.
Maggie Delano, “I Tried Tracking My Period and It Was Even Worse Than I Could Have Imagined,” Medium, February 23, 2015, https://medium.com/@maggied/i-tried-tracking-my-period-and-it-was-even-worse-than-i-could-have-imagined-bb46f869f45. 2. Glow, “About Glow,” Wayback Machine, September 21, 2013, https://web.archive.org/web/20130921143302/https://www.glowing.com/about. 3. Kia Kokalitcheva, “Glow Brings in $17M in New Funding, Puts Big Data to Work for Women’s Health,” VentureBeat, October 2, 2014, http://venturebeat.com/2014/10/02/glow-brings-in-17m-in-new-funding-as-puts-big-data-to-task-with-fertility-challenges. 4. Glow, “About Glow,” Wayback Machine, March 27, 2014, https://web.archive.org/web/20140327011628/https://glowing.com/about. 5. Erin Abler, Twitter post, January 31, 2017 (6:12 p.m.), https://twitter.com/erinabler/status/826614200114016256. 6. Michael M. Grynbaum, “New York’s Cabbies Like Credit Cards?
Five seconds in, I’m already trying to ignore the app’s assumptions that pregnancy is why I want to track my period. The app also assumes that I’m sexually active with someone who can get me pregnant.1 The first screen in Glow’s onboarding process. What if none of these options apply to you? Delano’s experience with Glow might have made sense back in 2013, when Glow launched with the mission of using big data “to help get you pregnant.” 2 But in 2014, the founders realized that about half of Glow’s users were actually using the app to avoid getting pregnant.3 So, with $17 million in new funding in hand, the team set out to transform Glow from a narrow, fertility-focused experience to a product that could serve all women—including, it would seem, women like Delano. “We live in a time when people are tracking everything about their bodies . . . yet it’s still uncomfortable to talk about your reproductive health, whether you’re trying to get pregnant or just wondering how ‘normal’ your period is,” the company website stated.
Platform Revolution: How Networked Markets Are Transforming the Economy--And How to Make Them Work for You by Sangeet Paul Choudary, Marshall W. van Alstyne, Geoffrey G. Parker
3D printing, Affordable Care Act / Obamacare, Airbnb, Alvin Roth, Amazon Mechanical Turk, Amazon Web Services, Andrei Shleifer, Apple's 1984 Super Bowl advert, autonomous vehicles, barriers to entry, big data - Walmart - Pop Tarts, bitcoin, blockchain, business cycle, business process, buy low sell high, chief data officer, Chuck Templeton: OpenTable:, clean water, cloud computing, connected car, corporate governance, crowdsourcing, data acquisition, data is the new oil, digital map, discounted cash flows, disintermediation, Edward Glaeser, Elon Musk, en.wikipedia.org, Erik Brynjolfsson, financial innovation, Haber-Bosch Process, High speed trading, information asymmetry, Internet of things, inventory management, invisible hand, Jean Tirole, Jeff Bezos, jimmy wales, John Markoff, Khan Academy, Kickstarter, Lean Startup, Lyft, Marc Andreessen, market design, Metcalfe’s law, multi-sided market, Network effects, new economy, payday loans, peer-to-peer lending, Peter Thiel, pets.com, pre–internet, price mechanism, recommendation engine, RFID, Richard Stallman, ride hailing / ride sharing, Robert Metcalfe, Ronald Coase, Satoshi Nakamoto, self-driving car, shareholder value, sharing economy, side project, Silicon Valley, Skype, smart contracts, smart grid, Snapchat, software is eating the world, Steve Jobs, TaskRabbit, The Chicago School, the payments system, Tim Cook: Apple, transaction costs, Travis Kalanick, two-sided market, Uber and Lyft, Uber for X, uber lyft, winner-take-all economy, zero-sum game, Zipcar
It referred to an unusual housing option for professionals who planned to attend the upcoming joint convention of two industrial design organizations, the International Congress of Societies of Industrial Design (ICSID) and the Industrial Designers Society of America (IDSA): If you’re heading out to the ICSID/IDSA World Congress/Connecting ’07 event in San Francisco next week and have yet to make accommodations, well, consider networking in your jam-jams. That’s right. For “an affordable alternative to hotels in the city,” imagine yourself in a fellow design industry person’s home, fresh awake from a snooze on the ol’ air mattress, chatting about the day’s upcoming events over Pop Tarts and OJ. The hosts for this “networking in your jam-jams” opportunity were Brian Chesky and Joe Gebbia, budding designers who’d moved to San Francisco only to find they couldn’t afford the rent on the loft they shared. Strapped for cash, they impulsively decided to make air mattresses and their own services as part-time tour guides available to convention attendees. Chesky and Gebbia attracted three weekend guests and made a thousand bucks, which covered the next month’s rent.
He lists eight markets with the potential to generate new multi-billion-dollar industries based on smart connections among industrial devices: • Security: using platform-based networks to protect industrial assets from attacks • Network: designing, building, and servicing the networks that will link and control industrial tools • Connected services: developing software and systems to manage the new networks • Product as a service: transitioning industrial companies from selling machines and tools to selling services facilitated by platform connections • Payments: implementing new ways to create and capture value from industrial equipment • Retrofits: equipping the $6.8 trillion worth of existing industrial machinery in the U.S. to participate in the new industrial Internet • Translation: teaching a wide array of devices and software systems to share data and communicate with one another • Vertical applications: finding ways to connect industrial tools at various places in the value chain to solve specific problems In total, Mount concludes (drawing on data from a World Economic Forum report) that the Industrial Awakening will generate $14.2 trillion of global output by 2030.13 Economist Jeremy Rifkin has deftly summarized this development, as well as some of its broader implications: There are now 11 billion sensors connecting devices to the internet of things. By 2030, 100 trillion sensors will be [in place] … continually sending big data to the communications, energy and logistics internets. Anyone will be able to access the internet of things and use big data and analytics to develop predictive algorithms that can speed efficiency, dramatically increase productivity and lower the marginal cost of producing and distributing physical things, including energy, products and services, to near zero, just as we now do with information goods.14 We may not be on the verge of seeing the majority of physical goods priced at or even near to zero—not yet.
In response, over 2,000 extension developers signed up in the first twelve months. The power of APIs to attract extension developers and the value they can create is enormous. Compare the financial results experienced by two major retailers: traditional giant Walmart and online platform Amazon. Amazon has some thirty-three open APIs as well as over 300 API “mashups” (i.e., combination tools that span two or more APIs), enabling e-commerce, cloud computing, messaging, search engine optimization, and payments. By contrast, Walmart has just one API, an e-commerce tool.14 Partly as a result of this difference, Amazon’s stock market capitalization exceeded that of Walmart for the first time in June 2015, reflecting Wall Street’s bullish view of Amazon’s future growth prospects.15 Other platform businesses have reaped similar benefits from their APIs. Cloud computing and computer services platform Salesforce generates 50 percent of its revenues through APIs, while for travel platform Expedia, the figure is 90 percent.16 The third category of developers who add value to the interactions on a platform are data aggregators.
The Age of Entitlement: America Since the Sixties by Christopher Caldwell
1960s counterculture, affirmative action, Affordable Care Act / Obamacare, anti-communist, Bernie Sanders, big data - Walmart - Pop Tarts, blue-collar work, Cass Sunstein, choice architecture, computer age, crack epidemic, crony capitalism, Daniel Kahneman / Amos Tversky, David Attenborough, desegregation, disintermediation, disruptive innovation, Edward Snowden, Erik Brynjolfsson, Ferguson, Missouri, financial deregulation, financial innovation, Firefox, full employment, George Gilder, global value chain, Home mortgage interest deduction, illegal immigration, immigration reform, informal economy, Jeff Bezos, John Markoff, Kevin Kelly, libertarian paternalism, Mark Zuckerberg, Martin Wolf, mass immigration, mass incarceration, mortgage tax deduction, Nate Silver, new economy, Norman Mailer, post-industrial society, pre–internet, profit motive, reserve currency, Richard Thaler, Robert Bork, Robert Gordon, Robert Metcalfe, Ronald Reagan, Rosa Parks, Silicon Valley, Skype, South China Sea, Steve Jobs, Thomas Kuhn: the structure of scientific revolutions, Thomas L Friedman, too big to fail, transatlantic slave trade, transcontinental railway, War on Poverty, Whole Earth Catalog, zero-sum game
Internet businesses were favored, and even subsidized, in ways that would remove middlemen from every walk of life, most conspicuously from retail. Now that people interacted with almost everything through a computer, their tiniest velleities could be tabulated. As Google, Facebook, and Amazon served customers, they were also harvesting and correlating information on them. Around 2010, this process came to be called Big Data. It was, at first, an entertaining curiosity. Walmart discovered through its algorithms that, when storms are coming, people buy more strawberry Pop-Tarts. Target could identify pregnant women from their tendency to buy unscented lotion in the third month of a pregnancy and then mineral supplements a few weeks thereafter. Marketers and advertisers now felt they held in their hands the same kind of esoteric, all-explaining truth that Alfred Kinsey’s Sexual Behavior in the Human Male had provided enthusiasts of sex in the late 1940s—a truth that is indifferent to what you say your morals or opinions are.
Original interview at “new NASA goal = Muslim outreach—Bolden,” YouTube, July 6, 2010. “Either we must allow”: Bertrand Russell, The Impact of Science on Society (London: George Allen & Unwin, 1952), 95. In hundreds of cities: Jeffrey Weiss, “Lunch Rush,” Dallas Morning News, October 12, 1993. Walmart discovered: Viktor Mayer-Schönberger and Kenneth Cukier, Big Data: A Revolution That Will Transform How We Live, Work, and Think (New York: Houghton Mifflin Harcourt, 2013), 54. Target could identify: Ibid., 58. “Society will need”: Mayer-Schönberger and Cukier, Big Data, 7. Google claimed to predict: Ibid., 15. SWIFT: Ibid. When pundits sought new ways: Fareed Zakaria, “Sanctions Russia Will Respect,” Washington Post, February 13, 2015. In their communication with investors: John Cassidy, “How Eliot Spitzer Humbled Wall Street,” New Yorker, April 7, 2003, 54–73.
God Is Not Great: How Religion Poisons Everything, a broadside by the political journalist Christopher Hitchens, topped the New York Times bestseller list. Less well understood was that the internet approach to data, and to reality, undermined all types of thinking aimed at understanding systems from the outside—not just religion but also science, political ideology, and deductive reasoning. Big Data worked by correlation, not by logic. As the Oxford technology expert Viktor Mayer-Schönberger put it, “Society will need to shed some of its obsession for causality in exchange for simple correlations: not knowing why but only what.” Big Data was a reassertion by powerful corporations of a right that had been stripped from other Americans: the right to stereotype. If you’re the sort of person who does x, you’re the sort of person who’ll like y. The information-gathering capacities of the new internet firms brought them into both collusion and competition with government.
Move Fast and Break Things: How Facebook, Google, and Amazon Cornered Culture and Undermined Democracy by Jonathan Taplin
1960s counterculture, affirmative action, Affordable Care Act / Obamacare, Airbnb, Amazon Mechanical Turk, American Legislative Exchange Council, Apple's 1984 Super Bowl advert, back-to-the-land, barriers to entry, basic income, battle of ideas, big data - Walmart - Pop Tarts, bitcoin, Brewster Kahle, Buckminster Fuller, Burning Man, Clayton Christensen, commoditize, creative destruction, crony capitalism, crowdsourcing, data is the new oil, David Brooks, David Graeber, don't be evil, Donald Trump, Douglas Engelbart, Douglas Engelbart, Dynabook, Edward Snowden, Elon Musk, equal pay for equal work, Erik Brynjolfsson, future of journalism, future of work, George Akerlof, George Gilder, Google bus, Hacker Ethic, Howard Rheingold, income inequality, informal economy, information asymmetry, information retrieval, Internet Archive, Internet of things, invisible hand, Jaron Lanier, Jeff Bezos, job automation, John Markoff, John Maynard Keynes: technological unemployment, John von Neumann, Joseph Schumpeter, Kevin Kelly, Kickstarter, labor-force participation, life extension, Marc Andreessen, Mark Zuckerberg, Menlo Park, Metcalfe’s law, Mother of all demos, move fast and break things, move fast and break things, natural language processing, Network effects, new economy, Norbert Wiener, offshore financial centre, packet switching, Paul Graham, paypal mafia, Peter Thiel, plutocrats, Plutocrats, pre–internet, Ray Kurzweil, recommendation engine, rent-seeking, revision control, Robert Bork, Robert Gordon, Robert Metcalfe, Ronald Reagan, Ross Ulbricht, Sam Altman, Sand Hill Road, secular stagnation, self-driving car, sharing economy, Silicon Valley, Silicon Valley ideology, smart grid, Snapchat, software is eating the world, Steve Jobs, Stewart Brand, technoutopianism, The Chicago School, The Market for Lemons, The Rise and Fall of American Growth, Tim Cook: Apple, trade route, transfer pricing, Travis Kalanick, trickle-down economics, Tyler Cowen: Great Stagnation, universal basic income, unpaid internship, We wanted flying cars, instead we got 140 characters, web application, Whole Earth Catalog, winner-take-all economy, women in the workforce, Y Combinator
And it’s ludicrous to believe that this stuff doesn’t alter our brains. It’s also equally ludicrous to believe that—at the very least—this mass distraction and manipulation is not convenient for the people who are in charge. People are starving. They may not know it because they’re being fed mass-produced garbage. The packaging is colorful and loud, but it’s produced in the same factories that make Pop-Tarts and iPads by people sitting around thinking, “What can we do to get people to buy more of these?” And they’re very good at their jobs. But that’s what it is you’re getting, because that’s what they’re making. They’re selling you something. And the world is built on this now. Politics and government are built on this; corporations are built on this. Interpersonal relationships are built on this.
The very rich, when they get to be 130 years old or more, would be so fearful of ordinary causes of death—a car accident, a plane crash, a terrorist bomb—that, having spent millions of dollars on immortality, they might be afraid to leave their mansions for fear of losing money on their investment. I would say it takes no big leap to guess that both Peter Thiel and Larry Page truly believe that technology can deliver happiness. In a new book, The Internet of Us: Knowing More and Understanding Less in the Age of Big Data, Michael Patrick Lynch starts with a thought experiment: “Imagine a society where smartphones are miniaturized and hooked directly into a person’s brain.” Google’s Larry Page is already working on this. Then Lynch takes us several generations into the future, where we have stopped learning by observation and reason and have become totally dependent on the Google Now chip in our brains. And then imagine some disaster disables the worldwide communications grid.
Schumacher, Small Is Beautiful: Economics as if People Mattered (New York: Harper, 1973). Yuval Levin, Fractured Republic: Renewing America’s Social Contract in the Age of Individualism (New York: Basic Books, 2016). Toni Morrison, Ta-Nehisi Coates, and Sonia Sanchez, “Art is Dangerous,” VOX, June 17, 2016, www.vox.com/2016/6/17/11955704/ta-nehisi-coates-toni-morrison-sonia-sanchez-in-conversation. Yuval Noah Harari, “Big Data, Google, and the End of Free Will,” Financial Times, August 26, 2016, www.ft.com/content/50bb4830-6a4c-11e6-ae5b-a7cc5dd5a28c. Thank you for buying this ebook, published by Hachette Digital. To receive special offers, bonus content, and news about our latest ebooks and apps, sign up for our newsletters. Sign Up Or visit us at hachettebookgroup.com/newsletters
The Future of the Professions: How Technology Will Transform the Work of Human Experts by Richard Susskind, Daniel Susskind
23andMe, 3D printing, additive manufacturing, AI winter, Albert Einstein, Amazon Mechanical Turk, Amazon Web Services, Andrew Keen, Atul Gawande, Automated Insights, autonomous vehicles, Big bang: deregulation of the City of London, big data - Walmart - Pop Tarts, Bill Joy: nanobots, business process, business process outsourcing, Cass Sunstein, Checklist Manifesto, Clapham omnibus, Clayton Christensen, clean water, cloud computing, commoditize, computer age, Computer Numeric Control, computer vision, conceptual framework, corporate governance, creative destruction, crowdsourcing, Daniel Kahneman / Amos Tversky, death of newspapers, disintermediation, Douglas Hofstadter, en.wikipedia.org, Erik Brynjolfsson, Filter Bubble, full employment, future of work, Google Glasses, Google X / Alphabet X, Hacker Ethic, industrial robot, informal economy, information retrieval, interchangeable parts, Internet of things, Isaac Newton, James Hargreaves, John Maynard Keynes: Economic Possibilities for our Grandchildren, John Maynard Keynes: technological unemployment, Joseph Schumpeter, Khan Academy, knowledge economy, lifelogging, lump of labour, Marshall McLuhan, Metcalfe’s law, Narrative Science, natural language processing, Network effects, optical character recognition, Paul Samuelson, personalized medicine, pre–internet, Ray Kurzweil, Richard Feynman, Second Machine Age, self-driving car, semantic web, Shoshana Zuboff, Skype, social web, speech recognition, spinning jenny, strong AI, supply-chain management, telepresence, The Future of Employment, the market place, The Wealth of Nations by Adam Smith, The Wisdom of Crowds, transaction costs, Turing test, Watson beat the top human players on Jeopardy!, WikiLeaks, young professional
In relation to the latter, on one view, the ‘proportion of the world’s data that comes from such sensors is expected to increase from 11 percent in 2005 to 42 percent in 2020’.40 The upshot of all of this is that great volumes of data are now at large, and the broad aim of data scientists is to develop methods for collecting, analysing, and exploiting these data. Case studies of success in Big Data abound. One (not entirely uncontroversial) illustration is Google Flu Trends, a system that can identify outbreaks of flu earlier than was possible in the past, by identifying geographical clustering of users whose search requests are made up of similar symptoms. Another is provided by Walmart, which analysed the buying habits of its customers prior to hurricanes and found not just that flashlights were in greater demand but so too were Pop-Tarts; and this insight enabled them to stock up accordingly when the next storm came round. Natural language translation systems and self-driving cars are also said to operate on the back of Big Data techniques.41 While there are many ways in which Big Data is valuable,42 most specialists in the field would agree with Mayer-Schönberger and Cukier that, ‘[a]t its core, big data is about predictions … it’s about applying math to huge quantities of data in order to infer probabilities … these systems perform well because they are fed with lots of data on which to base their predictions’.43 More extravagantly, Eric Siegel, a computer scientist, goes further when he speaks of ‘computers automatically developing new knowledge and capabilities by furiously feeding on modern society’s greatest and most potent unnatural resource: data’.44 If we combine these views of Big Data, we can see its promise for the professions—as a way of making predictions and as a way of generating new knowledge.
Liddy, that the future of audit was ‘the capacity to examine 100 percent of a client’s transactions’.275 This ambition of ‘100 per-cent testing’—using all available data, and not just a representative sample—is a particular case of a more general ambition very much in vogue in statistics, as discussed by Viktor Mayer-Schönberger and Kenneth Cukier in their book Big Data. One of the general features of Big Data, the authors argue, is precisely this move from taking small samples of data to using all the data instead (as they put it, ‘from some to all’).276 The next step on from 100 per cent testing is a phenomenon referred to by auditors at the vanguard as ‘continuous auditing’. Combining ongoing review of transactions and traditional financial accounts with platforms that can draw on more varied data sources, the aim is real-time insight into a company’s financial health. Again, this is a reflection of a general ambition in Big Data—to use data derived from many different sources, in different formats, and with less formal structure (not, for example, data that are carefully presented in a spreadsheet).
Coffee, Gatekeepers: The Role of the Professions and Corporate Governance (2006), 15. 273 ‘The Dozy Watchdogs’, Economist, 13 Dec. 2014. 274 James Shanteau, ‘Cognitive Heuristics and Biases in Behavioral Auditing: Review, Comments, and Observations’, Accounting, Organizations, and Society, 14: 1 (1989), 165–77. 275 James P. Liddy, ‘The Future of Audit’, Forbes, 4 Aug. 2014 <http://www.forbes.com> (accessed 8 March 2015). 276 Viktor Mayer-Schönberger and Kenneth Cukier, Big Data: A Revolution That Will Transform How we Live, Work, and Think (2013), 26. 277 Mayer-Schönberger and Cukier, Big Data, 32. 278 Mayer-Schönberger and Cukier, Big Data, and James Surowiekcki, ‘A Billion Prices Now’, New Yorker, 30 May 2011. 279 Michael Andersen, ‘Four crowdsourcing lessons from the Guardian’s (spectacular) expenses-scandal experiment’, NiemanLab, 23 June 2009 <http://www.niemanlab.org> (accessed 8 March 2015). 280 <https://www.xbrl.org>. 281 For instance, ‘the long shadow of the gentleman architect still hangs over the profession’, in Dickon Robinson et al., ‘The Future for Architects?’
To Save Everything, Click Here: The Folly of Technological Solutionism by Evgeny Morozov
3D printing, algorithmic trading, Amazon Mechanical Turk, Andrew Keen, augmented reality, Automated Insights, Berlin Wall, big data - Walmart - Pop Tarts, Buckminster Fuller, call centre, carbon footprint, Cass Sunstein, choice architecture, citizen journalism, cloud computing, cognitive bias, creative destruction, crowdsourcing, data acquisition, Dava Sobel, disintermediation, East Village, en.wikipedia.org, Fall of the Berlin Wall, Filter Bubble, Firefox, Francis Fukuyama: the end of history, frictionless, future of journalism, game design, Gary Taubes, Google Glasses, illegal immigration, income inequality, invention of the printing press, Jane Jacobs, Jean Tirole, Jeff Bezos, jimmy wales, Julian Assange, Kevin Kelly, Kickstarter, license plate recognition, lifelogging, lone genius, Louis Pasteur, Mark Zuckerberg, market fundamentalism, Marshall McLuhan, moral panic, Narrative Science, Nelson Mandela, Nicholas Carr, packet switching, PageRank, Parag Khanna, Paul Graham, peer-to-peer, Peter Singer: altruism, Peter Thiel, pets.com, placebo effect, pre–internet, Ray Kurzweil, recommendation engine, Richard Thaler, Ronald Coase, Rosa Parks, self-driving car, Silicon Valley, Silicon Valley ideology, Silicon Valley startup, Skype, Slavoj Žižek, smart meter, social graph, social web, stakhanovite, Steve Jobs, Steven Levy, Stuxnet, technoutopianism, the built environment, The Chicago School, The Death and Life of Great American Cities, the medium is the message, The Nature of the Firm, the scientific method, The Wisdom of Crowds, Thomas Kuhn: the structure of scientific revolutions, Thomas L Friedman, transaction costs, urban decay, urban planning, urban sprawl, Vannevar Bush, WikiLeaks
., 1972), 6. 182 ShotSpotter: Ethan Watters, “Shot Spotter,” Wired, March 2007, http://www.wired.com/wired/archive/15.04/shotspotter.html. 183 PredPol: on PredPol and predictive policing in general, see “Sci-fi Policing: Predicting Crime before It Occurs,” Associated Press, July 1, 2012; Joel Rubin, “Stopping Crime before It Starts,” Los Angeles Times, August 21, 2010, http://articles.latimes.com/2010/aug/21/local/la-me-predictcrime-20100427–1. 183 Consider the New York Police Department’s latest innovation: “NYPD, Microsoft Push Big Data Policing into Spotlight,” Informationweek, August 20, 2012, http://www.informationweek.com/security/privacy/nypd-microsoft-push-big-data-policing-in/240005838 . 183 “understand the unique groups in their customer base”: C. Beck and C. McCue, “Predictive Policing: What Can We Learn from Wal-Mart and Amazon About Fighting Crime in a Recession?,” Police Chief 76, no. 11 (2009), http://www.policechiefmagazine.org/magazine/index.cfm?fuseaction=print_display&article_id=1942&issue_id=112009. 185 “Predictive algorithms are not magic boxes”: Andrew Guthrie Ferguson, “Predictive Policing: The Future of Reasonable Suspicion,” Emory Law Journal, May 2, 2012, http://ssrn.com/abstract=2050001. 185 “the environmental vulnerability that encouraged”: ibid. 185 financial authorities in Hong Kong and Australia: for more on this, see Jeremy Grant, “Australia Clamps Down on ‘Algo’ Trading,” Financial Times, August 13, 2012, http://www.ft.com/intl/cms/s/0/ad11c4bc-e4f2–11e1–8e29–00144feab49a.html, and “Hong Kong Considers Annual Inspections of Algorithms,” Automated Trader, July 26, 2012, http://www.automatedtrader.net/headlines/129847/hong-kong-considers-annual-inspections-of-algorithms. 186 Facebook began using PhotoDNA: Riva Richmond, “Facebook’s New Way to Combat Child Pornography,” New York Times Gadgetwise, May 19, 2011, http://gadgetwise.blogs.nytimes.com/2011/05/19/facebook-to-combat-child-porn-using-microsofts-technology . 186 “We’ve never wanted to set up an environment”: Joseph Menn, “Social Networks Scan for Sexual Predators, with Uneven Results,” Reuters, July 12, 2012, http://www.reuters.com/article/2012/07/12/us-usa-internet-predators-idUSBRE86B05G20120712. 187 A headline that appeared in the Wall Street Journal: “Can Data Mining Stop the Killing?
Not surprisingly, gamification has already become a favorite trick in the solutionist tool kit. That everything can be gamified does not mean that everything ought to be. Wired reports on how game theorist Jesse Schell, attempting to show that gamification has its limits, gave a conference talk describing “a world in which a person’s every action—brushing their teeth, showing up to work on time, tattooing an advertisement for Pop-Tarts onto their forearm—earned points.” Alas, Schell’s attempt to encourage more critical thinking by gamification apologists backfired. As Schell told Wired, “I’ve had dozens of people come to me saying, ‘Your talk was so influential to me that I started a company. . . All I can think is, oh God, don’t blame me for that.” It all looks extremely appealing—especially to the bored and tired citizenry.
Of course, algorithms can be configured differently—and some independent labels might choose to release music that is bound to remain unpopular—but it’s hard to expect the major labels to pass up the opportunity to make more, and safer, money by deploying the algorithms. Surviving Big Data As we transition into the meme-saturated world of “algorithmic audiences,” it becomes very hard to remember the time when serious news media didn’t obsess over whether something was a “total bummer” and reported news that was important and worth caring about, regardless of how it affected the emotional well-being of the audience. To celebrate “the age of big data” and acquiesce to the ongoing invasion of journalism by various statistical measures and indicators is to give in to solutionism and endorse a very different, complacent kind of journalism. Ignorance of one’s audience—and a certain inefficiency that this introduces into the world of journalism—is not necessarily a problem that needs to be solved, even if the latest tools make the solutions trivial and obvious.