58 results back to index
Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die by Eric Siegel
Albert Einstein, algorithmic trading, Amazon Mechanical Turk, Apple's 1984 Super Bowl advert, backtesting, Black Swan, book scanning, bounce rate, business intelligence, business process, butter production in bangladesh, call centre, Charles Lindbergh, commoditize, computer age, conceptual framework, correlation does not imply causation, crowdsourcing, dark matter, data is the new oil, en.wikipedia.org, Erik Brynjolfsson, Everything should be made as simple as possible, experimental subject, Google Glasses, happiness index / gross national happiness, job satisfaction, Johann Wolfgang von Goethe, lifelogging, Machine translation of "The spirit is willing, but the flesh is weak." to Russian and back, mass immigration, Moneyball by Michael Lewis explains big data, Nate Silver, natural language processing, Netflix Prize, Network effects, Norbert Wiener, personalized medicine, placebo effect, prediction markets, Ray Kurzweil, recommendation engine, risk-adjusted returns, Ronald Coase, Search for Extraterrestrial Intelligence, self-driving car, sentiment analysis, Shai Danziger, software as a service, speech recognition, statistical model, Steven Levy, text mining, the scientific method, The Signal and the Noise by Nate Silver, The Wisdom of Crowds, Thomas Bayes, Thomas Davenport, Turing test, Watson beat the top human players on Jeopardy!, X Prize, Yogi Berra, zero-sum game
Netflix Prize team BellKor’s Pragmatic Chaos: “BellKor’s Pragmatic Chaos Is the Winner of the $1 Million Netflix Prize!!!!” September 17, 2009. www2.research.att.com/~volinsky/netflix/bpc.html. Regarding SpaceShipOne and the XPrize: XPrize Foundation, “Ansari X Prize,” XPrize Foundation, updated April 25, 2012. http://space.xprize.org/ansari-x-prize. Netflix Prize team PragmaticTheory: PragmaticTheory website. https://sites.google.com/site/pragmatictheory/. Netflix Prize team BigChaos: Istvan Pilaszy, “Lessons That We Learned from the Netflix Prize,” Predictive Analytics World Washington, DC, Conference, October 21, 2009, Washington, DC. www.predictiveanalyticsworld.com/dc/2009/agenda.php#day2–13. Clive Thompson, “If You Liked This, You’re Sure to Love That,” New York Times, November 21, 2008. www.nytimes.com/2008/11/23/magazine/23Netflix-t.html. Netflix Prize team The Ensemble: Blog post by Aron, “Netflix Prize Conclusion,” The Ensemble, September 22, 2009. www.the-ensemble.com/content/netflix-prize-conclusion#comments.
Information Security Amazon Data Security Competition. https://sites.google.com/site/amazonaccessdatacompetition/. Approaches to the Netflix Prize: Clive Thompson, “If You Liked This, You’re Sure to Love That,” New York Times, November 21, 2008. www.nytimes.com/2008/11/23/magazine/23Netflix-t.html. Regarding collaboration rather than competition on the Netflix Prize: Jordan Ellenberg, “This Psychologist Might Outsmart the Math Brains Competing for the Netflix Prize,” Wired, February 25, 2008. www.wired.com/techbiz/media/magazine/16-03/mf_netflix. Overview of several uses of ensembles by Netflix Prize teams: Todd Holloway, “Ensemble Learning Better Predictions through Diversity,” ETech 2008, March 11, 2008. http://abeautifulwww.com/EnsembleLearningETech.pdf. Andreas Töscher from Netflix Prize team BigChaos: “Advanced Approaches for Recommender System and the Netflix Prize,” Predictive Analytics World San Francisco Conference, February 28, 2009, San Francisco, CA. www.predictiveanalyticsworld.com/sanfrancisco/2009/agenda.php#advancedapproaches.
Quote from Bart Baesens: Bart Baesens, PhD, “Building Bulletproof Models,” sascom Magazine, 3rd quarter, 2010. www.sas.com/news/feature/FSmodels.html. Chapter 5 Layperson competitors for the Netflix Prize: Eric Siegel, PhD, “Casual Rocket Scientists: An Interview with a Layman Leading the Netflix Prize, Martin Chabbert,” September 2009. www.predictiveanalyticsworld.com/layman-netflix-leader.php. $1 million Netflix Prize: Netflix Prize, September 21, 2009. www.netflixprize.com/. Seventy percent of Netflix movie choices based on recommendations: Jeffrey M. O’Brien, “The Netflix Effect,” Wired Magazine Online, December 12, 2002. http://www.wired.com/wired/archive/10.12/netflix.html. Michael Liedtke, “Netflix Recommendations Are About to Get Better, Say Execs,” Huffington Post Online, April 9, 2012. www.huffingtonpost.com/2012/04/09/netflix-recommendations_n_1413179.html. Netflix Prize team BellKor’s Pragmatic Chaos: “BellKor’s Pragmatic Chaos Is the Winner of the $1 Million Netflix Prize!!!!”
Netflixed: The Epic Battle for America's Eyeballs by Gina Keating
activist fund / activist shareholder / activist investor, barriers to entry, business intelligence, collaborative consumption, corporate raider, inventory management, Jeff Bezos, late fees, Mark Zuckerberg, McMansion, Menlo Park, Netflix Prize, new economy, out of africa, performance metric, Ponzi scheme, pre–internet, price stability, recommendation engine, Saturday Night Live, shareholder value, Silicon Valley, Silicon Valley startup, Steve Jobs, subscription business, Superbowl ad, telemarketer, X Prize
Craft Blockbuster Online, vice president of strategic planning Rick Ellis Blockbuster Online, operations consultant Shane Evangelist Blockbuster Online, senior vice president and general manager Gary Fernandes Board member Bill Fields Chairman/chief executive before Antioco Sarah Gustafson Blockbuster Online, senior director, customer analytics Jules Haimovitz Board member Lillian Hessel Blockbuster Online, vice president, customer marketing Jim Keyes Chairman/chief executive Karen Raskopf Corporate Communications, senior vice president, Nick Shepherd Chief operating officer Michael Siftar Blockbuster Online, director, applications development Nigel Travis President Strauss Zelnick Board member Larry Zine Chief financial officer COSTARS (ALPHABETICAL) Robert Bell AT&T Laboratory, Statistics Division, researcher Jeff Bezos Amazon.com founder/chief executive Martin Chabbert Netflix Prize winner, French-Canadian programmer Tom Dooley Viacom, senior vice president Roger Enrico PepsiCo, chairman John Fleming Walmart, chief executive Brett Icahn Carl Icahn’s son Carl Icahn Blockbuster investor/board member Michael Jahrer Netflix Prize winner, Big Chaos team, machine learning researcher Mike Kaltschnee HackingNetflix, founder/blogger Gregg Kaplan Redbox, chief executive Mel Karmazin Viacom, chief operating officer Yehuda Koren Netflix Prize winner, AT&T Laboratory, scientist Warren Lieberfarb Warner Home Video, president Joe Malugen Movie Gallery, chairman/chief executive Dave Novak Yum!
Craft Blockbuster Online, vice president of strategic planning Rick Ellis Blockbuster Online, operations consultant Shane Evangelist Blockbuster Online, senior vice president and general manager Gary Fernandes Board member Bill Fields Chairman/chief executive before Antioco Sarah Gustafson Blockbuster Online, senior director, customer analytics Jules Haimovitz Board member Lillian Hessel Blockbuster Online, vice president, customer marketing Jim Keyes Chairman/chief executive Karen Raskopf Corporate Communications, senior vice president, Nick Shepherd Chief operating officer Michael Siftar Blockbuster Online, director, applications development Nigel Travis President Strauss Zelnick Board member Larry Zine Chief financial officer COSTARS (ALPHABETICAL) Robert Bell AT&T Laboratory, Statistics Division, researcher Jeff Bezos Amazon.com founder/chief executive Martin Chabbert Netflix Prize winner, French-Canadian programmer Tom Dooley Viacom, senior vice president Roger Enrico PepsiCo, chairman John Fleming Walmart, chief executive Brett Icahn Carl Icahn’s son Carl Icahn Blockbuster investor/board member Michael Jahrer Netflix Prize winner, Big Chaos team, machine learning researcher Mike Kaltschnee HackingNetflix, founder/blogger Gregg Kaplan Redbox, chief executive Mel Karmazin Viacom, chief operating officer Yehuda Koren Netflix Prize winner, AT&T Laboratory, scientist Warren Lieberfarb Warner Home Video, president Joe Malugen Movie Gallery, chairman/chief executive Dave Novak Yum! Brands, chairman/chief executive Michael Pachter Wedbush Morgan, analyst Martin Piotte Netflix Prize winner, French-Canadian programmer Sumner Redstone Viacom, chairman Stuart Skorman Reel.com, founder/chief executive Andreas Toscher Netflix Prize winner, Big Chaos team, machine learning researcher Chris Volinsky Netflix Prize winner, AT&T Laboratory, Statistics Division, executive director Mark Wattles Hollywood Video, founder/ chief executive CONTENTS About the Author Title Page Copyright Dedication CAST OF CHARACTERS PROLOGUE CHAPTER ONE A SHOT IN THE DARK (1997–1998) CHAPTER TWO THE GOOD, THE BAD, AND THE UGLY (1998–1999) CHAPTER THREE THE GOLD RUSH (1999–2000) CHAPTER FOUR WAR OF THE WORLDS (2001–2003) CHAPTER FIVE THE PROFESSIONAL (2003–2004) CHAPTER SIX SOME LIKE IT HOT (2004–2005) CHAPTER SEVEN WALL STREET (2004–2005) CHAPTER EIGHT KICK ASS (2004–2005) CHAPTER NINE THE BEST YEARS OF OUR LIVES (2005–2006) CHAPTER TEN THE EMPIRE STRIKES BACK (2006–2007) CHAPTER ELEVEN THE INCREDIBLES (2006–2009) CHAPTER TWELVE HIGH NOON (2007–2008) CHAPTER THIRTEEN THE GREAT ESCAPE (2007–2009) CHAPTER FOURTEEN TRUE GRIT (2009–2010) CHAPTER FIFTEEN CINEMA PARADISO (2011) EPILOGUE AFTERWORD ACKNOWLEDGMENTS A NOTE ON SOURCES BIBLIOGRAPHY INDEX PROLOGUE IT IS EARLY MORNING ON a workday in the spring of 1997.
Volinsky also loved movies, and both he and Bell, who also found his vocation in baseball stats, were excited about the chance to experiment with Netflix’s huge trove of real-world data—a set of customer ratings that was a hundred times larger than any they had ever seen. Bell had entered and won contests before the Netflix Prize, but the $1 million and the open-door nature of the competition—anybody with a PC and an Internet connection could enter—gave the contest a special allure. It quickly became a leading topic of conversation in the research and academic communities that Bell traveled in, and he relished the chance to see how he stacked up against his peers. About fifteen people showed up for a brainstorming session Volinsky organized shortly after the Netflix Prize was announced, but active members dwindled after a couple of weeks to just three—Bell, Volinsky, and their younger Israeli colleague, Yehuda Koren. At first they watched as the Netflix-sponsored leaderboard lit up with a couple of hundred solutions—at least two of which bettered Cinematch within a week.
What Algorithms Want: Imagination in the Age of Computing by Ed Finn
Airbnb, Albert Einstein, algorithmic trading, Amazon Mechanical Turk, Amazon Web Services, bitcoin, blockchain, Chuck Templeton: OpenTable:, Claude Shannon: information theory, commoditize, Credit Default Swap, crowdsourcing, cryptocurrency, disruptive innovation, Donald Knuth, Douglas Engelbart, Douglas Engelbart, Elon Musk, factory automation, fiat currency, Filter Bubble, Flash crash, game design, Google Glasses, Google X / Alphabet X, High speed trading, hiring and firing, invisible hand, Isaac Newton, iterative process, Jaron Lanier, Jeff Bezos, job automation, John Conway, John Markoff, Just-in-time delivery, Kickstarter, late fees, lifelogging, Loebner Prize, Lyft, Mother of all demos, Nate Silver, natural language processing, Netflix Prize, new economy, Nicholas Carr, Norbert Wiener, PageRank, peer-to-peer, Peter Thiel, Ray Kurzweil, recommendation engine, Republic of Letters, ride hailing / ride sharing, Satoshi Nakamoto, self-driving car, sharing economy, Silicon Valley, Silicon Valley ideology, Silicon Valley startup, social graph, software studies, speech recognition, statistical model, Steve Jobs, Steven Levy, Stewart Brand, supply-chain management, TaskRabbit, technological singularity, technoutopianism, The Coming Technological Singularity, the scientific method, The Signal and the Noise by Nate Silver, The Structural Transformation of the Public Sphere, The Wealth of Nations by Adam Smith, transaction costs, traveling salesman, Turing machine, Turing test, Uber and Lyft, Uber for X, uber lyft, urban planning, Vannevar Bush, Vernor Vinge, wage slave
If Netflix has allowed us to glimpse the aesthetics of these modes of abstraction, a deeper look at algorithmic arbitrage will reveal the political economy of the algorithmic era and, ultimately, the nature of algorithmically inflected value. Notes 1. Willimon, House of Cards. 2. “Global Internet Phenomenon Report.” 3. Evers, “The Post: New ISP Performance Data For December.” 4. “The Netflix Prize Rules.” 5. Töscher, Jahrer, and Bell, “The BigChaos Solution to the Netflix Grand Prize.” 6. Hallinan and Striphas, “Recommended for You.” 7. Buley, “Netflix Settles Privacy Lawsuit, Cancels Prize Sequel.” 8. Strogatz, “The End of Insight.”; Arbesman, “Explain It to Me Again, Computer.” 9. Amatriain, “Netflix Recommendations.” 10. “Introducing Netflix Social.” 11. Amatriain, “Netflix Recommendations.” 12. Madrigal, “How Netflix Reverse Engineered Hollywood.” 13. Ibid. 14. Fritz, “Cadre of Film Buffs Helps Netflix Viewers Sort through the Clutter.” 15.
Only us—small, solitary, striving, battling one another. I pray to myself, for myself. Frank Underwood1 The Netflix Prize If Apple and Google want to dominate our relationships with search, access, and personal information, the movies-on-demand company Netflix wants to own the leisure time we spend on video entertainment. While less omnipresent than Google, the company’s influence on digital culture is still striking: on any given day in 2014, roughly a third of all Internet data downloaded during peak periods consisted of streaming files from Netflix.2 By the end of 2013, the company’s 40 million subscribers watched a billion hours of content each month.3 In 2006, Netflix announced a mathematical competition with a million dollar prize: improve the company’s recommendation algorithm by at least 10 percent. Modeled on other contests like DARPA grand challenges and the Loebner Prize (the annual Turing Test competition), the Netflix Prize invited outside researchers to teach them new algorithmic tricks that could improve the efficiency with which they recommended movies to their customers.
The system could look for patterns in these ratings, so if someone with a similar history to you had just given a new film five stars, the system might predict that you would also like that film. It was a mathematical approach to recommendations, one that ignored the complex position of Hollywood entertainment and movie rentals as culture machines. The Netflix Prize led to a heated competition between rival teams of computer scientists and statisticians gunning for the prestige and cash bounty of besting the Cinematch ratings. After three years, a combined team titled BellKor’s Pragmatic Chaos won the prize, inching out an equally effective algorithm (as measured by Netflix Prize rules) because they submitted their entry twenty minutes earlier. The winning algorithm combined hundreds of different approaches in an “ensemble” of predictors, blending a combination of randomly generated probes and observed features (for instance, weighting user ratings differently based on temporal features like weekday vs. weekend).5 The problem as Netflix framed it, and as the various contestants took it on, was almost purely mathematical.
Beautiful Visualization by Julie Steele
barriers to entry, correlation does not imply causation, data acquisition, database schema, Drosophila, en.wikipedia.org, epigenetics, global pandemic, Hans Rosling, index card, information retrieval, iterative process, linked data, Mercator projection, meta analysis, meta-analysis, natural language processing, Netflix Prize, pattern recognition, peer-to-peer, performance metric, QR code, recommendation engine, semantic web, social graph, sorting algorithm, Steve Jobs, web application, wikimedia commons
In the case of movies, intuitively, the measure indicates that two movies are similar if users who rated one highly rated the other highly or, conversely, users who rated one poorly rated the other poorly. We’ll use this similarity measure to generate similarity data for all 17,700 movies in the Netflix Prize dataset, then generate coordinates based on that data. If we were interested in building an actual movie recommender system, we might do so simply by recommending the movies that were similar to those a user had rated highly. However, the goal here is just to gain insight into the dynamics of such a recommender system. Labeling The YELLOWPAGES.COM visualization was easier to label than this Netflix Prize visualization for a number of reasons, including fewer nodes and shorter labels, but mostly because the nodes were more uniformly distributed. Although the Netflix Prize visualization has a large number of clusters, most of the movies are contained in only a small number of those clusters.
For the visualization in Figure 9-8, the first strategy was used because it illustrates the highly nonuniform distribution both of movies in general and of movies with large numbers of ratings (indicated by larger circles). However, for the enlargements of the visualization in the subsequent figures, the second strategy was used for improved readability. Figure 9-8. Visualization of the 17,700 movies in the Netflix Prize dataset Closer Looks Other than ratings, the only data in the Netflix Prize dataset is the titles and release dates for the movies. However, competitors in the Netflix Prize have found that latent attributes, such as the amount of violence in a movie or the gender of the user, are important predictors of preference. Not surprisingly, some of the clusters appear to be explainable by these attributes. Why other clusters emerge from user preferences, however, is more difficult to explain.
Chapter 8, Visualizing the U.S. Senate Social Graph (1991–2009), by Andrew Odewahn, uses quantitative evidence to evaluate a qualitative story about voting coalitions in the United States Senate. Chapter 9, The Big Picture: Search and Discovery, by Todd Holloway, uses a proximity graphing technique to explore the dynamics of search and discovery as they apply to YELLOWPAGES.COM and the Netflix Prize. Chapter 10, Finding Beautiful Insights in the Chaos of Social Network Visualizations, by Adam Perer, empowers users to dig into chaotic social network visualizations with interactive techniques that integrate visualization and statistics. Chapter 11, Beautiful History: Visualizing Wikipedia, by Martin Wattenberg and Fernanda Viégas, takes readers through the process of exploring an unknown phenomenon through visualization, from initial sketches to published scientific papers.
The Ethical Algorithm: The Science of Socially Aware Algorithm Design by Michael Kearns, Aaron Roth
23andMe, affirmative action, algorithmic trading, Alvin Roth, Bayesian statistics, bitcoin, cloud computing, computer vision, crowdsourcing, Edward Snowden, Elon Musk, Filter Bubble, general-purpose programming language, Google Chrome, ImageNet competition, Lyft, medical residency, Nash equilibrium, Netflix Prize, p-value, Pareto efficiency, performance metric, personalized medicine, pre–internet, profit motive, quantitative trading / quantitative ﬁnance, RAND corporation, recommendation engine, replication crisis, ride hailing / ride sharing, Robert Bork, Ronald Coase, self-driving car, short selling, sorting algorithm, speech recognition, statistical model, Stephen Hawking, superintelligent machines, telemarketer, Turing machine, two-sided market, Vilfredo Pareto
In fact, Sweeney subsequently estimated from US Census data that 87 percent of the US population can be uniquely identified from this data triple. But now that we know this, can the problem of privacy be solved by simply concealing information about birthdate, sex, and zip code in future data releases? It turns out that lots of less obvious things can also identify you—like the movies you watch. In 2006, Netflix launched the Netflix Prize competition, a public data science competition to find the best “collaborative filtering” algorithm to power Netflix’s movie recommendation engine. A key feature of Netflix’s service is its ability to recommend to users movies that they might like, given how they have rated past movies. (This was especially important when Netflix was primarily a mail-order DVD rental service, rather than a streaming service—it was harder to quickly browse or sample movies.)
For each movie, Netflix knew both what score the user had given the movie (on a scale of 1 to 5 stars) and the date on which the user provided the rating. The goal of a collaborative filtering engine is to predict how a given user will rate a movie she hasn’t seen yet. The engine can then recommend to a user the movies that it predicts she will rate the highest. Netflix had a basic recommendation system based on collaborative filtering, but the company wanted a better one. The Netflix Prize competition offered $1 million for improving the accuracy of Netflix’s existing system by 10 percent. A 10 percent improvement is hard, so Netflix expected a multiyear competition. An improvement of 1 percent over the previous year’s state of the art qualified a competitor for an annual $50,000 progress prize, which would go to the best recommendation system submitted that year. Of course, to build a recommendation system, you need data, so Netflix publicly released a lot of it—a dataset consisting of more than a hundred million movie rating records, corresponding to the ratings that roughly half a million users gave to a total of nearly eighteen thousand movies.
In fact, a gay mother of two who was not open about her sexual orientation sued Netflix, alleging that the ability to de-anonymize the dataset “would negatively affect her ability to pursue her livelihood and support her family, and would hinder her and her children’s ability to live peaceful lives.” She was worried that her sexual orientation would become clear if people knew what movies she watched on Netflix. The lawsuit sought the maximum penalty allowed by the Video Privacy Protection Act: $2,500 for each of Netflix’s more than two million subscribers. Netflix settled this lawsuit for undisclosed financial terms, and cancelled a planned second Netflix Prize competition. The history of data anonymization is littered with many more such failures. The problem is that a surprisingly small number of apparently idiosyncratic facts about you—like when you watched a particular movie, or the last handful of items you purchased on Amazon—are enough to uniquely identify you among the billions of people in the world, or at least among those appearing in a large database.
Bold: How to Go Big, Create Wealth and Impact the World by Peter H. Diamandis, Steven Kotler
3D printing, additive manufacturing, Airbnb, Amazon Mechanical Turk, Amazon Web Services, augmented reality, autonomous vehicles, Charles Lindbergh, cloud computing, creative destruction, crowdsourcing, Daniel Kahneman / Amos Tversky, dematerialisation, deskilling, disruptive innovation, Elon Musk, en.wikipedia.org, Exxon Valdez, fear of failure, Firefox, Galaxy Zoo, Google Glasses, Google Hangouts, gravity well, ImageNet competition, industrial robot, Internet of things, Jeff Bezos, John Harrison: Longitude, John Markoff, Jono Bacon, Just-in-time delivery, Kickstarter, Kodak vs Instagram, Law of Accelerating Returns, Lean Startup, life extension, loss aversion, Louis Pasteur, low earth orbit, Mahatma Gandhi, Marc Andreessen, Mark Zuckerberg, Mars Rover, meta analysis, meta-analysis, microbiome, minimum viable product, move fast and break things, Narrative Science, Netflix Prize, Network effects, Oculus Rift, optical character recognition, packet switching, PageRank, pattern recognition, performance metric, Peter H. Diamandis: Planetary Resources, Peter Thiel, pre–internet, Ray Kurzweil, recommendation engine, Richard Feynman, ride hailing / ride sharing, risk tolerance, rolodex, self-driving car, sentiment analysis, shareholder value, Silicon Valley, Silicon Valley startup, skunkworks, Skype, smart grid, stem cell, Stephen Hawking, Steve Jobs, Steven Levy, Stewart Brand, superconnector, technoutopianism, telepresence, telepresence robot, Turing test, urban renewal, web application, X Prize, Y Combinator, zero-sum game
When asked about their experience, Vor-Tek member and tattoo artist Fred Giovannitti said, “We get asked all the time, ‘How long have you been in the oil industry?’ and I ask back, ‘Counting today?’ ” The lesson here is that in incentive competitions, results can come from the most unusual of places, from players you would never expect, and from technologies you might never suspect. Lee Stein, an XPRIZE benefactor, says, “When you are looking for a needle in the haystack, incentive competitions help the needle come to you.” Case Study 2: The Netflix Prize The best incentive prizes are those that solve important puzzles that people want solved and people want to solve—and there’s a difference. The Wendy Schmidt Oil Cleanup XCHALLENGE falls directly into the former category. It took me over ten years to raise the money for the Ansari XPRIZE, but Wendy Schmidt stepped forward to fund the Oil Cleanup Challenge in less than forty-eight hours. Certainly one reason I raised money for the Oil Cleanup Challenge so quickly was the fact that by then I had a track record of success and a considerably thicker Rolodex, but a more important factor was the 800,000 gallons of crude gushing into the Gulf Coast each day.
By the middle 2000s, Netflix engineers had plucked all the low-hanging fruit and the rate of Cinematch optimization had slowed to a crawl. Every time one of their recommendations was a clear miss—based on your interest in Breakfast at Tiffany’s we think you’ll enjoy Naked Lunch—customers got angry. And with new competitors sprouting up in the likes of Hulu, Amazon, and YouTube, this ire was getting expensive. So Netflix decided to attack the problem head-on, announcing the Netflix Prize in October 2006—a million-dollar purse for whoever could write an algorithm that improved their existing system by 10 percent.15 And this contest is a perfect example of what happens when you design prizes around intrinsic motivations. Competition, coding, and movies—what could be more fun than that? Within two weeks, Netflix had received nearly 170 submissions, three of them outperforming Cinematch.
Within two weeks, Netflix had received nearly 170 submissions, three of them outperforming Cinematch. Within ten months, there were over 20,000 teams from 150 different countries involved. By the time the contest was won, in 2009, that figure had doubled to 40,000 teams. But the results that Netflix saw extended far beyond the number of contestants entered in a contest. As Jordan Ellenberg explained in Wired: “Secrecy hasn’t been a big part of the Netflix competition. The prize hunters, even the leaders, are startlingly open about the methods they’re using, acting more like academics huddled over a knotty problem than entrepreneurs jostling for a $1 million payday. In December 2006, a competitor called ‘simonfunk’ posted a complete description of his algorithm—which at the time was tied for third place—giving everyone else the opportunity to piggyback on his progress. ‘We had no idea the extent to which people would collaborate with each other,’ says Jim Bennett, vice president for recommendation systems at Netflix.”16 And this isn’t an aberration.
Big Data at Work: Dispelling the Myths, Uncovering the Opportunities by Thomas H. Davenport
Automated Insights, autonomous vehicles, bioinformatics, business intelligence, business process, call centre, chief data officer, cloud computing, commoditize, data acquisition, disruptive innovation, Edward Snowden, Erik Brynjolfsson, intermodal, Internet of things, Jeff Bezos, knowledge worker, lifelogging, Mark Zuckerberg, move fast and break things, move fast and break things, Narrative Science, natural language processing, Netflix Prize, New Journalism, recommendation engine, RFID, self-driving car, sentiment analysis, Silicon Valley, smart grid, smart meter, social graph, sorting algorithm, statistical model, Tesla Model S, text mining, Thomas Davenport
GE is primarily focused on big data for improving services and is already using data science to optimize the service contracts and maintenance intervals for industrial products. Google, of course—the ultimate big data firm—uses data scientists to refine its core search and ad-serving algorithms. Zynga uses data scientists to target games and game-related products to customers. Netflix created the wellknown Netflix Prize for the data science team that could optimize the company’s movie recommendations for customers. The testing firm Kaplan uses its data scientists to begin advising customers on effective learning and test-preparation strategies. These companies’ big data efforts are directly focused on products, services, and customers. This has important implications, of course, for the organizational locus of big data and the processes and pace of new product development.
On the decision side, the primary value from big data derives from adding new sources of data to explanatory and predictive models. Many big data enthusiasts argue that there is more value from adding new sources of data to a model than to refining the model itself. For example, Anand Rajaram, who works at @WalMartLabs and teaches at Stanford, ran a bit of a natural experiment in one of his Stanford classes along the lines of the Netflix Prize—the contest that invited anyone to try to improve the Netflix customer video preference algorithm and win a million bucks.16 One of the groups in Rajaram’s classes used the data that Netflix provided and applied very sophisticated algorithms to it. Another group supplemented the data (illegally, according to the rules of the competition) with movie genre data from the Internet Movie Database.
There are many other examples of this phenomenon in both online and primarily offline businesses. GE is mainly focused on big data for improving services—among other things, to optimize the service contracts and maintenance intervals for industrial products. The real estate site Zillow created the Zestimate home price estimate, as well as Chapter_03.indd 65 03/12/13 11:28 AM 66 big data @ work rental cost Zestimates and a national home value index. Netflix created the Netflix Prize for the data science team that could optimize the company’s movie recommendations for customers and, as I noted in chapter 2, is now using big data to help in the creation of proprietary content. The testing firm Kaplan uses its big data to begin advising customers on effective learning and test-preparation strategies. Novartis focuses on big data—the health-care industry calls it informatics—to develop new drugs.
The New Kingmakers by Stephen O'Grady
AltaVista, Amazon Web Services, barriers to entry, cloud computing, correlation does not imply causation, crowdsourcing, David Heinemeier Hansson, DevOps, Jeff Bezos, Khan Academy, Kickstarter, Marc Andreessen, Mark Zuckerberg, Netflix Prize, Paul Graham, Ruby on Rails, Silicon Valley, Skype, software as a service, software is eating the world, Steve Ballmer, Steve Jobs, Tim Cook: Apple, Y Combinator
Netflix’s own algorithm, Cinematch, attempted to predict what rating a given user would assign to a given film. On October 2, 2006, Netflix announced the Netflix Prize: The first team of non-employees that could best their in-house algorithm by 10% would claim $1,000,000. This prize had two major implications. First, it implied that the benefits of an improved algorithm would exceed one million dollars for Netflix, presumably through customer acquisition and improvements in retention. Second, it implied that crowd-sourcing had the potential to deliver better results than the organization could produce on its own. In this latter assumption, Netflix was proven correct. On October 8—just six days after the prize was announced—an independent team bested the Netflix algorithm, albeit by substantially less than ten percent. The 10% threshold was finally reached in 2009.
The 10% threshold was finally reached in 2009. In September of that year, Netflix announced that the team “BellKor’s Pragmatic Chaos”—composed of researchers from AT&T Labs, Pragmatic Theory, and Yahoo!—had won the Netflix Prize, taking home a million dollars for their efforts. A year earlier, meanwhile, Netflix had enabled the recruitment of millions of other developers by providing official APIs. In September 2008, Netflix launched developer.netflix.com, where developers could independently register with Netflix to get access to APIs that would enable them to build applications that would manage users’ video queues, check availability, and access account details. Just as Netflix believed that the wider world might be able to build a better algorithm, so too did it believe that out of the millions of developers in the world, one of them might be able to build a better application than Netflix itself.
Think Twice: Harnessing the Power of Counterintuition by Michael J. Mauboussin
affirmative action, asset allocation, Atul Gawande, availability heuristic, Benoit Mandelbrot, Bernie Madoff, Black Swan, butter production in bangladesh, Cass Sunstein, choice architecture, Clayton Christensen, cognitive dissonance, collateralized debt obligation, Daniel Kahneman / Amos Tversky, deliberate practice, disruptive innovation, Edward Thorp, experimental economics, financial innovation, framing effect, fundamental attribution error, Geoffrey West, Santa Fe Institute, George Akerlof, hindsight bias, hiring and firing, information asymmetry, libertarian paternalism, Long Term Capital Management, loose coupling, loss aversion, mandelbrot fractal, Menlo Park, meta analysis, meta-analysis, money market fund, Murray Gell-Mann, Netflix Prize, pattern recognition, Philip Mirowski, placebo effect, Ponzi scheme, prediction markets, presumed consent, Richard Thaler, Robert Shiller, Robert Shiller, statistical model, Steven Pinker, The Wisdom of Crowds, ultimatum game
Using consumer feedback, Cinematch rapidly improved its ability to anticipate consumer tastes and now drives well over half of Netflix’s rentals, keeping users happy and reducing reliance on new releases. But the company’s executives realized that Cinematch did not have all the answers. So in 2006, they issued a challenge: Netflix will pay a $1 million prize for a program that predicts consumer preferences 10 percent better than Cinematch. As I write this, the Netflix Prize is still up for grabs, with the leading group 9.80 percent better than Cinematch. Two points bear emphasis. First, some really great minds are toiling on a problem that is worth a lot less to them than it is to Netflix. (Netflix executives freely admit that a winning algorithm is worth more than $1 million.) Second, Cinematch, or whatever program ultimately unseats it, is vastly better than the video-store employee in New York City.15 The night-and-day contrast between the quality of advice from Netflix’s algorithms and the local video-store clerk illustrates this chapter’s first decision mistake: using experts instead of mathematical models.
Anders Ericsson, Neil Charness, Paul J. Feltovich, and Robert R. Hoffman (Cambridge: Cambridge University Press, 2006), 87–103. 15. For more on the contest, see www.netflixprize.com. Clive Thompson, “If You Liked This, You’re Sure to Love That,” New York Times Magazine, November 23, 2008. Jordan Ellenberg, “The Netflix Challenge: This Psychologist Might Outsmart the Math Brains Competing for the Netflix Prize,” Wired Magazine, March 2008, 114–122. 16. Paul E. Meehl, Clinical versus Statistical Prediction: A Theoretical Analysis and a Review of the Evidence (Minneapolis: University of Minnesota Press, 1954); Robyn M. Dawes, David Faust, and Paul E. Meehl, “Clinical versus Actuarial Judgment,” in Heuristics and Biases: The Psychology of Intuitive Judgment, ed. Thomas Gilovich, Dale Griffin, and Daniel Kahneman (Cambridge: Cambridge University Press, 2002), 716–729; Reid Hastie and Robyn M.
“Transmission of Information and Herd Behavior: An Application to Financial Markets.” Physical Review Letters 85, no. 26 (2000): 5659–5662. Ehrlich, Paul, and Brian Walker. “Rivets and Redundancy.” BioScience 48, no. 5 (1998): 387. Eisenhardt, Kathleen M., and Donald N. Sull. “Strategy as Simple Rules.” Harvard Business Review, January 2001, 107–116. Ellenberg, Jordan. “The Netflix Challenge: This Psychologist Might Outsmart the Math Brains Competing for the Netflix Prize.” Wired Magazine, March, 2008, 114–122. Epley, Nicholas, and Thomas Gilovich. “The Anchoring-and-Adjustment Heuristic: Why the Adjustments Are Insufficient.” Psychological Science 17, no. 4 (2006): 311–318. Ernst, Cécile, and Jules Angst. Birth Order: Its Influence on Personality. Berlin: Springer-Verlag, 1983. Fauconnier, Gilles, and Mark Turner. The Way We Think: Conceptual Blending and the Mind’s Hidden Complexities.
Smarter Than You Think: How Technology Is Changing Our Minds for the Better by Clive Thompson
4chan, A Declaration of the Independence of Cyberspace, augmented reality, barriers to entry, Benjamin Mako Hill, butterfly effect, citizen journalism, Claude Shannon: information theory, conceptual framework, corporate governance, crowdsourcing, Deng Xiaoping, discovery of penicillin, disruptive innovation, Douglas Engelbart, Douglas Engelbart, drone strike, Edward Glaeser, Edward Thorp, en.wikipedia.org, experimental subject, Filter Bubble, Freestyle chess, Galaxy Zoo, Google Earth, Google Glasses, Gunnar Myrdal, Henri Poincaré, hindsight bias, hive mind, Howard Rheingold, information retrieval, iterative process, jimmy wales, Kevin Kelly, Khan Academy, knowledge worker, lifelogging, Mark Zuckerberg, Marshall McLuhan, Menlo Park, Netflix Prize, Nicholas Carr, Panopticon Jeremy Bentham, patent troll, pattern recognition, pre–internet, Richard Feynman, Ronald Coase, Ronald Reagan, Rubik’s Cube, sentiment analysis, Silicon Valley, Skype, Snapchat, Socratic dialogue, spaced repetition, superconnector, telepresence, telepresence robot, The Nature of the Firm, the scientific method, The Wisdom of Crowds, theory of mind, transaction costs, Vannevar Bush, Watson beat the top human players on Jeopardy!, WikiLeaks, X Prize, éminence grise
the Galaxy Zoo: Tim Adams, “Galaxy Zoo and the New Dawn of Citizen Science,” The Observer (UK), March 17, 2012, accessed March 24, 2013, www.guardian.co.uk/science/2012/mar/18/galaxy-zoo-crowdsourcing-citizen-scientists. a one-million-dollar prize: Eliot Van Buskirk, “BellKor’s Pragmatic Chaos Wins $1 Million Netflix Prize by Mere Minutes,” Wired, September 21, 2009, accessed March 24, 2013, www.wired.com/business/2009/09/bellkors-pragmatic-chaos-wins-1-million-netflix-prize/; I also previously reported on the Netflix Prize in “If You Liked This, You’re Sure to Love That,” The New York Times Magazine, November 21, 2008, accessed March 24, 2013, www.nytimes.com/2008/11/23/magazine/23Netflix-t.html. I didn’t specifically note the increasing secrecy of the participants over time in my article, but the teams remarked on this in my interviews.
At best, companies have been able to deploy fairly simple polling- and-voting group thinking projects, often to tap in to what their customers want; clothing firms like Threadless let their users vote on user-submitted designs. Others have solved the motivational problem by offering substantial prizes. Netflix, for example, offered a one-million-dollar prize for whoever could improve its movie-recommendation algorithm by 10 percent. But while prizes motivate hard work, they can inhibit sharing. When people are competing for a big prize, they’re often not willing to talk about their smartest breakthrough ideas for fear that a rival will steal their work. (Indeed, as teams got closer to winning the Netflix prize, they became increasingly secretive.) Other corporations have solved the problems of motivation and secrecy by turning inward and creating internal “decision markets” where employees can pose ideas and vote on the best ones.
AIQ: How People and Machines Are Smarter Together by Nick Polson, James Scott
Air France Flight 447, Albert Einstein, Amazon Web Services, Atul Gawande, autonomous vehicles, availability heuristic, basic income, Bayesian statistics, business cycle, Cepheid variable, Checklist Manifesto, cloud computing, combinatorial explosion, computer age, computer vision, Daniel Kahneman / Amos Tversky, Donald Trump, Douglas Hofstadter, Edward Charles Pickering, Elon Musk, epigenetics, Flash crash, Grace Hopper, Gödel, Escher, Bach, Harvard Computers: women astronomers, index fund, Isaac Newton, John von Neumann, late fees, low earth orbit, Lyft, Magellanic Cloud, mass incarceration, Moneyball by Michael Lewis explains big data, Moravec's paradox, more computing power than Apollo, natural language processing, Netflix Prize, North Sea oil, p-value, pattern recognition, Pierre-Simon Laplace, ransomware, recommendation engine, Ronald Reagan, self-driving car, sentiment analysis, side project, Silicon Valley, Skype, smart cities, speech recognition, statistical model, survivorship bias, the scientific method, Thomas Bayes, Uber for X, uber lyft, universal basic income, Watson beat the top human players on Jeopardy!, young professional
Then in 2009, after two years of refining their algorithm, a team calling themselves BellKor’s Pragmatic Chaos finally submitted the million-dollar piece of code, beating Netflix’s engine by 10.06%. And it’s a good thing they didn’t pause to watch an extra episode of The Big Bang Theory before hitting the submit button. BellKor reached the finish line of the two-year race just 19 minutes and 54 seconds ahead of a second team, The Ensemble, who submitted an algorithm also reaching 10.06% improvement—just not quite fast enough. In retrospect, the Netflix Prize was a perfect symbol of the company’s early reliance on a core machine-learning task: algorithmically predicting how a subscriber would rate a film. Then, in March of 2011, three little words changed the future of Netflix forever: House of Cards. House of Cards was the first “Netflix Original Series,” the company’s first try at producing TV rather than merely distributing it. The production team behind House of Cards originally went to all the major networks with their idea, and every single one was interested.
How can Netflix make a recommendation on the basis of your viewing history, using other people’s viewing histories, when yours is unprecedented and theirs will never be repeated? The solution to all three issues is careful modeling. Just as Wald solved his missing-data problem by building a model of a B-17’s encounter with an enemy fighter, Netflix solved its problem by building a model of a subscriber’s encounter with a film. And while Netflix’s current model is proprietary, the million-dollar model built by team BellKor’s Pragmatic Chaos, winner of the Netflix Prize, is posted for free on the web.8 Here’s the gist of how it works. (Remember, Netflix predicts ratings on a 1-to-5 scale, from which a like/dislike prediction can be made using a simple cutoff, e.g., four stars.) The fundamental equation here is Predicted Rating = Overall Average + Film Offset + User Offset + User-Film Interaction. The first three pieces of this equation are easy to explain
See health care and medicine Medtronic Menger, Karl Microsoft Microsoft Azure modeling assumptions and deep-learning models imputation and Inception latent feature massive models missing data and model rust natural language processing and prediction rules as reality versus rules-based (top-down) models training the model Moneyball Moore’s law Moravec paradox Morgenstern, Oskar Musk, Elon natural language processing (NLP) ambiguity and bottom-up approach chatbots digital assistants future trends Google Translate growth of statistical NLP knowing how versus knowing that natural language revolution “New Deal” for human-machine linguistic interaction prediction rules and programing language revolution robustness and rule bloat and speech recognition top-down approach word co-location statistics word vectors naturally occurring radioactive materials (NORM) Netflix Crown, The (series) data scientists history of House of Cards (series) Netflix Prize for recommender system personalization recommender systems neural networks deep learning and Friends new episodes and Inception model prediction rules and New England Patriots Newton, Isaac Nightingale, Florence coxcomb diagram (1858) Crimean War and early years and training evidence-based medicine legacy of “lady with the lamp” medical statistics legacy of nursing reform legacy of Nvidia Obama, Barack Office of Scientific Research and Development parallax pattern recognition cucumber sorting input and output learning a pattern maximum heart rate and prediction rules and toilet paper theft and See also prediction rules PayPal personalization conditional probability and latent feature models and Netflix and Wald’s survivability recommendations for aircraft and See also recommender systems; suggestion engines philosophy Pickering, Edward C.
Big Data: A Revolution That Will Transform How We Live, Work, and Think by Viktor Mayer-Schonberger, Kenneth Cukier
23andMe, Affordable Care Act / Obamacare, airport security, barriers to entry, Berlin Wall, big data - Walmart - Pop Tarts, Black Swan, book scanning, business intelligence, business process, call centre, cloud computing, computer age, correlation does not imply causation, dark matter, double entry bookkeeping, Eratosthenes, Erik Brynjolfsson, game design, IBM and the Holocaust, index card, informal economy, intangible asset, Internet of things, invention of the printing press, Jeff Bezos, Joi Ito, lifelogging, Louis Pasteur, Mark Zuckerberg, Menlo Park, Moneyball by Michael Lewis explains big data, Nate Silver, natural language processing, Netflix Prize, Network effects, obamacare, optical character recognition, PageRank, paypal mafia, performance metric, Peter Thiel, post-materialism, random walk, recommendation engine, self-driving car, sentiment analysis, Silicon Valley, Silicon Valley startup, smart grid, smart meter, social graph, speech recognition, Steve Jobs, Steven Levy, the scientific method, The Signal and the Noise by Nate Silver, The Wealth of Nations by Adam Smith, Thomas Davenport, Turing test, Watson beat the top human players on Jeopardy!
Still, within days, the New York Times cobbled together searches like “60 single men” and “tea for good health” and “landscapers in Lilburn, Ga” to successfully identify user number 4417749 as Thelma Arnold, a 62-year-old widow from Lilburn, Georgia. “My goodness, it’s my whole personal life,” she told the Times reporter when he came knocking. “I had no idea somebody was looking over my shoulder.” The ensuing public outcry led to the ouster of AOL’s chief technology officer and two other employees. Yet a mere two months later, in October 2006, the movie rental service Netflix did something similar in launching its “Netflix Prize.” The company released 100 million rental records from nearly half a million users—and offered a bounty of a million dollars to any team that could improve its film recommendation system by at least 10 percent. Again, personal identifiers had been carefully removed from the data. And yet again, a user was re-identified: a mother and a closeted lesbian in America’s conservative Midwest, who because of this later sued Netflix under the pseudonym “Jane Doe.”
. [>] Netflix identified individual—Ryan Singel, “Netflix Spilled Your Brokeback Mountain Secret, Lawsuit Claims,” Wired, December 17, 2009 (http://www.wired.com/threatlevel/2009/12/netflix-privacy-lawsuit/). On the Netflix data release—Arvind Narayanan and Vitaly Shmatikov, “Robust De-Anonymization of Large Sparse Datasets,” Proceedings of the 2008 IEEE Symposium on Security and Privacy, p. 111 et seq. (http://www.cs.utexas.edu/~shmat/shmat_oak08netflix.pdf); Arvind Narayanan and Vitaly Shmatikov, “How to Break the Anonymity of the Netflix Prize Dataset,” October 18, 2006, arXiv:cs/0610105 [cs.CR] (http://arxiv.org/abs/cs/0610105). On identifying people from three characteristics—Philippe Golle, “Revisiting the Uniqueness of Simple Demographics in the US Population,” Association for Computing Machinery Workshop on Privacy in Electronic Society 5 (2006), p. 77. On the structural weakness of anonymization—Paul Ohm, “Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization,” 57 UCLA Law Review 1701 (2010).
Scientific American, March 30, 2007 (http://www.scientificamerican.com/article.cfm?id=confirmed-the-us-census-b). Murray, Alexander. Reason and Society in the Middle Ages. Oxford University Press, 1978. Nalimov, E. V., G. McC. Haworth, and E. A. Heinz. “Space-Efficient Indexing of Chess Endgame Tables.” ICGA Journal 23, no. 3 (2000), pp. 148–162. Narayanan, Arvind, and Vitaly Shmatikov. “How to Break the Anonymity of the Netflix Prize Dataset.” October 18, 2006, arXiv:cs/0610105 (http://arxiv.org/abs/cs/0610105). ———. “Robust De-Anonymization of Large Sparse Datasets.” Proceedings of the 2008 IEEE Symposium on Security and Privacy, p. 111 (http://www.cs.utexas.edu/~shmat/shmat_oak08netflix.pdf). Nazareth, Rita, and Julia Leite. “Stock Trading in U.S. Falls to Lowest Level Since 2008.” Bloomberg, August 13, 2012 (http://www.bloomberg.com/news/2012-08-13/stock-trading-in-u-s-hits-lowest-level-since-2008-as-vix-falls.html).
The Creativity Code: How AI Is Learning to Write, Paint and Think by Marcus Du Sautoy
3D printing, Ada Lovelace, Albert Einstein, Alvin Roth, Andrew Wiles, Automated Insights, Benoit Mandelbrot, Claude Shannon: information theory, computer vision, correlation does not imply causation, crowdsourcing, data is the new oil, Donald Trump, double helix, Douglas Hofstadter, Elon Musk, Erik Brynjolfsson, Fellow of the Royal Society, Flash crash, Gödel, Escher, Bach, Henri Poincaré, Jacquard loom, John Conway, Kickstarter, Loebner Prize, mandelbrot fractal, Minecraft, music of the spheres, Narrative Science, natural language processing, Netflix Prize, PageRank, pattern recognition, Paul Erdős, Peter Thiel, random walk, Ray Kurzweil, recommendation engine, Rubik’s Cube, Second Machine Age, Silicon Valley, speech recognition, Turing test, Watson beat the top human players on Jeopardy!, wikimedia commons
The challenge was to produce an algorithm that was 10 per cent better than Netflix’s own algorithm in predicting what these 2,817,131 recommendations were. Given the data you had seen, your algorithm needed to predict how user 234,654 rated film 2666. To add some spice to the challenge, the first team that beat the Netflix algorithm by 10 per cent would receive a $1,000,000 prize. The catch was that if you won, you had to disclose your algorithm and give Netflix a non-exclusive licence to use the algorithm to recommend films to its users. Several progress prizes were offered on the way to the one-million-dollar prize. Each year a prize of $50,000 would be awarded to the team that had produced the best results so far, provided they at least improved on the previous year’s progress winner by 1 per cent. Again, to claim the prize you had to disclose the code you were using to drive your algorithm.
It is open to biases coming from those classifying and will end up teaching the computer what we know already rather than finding new underlying trends. It could cause the algorithm to get stuck in a particularly human way of looking at the data. The best scenario is to train the algorithm to learn and spot patterns from pure raw data. This is what Netflix was hoping to do when it issued its Netflix prize challenge in 2006. It had developed its own algorithm to push users towards films they would like but thought a competition might stimulate the discovery of better algorithms. By that point Netflix had a huge amount of data from users who had watched films and rated them on a scale of 1–5. So it decided to publish 100,480,507 ratings spanning 480,189 anonymous customers evaluating 17,770 movies.
This is borne out by the dots not being all over the place. The shadow has picked up a pattern in the data. If you look at the actual films that were plotted, then indeed you will see that this shadow has picked out traits that we would recognise as distinct in films. Drama films are appearing in the top-right quadrant, action movies in the bottom left. This is the approach the team that eventually won the Netflix prize in 2009 successfully implemented. They essentially sought to identify a shadow in twenty dimensions that corresponds to twenty independent traits of films that would help predict what films users would like. The power of a computer is that it can run over a whole range of different shadows and pick out the best one to reveal structure, something that our brains and eyes cannot hope to do. Interestingly, some of the traits that the model picked out could be clearly identified: for example, action films or drama films.
What's Yours Is Mine: Against the Sharing Economy by Tom Slee
4chan, Airbnb, Amazon Mechanical Turk, asset-backed security, barriers to entry, Berlin Wall, big-box store, bitcoin, blockchain, citizen journalism, collaborative consumption, congestion charging, Credit Default Swap, crowdsourcing, data acquisition, David Brooks, don't be evil, gig economy, Hacker Ethic, income inequality, informal economy, invisible hand, Jacob Appelbaum, Jane Jacobs, Jeff Bezos, Khan Academy, Kibera, Kickstarter, license plate recognition, Lyft, Marc Andreessen, Mark Zuckerberg, move fast and break things, move fast and break things, natural language processing, Netflix Prize, Network effects, new economy, Occupy movement, openstreetmap, Paul Graham, peer-to-peer, peer-to-peer lending, Peter Thiel, pre–internet, principal–agent problem, profit motive, race to the bottom, Ray Kurzweil, recommendation engine, rent control, ride hailing / ride sharing, sharing economy, Silicon Valley, Snapchat, software is eating the world, South of Market, San Francisco, TaskRabbit, The Nature of the Firm, Thomas L Friedman, transportation-network company, Travis Kalanick, Uber and Lyft, Uber for X, uber lyft, ultimatum game, urban planning, WikiLeaks, winner-take-all economy, Y Combinator, Zipcar
There is every reason to believe that most Netflix ratings are independent and honest. When you rate a movie you can offer your opinion freely, having no reason to expect reward or punishment for any particular rating. You also have an incentive to give a rating that matches your actual opinion, as it enables Netflix to recommend movies that better match your tastes. Figure 3 shows the distribution of ratings for a set of 100 million ratings that Netflix released for its Netflix Prize competition. Figure 3. Ratings in the Netflix contest data set. The ratings are distributed among the available scores with a peak at about 3.5, so a rating of 4 or 5 is a pretty good rating and Netflix ratings help us to discriminate between one-star stinkers and five-star favorites. Yelp is a rating site for restaurants and other small businesses. Each rating is made by an individual customer (who may remain anonymous).
BlaBlaCar ratings. Even though Sharing Economy ratings are typically crammed into a small range, could a 4.9 rating still indicate a better experience than a 4.7? All the evidence to date says no, it cannot. Even in rating systems with widely-distributed ratings like Netflix, the relationship between an individual rating and the quality of user experience is murky. One of the results of the Netflix Prize competition was that individual ratings turn out to depend on factors that have nothing to do with the movie itself: people tend to grade relative to the existing rating, so highly rated films tend to stay highly rated. The best competitors managed to compensate for these effects, but only in an environment where individual movies were getting millions of ratings, quite different from the Sharing Economy case.
The Filter Bubble: What the Internet Is Hiding From You by Eli Pariser
A Declaration of the Independence of Cyberspace, A Pattern Language, Amazon Web Services, augmented reality, back-to-the-land, Black Swan, borderless world, Build a better mousetrap, Cass Sunstein, citizen journalism, cloud computing, cognitive dissonance, crowdsourcing, Danny Hillis, data acquisition, disintermediation, don't be evil, Filter Bubble, Flash crash, fundamental attribution error, global village, Haight Ashbury, Internet of things, Isaac Newton, Jaron Lanier, Jeff Bezos, jimmy wales, Kevin Kelly, knowledge worker, Mark Zuckerberg, Marshall McLuhan, megacity, Metcalfe’s law, Netflix Prize, new economy, PageRank, paypal mafia, Peter Thiel, recommendation engine, RFID, Robert Metcalfe, sentiment analysis, shareholder value, Silicon Valley, Silicon Valley startup, social graph, social software, social web, speech recognition, Startup school, statistical model, stem cell, Steve Jobs, Steven Levy, Stewart Brand, technoutopianism, the scientific method, urban planning, Whole Earth Catalog, WikiLeaks, Y Combinator
They can solve for serendipity, by designing filtering systems to expose people to topics outside their normal experience. This will often be in tension with pure optimization in the short term, because a personalization system with an element of randomness will (by definition) get fewer clicks. But as the problems of personalization become better known, it may be a good move in the long run—consumers may choose systems that are good at introducing them to new topics. Perhaps what we need is a kind of anti-Netflix Prize—a Serendipity Prize for systems that are the best at holding readers’ attention while introducing them to new topics and ideas. If this shift toward corporate responsibility seems improbable, it’s not without precedent. In the mid-1800s, printing a newspaper was hardly a reputable business. Papers were fiercely partisan and recklessly ideological. They routinely altered facts to suit their owners’ vendettas of the day, or just to add color.
When you use AddThis to share a piece of content on ABC News’s site (or anyone else’s), AddThis places a tracking cookie on your computer that can be used to target advertising to people who share items from particular sites. 6 “the cost is information about you”: Chris Palmer, phone interview with author, Dec 10, 2010. 7 accumulated an average of 1,500 pieces of data: Stephanie Clifford, “Ads Follow Web Users, and Get More Personal,” New York Times, July 30, 2009, accessed Dec. 19, 2010, www.nytimes.com/2009/07/31/business/media/31privacy.html. 7 96 percent of Americans: Richard Behar, “Never Heard of Acxiom? Chances Are It’s Heard of You.” Fortune, Feb. 23, 2004, accessed Dec. 19, 2010, http://money.cnn.com/magazines/fortune/fortune_archive/2004/02/23/362182/index.htm. 8 Netflix can predict: Marshall Kirkpatrick, “They Did It! One Team Reports Success in the $1m Netflix Prize,” ReadWriteWeb, June 26, 2009, accessed Dec. 19, 2010, www.readwriteweb.com/archives/they_did_it_one_team_reports_success_in_the_1m_net.php. 8 Web site that isn’t customized . . . will seem quaint: Marshall Kirpatrick, “Facebook Exec: All Media Will Be Personalized in 3 to 5 Years,” ReadWriteWeb, Sept. 29, 2010, accessed Jan. 30, 2011, www.readwriteweb.com/archives/facebook_exec_all_media_will_be_personalized_in_3.php. 8 “now the web is about ‘me’ ”: Josh Catone, “Yahoo: The Web’s Future Is Not in Search,” ReadWriteWeb, June 4, 2007, accessed Dec. 19, 2010, www.readwriteweb.com/archives/yahoo_personalization.php. 8 “tell them what they should be doing”: James Farrar, “Google to End Serendipity (by Creating It),” ZDNet, Aug. 17, 2010, accessed Dec. 19, 2010, www.zdnet.com/blog/sustainability/google-to-end-serendipity-by-creating-it/1304. 8 are becoming a primary news source: Pew Research Center, “Americans Spend More Time Following the News,” Sept. 12, 2010, accessed Feb 7, 2011, http://people-press.org/report/?
The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World by Pedro Domingos
Albert Einstein, Amazon Mechanical Turk, Arthur Eddington, basic income, Bayesian statistics, Benoit Mandelbrot, bioinformatics, Black Swan, Brownian motion, cellular automata, Claude Shannon: information theory, combinatorial explosion, computer vision, constrained optimization, correlation does not imply causation, creative destruction, crowdsourcing, Danny Hillis, data is the new oil, double helix, Douglas Hofstadter, Erik Brynjolfsson, experimental subject, Filter Bubble, future of work, global village, Google Glasses, Gödel, Escher, Bach, information retrieval, job automation, John Markoff, John Snow's cholera map, John von Neumann, Joseph Schumpeter, Kevin Kelly, lone genius, mandelbrot fractal, Mark Zuckerberg, Moneyball by Michael Lewis explains big data, Narrative Science, Nate Silver, natural language processing, Netflix Prize, Network effects, NP-complete, off grid, P = NP, PageRank, pattern recognition, phenotype, planetary scale, pre–internet, random walk, Ray Kurzweil, recommendation engine, Richard Feynman, scientific worldview, Second Machine Age, self-driving car, Silicon Valley, social intelligence, speech recognition, Stanford marshmallow experiment, statistical model, Stephen Hawking, Steven Levy, Steven Pinker, superintelligent machines, the scientific method, The Signal and the Noise by Nate Silver, theory of mind, Thomas Bayes, transaction costs, Turing machine, Turing test, Vernor Vinge, Watson beat the top human players on Jeopardy!, white flight, zero-sum game
With a decision tree, the choice of whether to use a learner can be contingent on other learners’ predictions. Either way, to obtain a learner’s prediction for a given training example, we must first apply it to the original training set excluding that example and use the resulting classifier—otherwise the committee risks being dominated by learners that overfit, since they can predict the correct class just by remembering it. The Netflix Prize winner used metalearning to combine hundreds of different learners. Watson uses it to choose its final answer from the available candidates. Nate Silver combines polls in a similar way to predict election results. This type of metalearning is called stacking and is the brainchild of David Wolpert, whom we met in Chapter 3 as the author of the “no free lunch” theorem. An even simpler metalearner is bagging, invented by the statistician Leo Breiman.
We don’t have the Master Algorithm yet, just a glimpse of what it might look like. What if something fundamental is still missing, something all of us in the field, steeped in its history, can’t see? We need new ideas, and ideas that are not just variations on the ones we already have. That’s why I wrote this book: to start you thinking. I teach an evening class on machine learning at the University of Washington. In 2007, soon after the Netflix Prize was announced, I proposed it as one of the class projects. Jeff Howbert, a student in the class, got hooked and continued to work on it after the class was over. He wound up being a member of one of the two winning teams, two years after learning about machine learning for the first time. Now it’s your turn. To learn more about machine learning, check out the section on further readings at the end of the book.
See Markov logic networks (MLNs) Moby Dick (Melville), 72 Molecular biology, data and, 14 Moneyball (Lewis), 39 Mooney, Ray, 76 Moore’s law, 287 Moravec, Hans, 288 Muggleton, Steve, 80 Multilayer perceptron, 108–111 autoencoder, 116–118 Bayesian, 170 driving a car and, 113 Master Algorithm and, 244 NETtalk system, 112 reinforcement learning and, 222 support vector machines and, 195 Music composition, case-based reasoning and, 199 Music Genome Project, 171 Mutation, 124, 134–135, 241, 252 Naïve Bayes classifier, 151–153, 171, 304 Bayesian networks and, 158–159 clustering and, 209 Master Algorithm and, 245 medical diagnosis and, 23 relational learning and, 228–229 spam filters and, 23–24 text classification and, 195–196 Narrative Science, 276 National Security Agency (NSA), 19–20, 232 Natural selection, 28–29, 30, 52 as algorithm, 123–128 Nature Bayesians and, 141 evolutionaries and, 137–142 symbolists and, 141 Nature (journal), 26 Nature vs. nurture debate, machine learning and, 29, 137–139 Neal, Radford, 170 Nearest-neighbor algorithms, 24, 178–186, 202, 306–307 dimensionality and, 186–190 Negative examples, 67 Netflix, 12–13, 183–184, 237, 266 Netflix Prize, 238, 292 Netscape, 9 NETtalk system, 112 Network effect, 12, 299 Neumann, John von, 72, 123 Neural learning, fitness and, 138–139 Neural networks, 99, 100, 112–114, 122, 204 convolutional, 117–118, 302–303 Master Algorithm and, 240, 244, 245 reinforcement learning and, 222 spin glasses and, 102–103 Neural network structure, Baldwin effect and, 139 Neurons action potentials and, 95–96, 104–105 Hebb’s rule and, 93–94 McCulloch-Pitts model of, 96–97 processing in brain and, 94–95 See also Perceptron Neuroscience, Master Algorithm and, 26–28 Newell, Allen, 224–226, 302 Newhouse, Neil, 17 Newman, Mark, 160 Newton, Isaac, 293 attribute selection, 189 laws of, 4, 14, 15, 46, 235 rules of induction, 65–66, 81, 82 Newtonian determinism, Laplace and, 145 Newton phase of science, 39–400 New York Times (newspaper), 115, 117 Ng, Andrew, 117, 297 Nietzche, Friedrich, 178 NIPS.
Doing Data Science: Straight Talk From the Frontline by Cathy O'Neil, Rachel Schutt
Amazon Mechanical Turk, augmented reality, Augustin-Louis Cauchy, barriers to entry, Bayesian statistics, bioinformatics, computer vision, correlation does not imply causation, crowdsourcing, distributed generation, Edward Snowden, Emanuel Derman, fault tolerance, Filter Bubble, finite state, Firefox, game design, Google Glasses, index card, information retrieval, iterative process, John Harrison: Longitude, Khan Academy, Kickstarter, Mars Rover, Nate Silver, natural language processing, Netflix Prize, p-value, pattern recognition, performance metric, personalized medicine, pull request, recommendation engine, rent-seeking, selection bias, Silicon Valley, speech recognition, statistical model, stochastic process, text mining, the scientific method, The Wisdom of Crowds, Watson beat the top human players on Jeopardy!, X Prize
What it predicts depends on the particular dataset, but some examples include whether or not a given person will get in a car crash, or like a particular film. A training set is provided, an evaluation metric determined up front, and some set of rules is provided about, for example, how often competitors can submit their predictions, whether or not teams can merge into larger teams, and so on. Examples of machine learning competitions include the annual Knowledge Discovery and Data Mining (KDD) competition, the one-time million-dollar Netflix prize (a competition that lasted two years), and, as we’ll learn a little later, Kaggle itself. Some remarks about data science competitions are warranted. First, data science competitions are part of the data science ecosystem—one of the cultural forces at play in the current data science landscape, and so aspiring data scientists ought to be aware of them. Second, creating these competitions puts one in a position to codify data science, or define its scope.
Chris Mulligan, a student in Rachel’s class, created this leapfrogging visualization to capture the competition in real time as it progressed throughout the semester This leapfrogging effect is good and bad. It encourages people to squeeze out better performing models, possibly at the risk of overfitting, but it also tends to make models much more complicated as they get better. One reason you don’t want competitions lasting too long is that, after a while, the only way to inch up performance is to make things ridiculously complicated. For example, the original Netflix Prize lasted two years and the final winning model was too complicated for them to actually put into production. Their Customers So why would companies pay to work with Kaggle? The hole that Kaggle is filling is the following: there’s a mismatch between those who need analysis and those with skills. Even though companies desperately need analysis, they tend to hoard data; this is the biggest obstacle for success for those companies that even host a Kaggle competition.
Claudia admits to being slightly biased herself toward institutions—in her experience, certain institutions prepare better work. Data Mining Competitions Claudia drew a distinction between different types of data mining competitions. The first is the “sterile” kind, where you’re given a clean, prepared data matrix, a standard error measure, and features that are often anonymized. This is a pure machine learning problem. Examples of this first kind are KDD Cup 2009 and the Netflix Prize, and many of the Kaggle competitions. In such competitions, your approach would emphasize algorithms and computation. The winner would probably have heavy machines and huge modeling ensembles. KDD Cups All the KDD Cups, with their tasks and corresponding datasets, can be found at http://www.kdd.org/kddcup/index.php. Here’s a list: KDD Cup 2010: Student performance evaluation KDD Cup 2009: Customer relationship prediction KDD Cup 2008: Breast cancer KDD Cup 2007: Consumer recommendations KDD Cup 2006: Pulmonary embolisms detection from image data KDD Cup 2005: Internet user search query categorization KDD Cup 2004: Particle physics; plus protein homology prediction KDD Cup 2003: Network mining and usage log analysis KDD Cup 2002: BioMed document; plus gene role classification KDD Cup 2001: Molecular bioactivity; plus protein locale prediction KDD Cup 2000: Online retailer website clickstream analysis KDD Cup 1999: Computer network intrusion detection KDD Cup 1998: Direct marketing for profit optimization KDD Cup 1997: Direct marketing for lift curve optimization On the other hand, you have the “real world” kind of data mining competition, where you’re handed raw data (which is often in lots of different tables and not easily joined), you set up the model yourself, and come up with task-specific evaluations.
The Elements of Data Analytic Style by Jeff Leek
One of the earliest descriptions of this idea was of a much simplified version based on bootstrapping samples and building multiple prediction functions - a process called bagging (short for bootstrap aggregating). Random forests, another successful prediction algorithm, is based on a similar idea with classification trees. 7.8 Prediction is about tradeoffs Interpretability versus accuracy Speed versus accuracy Simplicity versus accuracy Scalability versus accuracy In some areas one of these components may be more important than others. The Netflix prize awarded $1,000,000 to the best solution to predicting what movies people would like. But the winning solution was so complicated it was never implemented by Netflix because it was impractical. 8. Causality The gold standard for causal data analysis is to combine specific experimental designs such as randomized studies with standard statistical analysis techniques. If correctly performed, the experimental design makes it possible to identify how variables affect each other on average. 8.1 Causal data analysis of non-randomized experiments is often difficult to justify.
The Deep Learning Revolution (The MIT Press) by Terrence J. Sejnowski
AI winter, Albert Einstein, algorithmic trading, Amazon Web Services, Any sufficiently advanced technology is indistinguishable from magic, augmented reality, autonomous vehicles, Baxter: Rethink Robotics, bioinformatics, cellular automata, Claude Shannon: information theory, cloud computing, complexity theory, computer vision, conceptual framework, constrained optimization, Conway's Game of Life, correlation does not imply causation, crowdsourcing, Danny Hillis, delayed gratification, discovery of DNA, Donald Trump, Douglas Engelbart, Drosophila, Elon Musk, en.wikipedia.org, epigenetics, Flynn Effect, Frank Gehry, future of work, Google Glasses, Google X / Alphabet X, Guggenheim Bilbao, Gödel, Escher, Bach, haute couture, Henri Poincaré, I think there is a world market for maybe five computers, industrial robot, informal economy, Internet of things, Isaac Newton, John Conway, John Markoff, John von Neumann, Mark Zuckerberg, Minecraft, natural language processing, Netflix Prize, Norbert Wiener, orbital mechanics / astrodynamics, PageRank, pattern recognition, prediction markets, randomized controlled trial, recommendation engine, Renaissance Technologies, Rodney Brooks, self-driving car, Silicon Valley, Silicon Valley startup, Socratic dialogue, speech recognition, statistical model, Stephen Hawking, theory of mind, Thomas Bayes, Thomas Kuhn: the structure of scientific revolutions, traveling salesman, Turing machine, Von Neumann architecture, Watson beat the top human players on Jeopardy!, X Prize, Yogi Berra
As a consequence, there are fewer parameters to train on each epoch, and the resulting network has fewer dependencies between units than would be the case if the same large network were trained on every epoch. Dropout decreases the error rate in deep learning networks by 10 percent, which is a large improvement. In 2009, Netflix conducted an open competition, offering a prize of $1 million to the first person who could reduce the error of their recommender system by 10 percent.16 Almost every graduate student in machine learning entered the competition. Netflix probably inspired $10 million of research for the cost of the prize. And deep networks are now a core technology for online streaming.17 Intriguingly, cortical synapses drop out at a high rate. On every spike along an input, the typical excitatory synapse in the cortex has a 90 percent failure rate.18 This is like a baseball team where almost all the players are batting .100.
“Adults with brain damage make some bizarre errors when reading words. If a network of simulated neurons is trained to read and then is damaged, it produces strikingly similar behavior” (76). 15. N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: A Simple Way to Prevent Neural Networks from Overfitting,” Journal of Machine Learning Research 15 (2014):1929–1958. 16. “Netflix Prize,” Wikipedia, last modified, August 23, 2017, https://en.wikipedia .org/wiki/Netflix_Prize. 17. Carlos A. Gomez-Uribe, Neil Hunt, “The Netflix Recommender System: Algorithms,” ACM Transactions on Management Information Systems 6, no. 4 (2016) , article no. 13. 18. T. M. Bartol Jr., C. Bromer, J. Kinney, M. A. Chirillo, J. N. Bourne, K. M. Harris, and T. J. Sejnowski, “Nanoconnectomic Upper Bound on the Variability of Synaptic Plasticity,” eLife, 4:e10778, 2015, doi:10.7554/eLife.10778. 19.
Ethics of Big Data: Balancing Risk and Innovation by Kord Davis, Doug Patterson
4chan, business process, corporate social responsibility, crowdsourcing, en.wikipedia.org, longitudinal study, Mahatma Gandhi, Mark Zuckerberg, Netflix Prize, Occupy movement, performance metric, Robert Bork, side project, smart grid, urban planning
For example, what Google considers Personally Identifiable Information (PII) may be substantially different from Microsoft’s definition. How are we to protect PII if we can’t agree on what we’re protecting? The increasing availability of open data (and increasing number of data breaches) that make cross-correlation and de-anonymization an increasingly trivial task. Let’s not forget the example of the Netflix prize. Finally, there is no mention anywhere, in any policy statement reviewed, no matter what it was called, that addressed the topic of reputation. Reputation might be considered an “aggregate” value comprised of personal information that is judged in one fashion or another. Again, however, this raises the question of what values an organization is motivated by when developing the constituent policies.
Remix: Making Art and Commerce Thrive in the Hybrid Economy by Lawrence Lessig
Amazon Web Services, Andrew Keen, Benjamin Mako Hill, Berlin Wall, Bernie Sanders, Brewster Kahle, Cass Sunstein, collaborative editing, commoditize, disintermediation, don't be evil, Erik Brynjolfsson, Internet Archive, invisible hand, Jeff Bezos, jimmy wales, Joi Ito, Kevin Kelly, Larry Wall, late fees, Mark Shuttleworth, Netflix Prize, Network effects, new economy, optical character recognition, PageRank, peer-to-peer, recommendation engine, revision control, Richard Stallman, Ronald Coase, Saturday Night Live, SETI@home, sharing economy, Silicon Valley, Skype, slashdot, Steve Jobs, The Nature of the Firm, thinkpad, transaction costs, VA Linux, yellow journalism
Functionality gets LEGO-ized: it gets turned into a block that others can add to their own Web site or their own business. Netflix does this the least among the three, but it does it nonetheless. (The company was scolded by one of the Net’s leading bloggers in 2004 for failing to offer APIs.26 It is slowly responding.) Its purpose is to “improve the accuracy of predictions about how much someone is going to love a movie based on their movie preferences.”27 To achieve this end, Netflix runs a “Netflix Prize”—offering a grand prize of $1 million to anyone who improves Netflix’s own system by more than 10 percent. To enable this competition to happen, Netflix shared “a lot of anonymous rating data.” The company 80706 i-xxiv 001-328 r4nk.indd 137 8/12/08 1:55:20 AM REMI X 138 also increasingly offers through RSS feeds access to ranking information about its users’ choices. Amazon does this through its Amazon Web Services.
See “Google Defies US Over Search Data,” BBC News, January 20, 2006, available at link #64; Maryclaire Dale, “Judge Throws Out Internet Blocking Law: Ruling States Parents Must Protect Children Through Less Restrictive Means,” MSNBC, March 22, 2007, available at link #65. Google prevailed in its effort to restrict the government’s search. See Gonzales v. Google, 234 F.R.D. 674 (N.D. Cal. 2006). 26. Phillip Torrone, “Netflix, Open Up or Die . . . ,” available at link #66. 27. Netflix, Netflix Prize, available at link #67 (last visited July 2, 2007). 28. Tapscott and Williams, Wikinomics, 183. 29. See Tim O’Reilly, “What Is Web 2.0: Design Patterns and Business Models for the Next Generation of Software,” O’Reilly, September 30, 2005, available at link #68. As Mary Madden summarizes the idea, it is “utilizing collective intelligence, providing networkenabled interactive services, giving users control over their own data.”
Machine Learning for Hackers by Drew Conway, John Myles White
call centre, centre right, correlation does not imply causation, Debian, Erdős number, Nate Silver, natural language processing, Netflix Prize, p-value, pattern recognition, Paul Erdős, recommendation engine, social graph, SpamAssassin, statistical model, text mining, the scientific method, traveling salesman
As long as we assume we’re working with rectangular arrays, we can use lots of powerful mathematical techniques without having to think very carefully about the actual mathematical operations being performed. For example, we briefly describe matrix multiplication in Chapter 9, even though almost every technique we’re going to exploit can be described in terms of matrix multiplications, whether it’s the standard linear regression model or the modern matrix factorization techniques that have become so popular lately thanks to the Netflix Prize. Because we’ll treat data rectangles, tables, and matrices interchangeably, we ask for your patience when we switch back and forth between those terms throughout this book. Whatever term we use, you should just remember that we’re thinking of something like Table 2-1 when we talk about data. Table 2-1. Your authors NameAge Drew Conway 28 John Myles White 29 Because data consists of rectangles, we can actually draw pictures of the sorts of operations we’ll perform pretty easily.
x <- 1:10 y <- x ^ 2 fitted.regression <- lm(y ~ x) errors <- residuals(fitted.regression) squared.errors <- errors ^ 2 mse <- mean(squared.errors) mse # 52.8 The solution to this problem of scale is trivial: we take the square root of the mean squared error to get the root mean squared error, which is the RMSE measurement we also tried out earlier in this chapter. RMSE is a very popular measure of performance for assessing machine learning algorithms, including algorithms that are far more sophisticated than linear regression. For just one example, the Netflix Prize was scored using RMSE as the definitive metric of how well the contestants’ algorithms performed. x <- 1:10 y <- x ^ 2 fitted.regression <- lm(y ~ x) errors <- residuals(fitted.regression) squared.errors <- errors ^ 2 mse <- mean(squared.errors) rmse <- sqrt(mse) rmse # 7.266361 One complaint that people have with RMSE is that it’s not immediately clear what mediocre performance is. Perfect performance clearly gives you an RMSE of 0, but the pursuit of perfection is not a realistic goal in these tasks.
Machine Learning for Email by Drew Conway, John Myles White
As long as we assume we’ve working with rectangular arrays, we’ll be able to use lots of powerful mathematical techniques without having to think very carefully about the actual mathematical operations being performed. For example, we won’t explicitly use matrix multiplication anywhere in this book, even though almost every technique we’re going to exploit can be described in terms of matrix multiplications, whether it’s the standard linear regression model or the modern matrix factorization techniques that have become so popular lately thanks to the Netflix prize. Because we’ll treat data rectangles, tables, and matrices interchangeably, we ask for your patience when we switch back and forth between those terms throughout this book. Whatever term we use, you should just remember that we’re thinking of something like Table 2-1 when we talk about data. Table 2-1. Your authors NameAge Drew Conway 28 John Myles White 29 Since data consists of rectangles, we can actually draw pictures of the sorts of operations we’ll perform pretty easily.
Mining of Massive Datasets by Jure Leskovec, Anand Rajaraman, Jeffrey David Ullman
cloud computing, crowdsourcing, en.wikipedia.org, first-price auction, G4S, information retrieval, John Snow's cholera map, Netflix Prize, NP-complete, PageRank, pattern recognition, random walk, recommendation engine, second-price auction, sentiment analysis, social graph, statistical model, web application
C. Anderson, The Long Tail: Why the Future of Business is Selling Less of More, Hyperion Books, New York, 2006. Y. Koren, “The BellKor solution to the Netflix grand prize,” www.netflixprize.com/assets/GrandPrize2009_BPC_BellKor.pdf 2009. G. Linden, B. Smith, and J. York, “Amazon.com recommendations: item-to-item collaborative filtering,” Internet Computing 7:1, pp. 76–80, 2003. M. Piotte and M. Chabbert, ”The Pragmatic Theory solution to the Netflix grand prize,” www.netflixprize.com/assets/GrandPrize2009_BPC_PragmaticTheory.pdf 2009. A. Toscher, M. Jahrer, and R. Bell, “The BigChaos solution to the Netflix grand prize,” www.netflixprize.com/assets/GrandPrize2009_BPC_BigChaos.pdf 2009. L. von Ahn, “Games with a purpose,” IEEE Computer Magazine, pp. 96–98, June 2006. 1 To be exact, the algorithm had to have a root-mean-square error (RMSE) that was 10% less than the RMSE of the Netflix algorithm on a test set taken from actual ratings of Netflix users.
The process converges to a local optimum, although to have a good chance of obtaining a global optimum we must either repeat the process from many starting matrices, or search from the starting point in many different ways. ✦The NetFlix Challenge: An important driver of research into recommendation systems was the NetFlix challenge. A prize of $1,000,000 was offered for a contestant who could produce an algorithm that was 10% better than NetFlix’s own algorithm at predicting movie ratings by users. The prize was awarded in Sept., 2009. 9.7References for Chapter 9  is a survey of recommendation systems as of 2005. The argument regarding the importance of the long tail in on-line systems is from , which was expanded into a book .  discusses the use of computer games to extract tags for items.
These suggestions are not random, but are based on the purchasing decisions made by similar customers or on other techniques we shall discuss in this chapter. (2)Movie Recommendations: Netflix offers its customers recommendations of movies they might like. These recommendations are based on ratings provided by users, much like the ratings suggested in the example utility matrix of Fig. 9.1. The importance of predicting ratings accurately is so high, that Netflix offered a prize of one million dollars for the first algorithm that could beat its own recommendation system by 10%.1 The prize was finally won in 2009, by a team of researchers called “Bellkor’s Pragmatic Chaos,” after over three years of competition. (3)News Articles: News services have attempted to identify articles of interest to readers, based on the articles that they have read in the past. The similarity might be based on the similarity of important words in the documents, or on the articles that are read by people with similar reading tastes.
Avogadro Corp by William Hertling
Any sufficiently advanced technology is indistinguishable from magic, cloud computing, crowdsourcing, Hacker Ethic, hive mind, invisible hand, natural language processing, Netflix Prize, private military company, Ray Kurzweil, recommendation engine, Richard Stallman, Ruby on Rails, standardized shipping container, technological singularity, Turing test, web application, WikiLeaks
“At the heart of how this works is the field of recommendation algorithms,” David explained. “Sean hired me not because I knew anything about language analysis but because I was a leading competitor in the Netflix competition. Netflix recommends movies that you’d enjoy watching. The better Netflix can do this, the more you as a customer enjoy using Netflix’s movie rental service. Several years ago, Netflix offered a million dollar prize to anyone who could beat their own algorithm by ten percent.” “What’s amazing and even counterintuitive about recommendation algorithms is that they don’t depend on understanding anything about the movie. Netflix does not, for example, have a staff of people watching movies to categorize and rate them, just to find the latest sci-fi space action thriller that I happen to like. Instead, they rely on a technique called collaborative filtering, where they find other customers just like me, and then see how those customers rated a given movie to predict how I’ll rate it.
While the analysis module determined goals and objectives from the email, the optimization module used fragments from thousands of other emails to create a realistic email written in a voice very similar to that of the sender. David relished the success of the team, and wished he could share with them what they had accomplished. Their project was the culmination of nearly three years of dedicated research and development. It had started with David’s work on the Netflix Prize before he was hired at Avogadro, although even that work had been built on the shoulders of geniuses that had come before him. Then there were eight months of him and Mike laboring on their own to prove out the idea enough to justify an entire team. Finally, during the last eighteen months, an entire R&D team worked on the project, building the initial architecture, and then incrementally improving the effectiveness of the system week after week.
Exponential Organizations: Why New Organizations Are Ten Times Better, Faster, and Cheaper Than Yours (And What to Do About It) by Salim Ismail, Yuri van Geest
23andMe, 3D printing, Airbnb, Amazon Mechanical Turk, Amazon Web Services, augmented reality, autonomous vehicles, Baxter: Rethink Robotics, Ben Horowitz, bioinformatics, bitcoin, Black Swan, blockchain, Burning Man, business intelligence, business process, call centre, chief data officer, Chris Wanstrath, Clayton Christensen, clean water, cloud computing, cognitive bias, collaborative consumption, collaborative economy, commoditize, corporate social responsibility, cross-subsidies, crowdsourcing, cryptocurrency, dark matter, Dean Kamen, dematerialisation, discounted cash flows, disruptive innovation, distributed ledger, Edward Snowden, Elon Musk, en.wikipedia.org, Ethereum, ethereum blockchain, game design, Google Glasses, Google Hangouts, Google X / Alphabet X, gravity well, hiring and firing, Hyperloop, industrial robot, Innovator's Dilemma, intangible asset, Internet of things, Iridium satellite, Isaac Newton, Jeff Bezos, Joi Ito, Kevin Kelly, Kickstarter, knowledge worker, Kodak vs Instagram, Law of Accelerating Returns, Lean Startup, life extension, lifelogging, loose coupling, loss aversion, low earth orbit, Lyft, Marc Andreessen, Mark Zuckerberg, market design, means of production, minimum viable product, natural language processing, Netflix Prize, NetJets, Network effects, new economy, Oculus Rift, offshore financial centre, PageRank, pattern recognition, Paul Graham, paypal mafia, peer-to-peer, peer-to-peer model, Peter H. Diamandis: Planetary Resources, Peter Thiel, prediction markets, profit motive, publish or perish, Ray Kurzweil, recommendation engine, RFID, ride hailing / ride sharing, risk tolerance, Ronald Coase, Second Machine Age, self-driving car, sharing economy, Silicon Valley, skunkworks, Skype, smart contracts, Snapchat, social software, software is eating the world, speech recognition, stealth mode startup, Stephen Hawking, Steve Jobs, subscription business, supply-chain management, TaskRabbit, telepresence, telepresence robot, Tony Hsieh, transaction costs, Travis Kalanick, Tyler Cowen: Great Stagnation, uber lyft, urban planning, WikiLeaks, winner-take-all economy, X Prize, Y Combinator, zero-sum game
Internal Usage: Maps display resulting gestures for all users SCALE Attribute: Community & Crowd Google Interface: AdWords Description: User picks keywords to advertise against Internal Usage: Google places ads against search results SCALE Attribute: Algorithms GitHub Interface: Version control system Description: Multiple coders updating software sequentially and in parallel Internal Usage: Platform keeps all contributions in sync SCALE Attribute: Community & Crowd Zappos Interface: Hiring process Description: Incentive competitions Internal Usage: Narrows down candidates from large pool SCALE Attribute: Engagement Gigwalk Interface: Task availability Description: Gigwalk workers receive location-based, simple tasks when available Internal Usage: Matches task demand with supply of Gigwalkers SCALE Attribute: Staff on Demand One final way to think about Interfaces is that they help manage abundance. While most processes are optimized around scarcity and efficiency, SCALE elements generate large result sets, meaning Interfaces are geared towards filtering and matching. As an example, keep in mind that the Netflix prize generated 44,104 entries that needed to be filtered, ranked, prioritized and scored. Why Important? Dependencies or Prerequisites • Filter external abundance into internal value • Bridge between external growth drivers and internal stabilizing factors • Automation allows scalability • Standardized processes to enable automation • Scalable externalities • Algorithms (in most cases) Dashboards Given the huge amounts of data from customers and employees becoming available, ExOs need a new way to measure and manage the organization: a real-time, adaptable dashboard with all essential company and employee metrics, accessible to everyone in the organization.
A small mass allows dramatic acceleration and quick changes in direction—precisely what we’re seeing with many ExOs today. With very little internal inertia (that is, number of employees, assets or organizational structures), they demonstrate extraordinary flexibility, which is a critical quality in today’s volatile world. This remarkable characteristic has been well demonstrated by Netflix. As mentioned earlier, the company offered a $1 million prize (Engagement) to anyone who could improve its rental recommendation program. What is less well known is that Netflix never implemented the winning algorithm. Why? Because, tellingly, the market had already moved on. By the conclusion of the contest the industry had shifted away from rental DVDs; meanwhile Netflix’s streaming video business was exploding and, unfortunately, the winning algorithm didn’t apply to streaming recommendations.
Data Science from Scratch: First Principles with Python by Joel Grus
= other_interest_id and similarity > 0] return sorted(pairs, key=lambda (_, similarity): similarity, reverse=True) which suggests the following similar interests: [('Hadoop', 0.8164965809277261), ('Java', 0.6666666666666666), ('MapReduce', 0.5773502691896258), ('Spark', 0.5773502691896258), ('Storm', 0.5773502691896258), ('Cassandra', 0.4082482904638631), ('artificial intelligence', 0.4082482904638631), ('deep learning', 0.4082482904638631), ('neural networks', 0.4082482904638631), ('HBase', 0.3333333333333333)] Now we can create recommendations for a user by summing up the similarities of the interests similar to his: def item_based_suggestions(user_id, include_current_interests=False): # add up the similar interests suggestions = defaultdict(float) user_interest_vector = user_interest_matrix[user_id] for interest_id, is_interested in enumerate(user_interest_vector): if is_interested == 1: similar_interests = most_similar_interests_to(interest_id) for interest, similarity in similar_interests: suggestions[interest] += similarity # sort them by weight suggestions = sorted(suggestions.items(), key=lambda (_, similarity): similarity, reverse=True) if include_current_interests: return suggestions else: return [(suggestion, weight) for suggestion, weight in suggestions if suggestion not in users_interests[user_id]] For user 0, this generates the following (seemingly reasonable) recommendations: [('MapReduce', 1.861807319565799), ('Postgres', 1.3164965809277263), ('MongoDB', 1.3164965809277263), ('NoSQL', 1.2844570503761732), ('programming languages', 0.5773502691896258), ('MySQL', 0.5773502691896258), ('Haskell', 0.5773502691896258), ('databases', 0.5773502691896258), ('neural networks', 0.4082482904638631), ('deep learning', 0.4082482904638631), ('C++', 0.4082482904638631), ('artificial intelligence', 0.4082482904638631), ('Python', 0.2886751345948129), ('R', 0.2886751345948129)] For Further Exploration Crab is a framework for building recommender systems in Python. Graphlab also has a recommender toolkit. The Netflix Prize was a somewhat famous competition to build a better system to recommend movies to Netflix users. Chapter 23. Databases and SQL Memory is man’s greatest friend and worst enemy. Gilbert Parker The data you need will often live in databases, systems designed for efficiently storing and querying data. The bulk of these are relational databases, such as Oracle, MySQL, and SQL Server, which store data in tables and are typically queried using Structured Query Language (SQL), a declarative language for manipulating data.
Adapt: Why Success Always Starts With Failure by Tim Harford
Andrew Wiles, banking crisis, Basel III, Berlin Wall, Bernie Madoff, Black Swan, car-free, carbon footprint, Cass Sunstein, charter city, Clayton Christensen, clean water, cloud computing, cognitive dissonance, complexity theory, corporate governance, correlation does not imply causation, creative destruction, credit crunch, Credit Default Swap, crowdsourcing, cuban missile crisis, Daniel Kahneman / Amos Tversky, Dava Sobel, Deep Water Horizon, Deng Xiaoping, disruptive innovation, double entry bookkeeping, Edmond Halley, en.wikipedia.org, Erik Brynjolfsson, experimental subject, Fall of the Berlin Wall, Fermat's Last Theorem, Firefox, food miles, Gerolamo Cardano, global supply chain, Intergovernmental Panel on Climate Change (IPCC), Isaac Newton, Jane Jacobs, Jarndyce and Jarndyce, Jarndyce and Jarndyce, John Harrison: Longitude, knowledge worker, loose coupling, Martin Wolf, mass immigration, Menlo Park, Mikhail Gorbachev, mutually assured destruction, Netflix Prize, New Urbanism, Nick Leeson, PageRank, Piper Alpha, profit motive, Richard Florida, Richard Thaler, rolodex, Shenzhen was a fishing village, Silicon Valley, Silicon Valley startup, South China Sea, special economic zone, spectrum auction, Steve Jobs, supply-chain management, the market place, The Wisdom of Crowds, too big to fail, trade route, Tyler Cowen: Great Stagnation, web application, X Prize, zero-sum game
The better the recommendations, the happier the customer, so in March 2006 the founder and chief executive of Netflix, Reed Hastings, met some colleagues to discuss how they might improve the software that made the recommendations. Hastings had been inspired by the story of John Harrison, and suggested offering a prize of $1m to anyone who could do better than Netflix’s in-house algorithm, Cinematch. The Netflix prize, announced in October 2006, struck a chord with the Web 2.0 generation. Within days of the prize announcement, some of the best minds in the relevant fields of computer science were on the case. Within a year, the leading entries had reduced Cinematch’s recommendation errors by more than 8 per cent – close to the million-dollar hurdle of 10 per cent. Over 2,500 teams from 161 countries and comprising 27,000 competitors entered the contest.
Economy, http://online.wsj.com/article/SB10001424052748703344704574610350092009062.html 104 ‘Firms are reluctant to risk their money’: McKinstry, Spitfire, pp. 34–5. 105 There is an inconvenient tale behind this: I have drawn much of this account from Dava Sobel’s Longitude (London: Fourth Estate, 1996). 106 Compared with the typical wage of the day: Officer, ‘Purchasing power of British pounds’, cited above, n. 10. 107 In 1810 Nicolas Appert: http://en.wikipedia.org/wiki/Nicolas_Appert 107 Ultimately the Académie began to turn down: Maurice Crosland, ‘From prizes to grants in the support of scientific research in France in the nineteenth century: The Montyon legacy’, Minerva, 17(3) (1979), pp. 355–80, and Robin Hanson, ‘Patterns of patronage: why grants won over prizes in science’, University of California, Berkeley, working paper 1998, http://hanson.gmu.edu/whygrant.pdf 108 Innovation prizes were firmly supplanted: Hanson, ‘Patterns of patronage’. 109 The prize was eventually awarded in September 2009: a follow-up prize was announced and then cancelled following a lawsuit over privacy. One Netflix user alleged that the data released by Netflix didn’t sufficiently conceal her anonymity, and might allow others to discover that she was a lesbian by connecting her with ‘anonymous’ reviews. (Ryan Singel, ‘Netflix spilled your Brokeback Mountain secret, lawsuit claims’, Wired, 17 December 2009, http://www.wired.com/threatlevel/2009/12/netflix-privacy-lawsuit/) 110 ‘One of the goals of the prize’: author interview, 13 December 2007. 110 Not everybody responds to such incentives: ‘Russian maths genius Perelman urged to take $1m prize’, BBC News, 24 March 2010, http://news.bbc.co.uk/1/hi/8585407.stm 111 The vaccine prize takes the form of an agreement: the advanced market commitment idea was developed by Michael Kremer in ‘Patent buyouts: a mechanism for encouraging innovation’, Quarterly Journal of Economics, 113:4 (1998), 1137–67; but also see http://www.vaccineamc.org/ and the Center for Global Development’s ‘Making markets for vaccines’, http://www.cgdev.org/section/initiatives/_archive/vaccinedevelopment 111 Only the very largest pharmaceutical companies spend more than: Medicines Australia, ‘Global pharmaceutical industry facts at a glance’, p. 3, http://www.medicinesaustralia.com.au/pages/images/Global%20-%20facts%20at%20a%20glance.pdf 111 Children in Nicaragua received: Amanda Glassman, ‘Break out the champagne!
.), 37, 65, 77–8, 223, 227, 228; Battle of 73 Easting and, 72–3, 79; counterinsurgency strategy and, 53–5, 57, 61, 64, 74, 75, 79, 258, 262; streak of sedition, 54–5, 56, 58, 59, 78; in Tal Afar, 53–6, 61, 64, 79; on Vietnam War, 46, 47, 50, 56, 78 McNamara, Robert, 46–7, 49–50, 60, 68, 69, 76, 78 medical profession, 120–1; clinical trials, 123–4, 125–6; in history, 121–3, 140–1; rigorous evidence and, 121, 122–3, 125–7 Medical Research Council, 100 Melvill, Mike, 112, 114 Menand, Louis, 7 Merton Rule, 169–72, 176, 177 microfinance organisations, 116, 117–18, 120 Microsoft, 11, 12, 90, 78668 >112, 241, 242 Miguel, Edward, 129, 131 Millennium prizes, 110, 114 mission command doctrine, 79 Mitchelhill, Steve, 193 Mitchell, Reginald, 88, 89, 114, 223, 262 molecular biology, 98–9; DNA research, 99–100; prizes for, 109, 110 Mondrian, Piet, 260 moon landings, 84, 113 Moore, Paul, 211, 213, 214, 250 Morse, Adair, 210, 213 Moulin, Sylvie, 127–8 Movin’ Out (ballet/musical), 247–50, 253–4, 257, 258–9 Mprize, 110 Mullainathan, Sendhil, 135 Murray, Euan, 163–5 Myers, Dave, 233 Nagl, John, 52, 61, 63, 65, 66, 76 Napoleon, 41, 107 > NASA, 113 National Academy of Sciences, 6 National Bureau of Economic Research, 145 National Institutes of Health, US (NIH), 99–100, 101, 102–3 Netflix, 108–9 New Songdo City (South Korea), 152 New Zealand, 161, 176 Newsweek, 63 Newton, Sir Isaac, 105 Nobel prizes, 68, 75, 100, 108, 116, 120 nuclear industry, 184, 185, 187, 191–3, 215, 220, 227–8, 230–1 Obama, President Barack, 5, 195 Odean, Terrance, 35 Office for Financial Research, US, 195 Ofshe, Richard, 252–3 Oklahoma! (musical), 248 Oliver, Jamie, 29–30 Olken, Benjamin, 133–4, 142, 143 Opportunity International, organic products, 159, 160, 226 organisations: ‘bottom-up’ adaptation, 58, 60–1, 134; grandiosity and, 27–8; idealised hierarchy, 40–1, 42, 46–7, 49–50, 55; peer monitoring, 229–31, 232–3; standardisation and, 28; traditional, 29, 31, 35; see also corporations and companies; government and politics Orgel, Leslie, 174, 175, 176, 177, 178, 180 Ormerod, Paul, 18–19 outsourcing, 90 Pace, General Peter, 42, 43 Packer, George, 43 Page, Larry, 231–2 Page, Scott, 49 Palchinsky, Peter, 21–5, 26, 27, 30, 31, 49, 118, 250 ‘Palchinsky principles’, 25, 28, 29, 36, 207, 224, 235, 250; see also selection principle; survivability principle; variation principle Palmer, Geoffrey, 170–1 Palo Alto Research Center (Parc), 11 Parkinson, Elizabeth, 249 patents, 90, 91–2, 94, 95–7, 104, 110, 111, 113, 114, 179 Patriquin, Captain Travis, 58 Pentagon Papers, 62 PEPFAR, 119 Pepys, Samuel, 96 Perelman, Grigory, 110 Perrow, Charles, 185, 186, 191, 194–5 Peters, Tom, 8, 10, 244 Petraeus, General David, 37, 59–62, 63–4, 65, 74, 78, 256 Pfizer, 90 pharmaceutical industry, 94, 110–11, 114, 236–7 PhilCo, 11 Philippines, 136 Phillips, Michael, 249 Picasso, Pablo, 260 pilot schemes, 29–30 Pinochet, General Augusto, 70 Piper-Alpha disaster (July 1988), 181–3, 184, 186, 187, 208–9, 219 planning, 19, 68–9; centrally planned economies, 11, 21, 23–6, 68–9, 70; ‘effects-based operations’ (EBO), 67–8, 74; localised/fleeting information and, 21–2, 24, 25, 31, 52–3, 57–8, 66–7, 71–3, 74, 78, 79; quantitative analysis, 46–7, 69, 78 PlayPumps, 118–19, 120, 130, 142 pneumococcal infections, 110–11, 114 poker, 31–2 politics see government and politics POSCO, 152 ‘postcode lottery’ concept, 28 poverty, global, 4, 5, 115–16 printing industry, early, 10 problem solving, 4–6, 14; evolutionary theory and, 14–15, 16, 17; idealised view of, 40–1, 42, 46–7, 49–50, 55; lessons from history and, 63, 65–7; technology and, 84, 94; ‘Toaster Project’, 1–2, 4, 12; see also decision making; innovation; ‘Palchinsky principles’; trial and error Procter & Gamble, 9, 12 public services, 28, 141, 213–14 public transport, 161–2 Pullman, 9, 15 PwC, 196–200 Pye, David, 80 Al Qaeda in Iraq (AQI), 39, 40, 43, 51, 54, 57, 77 quantitative analysis, 46–7, 69, 78 Rajan, Raghuram, 75 randomised trials, 235–8; development and, 127–9, 131, 132, 133, 134, 135–6, 137–40, 141 Raskin, Aza, 221 Reagan, President Ronald, 6 Reason, James, 184–5, 186–7, 208, 209, 218 Reinikka, Ritva, 142 renewable energy technology, 84, 91, 96, 130, 168, 169–73, 179, 245 research and development, 83–5, 87–95, 99–104, 111; see also innovation Ricks, Thomas, 61 risk, psychology of, 32–5, 253–4, 256 risk management, 183, 185, 187–90, 206–7 Roche, 97 Roger Preston Partners, 170–1 Romer, Paul, 150–1 Royal Air Force (RAF), 80–2, 88 Royal College of Physicians, 122 Royal Observatory, 105, 106, 107 Rumsfeld, Donald, 59, 61; centralisation and, 47, 69, 71, 72, 76, 196; refusal of advice/feedback, 43–4, 45, 46, 50, 57, 60, 62, 63, 65, 67, 223, 256; term ‘insurgency’ and, 42–3, 55, 63, 250 Russia, 21–7, 68–9, 250 Rutan, Burt, 112 Sachs, Jeffrey, 129–30 Saddam Hussein, 44, 45, 66, 73 Santa Fe Institute, 16, 103 Schmidt, Eric, 230, 231, 232 Schneider Trophy, 88, 89, 110, 114 Schulz, Kathryn, Being Wrong, 262 Schumacher, E.F., 181 Schwab, Charles, 243 Schwarzkopf, Norman, 67, 68 Scott, Owen, 118–19 scurvy, 122–3 Second World War, 81–2, 83, 85, 89, 124–5, 126 Securities and Exchange Commission (SEC), 210, 212–13 selection principle, 25–6, 27, 207, 224, 250, 259; charter cities and, 149, 152, 153; development aid, 117, 140–3, 149, 152, 153; evolutionary theory and, 13, 14, 15, 16–17, 23, 86; pilot schemes and, 29–30 Sepp, Kalev, 61 Sewall, Sarah, 61, 63 Shell, 9, 244–5 Shenzhen, 150, 152 Shimura, Goro, 247 Shindell, Drew, 160* Shinseki, General Eric, 43–4, 45 Shirky, Clay, 90 Shovell, Admiral Sir Clowdisley, 105 SIGMA I war game, 50 Sims, Karl, 13–14, 174, 176 Singapore, 150 Singer, 9, 10, 15 Skunk Works division, Lockheed, 89, 93, 224, 242 Smith, Adam, 143, 147 Sobel, Dava, Longitude, 107* solar power, 84, 91, 96, 179, 245 Solidarity movement, Polish, 26 Sorkin, Andrew Ross, 193 South Africa, 147 South Korea, 146–7, 152 Soviet Union, 21–7, 68–9, 250 space tourism, 112–13, 114 Spitfire aircraft, 81–2, 83, 84–5, 87–9, 114, 262 Spock, Dr Benjamin, Baby and Child Care, 120–1 Sri Lanka, 136 Stalin, Joseph, 24, 250 Starbucks, 28, 159, 164, 165, 166 Sunstein, Cass, 177–8 Supermarine, 81, 87–9 survivability principle, 25, 36, 153, 207–8, 215, 224, 235, 243, 250 Svensson, Jakob, 142–3 Tabarrok, Alex, 96 Taiwan, 148 Taleb, Nassim, The Black Swan, 83 Target (discount retailer), 243 Taylor, A.J.P., 89 Taylor, Charles, 136 technologies, new: centralisation and, 71, 75, 76, 79, 226, 227, 228; decentralisation and, 76; ‘effects-baseerations’ (EBO), 67–8, 74; evolutionary theory and, 13–14, 174; first Gulf War and, 67, 71, 72–3, 79; fraud and, 212; hi-tech start-ups, 90, 91; innovation and, 89–90, 91, 94–5, 239–40; iPhone and Android apps, 90, 93; Iraq war and, 71, 72, 74, 78–9, 196; Robert McNamara and, 47, 69; open-source software movement, 230; prizes and, 108–9; product space concept and, 145–8; Project CyberSyn, 69–72; randomised experiments and, 235–7; return on investment and, 83–4; safety systems and, 193; software, 12, 76, 90, 92–3, 230, 241–2; unpredictability and, 84–5; virtual decision making, 49; see also internet terrorism, 4, 51, 54, 57, 96–7, 192 Tesco, 75, 226 Tetlock, Philip, 6–8, 10, 16, 17, 19, 66 Thaler, Richard, 33, 34, 177–8, 254, 256 Tharp, Twyla, 247–51, 253–4, 256, 257, 258–9, 262 Thatcher, Margaret, 20 Thre Mile Island disaster (1979), 36, 184, 185, 191–2, 193, 220 Thwaites, Thomas, ‘Toaster Project’, 1–2, 4, 12 Timpson, John, 226–7, 228–9, 230, 232–3 Tipton, Jennifer, 257 Toyota, 9, 159, 161, 165 Transitron, 11 Transocean, 216, 217, 218–19 Trenchard, Sir Hugh, 88 trial and error, 12, 14, 17, 19–20, 21, 35, 36, 66, 220; decentralisation and, 31, 174–5, 232, 234; Thomas Edison and, 236, 238; individuals and, 31–5; Iraq war and, 64–5, 66–7; market system, 20; randomised experiments, 235–9; Muhammad Yunus and, 116, 117–18 Tversky, Amos, 32, 253 Tyndall, John, 154–6 Tzara, Tristan, 247 Uganda, 142–3 United Arab Emirates, 147 university students, 260–1 US Air Force, 93 US Steel, 9 USAID, 119 Utah, University of, 99 van Helmont, Jan Baptist, 121–2, 141 variation principle, 25–6, 79, 100, 117, 140, 174–5, 207, 224, 235, 250; charter cities and, 152–3; evolutionary theory and, 13, 14, 16–17, 23; grandiosity and, 27–8; pluralism and, 85; uniform standards and, 28–9 Vaze, Prashant, 177–8 Venter, Craig, 109 Vickers, 88 Vietnam war, 46–7, 49–50, 56, 62, 64, 68, 69, 78, 243–4 Virgin Group, 112, 243 Wallis, Barnes, 88 Wallstrom, Margot, 139 Wal-Mart, 3, 75, 226, 238 Warhol, Andy, 28 Waterman, Robert, 8, 10 Watson, James, 98–9 Weinstein, Jeremy, 137, 138 Weiss, Bob, 110 Westinghouse Electric, 9 whistleblowers, 211–14, 215, 218–19, 220, 229 White Knight One, 112, 114 Whole Foods Market, 224–6, 227, 228, 229, 232, 234 Wikipedia, 230 Williams, Mike, 216–17 Willumstad, Robert, 193–4 W.L.
Hands-On Machine Learning With Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems by Aurelien Geron
Amazon Mechanical Turk, Bayesian statistics, centre right, combinatorial explosion, constrained optimization, correlation coefficient, crowdsourcing, en.wikipedia.org, iterative process, Netflix Prize, NP-complete, optical character recognition, P = NP, p-value, pattern recognition, performance metric, recommendation engine, self-driving car, SpamAssassin, speech recognition, statistical model
Such an ensemble of Decision Trees is called a Random Forest, and despite its simplicity, this is one of the most powerful Machine Learning algorithms available today. Moreover, as we discussed in Chapter 2, you will often use Ensemble methods near the end of a project, once you have already built a few good predictors, to combine them into an even better predictor. In fact, the winning solutions in Machine Learning competitions often involve several Ensemble methods (most famously in the Netflix Prize competition). In this chapter we will discuss the most popular Ensemble methods, including bagging, boosting, stacking, and a few others. We will also explore Random Forests. Voting Classifiers Suppose you have trained a few classifiers, each one achieving about 80% accuracy. You may have a Logistic Regression classifier, an SVM classifier, a Random Forest classifier, a K-Nearest Neighbors classifier, and perhaps a few more (see Figure 7-1).
The Formula: How Algorithms Solve All Our Problems-And Create More by Luke Dormehl
3D printing, algorithmic trading, Any sufficiently advanced technology is indistinguishable from magic, augmented reality, big data - Walmart - Pop Tarts, call centre, Cass Sunstein, Clayton Christensen, commoditize, computer age, death of newspapers, deferred acceptance, disruptive innovation, Edward Lorenz: Chaos theory, Erik Brynjolfsson, Filter Bubble, Flash crash, Florence Nightingale: pie chart, Frank Levy and Richard Murnane: The New Division of Labor, Google Earth, Google Glasses, High speed trading, Internet Archive, Isaac Newton, Jaron Lanier, Jeff Bezos, job automation, John Markoff, Kevin Kelly, Kodak vs Instagram, lifelogging, Marshall McLuhan, means of production, Nate Silver, natural language processing, Netflix Prize, Panopticon Jeremy Bentham, pattern recognition, price discrimination, recommendation engine, Richard Thaler, Rosa Parks, self-driving car, sentiment analysis, Silicon Valley, Silicon Valley startup, Slavoj Žižek, social graph, speech recognition, Steve Jobs, Steven Levy, Steven Pinker, Stewart Brand, the scientific method, The Signal and the Noise by Nate Silver, upwardly mobile, Wall-E, Watson beat the top human players on Jeopardy!, Y Combinator
In other areas—particularly as relate to law—a reliance on algorithms might simply justify existing bias and lack of understanding, in the same way that the “filter bubble” effect described in Chapter 1 can result in some people not being presented with certain pieces of information, which may take the form of opportunities. “It’s not just you and I who don’t understand how these algorithms work—the engineers themselves don’t understand them entirely,” says scholar Ted Striphas. “If you look at the Netflix Prize, one of the things the people responsible for the winning entries said over and over again was that their algorithms worked, even though they couldn’t tell you why they worked. They might understand how they work from the point of view of mathematical principles, but that math is so complex that it is impossible for a human being to truly follow. That troubles me to some extent. The idea that we don’t know the world that we’re creating makes it very difficult for us to operate ethically and mindfully within it.”
Too Big to Know: Rethinking Knowledge Now That the Facts Aren't the Facts, Experts Are Everywhere, and the Smartest Person in the Room Is the Room by David Weinberger
airport security, Alfred Russel Wallace, Amazon Mechanical Turk, Berlin Wall, Black Swan, book scanning, Cass Sunstein, commoditize, corporate social responsibility, crowdsourcing, Danny Hillis, David Brooks, Debian, double entry bookkeeping, double helix, en.wikipedia.org, Exxon Valdez, Fall of the Berlin Wall, future of journalism, Galaxy Zoo, Hacker Ethic, Haight Ashbury, hive mind, Howard Rheingold, invention of the telegraph, jimmy wales, Johannes Kepler, John Harrison: Longitude, Kevin Kelly, linked data, Netflix Prize, New Journalism, Nicholas Carr, Norbert Wiener, openstreetmap, P = NP, Pluto: dwarf planet, profit motive, Ralph Waldo Emerson, RAND corporation, Ray Kurzweil, Republic of Letters, RFID, Richard Feynman, Ronald Reagan, semantic web, slashdot, social graph, Steven Pinker, Stewart Brand, technological singularity, Ted Nelson, the scientific method, The Wisdom of Crowds, Thomas Kuhn: the structure of scientific revolutions, Thomas Malthus, Whole Earth Catalog, X Prize
Then, of course, startups and Web 2.0 companies began holding “jellies,” which are like jams but bring together multiple smaller companies.32 Because the Net lets us form expert networks of just about any size and configuration, from twosomes to crowds to massively multiplayer games, the expertise of networks need not be equal to the expertise of its smartest member—or even cumulative. The complex, multiway interactions the Net enables means that networks of experts can be smarter than the sum of their participants. For example, BellKor’s Pragmatic Chaos was able to win the Netflix prize because the Internet not only made it feasible to assemble experts from around the world but also made it possible for those experts to collaborate. While crowdsourcing can aggregate information—people in every neighborhood of New York City can report on what their local groceries are charging for diapers—networked experts who are talking with one another can build on what they know. We see this all the time on topical mailing lists.
Bad Data Handbook by Q. Ethan McCallum
Amazon Mechanical Turk, asset allocation, barriers to entry, Benoit Mandelbrot, business intelligence, cellular automata, chief data officer, Chuck Templeton: OpenTable:, cloud computing, cognitive dissonance, combinatorial explosion, commoditize, conceptual framework, database schema, DevOps, en.wikipedia.org, Firefox, Flash crash, Gini coefficient, illegal immigration, iterative process, labor-force participation, loose coupling, natural language processing, Netflix Prize, quantitative trading / quantitative ﬁnance, recommendation engine, selection bias, sentiment analysis, statistical model, supply-chain management, survivorship bias, text mining, too big to fail, web application
As the value of the approach becomes better-known, the demand for part-time or project-based machine-learning work has grown, but it’s often hard for a traditional engineering team to effectively work with outside experts in the field. I’m going to talk about some of the things I learned while running an outsourced project through Kaggle, a community of thousands of researchers who participate in data competitions modeled on the Netflix Prize. This was an extreme example of outsourcing: we literally handed over a dataset, a short description, and a success metric to a large group of strangers. It had almost none of the traditional interactions you’d expect, but it did teach me valuable lessons that apply to any interactions with machine-learning specialists. Define the Problem My company Jetpac creates a travel magazine written by your friends, using vacation photos they’ve shared with you on Facebook and other social services.
The Inevitable: Understanding the 12 Technological Forces That Will Shape Our Future by Kevin Kelly
A Declaration of the Independence of Cyberspace, AI winter, Airbnb, Albert Einstein, Amazon Web Services, augmented reality, bank run, barriers to entry, Baxter: Rethink Robotics, bitcoin, blockchain, book scanning, Brewster Kahle, Burning Man, cloud computing, commoditize, computer age, connected car, crowdsourcing, dark matter, dematerialisation, Downton Abbey, Edward Snowden, Elon Musk, Filter Bubble, Freestyle chess, game design, Google Glasses, hive mind, Howard Rheingold, index card, indoor plumbing, industrial robot, Internet Archive, Internet of things, invention of movable type, invisible hand, Jaron Lanier, Jeff Bezos, job automation, John Markoff, Kevin Kelly, Kickstarter, lifelogging, linked data, Lyft, M-Pesa, Marc Andreessen, Marshall McLuhan, means of production, megacity, Minecraft, Mitch Kapor, multi-sided market, natural language processing, Netflix Prize, Network effects, new economy, Nicholas Carr, old-boy network, peer-to-peer, peer-to-peer lending, personalized medicine, placebo effect, planetary scale, postindustrial economy, recommendation engine, RFID, ride hailing / ride sharing, Rodney Brooks, self-driving car, sharing economy, Silicon Valley, slashdot, Snapchat, social graph, social web, software is eating the world, speech recognition, Stephen Hawking, Steven Levy, Ted Nelson, the scientific method, transport as a service, two-sided market, Uber for X, uber lyft, Watson beat the top human players on Jeopardy!, Whole Earth Review, zero-sum game
loans worth more than $10 billion: Simon Cunningham, “Default Rates at Lending Club & Prosper: When Loans Go Bad,” LendingMemo, October 17, 2014; and Davey Alba, “Banks Are Betting Big on a Startup That Bypasses Banks,” Wired, April 8, 2015. GE has launched over 400 new products: Steve Lohr, “The Invention Mob, Brought to You by Quirky,” New York Times, February 14, 2015. Netflix announced an award: Preethi Dumpala, “Netflix Reveals Million-Dollar Contest Winner,” Business Insider, September 21, 2009. Forty thousand groups submitted: “Leaderboard,” Netflix Prize, 2009. 150,000 car fanatics: Gary Gastelu, “Local Motors 3-D-Printed Car Could Lead an American Manufacturing Revolution,” Fox News, July 3, 2014. 3-D-printed electric car: Paul A. Eisenstein, “Startup Plans to Begin Selling First 3-D-Printed Cars Next Year,” NBC News, July 8, 2015. 7: FILTERING 8 million new songs: Private correspondence with Richard Gooch, CTO, International Federation of the Phonographic Industry, April 15, 2015.
Data Science for Business: What You Need to Know About Data Mining and Data-Analytic Thinking by Foster Provost, Tom Fawcett
Albert Einstein, Amazon Mechanical Turk, big data - Walmart - Pop Tarts, bioinformatics, business process, call centre, chief data officer, Claude Shannon: information theory, computer vision, conceptual framework, correlation does not imply causation, crowdsourcing, data acquisition, David Brooks, en.wikipedia.org, Erik Brynjolfsson, Gini coefficient, information retrieval, intangible asset, iterative process, Johann Wolfgang von Goethe, Louis Pasteur, Menlo Park, Nate Silver, Netflix Prize, new economy, p-value, pattern recognition, placebo effect, price discrimination, recommendation engine, Ronald Coase, selection bias, Silicon Valley, Skype, speech recognition, Steve Jobs, supply-chain management, text mining, The Signal and the Noise by Nate Silver, Thomas Bayes, transaction costs, WikiLeaks
This is a so-called log-normal distribution, which just means that the logs of the quantities in question are normally distributed.  There are some technicalities to the rules of the Netflix Challenge, which you can find on the Wikipedia page.  The winning team, Bellkor’s Pragmatic Chaos, had seven members. The history of the contest and the team evolution is complicated and fascinating. See this Wikipedia page on the Netflix Prize.  Thanks to one of the members of the winning team, Chris Volinsky, for his help here.  The debate sometimes can bear fruit. For example, thinking whether we have all the requisite information might reveal a new attribute that could be obtained that would increase the possible predictability.  It is beyond the scope of this book to explain the conditions under which this can be given a causal interpretation.
Superintelligence: Paths, Dangers, Strategies by Nick Bostrom
agricultural Revolution, AI winter, Albert Einstein, algorithmic trading, anthropic principle, anti-communist, artificial general intelligence, autonomous vehicles, barriers to entry, Bayesian statistics, bioinformatics, brain emulation, cloud computing, combinatorial explosion, computer vision, cosmological constant, dark matter, DARPA: Urban Challenge, data acquisition, delayed gratification, demographic transition, different worldview, Donald Knuth, Douglas Hofstadter, Drosophila, Elon Musk, en.wikipedia.org, endogenous growth, epigenetics, fear of failure, Flash crash, Flynn Effect, friendly AI, Gödel, Escher, Bach, income inequality, industrial robot, informal economy, information retrieval, interchangeable parts, iterative process, job automation, John Markoff, John von Neumann, knowledge worker, longitudinal study, Menlo Park, meta analysis, meta-analysis, mutually assured destruction, Nash equilibrium, Netflix Prize, new economy, Norbert Wiener, NP-complete, nuclear winter, optical character recognition, pattern recognition, performance metric, phenotype, prediction markets, price stability, principal–agent problem, race to the bottom, random walk, Ray Kurzweil, recommendation engine, reversible computing, social graph, speech recognition, Stanislav Petrov, statistical model, stem cell, Stephen Hawking, strong AI, superintelligent machines, supervolcano, technological singularity, technoutopianism, The Coming Technological Singularity, The Nature of the Firm, Thomas Kuhn: the structure of scientific revolutions, transaction costs, Turing machine, Vernor Vinge, Watson beat the top human players on Jeopardy!, World Values Survey, zero-sum game
This again foreshadows another later theme: the difficulty of anticipating all specific ways in which some particular plausible-seeming rule might go wrong. 73. Nilsson (2009, 319). 74. Minsky (2006); McCarthy (2007); Beal and Winston (2009). 75. Peter Norvig, personal communication. Machine-learning classes are also very popular, reflecting a somewhat orthogonal hype-wave of “big data” (inspired by e.g. Google and the Netflix Prize). 76. Armstrong and Sotala (2012). 77. Müller and Bostrom (forthcoming). 78. See Baum et al. (2011), another survey cited therein, and Sandberg and Bostrom (2011). 79. Nilsson (2009). 80. This is again conditional on no civilization-disrupting catastrophe occurring. The definition of HLMI used by Nilsson is “AI able to perform around 80% of jobs as well or better than humans perform” (Kruel 2012). 81.
Hands-On Machine Learning With Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems by Aurélien Géron
Amazon Mechanical Turk, Anton Chekhov, combinatorial explosion, computer vision, constrained optimization, correlation coefficient, crowdsourcing, don't repeat yourself, Elon Musk, en.wikipedia.org, friendly AI, ImageNet competition, information retrieval, iterative process, John von Neumann, Kickstarter, natural language processing, Netflix Prize, NP-complete, optical character recognition, P = NP, p-value, pattern recognition, pull request, recommendation engine, self-driving car, sentiment analysis, SpamAssassin, speech recognition, stochastic process
Such an ensemble of Decision Trees is called a Random Forest, and despite its simplicity, this is one of the most powerful Machine Learning algorithms available today. Moreover, as we discussed in Chapter 2, you will often use Ensemble methods near the end of a project, once you have already built a few good predictors, to combine them into an even better predictor. In fact, the winning solutions in Machine Learning competitions often involve several Ensemble methods (most famously in the Netflix Prize competition). In this chapter we will discuss the most popular Ensemble methods, including bagging, boosting, stacking, and a few others. We will also explore Random Forests. Voting Classifiers Suppose you have trained a few classifiers, each one achieving about 80% accuracy. You may have a Logistic Regression classifier, an SVM classifier, a Random Forest classifier, a K-Nearest Neighbors classifier, and perhaps a few more (see Figure 7-1).
Data Mining: Concepts, Models, Methods, and Algorithms by Mehmed Kantardzić
Albert Einstein, bioinformatics, business cycle, business intelligence, business process, butter production in bangladesh, combinatorial explosion, computer vision, conceptual framework, correlation coefficient, correlation does not imply causation, data acquisition, discrete time, El Camino Real, fault tolerance, finite state, Gini coefficient, information retrieval, Internet Archive, inventory management, iterative process, knowledge worker, linked data, loose coupling, Menlo Park, natural language processing, Netflix Prize, NP-complete, PageRank, pattern recognition, peer-to-peer, phenotype, random walk, RFID, semantic web, speech recognition, statistical model, Telecommunications Act of 1996, telemarketer, text mining, traveling salesman, web application
It can be combined with any classifiers including neural networks, decision trees, or nearest neighbor classifiers. The algorithm requires almost no parameters to tune, and is still very effective even for the most complex classification problems, but at the same time it could be sensitive to noise and outliers. Ensemble-learning approach showed all advantages in one very famous application, Netflix $1 million competition. The Netflix prize required substantial improvement in the accuracy of predictions on how much someone is going to love a movie based on his or her previous movie preferences. Users’ rating for movies was 1 to 5 stars; therefore, the problem was classification task with five classes. Most of the top-ranked competitors have used some variations of ensemble learning, showing its advantages in practice. Top competitor BellKor team explains ideas behind its success: “Our final solution consists of blending 107 individual predictors.
Free Speech: Ten Principles for a Connected World by Timothy Garton Ash
A Declaration of the Independence of Cyberspace, activist lawyer, Affordable Care Act / Obamacare, Andrew Keen, Apple II, Ayatollah Khomeini, battle of ideas, Berlin Wall, bitcoin, British Empire, Cass Sunstein, Chelsea Manning, citizen journalism, Clapham omnibus, colonial rule, crowdsourcing, David Attenborough, don't be evil, Donald Davies, Douglas Engelbart, Edward Snowden, Etonian, European colonialism, eurozone crisis, failed state, Fall of the Berlin Wall, Ferguson, Missouri, Filter Bubble, financial independence, Firefox, Galaxy Zoo, George Santayana, global village, index card, Internet Archive, invention of movable type, invention of writing, Jaron Lanier, jimmy wales, John Markoff, Julian Assange, Mark Zuckerberg, Marshall McLuhan, mass immigration, megacity, mutually assured destruction, national security letter, Nelson Mandela, Netflix Prize, Nicholas Carr, obamacare, Peace of Westphalia, Peter Thiel, pre–internet, profit motive, RAND corporation, Ray Kurzweil, Ronald Reagan, semantic web, Silicon Valley, Simon Singh, Snapchat, social graph, Stephen Hawking, Steve Jobs, Steve Wozniak, The Death and Life of Great American Cities, The Wisdom of Crowds, Turing test, We are Anonymous. We are Legion, WikiLeaks, World Values Survey, Yom Kippur War
DeNardis 2014, 235–36 131. see Emily Steel and April Dembosky, ‘Facebook Raises Fears with Ad Tracking’, Financial Times, 23 September 2012, http://www.ft.com/intl/cms/s/0/6cc4cf0a-0584-11e2-9ebd-00144feabdc0.html#axzz3rracetFy 132. see Mayer-Schönberger et al. 2013 133. a useful summary is given by Nate Anderson, ‘“Anonymized” Data Really Isn’t—and Here’s Why Not’, arstechnica, http://perma.cc/Z3N7-7TC3. Sweeney’s article was published in Journal of Law, Medicine and Ethics, no. 25, 1997, 98–110 134. Mayer-Schönberger et al. 2013, 154–55 135. see the paper by Arvind Narayanan and Vitaly Shmatikov, ‘Robust De-anonymization of Large Datasets (How to Break Anonymity of the Netflix Prize Dataset)’, University of Texas, 2008, and their useful FAQs at http://perma.cc/9PBE-5BW5. For even more serious examples of the reidentification of supposedly anonymised medical data, see Nuffield Council on Bioethics 2015, 66–69 136. Ghonim 2012, chapters 3 and 4. He was the anonymous administrator of the Facebook page and used Tor to conceal his IP address 137. Josh Chin, ‘China Is Requiring People to Register Real Names for Some Internet Services’, Wall Street Journal, 4 February 2015, http://www.wsj.com/articles/china-to-enforce-real-name-registration-for-internet-users-1423033973.
Data Mining: Concepts and Techniques: Concepts and Techniques by Jiawei Han, Micheline Kamber, Jian Pei
bioinformatics, business intelligence, business process, Claude Shannon: information theory, cloud computing, computer vision, correlation coefficient, cyber-physical system, database schema, discrete time, distributed generation, finite state, information retrieval, iterative process, knowledge worker, linked data, natural language processing, Netflix Prize, Occam's razor, pattern recognition, performance metric, phenotype, random walk, recommendation engine, RFID, semantic web, sentiment analysis, speech recognition, statistical model, stochastic process, supply-chain management, text mining, thinkpad, Thomas Bayes, web application
False positives are less desirable because they can annoy or anger consumers. Content-based recommender systems are limited by the features used to describe the items they recommend. Another challenge for both content-based and collaborative recommender systems is how to deal with new users for which a buying history is not yet available. Hybrid approaches integrate both content-based and collaborative methods to achieve further improved recommendations. The Netflix Prize was an open competition held by an online DVD-rental service, with a payout of $1,000,000 for the best recommender algorithm to predict user ratings for films, based on previous ratings. The competition and other studies have shown that the predictive accuracy of a recommender system can be substantially improved when blending multiple predictors, especially by using an ensemble of many substantially different methods, rather than refining a single technique.
When to Rob a Bank: ...And 131 More Warped Suggestions and Well-Intended Rants by Steven D. Levitt, Stephen J. Dubner
Affordable Care Act / Obamacare, Airbus A320, airport security, augmented reality, barriers to entry, Bernie Madoff, Black Swan, Broken windows theory, Captain Sullenberger Hudson, creative destruction, Daniel Kahneman / Amos Tversky, deliberate practice, feminist movement, food miles, George Akerlof, global pandemic, information asymmetry, invisible hand, loss aversion, mental accounting, Netflix Prize, obamacare, oil shale / tar sands, Pareto efficiency, peak oil, pre–internet, price anchoring, price discrimination, principal–agent problem, profit maximization, Richard Thaler, Sam Peltzman, security theater, Ted Kaczynski, the built environment, The Chicago School, the High Line, Thorstein Veblen, transaction costs, US Airways Flight 1549
. / 53 “A comprehensive Wall Street Journal article”: Sarah Rubenstein, “Why Generic Doesn’t Always Mean Cheap,” The Wall Street Journal, March 13, 2007. 57 “FOR $25 MILLION, NO WAY . . .”: “The virtues of offering big prizes to encourage . . . curing disease”: See Levitt, “Fight Global Pandemics (or at Least Find a Good Excuse When You’re Playing Hooky),” Freakonomics.com, May 18, 2007; “or improving Netflix’s algorithms”: See Levitt, “Netflix $ Million Prize,” Freakonomics.com, October 6, 2006. / 59 “As reported by ABC News”: See Matthew Cole, “U.S. Will Not Pay $25 Million Osama Bin Laden Reward, Officials Say,” ABCNews.com, May 19, 2011. 61 “CAN WE PLEASE GET RID OF THE PENNY ALREADY?”: A “60 Minutes segment called ‘Making Cents’”: See Morley Safer, “Should We Make Cents?,” 60 Minutes, February 10, 2008. 71 “JANE SIBERRY SNAPS”: “Anybody remember when Levitt announced . . .”: See Levitt, “The Two Smartest Musicians I Ever Met,” Freakonomics.com, April 5, 2006; and Levitt, “From Now on I Will Leave the Reporting to Dubner,” Freakonomics.com, April 9, 2006. 72 “HOW MUCH TAX ARE ATHLETES . . .”: “Manny Pacquiao will probably never fight in New York”: See “Manny Pacquiao Won’t Ever Fight in New York Due to State Tax Rates,” The Wall Street Journal, August 7, 2013. / 73 “Pacquiao may never fight anywhere in the U.S. again”: See Lance Pugmire, “Promoter: Manny Pacquiao May Never Again Fight in the U.S.,” The Los Angeles Times, May 31, 2013. / 73 “Phil Mickelson . . .
The Art of Statistics: Learning From Data by David Spiegelhalter
Antoine Gombaud: Chevalier de Méré, Bayesian statistics, Carmen Reinhart, complexity theory, computer vision, correlation coefficient, correlation does not imply causation, dark matter, Edmond Halley, Estimating the Reproducibility of Psychological Science, Hans Rosling, Kenneth Rogoff, meta analysis, meta-analysis, Nate Silver, Netflix Prize, p-value, placebo effect, probability theory / Blaise Pascal / Pierre de Fermat, publication bias, randomized controlled trial, recommendation engine, replication crisis, self-driving car, speech recognition, statistical model, The Design of Experiments, The Signal and the Noise by Nate Silver, The Wisdom of Crowds, Thomas Bayes, Thomas Malthus
This reflects a general concern that algorithms that win Kaggle competitions tend to be very complex in order to achieve that tiny final margin needed to win. A major problem is that these algorithms tend to be inscrutable black boxes – they come up with a prediction, but it is almost impossible to work out what is going on inside. This has three negative aspects. First, extreme complexity makes implementation and upgrading a great effort: when Netflix offered a $1m prize for prediction recommendation systems, the winner was so complicated that Netflix ended up not using it. The second negative feature is that we do not know how the conclusion was arrived at, or what confidence we should have in it: we just have to take it or leave it. Simpler algorithms can better explain themselves. Finally, if we do not know how an algorithm is producing its answer, we cannot investigate it for implicit but systematic biases against some members of the community – a point I expand on below.
The Slow Fix: Solve Problems, Work Smarter, and Live Better in a World Addicted to Speed by Carl Honore
Albert Einstein, Atul Gawande, Broken windows theory, call centre, Checklist Manifesto, clean water, clockwatching, cloud computing, crowdsourcing, Dava Sobel, delayed gratification, drone strike, Enrique Peñalosa, Erik Brynjolfsson, Ernest Rutherford, Exxon Valdez, fundamental attribution error, game design, income inequality, index card, invention of the printing press, invisible hand, Isaac Newton, Jeff Bezos, John Harrison: Longitude, lateral thinking, lone genius, medical malpractice, microcredit, Netflix Prize, planetary scale, Ralph Waldo Emerson, RAND corporation, shareholder value, Silicon Valley, Skype, stem cell, Steve Jobs, Steve Wozniak, the scientific method, The Wisdom of Crowds, ultimatum game, urban renewal, War on Poverty
John Harrison: For the whole story check out Dava Sobel, Longitude: The True Story of a Lone Genius Who Solved the Greatest Scientific Problem of His Time (London: Fourth Estate, 1998). Teenager invents method for detecting pancreatic cancer: Jake Andraka won first place at the 2012 Intel International Science and Engineering Fair (Intel ISEF), a program of the Society for Science and the Public. IBM idea jams: From the company website at https://www.collaborationjam.com/ Netflix offers $1 million prize: “Innovation Prizes – And the Winner Is,” Economist, 5 August 2010. Fiat builds first crowdsourced car: “The Case for Letting Customers Design Your Products,” Inc. Magazine, 20 September 2011. Prototype military vehicle crowdsourced: Based on interview with Ariel Ferreira of Local Motors. Steve Jobs on designing without focus groups: From interview in Business Week, 25 May 1998. Consultants compared 600 programmers across 92 companies: Based on the “Coding War Games” study by Tom DeMarco and Timothy Lister.
Competing on Analytics: The New Science of Winning by Thomas H. Davenport, Jeanne G. Harris
always be closing, big data - Walmart - Pop Tarts, business intelligence, business process, call centre, commoditize, data acquisition, digital map, en.wikipedia.org, global supply chain, high net worth, if you build it, they will come, intangible asset, inventory management, iterative process, Jeff Bezos, job satisfaction, knapsack problem, late fees, linear programming, Moneyball by Michael Lewis explains big data, Netflix Prize, new economy, performance metric, personalized medicine, quantitative hedge fund, quantitative trading / quantitative ﬁnance, recommendation engine, RFID, search inside the book, shareholder value, six sigma, statistical model, supply-chain management, text mining, the scientific method, traveling salesman, yield management
The first is a movie-recommendation “engine” called Cinematch that’s based on proprietary, algorithmically driven software. Netflix hired mathematicians with programming experience to write the algorithms and code to define clusters of movies, connect customer movie rankings to the clusters, evaluate thousands of ratings per second, and factor in current Web site behavior—all to ensure a personalized Web page for each visiting customer. Netflix has also created a $1 million prize for quanitative analysts outside the company who can improve the cinematch algorithm by at least 10 percent. Netflix CEO Reed Hastings notes, “If the Starbucks secret is a smile when you get your latte, ours is that the Web site adapts to the individual’s taste.”1 Netflix analyzes customers’ choices and customer feedback on the movies they have rented—over 1 billion reviews of movies they liked, loved, hated, and so forth—and recommends movies in a way that optimizes both the customer’s taste and inventory conditions.
Overcomplicated: Technology at the Limits of Comprehension by Samuel Arbesman
algorithmic trading, Anton Chekhov, Apple II, Benoit Mandelbrot, citation needed, combinatorial explosion, Danny Hillis, David Brooks, digital map, discovery of the americas, en.wikipedia.org, Erik Brynjolfsson, Flash crash, friendly AI, game design, Google X / Alphabet X, Googley, HyperCard, Inbox Zero, Isaac Newton, iterative process, Kevin Kelly, Machine translation of "The spirit is willing, but the flesh is weak." to Russian and back, mandelbrot fractal, Minecraft, Netflix Prize, Nicholas Carr, Parkinson's law, Ray Kurzweil, recommendation engine, Richard Feynman, Richard Feynman: Challenger O-ring, Second Machine Age, self-driving car, software studies, statistical model, Steve Jobs, Steve Wozniak, Steven Pinker, Stewart Brand, superintelligent machines, Therac-25, Tyler Cowen: Great Stagnation, urban planning, Watson beat the top human players on Jeopardy!, Whole Earth Catalog, Y2K
Exceptions must be cherished, rather than discarded, for exceptions or rare instances contain a large amount of information. The sophisticated machine learning techniques used in linguistics—employing probability and a large array of parameters rather than principled rules—are increasingly being used in numerous other areas, both in science and outside it, from criminal detection to medicine, as well as in the insurance industry. Even our aesthetic tastes are rather complicated, as Netflix discovered when it awarded a prize for improvements in its recommendation engine to a team whose solution was cobbled together from a variety of different statistical techniques. The contest seemed to demonstrate that no simple algorithm could provide a significant improvement in recommendation accuracy; the winners needed to use a more complex suite of methods in order to capture and predict our personal and quirky tastes in films.
Masters of Management: How the Business Gurus and Their Ideas Have Changed the World—for Better and for Worse by Adrian Wooldridge
affirmative action, barriers to entry, Black Swan, blood diamonds, borderless world, business climate, business cycle, business intelligence, business process, carbon footprint, Cass Sunstein, Clayton Christensen, cloud computing, collaborative consumption, collapse of Lehman Brothers, collateralized debt obligation, commoditize, corporate governance, corporate social responsibility, creative destruction, credit crunch, crowdsourcing, David Brooks, David Ricardo: comparative advantage, disintermediation, disruptive innovation, don't be evil, Donald Trump, Edward Glaeser, Exxon Valdez, financial deregulation, Frederick Winslow Taylor, future of work, George Gilder, global supply chain, industrial cluster, intangible asset, job satisfaction, job-hopping, joint-stock company, Joseph Schumpeter, Just-in-time delivery, Kickstarter, knowledge economy, knowledge worker, lake wobegon effect, Long Term Capital Management, low skilled workers, Mark Zuckerberg, McMansion, means of production, Menlo Park, mobile money, Naomi Klein, Netflix Prize, Network effects, new economy, Nick Leeson, Norman Macrae, patent troll, Ponzi scheme, popular capitalism, post-industrial society, profit motive, purchasing power parity, Ralph Nader, recommendation engine, Richard Florida, Richard Thaler, risk tolerance, Ronald Reagan, science of happiness, shareholder value, Silicon Valley, Silicon Valley startup, Skype, Social Responsibility of Business Is to Increase Its Profits, Steve Jobs, Steven Levy, supply-chain management, technoutopianism, The Wealth of Nations by Adam Smith, Thomas Davenport, Tony Hsieh, too big to fail, wealth creators, women in the workforce, young professional, Zipcar
They also discover that the crowds don’t always have their best interests at heart: when Justin Bieber, a Canadian teenage pop star, asked his fans for suggestions as to what country he should visit next, the most popular answer was North Korea.9 One popular solution to the problem of oversupply is to use prizes to give crowdsourcing a focus and structure. The value of prizes being offered by corporations has more than tripled over the past decade, to $375 million.10 Netflix offers a $1 million prize to anyone who can improve its film recommendation system by 10 percent. Frito-Lay offers prizes to people who can come up with new TV ads for its products. Indeed, prizes have become businesses in their own right: InnoCentive has created a network of 170,000 scientists who stand ready to solve R&D problems for a price. Regular users include some of the world’s biggest companies, such as Eli Lilly, which helped to found the network in 2001; Boeing; DuPont; and P&G.
The Runaway Species: How Human Creativity Remakes the World by David Eagleman, Anthony Brandt
active measures, Ada Lovelace, agricultural Revolution, Albert Einstein, Andrew Wiles, Burning Man, cloud computing, computer age, creative destruction, crowdsourcing, Dava Sobel, delayed gratification, Donald Trump, Douglas Hofstadter, en.wikipedia.org, Frank Gehry, Google Glasses, haute couture, informal economy, interchangeable parts, Isaac Newton, James Dyson, John Harrison: Longitude, John Markoff, lone genius, longitudinal study, Menlo Park, microbiome, Netflix Prize, new economy, New Journalism, pets.com, QWERTY keyboard, Ray Kurzweil, reversible computing, Richard Feynman, risk tolerance, self-driving car, Simon Singh, stem cell, Stephen Hawking, Steve Jobs, Stewart Brand, the scientific method, Watson beat the top human players on Jeopardy!, wikimedia commons, X Prize
Twenty-six crafts from around the world competed, with designs from rocket fins to airplane wings. The prize eventually went to Mojave Aerospace’s SpaceShipOne By spreading the net widely, realizing the dream of privatized space travel came a step closer. And this crowdsourcing strategy is becoming increasingly popular. When Netflix wanted to boost its algorithms for personalized movie suggestions, the company realized it would be cheaper to sponsor a $1 million global prize than do the work in-house. Netflix published a sample set of data, with the goal of a 10 percent improvement over its own high-water mark. Tens of thousands of teams competed. Most of the attempts didn’t make the cut, but two teams surpassed Netflix’s desired threshold. With a small investment, Netflix had tackled a problem by encouraging thousands of solutions.
Keeping Up With the Quants: Your Guide to Understanding and Using Analytics by Thomas H. Davenport, Jinho Kim
Black-Scholes formula, business intelligence, business process, call centre, computer age, correlation coefficient, correlation does not imply causation, Credit Default Swap, en.wikipedia.org, feminist movement, Florence Nightingale: pie chart, forensic accounting, global supply chain, Hans Rosling, hypertext link, invention of the telescope, inventory management, Jeff Bezos, Johannes Kepler, longitudinal study, margin call, Moneyball by Michael Lewis explains big data, Myron Scholes, Netflix Prize, p-value, performance metric, publish or perish, quantitative hedge fund, random walk, Renaissance Technologies, Robert Shiller, Robert Shiller, self-driving car, sentiment analysis, six sigma, Skype, statistical model, supply-chain management, text mining, the scientific method, Thomas Davenport
Many students opted to try their hand at the Netflix Challenge: to design a movie recommendations algorithm that does better than the one developed by Netflix. Here’s how the competition works. Netflix has provided a large data set that tells you how nearly half a million people have rated about 18,000 movies. Based on these ratings, you are asked to predict the ratings of these users for movies in the set that they have not rated. The first team to beat the accuracy of Netflix’s proprietary algorithm by a certain margin wins a prize of $1 million! Different student teams in my class adopted different approaches to the problem, using both published algorithms and novel ideas. Of these, the results from two of the teams illustrate a broader point. Team A came up with a very sophisticated algorithm using the Netflix data. Team B used a very simple algorithm, but they added in additional data beyond the Netflix set: information about movie genres from the Internet Movie Database (IMDB).
Eat People: And Other Unapologetic Rules for Game-Changing Entrepreneurs by Andy Kessler
23andMe, Andy Kessler, bank run, barriers to entry, Berlin Wall, Bob Noyce, British Empire, business cycle, business process, California gold rush, carbon footprint, Cass Sunstein, cloud computing, collateralized debt obligation, collective bargaining, commoditize, computer age, creative destruction, disintermediation, Douglas Engelbart, Eugene Fama: efficient market hypothesis, fiat currency, Firefox, Fractional reserve banking, George Gilder, Gordon Gekko, greed is good, income inequality, invisible hand, James Watt: steam engine, Jeff Bezos, job automation, Joseph Schumpeter, Kickstarter, knowledge economy, knowledge worker, libertarian paternalism, low skilled workers, Mark Zuckerberg, McMansion, Netflix Prize, packet switching, personalized medicine, pets.com, prediction markets, pre–internet, profit motive, race to the bottom, Richard Thaler, risk tolerance, risk-adjusted returns, Silicon Valley, six sigma, Skype, social graph, Steve Jobs, The Wealth of Nations by Adam Smith, transcontinental railway, transfer pricing, wealth creators, Yogi Berra
Again, all those powerful machines at the edge and huge networks of servers in the cloud with giant repositories of all the things we’ve done, what our friends are doing, what the average twenty-seven-year-old from Sheboygan, Wisconsin, is likely to do. Amazon uses a limited version of this in their recommendations, but more as a marketing tool to get you to buy yet another book. Others who view this item bought this book. We recommend that. They look for patterns and crudely overlay them on your page views in their system. Netflix, the DVD rental and streaming video company, offered a prize of $1 million for a better algorithm to suggest movies you might like to watch. These are all early adopters of the adaptive model. But why not recommend books based on my search history? If I’m searching on Google for information on the Ottoman Empire, surely there are a dozen books that ought to pop up that will be of interest immediately, without me heading to Amazon to find out.
Unleashed by Anne Morriss, Frances Frei
"side hustle", Airbnb, Donald Trump, future of work, gig economy, glass ceiling, Grace Hopper, Jeff Bezos, Netflix Prize, Network effects, performance metric, race to the bottom, ride hailing / ride sharing, Silicon Valley, Steve Jobs, TaskRabbit, Tony Hsieh, Toyota Production System, Travis Kalanick, Uber for X, women in the workforce
Anyone who has felt more optimistic after walking into Starbucks—or felt cooler when they chose that indie coffee shop instead—knows that culture can influence anyone who interacts with it. If your ambition as a leader is maximum impact, then learn to become a culture warrior. Among the most effective culture warriors walking the planet is Patty McCord, former chief talent officer at Netflix. You won’t find empty values statements on the walls of a Netflix conference room, not on McCord’s watch. As she helped to build Netflix into a media giant, McCord articulated the behaviors the company prized most—there are nine—and then used them to drive all hiring, compensation, and exit decisions. She socialized new recruits on these behaviors in a famous hundred-slide presentation on Netflix’s unique culture, and then reinforced them constantly, for example, invoking “honesty” (number eight) if colleagues withheld feedback from each other. (Sheryl Sandberg described McCord’s presentation, known as the Netflix Culture Deck, as “the most important document ever to come out of [Silicon] Valley.”5) McCord also challenged employees to question each other’s actions if they were inconsistent with Netflix culture, an act she explicitly labeled an expression of “courage” (number six).
Singularity Rising: Surviving and Thriving in a Smarter, Richer, and More Dangerous World by James D. Miller
23andMe, affirmative action, Albert Einstein, artificial general intelligence, Asperger Syndrome, barriers to entry, brain emulation, cloud computing, cognitive bias, correlation does not imply causation, crowdsourcing, Daniel Kahneman / Amos Tversky, David Brooks, David Ricardo: comparative advantage, Deng Xiaoping, en.wikipedia.org, feminist movement, Flynn Effect, friendly AI, hive mind, impulse control, indoor plumbing, invention of agriculture, Isaac Newton, John von Neumann, knowledge worker, Long Term Capital Management, low skilled workers, Netflix Prize, neurotypical, Norman Macrae, pattern recognition, Peter Thiel, phenotype, placebo effect, prisoner's dilemma, profit maximization, Ray Kurzweil, recommendation engine, reversible computing, Richard Feynman, Rodney Brooks, Silicon Valley, Singularitarianism, Skype, statistical model, Stephen Hawking, Steve Jobs, supervolcano, technological singularity, The Coming Technological Singularity, the scientific method, Thomas Malthus, transaction costs, Turing test, twin studies, Vernor Vinge, Von Neumann architecture
The AI would obviously benefit from faster computers, but it could also improve its performance by using data from DNA sequencing and brain scans to conduct large statistical studies so as to better categorize people. For example, if 90 percent of people who had some unusual allele or brain microstructure enjoyed a certain cat video, then the AI recommender would suggest the video to all other viewers who had that trait. 12.Amenable to Crowdsourcing—Netflix, the rent-by-mail and streaming video distributor, offered (and eventually paid) a $1 million prize to whichever group improved its recommendation system the most, so long as at least one group improved the system by at least 10 percent. This “crowdsourcing,” which occurs when a problem is thrown open to anyone, helps a company by allowing them to draw on the talents of strangers, while only paying the strangers if they help the firm. This kind of crowdsourcing works only if, as with a video recommendation system, there is an easy and objective way of measuring progress toward the crowdsourced goal. 13.Potential Improvement All the Way Up to Superhuman Artificial General Intelligence—A recommendation AI could slowly morph into a content creator.
See also cognitive-enhancement drugs mental ability, general, 64 Methuselah Foundation, 170 Mexicans, millions of illegal, 135 microprocessors, 122 military spending, 180 military threats, 126–28 Milky Way galaxy, 45, 199 minority students, “acceptable” number of, 87 missile technology, 124 modafinil (cognitive-enhancement drug), 104 “modafinil squared” technology, 159 modern man, 77 “A Modest Proposal: Allow Women to Pay for College in Eggs” (Miller), 88 moons of Jupiter, 41 Moore, Gordon, xvi Moore’s Law, xivi 3–5, 8–9, 11, 17, 209 Muehlhauser, Luke, 7 multimillionaires, self-made, 209 murder, 39 musicians, 108 N nanobots, 10 nanosensors, 127 nanotechnology based weapons, 127 computing hardware, to improve, 8 to control our emotions and levels of happiness, 8 free energy from our galaxy stars and, 45 moons of Jupiter and, 41 office buildings, to construct large, 181 restorative, 213 virtual reality and, 171 nanotech weapons, 197 nanotubes, 4, 17 narcolepsy, 184 narcoleptics, 105 NASA, 217 National Center for Education Statistics (Washington, DC), 172 national defense, 180 National Football League (NFL), 90 natural disasters, 197 natural selection, 171 Nature, 109 Nazi Germany, 23 Netflix, 20 neuroscience, 185 neuroscientists, 13, 17, 203 Newton, Isaac, 91 New York Times, x NFL. See National Football League (NFL) Nobel science prizes, 96 nonparallel processing, 18 Normans in 1066, 187 North Korea, 187 Norvig, Peter, 35 nuclear war, xi, 197. See also thermonuclear war nuclear weapons, 24, 126 nutritional-supplement regime, 179 O Obama, Barack (President), 73 obsolescence, 144, 147 The Odyssey (Homer), 61 Omohundro, Steve, 25 Overcoming Bias (blog), 138, 207 P Pac-Man video game, 209 parallel processing, 18–19 Parkinson’s disease, 168–69 Pascal, Blaise, 208 patents and copyrights, 143 people, long term—oriented, 80 person, anonymous, 93 pharmaceutical product development, 183 phonetic pattern of language, 91 pirate maps, 184 placebo effect, 110–11 plagues, 36, 45, 78 plastic surgery, 89 Plath, Sylvia, 92 political correctness, 172 Polizzi, Eric, 5 population groups, 75–76, 96, 173 pornography, hard-core, 38, 195 pornography, Internet, 194 post-Singularity civilization, 199–200, 221 goods, 42 operating-system world, 41 pre-Singularity property will have value of, 188 pre-Singularity will be worthless, 187 property rights, 56, 188 race throughout the galaxy, 199 ultra-AI and chess, 132 value, 189 value of education, 192 value of money, 211 Praetorian Guard, 148 pre-Singularity destructive technologies, 201–2 investments, ultra-AI might obliterate the value of, 187 property rights, 56, 187–89 value of money, 211 prisoner-of-war camp, 31 Prisoners’ Dilemma AI development and, 47–53 annihilation of mankind, xix Chinese militaries and, 48–53 drug use and risk of schizophrenia or kidney failure, 160–62 unleaded vs. leaded petrol (gas), 57 US militaries and, 48–53 probe, self-replicating, 199–200 procrastination, 106 production wands, 145 pro-eugenic Chinese, ix prognosticators, 206–7 property owning, 147 property rights economic behavior and, xviii post-Singularity, 56 stable, 82 property rights, pre-Singularity, 187 property rights of bio-humans, 149 Psychology Today, 195 psychotic breakdowns, 120 Q quantum computing, 5, 17 quantum effects, 4 R rabbit population, 142 race, star-faring, 200 racial classifications, 76 racial equality, 173 rapture of the nerds, 208 Rattner, Justin, 35 real estate developer, 181–82 real estate development, 188 recessive condition, inherited, 83 Recursive Darkness (horse), 55, 57 Reed, Leonard, 204–5 religious disagreement, 43 reproductive fitness, 76 reproductive success, 75 resale value, 181 residential housing, 181 The Restaurant at the End of the Universe Adams), 150 retirement savings, 175–76 reverse-engineering biology, 203 reversible computing, 17 Ricardian comparative advantage, 136–37, 143, 188, 190 Ricardian scenario, 189 Ricardo, David, xvii, 135, 143 rich investors, 144–45 Ritalin (cognitive-enhancement drug), xiv, 104–5 Robin.
Dawn of the New Everything: Encounters With Reality and Virtual Reality by Jaron Lanier
4chan, augmented reality, back-to-the-land, Buckminster Fuller, Burning Man, carbon footprint, cloud computing, collaborative editing, commoditize, cosmological constant, creative destruction, crowdsourcing, Donald Trump, Douglas Engelbart, Douglas Hofstadter, El Camino Real, Elon Musk, Firefox, game design, general-purpose programming language, gig economy, Google Glasses, Grace Hopper, Gödel, Escher, Bach, Hacker Ethic, Howard Rheingold, impulse control, information asymmetry, invisible hand, Jaron Lanier, John von Neumann, Kevin Kelly, Kickstarter, Kuiper Belt, lifelogging, mandelbrot fractal, Mark Zuckerberg, Marshall McLuhan, Menlo Park, Minecraft, Mitch Kapor, Mother of all demos, Murray Gell-Mann, Netflix Prize, Network effects, new economy, Norbert Wiener, Oculus Rift, pattern recognition, Paul Erdős, profit motive, Ray Kurzweil, recommendation engine, Richard Feynman, Richard Stallman, Ronald Reagan, self-driving car, Silicon Valley, Silicon Valley startup, Skype, Snapchat, stem cell, Stephen Hawking, Steve Jobs, Steven Levy, Stewart Brand, technoutopianism, Ted Nelson, telemarketer, telepresence, telepresence robot, Thorstein Veblen, Turing test, Vernor Vinge, Whole Earth Catalog, Whole Earth Review, WikiLeaks, wikimedia commons
What if you make yourself dumb to make a computer look smart?” “That could never happen.” Present-day Jaron must interject, from this doubly indented location,11 to defend his old self with more recent stories. It has happened! Now that computation runs our lives, we make ourselves dumb to make computers look smart all the time. Consider Netflix. The company claims that its smart algorithm gets to know you and then recommends movies. The company even offered a million-dollar prize for ideas to make the algorithm smarter. The thing about Netflix, though, is that it doesn’t offer a comprehensive catalog, especially of recent, hot releases. If you think of any particular movie, it might not be available for streaming. The recommendation engine is a magician’s misdirection, distracting you from the fact that not everything is available.
Everything Is Obvious: *Once You Know the Answer by Duncan J. Watts
active measures, affirmative action, Albert Einstein, Amazon Mechanical Turk, Black Swan, business cycle, butterfly effect, Carmen Reinhart, Cass Sunstein, clockwork universe, cognitive dissonance, coherent worldview, collapse of Lehman Brothers, complexity theory, correlation does not imply causation, crowdsourcing, death of newspapers, discovery of DNA, East Village, easy for humans, difficult for computers, edge city, en.wikipedia.org, Erik Brynjolfsson, framing effect, Geoffrey West, Santa Fe Institute, George Santayana, happiness index / gross national happiness, high batting average, hindsight bias, illegal immigration, industrial cluster, interest rate swap, invention of the printing press, invention of the telescope, invisible hand, Isaac Newton, Jane Jacobs, Jeff Bezos, Joseph Schumpeter, Kenneth Rogoff, lake wobegon effect, Laplace demon, Long Term Capital Management, loss aversion, medical malpractice, meta analysis, meta-analysis, Milgram experiment, natural language processing, Netflix Prize, Network effects, oil shock, packet switching, pattern recognition, performance metric, phenotype, Pierre-Simon Laplace, planetary scale, prediction markets, pre–internet, RAND corporation, random walk, RFID, school choice, Silicon Valley, social intelligence, statistical model, Steve Ballmer, Steve Jobs, Steve Wozniak, supply-chain management, The Death and Life of Great American Cities, the scientific method, The Wisdom of Crowds, too big to fail, Toyota Production System, ultimatum game, urban planning, Vincenzo Peruggia: Mona Lisa, Watson beat the top human players on Jeopardy!, X Prize
The funding agency DARPA, for example, was able to harness the collective creativity of dozens of university research labs to build self-driving robot vehicles by offering just a few million dollars in prize money—far less than it would have cost to fund the same amount of work with conventional research grants. Likewise, the $10 million Ansari X Prize elicited more than $100 million worth of research and development in pursuit of building a reusable spacecraft. And the video rental company Netflix got some of the world’s most talented computer scientists to help it improve its movie recommendation algorithms for just a $1 million prize. Inspired by these examples—along with “open innovation” companies like Innocentive, which conducts hundreds of prize competitions in engineering, computer science, math, chemistry, life sciences, physical sciences, and business—governments are wondering if the same approach can be used to solve otherwise intractable policy problems.
Range: Why Generalists Triumph in a Specialized World by David Epstein
Airbnb, Albert Einstein, Apple's 1984 Super Bowl advert, Atul Gawande, Checklist Manifesto, Claude Shannon: information theory, Clayton Christensen, clockwork universe, cognitive bias, correlation does not imply causation, Daniel Kahneman / Amos Tversky, deliberate practice, Exxon Valdez, Flynn Effect, Freestyle chess, functional fixedness, game design, Isaac Newton, Johannes Kepler, knowledge economy, lateral thinking, longitudinal study, Louis Pasteur, Mark Zuckerberg, medical residency, meta analysis, meta-analysis, Mikhail Gorbachev, Nelson Mandela, Netflix Prize, pattern recognition, Paul Graham, precision agriculture, prediction markets, premature optimization, pre–internet, random walk, randomized controlled trial, retrograde motion, Richard Feynman, Richard Feynman: Challenger O-ring, Silicon Valley, Stanford marshmallow experiment, Steve Jobs, Steve Wozniak, Steven Pinker, Walter Mischel, Watson beat the top human players on Jeopardy!, Y Combinator, young professional
Evil by Design: Interaction Design to Lead Us Into Temptation by Chris Nodder
4chan, affirmative action, Amazon Mechanical Turk, cognitive dissonance, crowdsourcing, Daniel Kahneman / Amos Tversky, Donald Trump, en.wikipedia.org, endowment effect, game design, haute couture, jimmy wales, Jony Ive, Kickstarter, late fees, loss aversion, Mark Zuckerberg, meta analysis, meta-analysis, Milgram experiment, Netflix Prize, Nick Leeson, Occupy movement, pets.com, price anchoring, recommendation engine, Rory Sutherland, Silicon Valley, Stanford prison experiment, stealth mode startup, Steve Jobs, telemarketer, Tim Cook: Apple, trickle-down economics, upwardly mobile
Average Is Over: Powering America Beyond the Age of the Great Stagnation by Tyler Cowen
Amazon Mechanical Turk, Black Swan, brain emulation, Brownian motion, business cycle, Cass Sunstein, choice architecture, complexity theory, computer age, computer vision, computerized trading, cosmological constant, crowdsourcing, dark matter, David Brooks, David Ricardo: comparative advantage, deliberate practice, Drosophila, en.wikipedia.org, endowment effect, epigenetics, Erik Brynjolfsson, eurozone crisis, experimental economics, Flynn Effect, Freestyle chess, full employment, future of work, game design, income inequality, industrial robot, informal economy, Isaac Newton, Johannes Kepler, John Markoff, Khan Academy, labor-force participation, Loebner Prize, low skilled workers, manufacturing employment, Mark Zuckerberg, meta analysis, meta-analysis, microcredit, Myron Scholes, Narrative Science, Netflix Prize, Nicholas Carr, P = NP, pattern recognition, Peter Thiel, randomized controlled trial, Ray Kurzweil, reshoring, Richard Florida, Richard Thaler, Ronald Reagan, Silicon Valley, Skype, statistical model, stem cell, Steve Jobs, Turing test, Tyler Cowen: Great Stagnation, upwardly mobile, Yogi Berra
See also artificial intelligence (AI) Mechanical Turk, 148–49 mechanization, 126–27 media, 146 median incomes, 38, 52, 60, 253 Medicaid, 234–39, 250 medical diagnosis, 87–89, 128–29 Medicare, 232–35, 237–38, 242 Medication Adherence Scores, 124 Mediterranean Europe, 174–75 memory, 151–55 meritocracy, 189–90, 230–31 meta-rationality, 82, 115 meta-studies, 224–25 Mexico, 168, 171, 177, 242–43 microcredit, 222–23 microeconomics, 212, 225 “micro-intelligibility,” 219 mid-wage occupations, 38 military, 29, 57 Millennium Prize Problems, 207–8 minimum wage, 59, 60 modes of employment, 35–36 monetarist theory, 226 MOOCs (massive open online courses), 180 Moonwalking with Einstein (Foer), 152 Moore’s law, 10, 15–16 moral issues, 26, 130–31 morale in the workplace, 30, 36 Mormon Church, 197 Morphy, Paul, 106 motivation, 197–202, 203 movie ratings, 121 Moxon’s Master, 134 Mueller, Andreas, 59 multinational corporations, 164 Murray, Charles, 231, 249 music, 146–47, 158 Myspace, 42, 209 mysticism, 153 Nakamura, Hikaru, 80 Narrative Science, 8–9 natural gas production, 177 natural language, 7, 119, 140–41 Naum (chess program), 72 negotiations in business, 12–13, 73 Netflix, 9 Nevada, 8 The New York Times, 11–12 Newton, Isaac, 153 Ng, Jennifer Hwee Kwoon, 89 Nickel, Arno, 81 Nielsen, Dagh, 80 Nobel Prizes, 187, 216 non-tradeable sectors, 176 North American Free Trade Agreement (NAFTA), 8 Northeast US, 241 “nudge” concept, 105 Obama healthcare reform, 237–38 Occupy Wall Street, 230, 251, 253, 256 O’Daniel, Karrah, 96 offshoring, 175. See also outsourcing “off-the-grid” living, 246–47 online dating, 9, 16, 95–98, 125, 144–45 online education, 179–85 opportunity cost, 184 options-pricing theory, 203 outsourcing, 162, 163–71 overseas labor markets, 59 “P vs.
Randomistas: How Radical Researchers Changed Our World by Andrew Leigh
Albert Einstein, Amazon Mechanical Turk, Anton Chekhov, Atul Gawande, basic income, Black Swan, correlation does not imply causation, crowdsourcing, David Brooks, Donald Trump, ending welfare as we know it, Estimating the Reproducibility of Psychological Science, experimental economics, Flynn Effect, germ theory of disease, Ignaz Semmelweis: hand washing, Indoor air pollution, Isaac Newton, Kickstarter, longitudinal study, loss aversion, Lyft, Marshall McLuhan, meta analysis, meta-analysis, microcredit, Netflix Prize, nudge unit, offshore financial centre, p-value, placebo effect, price mechanism, publication bias, RAND corporation, randomized controlled trial, recommendation engine, Richard Feynman, ride hailing / ride sharing, Robert Metcalfe, Ronald Reagan, statistical model, Steven Pinker, uber lyft, universal basic income, War on Poverty
McKinsey 139 medical drug trials 26–31, 199–201 and clinical trials 28 see also ‘placebo effect’; streptomycin trial medical randomised trials 3–6, 24–34, 183–4, 190, 201–2 and Archie Cochrane 27–8, 190 and control groups 13, 24 ‘double-blind’ studies 26 and ethics 13, 21, 31, 56, 182, 184 and evidence-based medicine 26–7 and exercise science 33, 201–2 and Head Injury Retrieval Trial 183–4 and Iain Chalmers 28 and James Lind 3–6, 16, 23 and scurvy treatment trials 3–5, 16 ‘single subject’ trials 168 and single-centre trials 197 and vaccinations 113 see also scurvy treatment trials meta-analysis 85 Miami Herald 60 microcredit 5, 6, 105–6, 121, 123 microtargeting 141 Minchin, Tim, and ‘Storm’ 32–3 Minneapolis Domestic Violence Experiment 90–1 see also Lawrence Sherman ‘modern synthesis’ 53 see also Ronald Fisher Moore, Evelyn 67 Morris, Nigel 128–9 motivation 107–8 ‘Moving to Opportunity’ program 39–40 Moynihan, Senator Daniel Patrick 206 Mullainathan, Sendhil 109, 209 Muralidharan, Karthik 110–11, 123–4 Murphy, Patrick 89 see also US Police Foundation Mycoskie, Blake 113 National Association for Music Education 76 National Council for Evaluation of Social Development 210 natural field experiments 176 neighbourhood project 38–40 see also Maricela Quintanar Neighbourhood Watch 94, 98, 183 Netflix 7, 143 Netscape 6 see also Jim Barksdale Newton, Isaac 203 ‘Nimechill’ 120 see also sex education No Child Left Behind Act 210 Nobel Peace Prize 105 see also Muhammad Yunus N-of-1 168–9 Nosek, Brian, and replication of psychology studies 196 Nudge Unit 170–1, 206 see also David Halpern Obama, President Barack 146–8, 159, 169 observational analysis 11 OkCupid, and the dating website trial 129–30 Okonjo-Iweala, Ngozi 102 ee also YouWiN! Olds, David 211 ‘once and done’ campaign, and Smile Train aid charity 158 O’Neill, John, and Black Saturday 2009 13–14 O’Neill, Maura 210 Oportunidades Mexico 117 see also President Vincent Fox Oregon research on health insurance 42 parachute study, and randomised evaluation of 12 Pare, Ambroise, and soldiers’ gunpowder burns 22–3 parenting programs 68–9 and Chicago ‘Parent Academy’ 9 and Incredible Years Basic Parenting Programme 69 and randomised evaluations 70 ‘Triple P’ positive parenting program 68–9 ‘partial equilibrium’ effect 191 Peirce, Charles Sanders 49–51 Perry, Rick 150–1 Perry Preschool 66–8, 71, 169, 191–2 see also David Weikart; Evelyn Moore ‘P-hacking’ 195–6 Piaget, Jean 66 Pinker, Stephen 177 placebo effect 10, 29–31, 34, 138, 192 and John Haygarth 23–4 placebo surgery 18–21 see also sham surgery Planet Money 103 policing programs 91–4, 209 ‘broken windows policing’ 209 and ‘hot spots’ policing 93 and ‘problem oriented policing’ 94 and randomised evaluations 94 see also criminal justice experiments; Lawrence Sherman; Patrick Murphy; Rudi Lammers political campaign strategies and Benin political campaign 160 and control groups 148, 155 and ‘deep canvassing’ 163–4 and Harold Gosnell 148–50 and lobbying in US 162 and online campaigning 154–5 and political speeches 160–1 and ‘robocalls’ 152 and Sierra Leone election debates 161 and use of ‘social pressure’ 151–2 see also Get Out the Vote Pope Benedict XVI 119 ‘power of free’ theory 112 pragmatism 50 see also Charles Sanders Pierce ‘problem oriented policing’ 94 Programme for International Student Assessment 73 Progresa Mexico 117–18 see also President Ernesto Zedillo Project Independence 60–1 see also Ben Graber; Judith Gueron; Manpower Demonstration Research Corporation (MDRC) Project STAR experiment 81 Promise Academy 78–9 Prospera Mexico 118 psychology experiments 50–1, 143, 170, 177, 196 see also Charles Sanders Pierce; Joseph Jastrow ‘publication bias’ 199 Pyrotron 14–15 see also Andrew Sullivan Quintanar, Maricela 38–40 Quora 131 RAND Health Insurance Experiment 41, 169 randomised auditing 174–5 randomised trials see also A/B testing and ‘anchoring’ effect 133 and the book of Daniel 22 and Community Led Sanitation 116 and control groups 13, 67–8, 74, 78, 82 and data collection 171–2 and the driving licence experiment 109 and the ‘experimental idea’ 194 fairness of 37, 100, 177, 185 and ‘fixed mindset’ 6 and ‘general equilibrium’ effect 191 and the ‘gold standard’ 194 and ‘growth mindset’ 6 and ‘healthy cohort’ effect 12 and Highest Paid Person’s Opinion (HiPPO) 6 and Kenyan mini-bus driver experiments 115–16 and ‘natural experiments’ 193 and N-of-1 168–9 and the No Child Left Behind Act 210 and ‘the paradox of choice’ 195 and ‘partial equilibrium’ effect 191 and ‘publication bias’ 199 and replication of 90, 124, 195, 197–8 and sex education 119–20 and single-centre trials 197 and ‘virginity pledges’ in the US 46–7 randomistas, Angus Deaton Nobel laureate on 12 Read India 188 see also Rukmini Banerji Reagan, President Ronald 59, 151 Registry for International Development Impact Evaluations 199 replication 90, 195, 197–8 ‘restorative justice conferencing’ 84 restorative justice experiments 85–6, 182 Results for America 211 Rhinehart, Luke, and The Dice Man 180 Roach, William 52 ‘robocalls’ 152 Romney, Mitt 147 Rossi, Peter 190 ‘Rossi’s Law’ 190, 206 Rothamsted Experimental Centre 53 Rudder, Christian 130 see also OkCupid Sachs, Jeffrey 121 Sackett, David 27, 206 Sacred Heart Mission 36 Salk, Jonas 168 Salvation Army’s ‘Red Kettle Christmas drive 157 Sandburg, Sheryl 144 Saut, Fabiola Vasquez 110 see also Acayucan road experiment ‘scaling proven success,’ and ‘Development Innovation Ventures’ 210 Scared Straight 7–8, 94, 98–9, 189 see also Danny Glover; James Finckenauer Schmidt, Eric, and Google 143 Schwarzenegger, Arnold 75, 173 Science 163 ‘Science of Philanthropy Initiative’ 159 scurvy treatment trials 3–5, 16 see also Gilbert Blane; James Cook; James Lind; William Stark Second Chance Act 210 Seeger, Pete, and ‘The Draft Dodger Rag’ 42 Semelweiss, Ignaz 25 Sesame Street 63–5, 83 see also Joan Cooney sex education 119–20 sham surgery trials 19–20, 182 and ‘clinical equipoise’ 21 Sherman, Lawrence 91–4, 101 ‘Shoes for Better Tomorrows’ (TOMS) 113–15 see also Blake Mycoskie; Bruce Wydick Sierra Leone election debates 161 see also Saa Badabla SimCalc, and online learning tools 77 ‘single subject’ trials 168–9 see also N-of-1 Siroker, Dan 148 Sliding Doors 9 Smile Train aid charity, and ‘once and done’ campaign 158 social experiments large-scale 41 social field experiments and control groups 37, 39–41, 139 and credit card upgrades 132–3 and pay rates 136–7 and retail discounts 133 and ‘split cable’ techniques 139–40 and Western Union money transfers 130 social program trials and Kenyan electricity trial 110 and smoking deterrents 47–8 see also Acayucan road experiment; neighbourhood project social service agencies 36, 69 ‘soft targeting’ 36 ‘split cable’ technique 139–40 St.
The Future Is Asian by Parag Khanna
3D printing, Admiral Zheng, affirmative action, Airbnb, Amazon Web Services, anti-communist, Asian financial crisis, asset-backed security, augmented reality, autonomous vehicles, Ayatollah Khomeini, barriers to entry, Basel III, blockchain, Boycotts of Israel, Branko Milanovic, British Empire, call centre, capital controls, carbon footprint, cashless society, clean water, cloud computing, colonial rule, computer vision, connected car, corporate governance, crony capitalism, currency peg, deindustrialization, Deng Xiaoping, Dissolution of the Soviet Union, Donald Trump, energy security, European colonialism, factory automation, failed state, falling living standards, family office, fixed income, flex fuel, gig economy, global reserve currency, global supply chain, haute couture, haute cuisine, illegal immigration, income inequality, industrial robot, informal economy, Internet of things, Kevin Kelly, Kickstarter, knowledge worker, light touch regulation, low cost airline, low cost carrier, low skilled workers, Lyft, Malacca Straits, Mark Zuckerberg, megacity, Mikhail Gorbachev, money market fund, Monroe Doctrine, mortgage debt, natural language processing, Netflix Prize, new economy, off grid, oil shale / tar sands, open economy, Parag Khanna, payday loans, Pearl River Delta, prediction markets, purchasing power parity, race to the bottom, RAND corporation, rent-seeking, reserve currency, ride hailing / ride sharing, Ronald Reagan, Scramble for Africa, self-driving car, Silicon Valley, smart cities, South China Sea, sovereign wealth fund, special economic zone, stem cell, Steve Jobs, Steven Pinker, supply-chain management, sustainable-tourism, trade liberalization, trade route, transaction costs, Travis Kalanick, uber lyft, upwardly mobile, urban planning, Washington Consensus, working-age population, Yom Kippur War
., 49, 265, 316 Ganges region, 29, 32 Ganges River, 33, 35, 46 “Gangnam Style” (music video), 343 Gates, Bill, 317 Geely, 194 General Electric, 110, 168, 211 Genghis Khan, 39–40 Georgia, Republic of, 59 technocracy in, 307 Germany, Nazi, 50 Germany, unified: Arab refugees in, 255 Asian immigrants in, 253, 254, 256 Asia’s relations with, 242 multiparty consensus in, 284 Ginsberg, Allen, 331 Giving Pledge, 317 Global-is-Asian, 22 globalization: Asia and, 8–9, 162, 357–59; see also Asianization growth of, 14 global order, see world order Goa, 44, 89, 186 Göbekli Tepe, 28 Goguryeo Kingdom, 34 Go-Jek, 187 Golden Triangle, 123 Google, 199, 200, 208–9, 219 Gorbachev, Mikhail, 58 governance: digital technology in, 318–19 inclusive policies in, 303 governance, global: Asia and, 321–25 infrastructure and, 322 US and, 321 government: effectiveness of, 303 trust in, 291, 310 violence against minorities by, 308–9 Government Accountability Office (GAO), 293 GrabShare, 174–75 grain imports, Asian, 90 Grand Canal, China, 37, 42 Grand Trunk Road, 33 Great Britain: Asian investments in, 247 Brexit vote in, 283–84, 286, 293–94 civil service in, 293–94 colonial empire of, 46–47 industrialization in, 46 Iran and, 252 populism in, 283–84 South Asian immigrants in, 253, 254 West Asian mandates of, 49–50 Great Game, 47 Great Leap Forward, 55 Great Wall of China, 31 Greece, 60, 91, 248 Greeks, ancient, 29, 34 greenhouse gas emissions, 176–77, 182 gross domestic product (GDP), 2, 4, 150 Grupo Bimbo, 272 Guam, 50, 136 Guangdong, 42, 98 Guangzhou (Canton), 37, 48, 68 Gulf Cooperation Council (GCC), 58, 101, 102 Gulf states (Khaleej), 6, 9, 57, 62, 81 alternative energy projects in, 251 Asianization of, 100–106 China and, 101, 102 European investment in, 251 India and, 102 Israel and, 99–100, 105 Japan and, 102 oil and gas exports of, 62, 74, 100–101, 176 South Asian migrants in, 334 Southeast Asia’s trade with, 102 South Korea and, 102 technocracy in, 311–12 US arms sales to, 101 women in, 315 see also specific countries Gulliver, Stuart, 148, 150 Gupta Empire, 35 H-1B visas, 219 Hamas, 59, 100, 139 Hamid, Mohsin, 184 Han Dynasty, 32, 33, 34, 300 Hanoi, 180 Han people, 31–32, 37, 69 Harappa, 29 Hardy, Alfredo Toro, 275 Hariri, Saad, 95 Harun al-Rashid, Caliph, 37 Harvard University, 230 Haushofer, Karl, 1 health care, 201–2 Helmand River, 107 Herberg-Rothe, Andreas, 75 Herodotus, 30 heroin, 106–7 Hezbollah, 58, 95, 96, 106 Hindus, Hinduism, 29, 31, 32, 34, 38, 70–71 in Southeast Asia, 121 in US, 220, 221 Hiroshima, atomic bombing of, 51 Hispanic Americans, 217 history, Asian view of, 75 history textbooks: Asia nationalism in, 27–28 global processes downplayed in, 28 Western focus of, 27–28, 67–68 Hitler, Adolf, 50 Ho, Peter, 289 Ho Chi Minh, 52 Ho Chi Minh City, 56 Honda, 275 Hong Kong, 56, 74 American expats in, 234 art scene in, 342 British handover of, 60, 141 civil society in, 313 Hongwu, Ming emperor, 42 honor killings, 315 Hormuz, Strait of, 103, 106 hospitality industry, 190, 214 Houthis, 106, 107 Huan, Han emperor, 33–34 Hulagu Khan, 40 Human Rights Watch, 313 human trafficking, 318 Hunayn ibn Ishaq, 37 Hungary, 40, 248, 256 Huns, 35, 76 hunter-gatherers, 28 Huntington, Samuel, 15 Hu Shih, 332 Hussein, Saddam, 58, 62, 101 Hyundai, 104 IBM, 212 I Ching, 30 Inclusive Development Index (IDI), 150 income inequality: in Asia, 183–84 in US, 228, 285 India, 101, 104 Afghanistan and, 118 Africa and, 264–66 AI research in, 200 alternative energy programs in, 178–79, 322 Asian investments of, 118 Australia and, 128 British Raj in, 46, 49 charitable giving in, 316–17 China and, 19–20, 113, 117–18, 155, 156, 332 civil society in, 313 in Cold War era, 52, 55, 56 corporate debt in, 170 corruption in, 161, 305 demonetization in, 184, 186–87 diaspora of, 333–34 early history of, 29, 30–31 economic growth of, 9, 17, 148, 185–86 elections in, 63 European trade partnerships with, 250–51 expansionist period in, 38, 41–42 failure of democracy in, 302 family-owned businesses in, 160 film industry in, 349–51 financial markets in, 186 foreign investment in, 192 gender imbalance in, 315 global governance in, 322–23 global image of, 331–32 Gulf states and, 102 inclusive policies in, 304 infrastructure investment in, 63, 110, 185 Iran and, 116, 118 Israel and, 98–99 IT industry in, 204, 275 Japan and, 134, 156 Latin America and, 275 manufacturing in, 192 as market for Western products and services, 207 naval forces of, 105 Northeast Asia and, 154–55 oil and gas imports of, 96, 107–8, 176 Pakistan and, 53, 55, 61, 77–78, 117–18 partitioning of, 52–53 pharmaceutical industry in, 228, 275 population of, 15, 186 in post–Cold War era, 61, 62 privatization in, 170 returnees in, 226 Russia and, 86–87 service industry in, 192 Southeast Asia and, 154–55 special economic zones in, 185 spiritual heritage of, 332 technocracy in, 304–6 technological innovation in, 186–87 territorial claims of, 11 top-down economic reform in, 305 traditional medicine of, 355 West Asia and, 155 Indian Americans, 217, 218, 219–20, 222 Indian Institutes of Technology (ITT), 205 Indian Ocean, 38, 47, 74, 105, 261, 262, 266 European voyages to, 44 Indians, in Latin America, 276 IndiaStack, 187 Indochina, 45, 50, 52 see also Southeast Asia Indo-Islamic culture, 38 Indonesia, 53, 61, 121, 125, 182 art scene in, 342 in Cold War era, 54 economic growth of, 17, 148 eco-tourism in, 340 failure of democracy in, 302 foreign investment in, 187 illiberal policies of, 306 inclusive policies of, 304 Muslims in, 71 technocracy in, 304–5 Indus River, 32, 113 Industrial and Commercial Bank of China (ICBC), 92, 159 industrialization, spread of, 22 Industrial Revolution, 2, 46, 68 Indus Valley, 29 infrastructure investment, in Asia, 6, 62, 63, 85, 88, 93, 96, 104, 108, 109, 110–11, 185, 190, 191, 243–44 see also; Asian Infrastructure Investment Bank; Belt and Road Initiative Institut d’Études Politiques de Paris (Sciences Po), 257, 286–87 insurance industry, 210 intermarriage, 336, 337–38 International Monetary Fund (IMF), 162, 163, 166, 323 International North-South Transport Corridor (INSTC), 116 International Renewable Energy Agency (IRENA), 100 International Systems in World History (Buzan), 7 Internet of Things (IoT), 134, 136, 197 Interpol, 324 Iran, 11, 15, 62, 92, 95, 98, 101, 140 China and, 101, 106–7, 116 in Cold War era, 54 European trade with, 251–52 growing opposition to theocracy in, 312 India and, 116, 118 Islamic revolution in, 57 Israel and, 99, 100 nuclear program of, 62 oil and gas exports of, 50, 94, 106, 107–8, 118, 176 in post–Cold War era, 58–59 privatization in, 170 re-Asianization of, 81, 106 Russia and, 87 Saudi Arabia and, 95–96, 100, 105–6 Syria and, 106 tourism in, 252 Turkey and, 94 US sanctions on, 87, 107, 241, 251, 252 women in, 315 Yemen and, 107 Iran-Iraq War, 58, 106 Iraq, 9, 11, 16, 49 Kuwait invaded by, 59 oil exports of, 55, 96 Sunni-Shi’a conflict in, 312 Iraq Reconstruction Conference (2018), 96 Iraq War, 3, 62, 91, 217, 240 Isfahan, 41 Islam, 40, 316 politics and, 71–72 spread of, 36, 38–39, 43, 69–72, 74 Sunni-Shi’a conflict in, 95, 312 Sunni-Shi’a division in, 36 see also Muslims; specific countries Islamic radicalism, 58, 59, 62, 65, 68, 71, 72, 115, 117, 139 see also terrorism Islamic State in Iraq and Syria (ISIS), 63, 71, 94, 96, 117 Israel, 11, 54, 96 arms sales of, 98 China and, 98–99 desalinzation technology of, 181 EU and, 97 Gulf states and, 99–100, 105 India and, 98–99 Iran and, 99, 100 Russia and, 88 see also Arab-Israeli conflict; Palestinian-Israeli conflict Japan, 14, 16, 63, 68, 69, 73 Africa and, 265 Allied occupation in, 51 alternative energy technologies in, 322 Asian investments of, 118, 156 Asianization of, 81 Asian migrants in, 336–37 Asian trade with, 273 capitalism in, 159 cashless economy in, 189 China and, 19–20, 77, 134, 136–37, 140–42 in Cold War era, 5, 55 corporate culture of, 132 early history of, 29, 31, 34–35 economic growth of, 55, 132, 148, 158, 163 economic problems of, 132, 134–35 in era of European imperialism, 47–48 EU trade agreement with, 133 expansionist period in, 38, 42, 44 foreign investment in, 135 in global economy, 133–37 global governance and, 322–23 global image of, 331 Gulf states and, 102 immigration in, 135–36 India and, 134, 156 infrastructure investment in, 110 Latin America and, 275 precision industries in, 134, 135–36 robotic technology in, 134 Russia and, 82, 86–87 Southeast Asia and, 133, 153–54, 156 South Korea and, 141–42 technological innovation in, 134, 196, 197 territorial claims of, 11 tourism in, 135 US and, 136 in World War I, 49 in World War II, 50–51 Japan International Cooperation Agency (JICA), 265 Japan-Mexico Economic Partnership Agreement, 273 Java, 35, 38, 39, 45 Javid, Sajid, 254 Jericho, 28 Jerusalem, 54, 98 Jesus Christ, 35 jihad, 38 Jinnah, Muhammad Ali, 52 Jobs, Steve, 331 Joko Widodo (Jokowi), 305, 306, 320 Jollibee, 172 Jordan, 54, 62, 97, 99 Syrian refugees in, 63 Journal of Asian Studies, 352 Journey to the West, 353 Judaism, 36 Kagame, Paul, 268 Kanishka, Kush emperor, 35 Kapur, Devesh, 218 Karachi, 113 Karakoram Highway, 113 Kashmir, 53, 55, 61, 77–78, 117–18, 119 Kazakhstan, 59, 140, 207 China and, 20, 108 economic diversification in, 190 energy investment in, 112 as hub of new Silk Road, 111–12 Kenya, 262, 263 Kerouac, Jack, 331 Khaleej, see Gulf states Khmer Empire, 70 Khmer people, 34, 38, 239 Khmer Rouge, 56 Khomeini, Ayatollah, 57, 59 Khorgas, 108 Khrushchev, Nikita, 56 Khwarizmi, Muhammad al-, 37 Kiev, 40 Kim Il Sung, 55 Kim Jong-un, 142 Kish, 28 Kissinger, Henry, 357 Koran, 316 Korea, 11, 31, 51, 68, 69 early history of, 34 expansionist period in, 38 Japanese annexation of, 48 reunification of, 142–43 see also North Korea; South Korea Korea Investment Corporation, 164 Korean Americans, 217 Korean War, 51 Kosygin, Alexei, 56 K-pop, 343 Kuala Lampur, 121, 246 Kublai Khan, 40 Kurds, Kurdistan, 87, 94, 99, 256 Kushan Empire, 32, 35 Kuwait, 101 Iraqi invasion of, 59 Kyrgyzstan, 59, 108, 182 language, Asian links in, 68–69 Laos, 45, 52, 60, 122, 154 Latin America: Asian immigrants in, 275–76 Asian investment in, 273–75, 276–77 Indian cultural exports to, 350 trade partnerships in, 272–73, 274, 275 US and, 271–72 Lebanon, 49, 54, 58, 95, 106 Syrian refugees in, 63 Lee, Ang, 347 Lee, Calvin Cheng Ern, 131 Lee Hsien Loong, 296–97 Lee Kuan Yew, 56, 127, 268, 288, 289, 292–93, 299, 305 voluntary retirement of, 296 Lee Kuan Yew School of Public Policy, 22, 299 Lenin, Vladimir, 49, 89 Levant (Mashriq), 81, 95, 97 LG, 275 Li & Fung, 184–85 Liang Qichao, 48–49 Liberalism Discovered (Chua), 297 Lien, Laurence, 317 life expectancies, 201 literature, Asian, global acclaim for, 353–54 Liu, Jean, 175 Liu Xiaobo, 249 logistics industry, 243 Ma, Jack, 85–86, 160, 189 Macao (Macau), 44 MacArthur, Douglas, 51 McCain, John, 285 McKinsey & Company, 160, 213 Macquarie Group, 131 Maddison, Angus, 2 Made in Africa Initiative, 262 Magadha Kingdom, 31 Magellan, Ferdinand, 43 Mahabharata, 35 Mahbubani, Kishore, 3 Mahmud of Ghazni, Abbasid sultan, 38 Malacca, 38, 43, 44, 124 Malacca, Strait of, 37, 39, 102, 103, 118, 125 Malaya, 46, 50 Malay Peninsula, 39, 53 Malaysia, 53, 61, 188 Asian foreign labor in, 335 China and, 123, 124 in Cold War era, 54 economic diversification in, 190 economic growth of, 17 technocracy in, 308 Maldives, 105 Malesky, Edmund, 308 Manchuria, 38, 48, 50, 51 Mandarin language, 229–30, 257 Manila, 121, 245 Spanish colonization of, 44 Mansur, al-, Caliph, 37 manufacturing, in Asia, 192 Mao Zedong, 51–52, 55, 56, 261, 300, 301 Marawi, 71 Marcos, Ferdinand, 53–54, 61 martial arts, mixed (MMA), 340–41 Mashriq (Levant), 81, 95, 97 Mauritius, 268 Mauryan Empire, 32–33, 68 May, Theresa, 293 Mecca, 57 media, in Asia, 314 median ages, in Asia, 148, 149, 155 Median people, 29 Mediterranean region, 1, 6, 29, 30, 33, 68, 84, 92, 95, 99, 106 see also Mashriq Mehta, Zubin, 332 Mekong River, 122 Menander, Indo-Greek king, 33 mergers and acquisitions, 212–13 meritocracy, 294, 301 Merkel, Angela, 242, 254 Mesopotamia, 28 Mexico, 7 Asian economic ties to, 272, 273, 274, 277 Microsoft, 208 middle class, Asian, growth of, 3, 4 Mihov, Ilian, 309 mindfulness, 332 Ming Dynasty, 42–43, 44, 69, 73, 75, 76, 105, 137, 262 mobile phones, 157, 183–84, 187, 188, 189, 193, 199, 208–9, 211 Modi, Narendra, 63, 98, 117, 119, 154–55, 161, 180, 185, 222, 265, 305, 306, 307, 320 Mohammad Reza Pahlavi, Shah of Iran, 54 Mohammed bin Salman, crown prince of Saudi Arabia, 72, 247, 310, 312, 315 Mohenjo-Daro, 29 Moluku, 45 MoneyGram, 196 Mongolia, 92, 111–12 alternative energy programs in, 112, 182 technocracy in, 307 Mongols, Mongol Empire, 39–40, 42, 44, 68, 69, 73, 76, 77, 239 religious and cultural inclusiveness of, 40, 70–71 Monroe Doctrine, 271 Moon Jae-in, 142 Moscow, 81, 82 Mossadegh, Mohammad, 54 MSCI World Index, 166, 168 Mubadala Investment Company, 88, 103, 104 Mughal Empire, 41–42, 46 religious tolerance in, 70–71 Muhammad, Prophet, 36 Mumbai, 185–86 Munich Security Conference, 241 Murakami, Haruki, 354 Murasaki Shikibu, 353 music scene, in Asia, 343 Muslim Brotherhood, 59 Muslims, 70–72 in Southeast Asia, 38–39, 43, 70–71, 121 in US, 220 see also Islam; specific countries Myanmar, 60, 63, 161 Asian investment in, 118–19 charitable giving in, 316 failure of democracy in, 302 financial reform in, 184 Rohingya genocide in, 122–23 see also Burma Nagasaki, atomic bombing of, 51 Nanjing, 42, 49 Napoleon I, emperor of the French, 1 nationalism, 11, 20, 22, 49–50, 52–55, 77, 118, 137, 138–39, 222, 312, 329, 337, 352 Natufian people, 28 natural gas, see oil and gas natural gas production, 175–76 Nazism, 200 Nehru, Jawaharlal, 52, 55 Neolithic Revolution, 28 neomercantilism, 20, 22, 158 Nepal, 46, 119–20, 333 Nestorian Christianity, 36, 70 Netanyahu, Benjamin, 97, 98, 100 Netflix, 348 New Deal, 287 New Delhi, 245 Ng, Andrew, 199 NGOs, 313 Nigeria, 265 Nisbett, Richard, 357 Nixon, Richard, 56, 101 Nobel Prize, 48, 221, 249, 323, 353–54 nomadic cultures, 76 Non-Aligned Movement, 55 Non-Proliferation of Nuclear Weapons Treaty, 61 North America: Asian trade with, 13, 14, 207 as coherent regional system, 7 energy self-sufficiency of, 175, 272 internal trade in, 152 see also Canada; Mexico; United States North American Free Trade Agreement (NAFTA), 7 North Atlantic Treaty Organization (NATO), 2, 57, 92, 116 Northeast Asia, 141 India and, 154–55 internal trade in, 152 manufacturing in, 153 North Korea, 55, 61 aggressiveness of, 63 China and, 143 cyber surveillance by, 142 nuclear and chemical weapons program of, 142 Russia and, 143 South Korea and, 142 US and, 142–43 Obama, Barack, 18, 82, 229, 240 oil and gas: Asian imports of, 9, 62, 82–83, 84–85, 96, 102, 106, 107–8, 152, 175, 176, 207 Gulf states’ exports of, 62, 74, 100–103, 176 Iranian exports of, 50, 94, 106, 107–8, 118, 176 Iraqi exports of, 55, 96 OPEC embargo on, 57 price of, 61 Russian exports of, 82–83, 84, 87–88, 175, 176 Saudi exports of, 58, 87–88, 102, 103 US exports of, 16, 207 West Asian exports of, 9, 23, 57, 62, 152 Okakura Tenshin, 48 oligarchies, 294–95 Olympic Games, 245 Oman, East Asia and, 104 ONE Championship (MMA series), 341 OPEC (Organization of Petroleum Exporting Countries), 57 Operation Mekong (film), 123 opium, 47, 123 Organization for Security and Co-operation in Europe (OSCE), 241 Oslo Accords, 59 Osman I, Ottoman Sultan, 41 Ottoman Empire, 40–41, 43, 45, 46–47, 48, 73, 91 partitioning of, 49–50 religious tolerance in, 70–71 Out of Eden Walk, 4 Overseas Private Investment Company (OPEC), 111 Pacific Alliance, 272 Pacific Islands, 181–82 US territories in, 48 Pacific Rim, see East Asia Pakistan, 52–53, 58, 62, 72, 95, 102, 105 AI research in, 200 Asianization of, 81, 113–18 as Central Asia’s conduit to Arabian Sea, 113–14 China and, 20, 114–16, 117–18 corruption in, 161 failure of democracy in, 302 finance industry in, 168–69 foreign investment in, 115 GDP per capita in, 184 India and, 55, 61–62, 117–18 intra-Asian migration from, 334 logistics industry in, 185 as market for Western products and services, 207 US and, 114–15 Pakistan Tehreek-e-Insaf (PTI), 307 Palestine, Palestinians, 49, 54, 99 Palestine Liberation Organization (PLO), 59 Palestinian-Israeli conflict, 59, 62, 97, 100 Pan-Asianism, 48, 351–52 paper, invention of, 72 Paris climate agreement, 178, 240 Paris Peace Conference (1918), 49 Park Chung-hee, 56 Park Geun-hye, 313 parliamentary democracy, 295 Parthians, 33, 76 Pawar, Rajendra, 205 Pearl Harbor, Japanese attack on, 50 peer-to-peer (P2P) lending, 169 People’s Action Party (PAP), Singapore, 294 People’s Bank of China (PBOC), 110, 188 Pepper (robot), 134 per capita income, 5, 150, 183, 186 Persia, Persian Empire, 29, 30, 42, 45, 47, 50, 68, 75 see also Iran Persian Gulf War, 61, 101, 217 Peru: Asian immigrants in, 275, 276 Asian trade with, 272 Peshawar, 32 Peter I, Tsar of Russia, 45, 90 pharmaceutical companies, 209–10 Philippines, 61, 157, 165 alternative energy programs in, 180 Asian migrants in, 333 China and, 123–24 Christianity in, 74 in Cold War era, 53–54 eco-tourism in, 340 foreign investment in, 124 illiberal policies of, 306 inclusive policies in, 304 as market for Western products and services, 207 Muslims in, 71 privatization in, 170 technocracy in, 304–5 urban development in, 190 US acquisition of, 48 US and, 123–24 philosophy, Asian vs.
MacroWikinomics: Rebooting Business and the World by Don Tapscott, Anthony D. Williams
accounting loophole / creative accounting, airport security, Andrew Keen, augmented reality, Ayatollah Khomeini, barriers to entry, Ben Horowitz, bioinformatics, Bretton Woods, business climate, business process, buy and hold, car-free, carbon footprint, Charles Lindbergh, citizen journalism, Clayton Christensen, clean water, Climategate, Climatic Research Unit, cloud computing, collaborative editing, collapse of Lehman Brothers, collateralized debt obligation, colonial rule, commoditize, corporate governance, corporate social responsibility, creative destruction, crowdsourcing, death of newspapers, demographic transition, disruptive innovation, distributed generation, don't be evil, en.wikipedia.org, energy security, energy transition, Exxon Valdez, failed state, fault tolerance, financial innovation, Galaxy Zoo, game design, global village, Google Earth, Hans Rosling, hive mind, Home mortgage interest deduction, information asymmetry, interchangeable parts, Internet of things, invention of movable type, Isaac Newton, James Watt: steam engine, Jaron Lanier, jimmy wales, Joseph Schumpeter, Julian Assange, Kevin Kelly, Kickstarter, knowledge economy, knowledge worker, Marc Andreessen, Marshall McLuhan, mass immigration, medical bankruptcy, megacity, mortgage tax deduction, Netflix Prize, new economy, Nicholas Carr, oil shock, old-boy network, online collectivism, open borders, open economy, pattern recognition, peer-to-peer lending, personalized medicine, Ray Kurzweil, RFID, ride hailing / ride sharing, Ronald Reagan, Rubik’s Cube, scientific mainstream, shareholder value, Silicon Valley, Skype, smart grid, smart meter, social graph, social web, software patent, Steve Jobs, text mining, the scientific method, The Wisdom of Crowds, transaction costs, transfer pricing, University of East Anglia, urban sprawl, value at risk, WikiLeaks, X Prize, young professional, Zipcar
Today, the X Prize Foundation is just one of many organizations that have latched on to incentivized challenges as a way to unleash fundamental breakthroughs in society. Richard Branson, the founder of Virgin, will part with $25 million of his own money in exchange for a commercially feasible way to remove greenhouse gases from Earth’s atmosphere. Netflix has issued a global challenge to anyone who can improve the company’s automated movie recommendations algorithm, while Google’s Lunar X Prize will go to the first private venture to send image-transmitting rovers to the moon. Why do these competitions work? “We are genetically bred to compete,” Diamandis explains. “It’s when we do our best business many times, we do our best sports, and I believe competition extracts the best out of individuals.” Diamandis argues that competitions also bring out the best in small teams.