book scanning

28 results back to index

pages: 380 words: 109,724

Don't Be Evil: How Big Tech Betrayed Its Founding Principles--And All of US by Rana Foroohar

"side hustle", accounting loophole / creative accounting, Airbnb, AltaVista, autonomous vehicles, banking crisis, barriers to entry, Bernie Madoff, Bernie Sanders, bitcoin, book scanning, Brewster Kahle, Burning Man, call centre, cashless society, cleantech, cloud computing, cognitive dissonance, Colonization of Mars, computer age, corporate governance, creative destruction, Credit Default Swap, cryptocurrency, data is the new oil, death of newspapers, Deng Xiaoping, disintermediation, don't be evil, Donald Trump, drone strike, Edward Snowden, Elon Musk,, Erik Brynjolfsson, Etonian, Filter Bubble, future of work, game design, gig economy, global supply chain, Gordon Gekko, greed is good, income inequality, informal economy, information asymmetry, intangible asset, Internet Archive, Internet of things, invisible hand, Jaron Lanier, Jeff Bezos, job automation, job satisfaction, Kenneth Rogoff, life extension, light touch regulation, Lyft, Mark Zuckerberg, Marshall McLuhan, Martin Wolf, Menlo Park, move fast and break things, move fast and break things, Network effects, new economy, offshore financial centre, PageRank, patent troll, paypal mafia, Peter Thiel,, price discrimination, profit maximization, race to the bottom, recommendation engine, ride hailing / ride sharing, Robert Bork, Sand Hill Road, search engine result page, self-driving car, shareholder value, sharing economy, Shoshana Zuboff, Silicon Valley, Silicon Valley startup, smart cities, Snapchat, South China Sea, sovereign wealth fund, Steve Jobs, Steven Levy, subscription business, supply-chain management, TaskRabbit, Telecommunications Act of 1996, The Chicago School, the new new thing, Tim Cook: Apple, too big to fail, Travis Kalanick, trickle-down economics, Uber and Lyft, Uber for X, uber lyft, Upton Sinclair, WikiLeaks, zero-sum game

After all, just as no small innovator with a patent could battle Big Tech, there was no way that an individual writer or musician, for example, could wage a legal battle to try to get royalties from the likes of Facebook or Google—or even to understand how much money the companies were making by linking to the content or leading advertisers to it as part of their search or social business models.18 As an example, consider the decade-long battle between Google and myriad authors and publishers over the Google Print project, later renamed Google Books. Scanning every single page of every single book in the world had long been an obsession of Page and Brin—it was, after all, a typically Google-sized ambition. They knew that the majority of the world’s books were protected under copyright from such unauthorized copying and distribution. But the Googlers felt, in typical form, that such pesky rules didn’t apply to them. Plus, they couldn’t understand why anyone would think it was better for authors to make money on books than for the entire world to have free access to information. So in 2002, they simply began scanning pages, albeit covertly. As tech writer Steven Levy put it in his book, In the Plex, which devotes twenty pages to the book-scanning project, “The secrecy was yet another expression of the paradox of a company that sometimes embraced transparency and other times seemed to model itself on the NSA.”19 Schmidt, who had by then decided that “evil is what Sergey says is evil,”20 was all for the project, which he declared “genius.”21 The publishing industry disagreed.

Some three-fourths of the books that Google aimed to copy were still under copyright—and Varian and the other Googlers knew that many, if not most, of the authors probably wouldn’t opt in.22 So, they decided to settle, ultimately agreeing to a compromise in which Google would agree to show only snippets of books that were under copyright for free in exchange for becoming the exclusive seller of digital copies of out-of-print books for the publishing houses and authors that agreed to the settlement. Google, which was earning about $10 billion in yearly revenue at that point, would pay the relatively tiny sum of $125 million to establish a registry of book rights holders and pay lawyers to organize the system and the payouts. It was a complete coup for Big Tech. Brewster Kahle, the head of the nonprofit Internet Archive, which wanted to do its own book-scanning project, claimed (not incorrectly) that Google had become an information monopolist. Even Lawrence Lessig, the digital law expert who favors many of the policies that the platforms support, said that Google’s deal was the equivalent of a “digital bookstore, not a digital library.”23 What he means is that even as Google was presenting the entire project as being done for the benefit of users, Google itself would ultimately benefit the most.

(This paradigm has played out in a number of subsequent cases—Google and Amazon, for example, regularly do battle to try to gain more access to each other’s markets, and many of the most powerful groups that complain about monopoly power on the part of Big Tech are other corporate behemoths.) Eventually, under pressure from some 143 groups, both nonprofit and private-sector, the U.S. Department of Justice took on the issue, claiming it had granted Google too many anticompetitive rights, and that the book-scanning and -selling project was a monopoly issue. Larry Page called the legal challenge a “travesty to humanity,” while Sergey Brin wrote a sanctimonious piece in The New York Times defending Google’s efforts. At court proceedings in 2010, Google’s attorney Daralyn J. Durie argued that “copyright infringement is evil to the extent that it is not compensated and that it harms the economic interests of rights holders.”

pages: 117 words: 30,654

Kindle Formatting: The Complete Guide to Formatting Books for the Amazon Kindle by Joshua Tallent

book scanning, job automation, optical character recognition

The easiest way to get the book back into a digital format is to scan it and run it through an Optical Character Recognition (OCR) software program. There are a variety of options available to the do-it-yourself person or to the pay-someone-else person. The main benefit to doing the process yourself is saving money, but you may find that having some help in the process is easier and faster. The first step in the OCR process is to have your book scanned. This is a process where each page of your book is turned into an image that can be loaded into the OCR program. There are a variety of places that will do scanning for you, or you can tackle the process yourself. Some copy and print stores (like FedEx/Kinko’s) offer scanning services, but you will often find the best prices at companies that specialize in scanning documents onto microfiche.

Some of these companies even have machines that can automate the scanning process by automatically turning the pages of the book. Be aware that the easiest way to scan a book on regular consumer scanners is to cut off the binding, which will effectively ruin the book. If your book is rare and you want to keep it intact, you should make sure the scanning company knows to handle it gently and to not cut off the binding. There is one consumer scanner called the OpticBook 3600 that is specifically designed for book scanning. That device is built in a way that allows a good scan of the pages without cutting the binding off or breaking the binding by forcing the book into unnatural positions on a flat surface. If you decide to scan the book yourself, you will need a flatbed or feed scanner. These devices are available at most electronics and computer stores and at various retailers online. They can be inexpensive or very expensive, depending on the options included and the quality of the scanner, and you may find that the available options are overwhelming.

pages: 413 words: 106,479

Because Internet: Understanding the New Rules of Language by Gretchen McCulloch

4chan, book scanning, British Empire, citation needed, Donald Trump,, Firefox, Flynn Effect, Google Hangouts, Internet Archive, invention of the printing press, invention of the telephone, moral panic, multicultural london english, natural language processing, pre–internet, QWERTY keyboard, Ray Oldenburg, Silicon Valley, Skype, Snapchat, social web, Steven Pinker, telemarketer, The Great Good Place, upwardly mobile, Watson beat the top human players on Jeopardy!

Even those of us who know that a single book isn’t the sole repository of a language and that dictionaries are records of how people are already using the language, not providers of words for us to start using—we still often think of the English language as contained within a sufficiently large quantity of books. We think of it as “the language of Shakespeare,” or the twenty volumes of the second edition of the Oxford English Dictionary, or the entire Library of Congress, or the millions of books scanned and made searchable by Google Books. This association isn’t accidental. If we look at how frequently people wrote the phrase “English language” across all the books scanned by Google, from 1500 to 2000, we see a major upswing between 1750 and 1800. It’s consistently low beforehand, and consistently high thereafter. “English” and “language” by themselves are pretty much steady—it’s just the two words together that go up. What happened in that period? Well, in 1755, Samuel Johnson published A Dictionary of the English Language, the first major English print dictionary.

Relicts of this setup are still in place on some commercial computer systems: teletypes are uncommon, but your grocery store receipt, bank statement, or airplane ticket might very well appear from a roll of shiny paper, printed in all caps. By the time computers did start supporting lowercase characters, we were faced with two competing standards: one group of people assumed that all caps is just how you write on a computer, while another group insisted that it stood for yelling. Ultimately, the emotional meaning won out. The shift in function happened in parallel with a shift in name: according to the millions of books scanned in Google Books, the terms “all caps” and “all uppercase” started rising sharply in the early 1990s. By contrast, in the earlier part of the century, the preferred terms were “block letters” or “block capitals.” People tended to use “all caps” to talk about the loud kind, while block capitals more often referred to the official kind, on signs and on forms. But the addition of all caps for tone of voice didn’t eliminate the official kind of capitals, which remain common on EXIT signs and CAUTION tape and CHAPTER ONE headings: they may be emphatic, but they aren’t interpreted as especially loud.

a newspaper in 1856: (No author cited.) April 17, 1856. “The Dutchman Who Had the Small Pox.” The Yorkville Enquirer (South Carolina). In Library of Congress, ed., Chronicling America: Historic American Newspapers. at one point it did: Thanks to Guy English (personal communication) for confirming that this was the case for FORTRAN and COBOL. millions of books scanned: Search for block capitals,block letters,all caps,all uppercase,caps lock in Google Books Ngram Viewer with date parameter 1800 to 2000. Jean-Baptiste Michel, Yuan Kui Shen, Aviva Presser Aiden, Adrian Veres, Matthew K. Gray, The Google Books Team, Joseph P.

pages: 666 words: 181,495

In the Plex: How Google Thinks, Works, and Shapes Our Lives by Steven Levy

23andMe, AltaVista, Anne Wojcicki, Apple's 1984 Super Bowl advert, autonomous vehicles, book scanning, Brewster Kahle, Burning Man, business process, clean water, cloud computing, crowdsourcing, Dean Kamen, discounted cash flows, don't be evil, Donald Knuth, Douglas Engelbart, Douglas Engelbart, El Camino Real, fault tolerance, Firefox, Gerard Salton, Gerard Salton, Google bus, Google Chrome, Google Earth, Googley, HyperCard, hypertext link, IBM and the Holocaust, informal economy, information retrieval, Internet Archive, Jeff Bezos, John Markoff, Kevin Kelly, Kickstarter, Mark Zuckerberg, Menlo Park, one-China policy, optical character recognition, PageRank, Paul Buchheit, Potemkin village, prediction markets, recommendation engine, risk tolerance, Rubik’s Cube, Sand Hill Road, Saturday Night Live, search inside the book, second-price auction, selection bias, Silicon Valley, skunkworks, Skype, slashdot, social graph, social software, social web, spectrum auction, speech recognition, statistical model, Steve Ballmer, Steve Jobs, Steven Levy, Ted Nelson, telemarketer, trade route, traveling salesman, turn-by-turn navigation, undersea cable, Vannevar Bush, web application, WikiLeaks, Y Combinator

It would require more care when handling the books, but it seemed more economical. For one thing, the books could be sold afterward. Or they could simply be borrowed in the first place. “We came up with all these numbers,” says Mayer. “We were emailing them around, the right cost per hour, the right number of pages per hour—debate, debate, debate. After one thread hinged on how many pages an hour we could do, we decided we should just scan one.” They set up a makeshift book scanning device. They tried several sizes of books, the first one, appropriately enough, being The Google Book, an illustrated children’s story by V. C. Vickers. (The “Google” in the title was an odd creature with aspects of mammal, reptile, and fish.) They then tested a photo book, Ancient Forests by David Middleton; a dense text, Algorithms in C by Robert Sedgewick; and a general-interest book, Startup, by Jerry Kaplan.

So it commissioned some of its best wizards to build a machine that, presumably, would work much more accurately and at a somewhat brisker rate than Marissa Mayer turning pages one by one. Though Google wasn’t known for actually building machines, its data center needs had generated a lot of engineering expertise in that area: remember, it was the world’s biggest manufacturer of computer servers. One of the difficulties in book scanning rested in producing high-quality images from the printed page, so that OCR software could accurately translate the shapes of the letters on the page to computer-readable text. The problem was that, on their own, books did not sit flat on the platform: they presented a 3-D problem requiring a 2-D solution. The usual workarounds—flattening the book by pressing it on the glass or removing the binding—would not work since they were time-consuming and damaged the books.

In other areas, Google had put its investments into the public domain, like the open-source Android and Chrome operating systems. And as far as user information was concerned, Google made it easy for people not to become locked into using its products. It even had an initiative called the Data Liberation Front to make sure that users could easily move information they created with Google documents off Google’s servers. It would seem that book scanning was a good candidate for similar transparency. If Google had a more efficient way to scan books, sharing the improved techniques could benefit the company in the long run—inevitably, much of the output would find its way onto the web, bolstering Google’s indexes. But in this case, paranoia and a focus on short-term gain kept the machines under wraps. “We’ve done a ton of work to try to make those machines an order of magnitude better,” AMac said.

pages: 304 words: 82,395

Big Data: A Revolution That Will Transform How We Live, Work, and Think by Viktor Mayer-Schonberger, Kenneth Cukier

23andMe, Affordable Care Act / Obamacare, airport security, barriers to entry, Berlin Wall, big data - Walmart - Pop Tarts, Black Swan, book scanning, business intelligence, business process, call centre, cloud computing, computer age, correlation does not imply causation, dark matter, double entry bookkeeping, Eratosthenes, Erik Brynjolfsson, game design, IBM and the Holocaust, index card, informal economy, intangible asset, Internet of things, invention of the printing press, Jeff Bezos, Joi Ito, lifelogging, Louis Pasteur, Mark Zuckerberg, Menlo Park, Moneyball by Michael Lewis explains big data, Nate Silver, natural language processing, Netflix Prize, Network effects, obamacare, optical character recognition, PageRank, paypal mafia, performance metric, Peter Thiel, post-materialism, random walk, recommendation engine, self-driving car, sentiment analysis, Silicon Valley, Silicon Valley startup, smart grid, smart meter, social graph, speech recognition, Steve Jobs, Steven Levy, the scientific method, The Signal and the Noise by Nate Silver, The Wealth of Nations by Adam Smith, Thomas Davenport, Turing test, Watson beat the top human players on Jeopardy!

Instead of nicely translated pages of text in two languages, Google availed itself of a larger but also much messier dataset: the entire global Internet and more. Its system sucked in every translation it could find, in order to train the computer. In went to corporate websites in multiple languages, identical translations of official documents, and reports from intergovernmental bodies like the United Nations and the European Union. Even translations of books from Google’s book-scanning project were included. Where Candide had used three million carefully translated sentences, Google’s system harnessed billions of pages of translations of widely varying quality, according to the head of Google Translate, Franz Josef Och, one of the foremost authorities in the field. Its trillion-word corpus amounted to 95 billion English sentences, albeit of dubious quality. Despite the messiness of the input, Google’s service works the best.

Nevertheless culturomics has given us an entirely new lens with which to understand ourselves. Transforming words into data unleashes numerous uses. Yes, the data can be used by humans for reading and by machines for analysis. But as the paragon of a big-data company, Google knows that information has multiple potential purposes that can justify its collection and datafication. So Google cleverly used the datafied text from its book-scanning project to improve its machine-translation service. As explained in Chapter Three, the system would take books that are translations and analyze what words and phrases the translators used as alternatives from one language to another. Knowing this, it could then treat translation as a giant math problem, with the computer figuring out probabilities to determine what word best substitutes for another between languages.

. [>] Quantifying the world—Much of the authors’ thinking on the history of datafication has been inspired by Crosby, The Measure of Reality. [>] Europeans were never exposed to abacuses—Ibid., 112. Calculating faster using Arabic numerals—Alexander Murray, Reason and Society in the Middle Ages (Oxford University Press, 1978), p. 166. [>] Total number of books published and Harvard study on Google book-scanning project—Jean-Baptiste Michel et al., “Quantitative Analysis of Culture Using Millions of Digitized Books,” Science 331 (January 14, 2011), pp. 176–182 ( For a video lecture on the paper, see Erez Lieberman Aiden and Jean-Baptiste Michel, “What We Learned from 5 Million Books,” TEDx, Cambridge, MA, 2011 ( [>] On wireless modules in cars and insurance—See Cukier, “Data, Data Everywhere.”

pages: 465 words: 109,653

Free Ride by Robert Levine

A Declaration of the Independence of Cyberspace, Anne Wojcicki, book scanning, borderless world, Buckminster Fuller, citizen journalism, commoditize, correlation does not imply causation, creative destruction, crowdsourcing, death of newspapers, Edward Lloyd's coffeehouse, Electric Kool-Aid Acid Test, Firefox, future of journalism, Googley, Hacker Ethic, informal economy, Jaron Lanier, Joi Ito, Julian Assange,, Kevin Kelly, linear programming, Marc Andreessen, Mitch Kapor, moral panic, offshore financial centre,, publish or perish, race to the bottom, Saturday Night Live, Silicon Valley, Silicon Valley startup, Skype, spectrum auction, Steve Jobs, Steven Levy, Stewart Brand, subscription business, Telecommunications Act of 1996, Whole Earth Catalog, WikiLeaks

Had the authors and publishers won, they would have received substantial damages but no way to sell out-of-print works. Perhaps most important, the settlement would have set an informal precedent that scanning books requires an agreement with publishers or authors. “The alternative was to take our chances on winning the lawsuit, and we probably would have,” Aiken says. “But if we didn’t, it would have been a catastrophe because [Google would have] millions of books scanned that authors and publishers would have no legal control over.” Like Amazon and Apple, Google sees books as a means to an end—in this case giving its search engine access to more information. “Probably the highest-quality knowledge is captured in books,” Sergey Brin said.16 Like record labels, publishers have become arms suppliers in a cold war between technology companies. By bringing Google into the business of selling books—and giving it enough of a selection to make it a legitimate competitor to Amazon and Apple—the proposed settlement could have given publishers more leverage.

Sergey Brin, “A Library to Last Forever,” New York Times, October 8, 2009. 11. Roy MacLeod, The Library of Alexandria: Centre of Learning in the Ancient World (New York: I. B. Tauris, 2000), p. 5. According to MacLeod, customs officials confiscated texts from passing ships, as well as visitors. They took originals for the library and returned copies to the owners. 12. There are two common views of whether Google’s book-scanning project qualifies as fair use. One, held by copyright reform activists, is that scanning books in order to create an index is no different from a card catalog, so it obviously falls under fair use. The other is that such a big project by a private company couldn’t possibly qualify. A court would probably find the issue less obvious than either side makes it out to be. On the one hand, Google’s use would further the aim of copyright law, and it could raise the value of the books in question by making them easier to find.

pages: 371 words: 108,317

The Inevitable: Understanding the 12 Technological Forces That Will Shape Our Future by Kevin Kelly

A Declaration of the Independence of Cyberspace, AI winter, Airbnb, Albert Einstein, Amazon Web Services, augmented reality, bank run, barriers to entry, Baxter: Rethink Robotics, bitcoin, blockchain, book scanning, Brewster Kahle, Burning Man, cloud computing, commoditize, computer age, connected car, crowdsourcing, dark matter, dematerialisation, Downton Abbey, Edward Snowden, Elon Musk, Filter Bubble, Freestyle chess, game design, Google Glasses, hive mind, Howard Rheingold, index card, indoor plumbing, industrial robot, Internet Archive, Internet of things, invention of movable type, invisible hand, Jaron Lanier, Jeff Bezos, job automation, John Markoff, Kevin Kelly, Kickstarter, lifelogging, linked data, Lyft, M-Pesa, Marc Andreessen, Marshall McLuhan, means of production, megacity, Minecraft, Mitch Kapor, multi-sided market, natural language processing, Netflix Prize, Network effects, new economy, Nicholas Carr, old-boy network, peer-to-peer, peer-to-peer lending, personalized medicine, placebo effect, planetary scale, postindustrial economy, recommendation engine, RFID, ride hailing / ride sharing, Rodney Brooks, self-driving car, sharing economy, Silicon Valley, slashdot, Snapchat, social graph, social web, software is eating the world, speech recognition, Stephen Hawking, Steven Levy, Ted Nelson, the scientific method, transport as a service, two-sided market, Uber for X, uber lyft, Watson beat the top human players on Jeopardy!, Whole Earth Review, zero-sum game

Legal tussles over the right to sample—to remix—snippets of music, particularly when either the sampled song or the borrowing song make a lot of money, are ongoing. The appropriateness of remixing, reusing material from one news source for another is a major restraint for new journalistic media. Legal uncertainty about Google’s reuse of snippets from the books it scanned was a major reason it closed down its ambitious book scanning program (although the court belatedly ruled in Google’s favor in late 2015). Intellectual property is a slippery realm. There are many aspects of contemporary intellectual property laws that are out of whack with the reality of how the underlying technology works. For instance, U.S. copyright law gives a temporary monopoly to a creator for his or her creation in order to encourage further creation, but the monopoly has been extended for at least 70 years after the death of the creator, long after a creator’s dead body can be motivated by anything.

., 62 extraordinary events, 277–79 eye tracking, 219–20 Facebook and aggregated information, 147 and artificial intelligence, 32, 39, 40 and “click-dreaming,” 280 cloud of, 128, 129 and collaboration, 273 and consumer attention system, 179, 184 and creative remixing, 199, 203 face recognition of, 39, 254 and filtering systems, 170, 171 flows of posts through, 63 and future searchability, 24 and interactivity, 235 and intermediation of content, 150 and lifestreaming, 246 and likes, 140 nonhierarchical infrastructure of, 152 number of users, 143, 144 as platform ecosystem, 123 and sharing economy, 139, 144, 145 and tracking technology, 239–40 and user-generated content, 21–22, 109, 138 facial recognition, 39, 40, 43, 220, 254 fan fiction, 194, 210 fear of technology, 191 Felton, Nicholas, 239–40 Fifield, William, 288 films and film industry, 196–99, 201–2 filtering, 165–91 and advertising, 179–89 differing approaches to, 168–75 filter bubble, 170 and storage capacity, 165–67 and superabundance of choices, 167–68 and value of attention, 175–79 findability of information, 203–7 firewalls, 294 first-in-line access, 68 first-person view (FPV), 227 fitness tracking, 238, 246, 255 fixity, 78–81 Flickr, 139, 199 Flows and flowing, 61–83 and engagement of users, 81–82 and free/ubiquitous copies, 61–62, 66–68 and generative values, 68–73 move from fixity to, 78–81 in real time, 64–65 and screen culture, 88 and sharing, 8 stages of, 80–81 streaming, 66, 74–75, 82 and users’ creations, 73–74, 75–78 fluidity, 66, 79, 282 food as service (FaS), 113–14 footnotes, 201 411 information service, 285 Foursquare, 139, 246 fraud, 184 freelancers (prosumers), 113, 115, 116–17, 148, 149 Freeman, Eric, 244–45 fungibility of digital data, 195 future, blindness to, 14–22 Galaxy phones, 219 gatekeepers, 167 Gates, Bill, 135, 136 gaze tracking, 219–20 Gelernter, David, 244–46 General Electric, 160 generatives, 68–73 genetics, 69, 238, 284 Gibson, William, 214 gifs, 195 global connectivity, 275, 276, 292 gluten, 241 GM, 185 goods, fixed, 62, 65 Google AdSense ads, 179–81 and artificial intelligence, 32, 36–37, 40 book scanning projects, 208 cloud of, 128, 129 and consumer attention system, 179, 184 and coveillance, 262 and facial recognition technology, 254 and filtering systems, 172, 188 and future searchability, 24 Google Drive, 126 Google Glass, 217, 224, 247, 250 Google Now, 287 Google Photo, 43 and intellectual property law, 208–9 and lifelogging, 250–51, 254 and lifestreaming, 247–48 and photo captioning, 51 quantity of searches, 285–86 and smart technology, 223–25 translator apps of, 51 and users’ usage patterns, 21, 146–47 and virtual reality technology, 215, 216–17 and visual intelligence, 203 government, 167, 175–76, 252, 255, 261–64 GPS technology, 226, 274 graphics processing units (GPU), 38–39, 40 Greene, Alan, 31–32, 238 grocery shopping, 62, 253 Guinness Book of World Records, 278 hackers, 252 Hall, Storrs, 264–65 Halo, 227 Hammerbacher, Jeff, 280 hand motion tracking, 222 haptic feedback, 233–34 harassment, online, 264 hard singularity, 296 Harry Potter series, 204, 209–10 Hartsell, Camille, 252 hashtags, 140 Hawking, Stephen, 44 health-related websites, 179–81 health tracking, 173, 238–40, 250 heat detection, 226 hierarchies, 148–54, 289 High Fidelity, 219 Hinton, Geoff, 40 historical documents, 101 hive mind, 153, 154, 272, 281 Hockney, David, 155 Hollywood films, 196–99 holodeck simulations, 211–12 HoloLens, 216 the “holos,” 292–97 home surveillance, 253 HotWired, 18, 149, 150 humanity, defining, 48–49 hyperlinking antifacts highlighted by, 279 of books, 95, 99 of cloud data, 125–26 and creative remixing, 201–2 early theories on, 18–19, 21 and Google search engines, 146–47 IBM, 30–31, 40, 41, 128, 287 identity passwords, 220, 235 IMAX technology, 211, 217 implantable technology, 225 indexing data, 258 individualism, 271 industrialization, 49–50, 57 industrial revolution, 189 industrial robots, 52–53 information production, 257–64.

pages: 502 words: 107,657

Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die by Eric Siegel

Albert Einstein, algorithmic trading, Amazon Mechanical Turk, Apple's 1984 Super Bowl advert, backtesting, Black Swan, book scanning, bounce rate, business intelligence, business process, butter production in bangladesh, call centre, Charles Lindbergh, commoditize, computer age, conceptual framework, correlation does not imply causation, crowdsourcing, dark matter, data is the new oil,, Erik Brynjolfsson, Everything should be made as simple as possible, experimental subject, Google Glasses, happiness index / gross national happiness, job satisfaction, Johann Wolfgang von Goethe, lifelogging, Machine translation of "The spirit is willing, but the flesh is weak." to Russian and back, mass immigration, Moneyball by Michael Lewis explains big data, Nate Silver, natural language processing, Netflix Prize, Network effects, Norbert Wiener, personalized medicine, placebo effect, prediction markets, Ray Kurzweil, recommendation engine, risk-adjusted returns, Ronald Coase, Search for Extraterrestrial Intelligence, self-driving car, sentiment analysis, Shai Danziger, software as a service, speech recognition, statistical model, Steven Levy, text mining, the scientific method, The Signal and the Noise by Nate Silver, The Wisdom of Crowds, Thomas Bayes, Thomas Davenport, Turing test, Watson beat the top human players on Jeopardy!, X Prize, Yogi Berra, zero-sum game

questions could be answered with a database lookup. The demands of open question answering reach far beyond the computer’s traditional arena of storing and accessing data for flight reservations and bank records. We’re going to need a smarter robot. The Ultimate Knowledge Source We are not scanning all those books to be read by people. We are scanning them to be read by an AI. —A Google employee regarding Google’s book scanning, as quoted by George Dyson in Turing’s Cathedral: The Origins of the Digital Universe A bit of good news: IBM didn’t need to create comprehensive databases for the Jeopardy! challenge because the ultimate knowledge source already exists: the written word. I am pleased to report that people like to report; we write down what we know in books, web pages, Wikipedia entries, blogs, and newspaper articles.

McKeown, “Learning Methods to Combine Linguistic Indicators: Improving Aspectual Classification and Revealing Linguistic Insights,” Computational Linguistics 26, issue 4 (December 2000). doi:10.1162/089120100750105957, Googling only 30 percent of the Jeopardy! questions right: Stephen Baker, Final Jeopardy: Man vs. Machine and the Quest to Know Everything (Houghton Mifflin Harcourt, 2011), 212–224. Quote about Google’s book scanning project: George Dyson, Turing’s Cathedral: The Origins of the Digital Universe (Pantheon Books, 2012). Natural language processing: Dursun Delen, Andrew Fast, Thomas Hill, Robert Nisbit, John Elder, and Gary Miner, Practical Text Mining and Statistical Analysis for Non-Structured Text Data Applications (Academic Press, 2012). James Allen, Natural Language Understanding, 2nd ed. (Addison-Wesley, 1994).

pages: 629 words: 142,393

The Future of the Internet: And How to Stop It by Jonathan Zittrain

A Declaration of the Independence of Cyberspace, Amazon Mechanical Turk, Andy Kessler, barriers to entry, book scanning, Brewster Kahle, Burning Man,, call centre, Cass Sunstein, citizen journalism, Clayton Christensen, clean water, commoditize, corporate governance, Daniel Kahneman / Amos Tversky, disruptive innovation, distributed generation,, Firefox, game design, Hacker Ethic, Howard Rheingold, Hush-A-Phone, illegal immigration, index card, informal economy, Internet Archive, jimmy wales, John Markoff, license plate recognition, loose coupling, mail merge, national security letter, old-boy network, packet switching, peer-to-peer, post-materialism, pre–internet, price discrimination, profit maximization, Ralph Nader, RFC: Request For Comment, RFID, Richard Stallman, Richard Thaler, risk tolerance, Robert Bork, Robert X Cringely, SETI@home, Silicon Valley, Skype, slashdot, software patent, Steve Ballmer, Steve Jobs, Ted Nelson, Telecommunications Act of 1996, The Nature of the Firm, The Wisdom of Crowds, web application, wikimedia commons, zero-sum game

Digital Millennium Copyright Act of 1998 give some protection to search engines that point customers to material that infringes copyright,113 but they do not shield the actions required to create the search database in the first place. The act of creating a search engine, like the act of surfing itself, is something so commonplace that it would be difficult to imagine deeming it illegal—but this is not to say that search engines rest on any stronger of a legal basis than the practice of using robots.txt to determine when it is and is not appropriate to copy and archive a Web site.114 Only recently, with Google’s book scanning project, have copyright holders really begun to test this kind of question.115 That challenge has arisen over the scanning of paper books, not Web sites, as Google prepares to make them searchable in the same way Google has indexed the Web.116 The long-standing practice of Web site copying, guided by robots.txt, made that kind of indexing uncontro-versial even as it is, in theory, legally cloudy.

., 188–92; and procrastination principle, 152, 164, 180, 242, 245; security in, 166; stability of, 153–74; use of term, 74; as what we make them, 242, 244–46; as works in progress, 152 generative technology: accessibility of, 72–73, 93; and accountability, 162–63; adaptability of, 71–72, 93, 125; affordance theory, 78; Apple II, 2; benefits of, 64, 79–80, 84–85; blending of models for innovation, 86–90; control vs. anarchy in, 98, 150, 157–62; design features of, 43; ease of mastery, 72; end-to-end neutrality of, 165; expansion of, 34; features of, 71–73; freedom vs. security in, 3–5, 40–43, 151; free software philosophy, 77; and generative content, 245; group creativity, 94, 95; hourglass architecture, 67–71, 99; innovation as output of, 80–84, 90; input/participation in, 90–94; leverage in, 71, 92–93; non-generative generative technology (continued) compared to, 73–76; openness of, 19, 150, 156–57, 178; pattern of, 64, 67, 96–100; as platform, 2, 3; recursive, 95–96; success of, 42–43; theories of the commons, 78–79; transferability of, 73; vulnerability of, 37–51, 54–57, 60–61, 64–65 generative tools, 74–76 generativity: extra-legal solutions for, 168–73; Libertarian model of, 131; and network neutrality, 178–81; paradox of, 99; recursive, 94; reducing, and increasing security, 97, 102, 165, 167, 245; repurposing via, 212; use of term, 70; and Web 2.0, 123–26, 119, 189 GNE, 132–33, 134, 135 GNU/Linux, 64, 77, 89, 114, 190, 192 GNUpedia, 132 goldfish bowl cams, 158 “good neighbors” system, 160 Google: and advertising, 56; book scanning project of, 224–25, 242; Chinese censorship of, 113, 147; clarification available on, 230; data gathering by, 160, 221; death penalty of, 218, 220; image search on, 214–15; innovation in, 84; map service of, 124, 184, 185; privacy policy on, 306n47; and procrastination principle, 242; as search engine, 223, 226; and security, 52, 171; and spam, 170–73 Google Desktop, 185 Google News, 242 Google Pagerank, 160 Google Video, 124 governments: abuse of power by, 117–19, 187; oppressive, monitoring by, 33; PCs investigated by, 186–88; research funding from, 27, 28 GPS (Global Positioning Systems), 109, 214 graffiti, 45 Griffith, Virgil, and Wikiscanner, 151 Gulf Shipbuilding Corporation, 172 gun control legislation, 117 hackers: ethos of, 43, 45, 53; increasing skills of, 245 Harvard University, Berkman Center, 159, 170 HD-DVDs, 123 Health, Education, and Welfare (HEW) Department, U.S., privacy report of (1973), 201–5, 222, 233–34 Herdict, 160, 163, 167–68, 173, 241 Hippel, Eric von, 86–87, 98, 146 Hollerith, Herman, 11–12, 13; business model of, 17, 20, 24 Hollerith Tabulating Machine Company, 11–12 home boxes, 180–81 honor codes, 128–29 Horsley, Neal, 215 Hotmail, 169 “How’s My Driving” programs, 219, 229 HTML (hypertext markup language), 95 Hunt, Robert, 190 Hush-A-Phone, 21–22, 81, 82, 121 hyperlinks, 56, 89 hypertext, coining of term, 226 IBM (International Business Machines): antitrust suit against, 12; business model of, 12, 23, 30, 161; competitors of, 12–13; and generative technology, 64; Internet Security Systems, 47–48; mainframe computers, 12, 57; OS/2, 88; and risk aversion, 17, 57; System 360, 174 identity tokens, unsheddable, 228 image recognition, 215–16 immigration, illegal, 209 information appliances: accessibility of, 29, 232; code thickets, 188–92; content thickets, 192–93; and data portability, 176–78; generative systems compared to, 73–76; limitations of, 177; and network neutrality, 178–85; PCs as, 4, 59–61, 102, 185–88; PCs vs., 18, 29, 57–59; and perfect enforcement, 161; and privacy, 185–88; regulatory interventions in, 103–7, 125, 197; remote control of, 161; remote updates of, 106–7, 176; security dilemma of, 42, 106–7, 123–24, 150, 176–88; specific injunction, 108–9; variety of designs for, 20; Web 2.0 and, 102; See also specific information appliances information overload, 230 information services, early forms, 9 InnoCentive, 246 innovation: blending models for, 86–90; generativity as parent of, 80–84, 90; group, 94; and idiosyncrasy, 90–91; inertia vs., 83–84; “sustaining” vs.

pages: 189 words: 57,632

Content: Selected Essays on Technology, Creativity, Copyright, and the Future of the Future by Cory Doctorow

AltaVista, book scanning, Brewster Kahle, Burning Man,, informal economy, information retrieval, Internet Archive, invention of movable type, Jeff Bezos, Law of Accelerating Returns, Metcalfe's law, Mitch Kapor, moral panic, mutually assured destruction, new economy, optical character recognition, patent troll, pattern recognition, peer-to-peer, Ponzi scheme, post scarcity, QWERTY keyboard, Ray Kurzweil, RFID, Sand Hill Road, Skype, slashdot, social software, speech recognition, Steve Jobs, Thomas Bayes, Turing test, Vernor Vinge

More importantly, the free e-book skeptics have no evidence to offer in support of their position — just hand-waving and dark muttering about a mythological future when book-lovers give up their printed books for electronic book-readers (as opposed to the much more plausible future where book lovers go on buying their fetish objects and carry books around on their electronic devices). I started giving away e-books after I witnessed the early days of the "bookwarez" scene, wherein fans cut the binding off their favorite books, scanned them, ran them through optical character recognition software, and manually proofread them to eliminate the digitization errors. These fans were easily spending 80 hours to rip their favorite books, and they were only ripping their favorite books, books they loved and wanted to share. (The 80-hour figure comes from my own attempt to do this — I'm sure that rippers get faster with practice.) I thought to myself that 80 hours' free promotional effort would be a good thing to have at my disposal when my books entered the market.

pages: 173 words: 14,313

Peers, Pirates, and Persuasion: Rhetoric in the Peer-To-Peer Debates by John Logie

1960s counterculture, Berlin Wall, book scanning, cuban missile crisis, Fall of the Berlin Wall, Hacker Ethic, Isaac Newton, Marshall McLuhan, moral panic, mutually assured destruction, peer-to-peer, plutocrats, Plutocrats, pre–internet, publication bias, Richard Stallman, Search for Extraterrestrial Intelligence, search inside the book, SETI@home, Silicon Valley, slashdot, Steve Jobs, Steven Levy, Stewart Brand, Whole Earth Catalog

In late 2004, the Internet megagiant Google announced, with great fanfare, its plan to digitize the library holdings of five major universities. Google intended to display small portions of the books, limiting users to reviewing a page at a time, and blocking printing. Less than a year later some members of the American Association of University Presses were petitioning the courts, demanding the right to opt out Pa r l orPr e s s wwwww. p a r l or p r e s s . c om Conclusion: The Cat Came Back 147 of having their authors’ books scanned. Other publishers are now demanding that Google request and receive permissions for each book it scans. And, for good measure, free speech advocates are encouraging Google to refuse to honor the publishers’ wishes and publish everything based on a hard-line fair use claim. Once again, U.S. Copyright Law has magically transformed an attempt to build Borges’s Library of Babel into the Tower of Babel, wherein the participants are unable to communicate with one another, and progress toward lofty goals is impossible.

The Orbital Perspective: Lessons in Seeing the Big Picture From a Journey of 71 Million Miles by Astronaut Ron Garan, Muhammad Yunus

Airbnb, barriers to entry, book scanning, Buckminster Fuller, clean water, corporate social responsibility, crowdsourcing, global village, Google Earth, Indoor air pollution, jimmy wales, low earth orbit, optical character recognition, ride hailing / ride sharing, shareholder value, Silicon Valley, Skype, smart transportation, Stephen Hawking, transaction costs, Turing test, Uber for X, web of trust

When many people review and comment on a particular room for rent or an Uber driver, those evaluations start to become statistically accurate. The driver or homeowner has demonstrated a track record of living up to agreements, and the collective wisdom of the crowd can point to a high level of dependability. This is similar to Duolingo’s use of beginning language students to provide translations or ReÂ�CAPTCHA’s ability to crowdsource the accuracy of book scans. Community-Based Trust These examples relate to personal trust, but there are countless similar examples of communities that form online for a specific purpose and operate in a coordinated way for the greater good. Wikipedia, for instance, was built on the premise that people enjoy interacting within a community, which in the case of Wikipedia, is a global village documenting human knowledge.

Not That Kind of Girl: A Young Woman Tells You What She's "Learned" by Lena Dunham

book scanning, Joan Didion, Mason jar, Saturday Night Live, sexual politics, zero-sum game

Her sister, another imp with impossibly well-thought-out hair, has a funny phlegmy laugh. I know I shouldn’t drink anymore, or should at least temper it with a few handfuls of the crisps they are passing around. No one can explain how they came to live here. Nellie hops up, discarding her coat while announcing that it’s freezing. “Let me show you round,” she says. I take in every detail of the house like I’m six again and reading a picture book, scanning the illustrations carefully. Next to a marble fireplace lies an issue of Elle, a torn thigh-high stocking, an empty pack of Marlboros, a half-eaten pudding cup. And each room leads to another, like one of those New York real-estate dreams where you open a hidden door and discover massive rooms you didn’t even know you had. I spill some of my wine down the front of my dress. Nellie’s bedroom contains a freestanding claw-foot tub, and I eye all her books and clippings with a pathetic level of interest.

pages: 291 words: 77,596

Total Recall: How the E-Memory Revolution Will Change Everything by Gordon Bell, Jim Gemmell

airport security, Albert Einstein, book scanning, cloud computing, conceptual framework, Douglas Engelbart, full text search, information retrieval, invention of writing, inventory management, Isaac Newton, John Markoff, lifelogging, Menlo Park, optical character recognition, pattern recognition, performance metric, RAND corporation, RFID, semantic web, Silicon Valley, Skype, social web, statistical model, Stephen Hawking, Steve Ballmer, Ted Nelson, telepresence, Turing test, Vannevar Bush, web application

My publications papers and reports e. My talks and presentations f. Other publications papers and reports g. People, references, recommendations, vitae h. Archived company and organizational folders (X) i) Digital Equipment Corp. . . . ii) NSF i. Archived calendars and correspondence (t) j. Archived files (e.g., DEC WPS, e-mail) 3. My Books books authored, books scanned 4. My Voice Conversations and Notes (telephone conversations are held in MyLifeBits database) 5. My Media, i.e., song collections from ripped CDs 6. My Videos including c. 1950s 8mm movies and lectures Psychologists have identified “lifetime periods” as an important way that autobiographical memories work. Lifetime periods are thematic and include work or jobs, educational institutions, and relationships that exist over an extended period of time.

pages: 252 words: 74,167

Thinking Machines: The Inside Story of Artificial Intelligence and Our Race to Build the Future by Luke Dormehl

Ada Lovelace, agricultural Revolution, AI winter, Albert Einstein, Alexey Pajitnov wrote Tetris, algorithmic trading, Amazon Mechanical Turk, Apple II, artificial general intelligence, Automated Insights, autonomous vehicles, book scanning, borderless world, call centre, cellular automata, Claude Shannon: information theory, cloud computing, computer vision, correlation does not imply causation, crowdsourcing, drone strike, Elon Musk, Flash crash, friendly AI, game design, global village, Google X / Alphabet X, hive mind, industrial robot, information retrieval, Internet of things, iterative process, Jaron Lanier, John Markoff, John Maynard Keynes: Economic Possibilities for our Grandchildren, John Maynard Keynes: technological unemployment, John von Neumann, Kickstarter, Kodak vs Instagram, Law of Accelerating Returns, life extension, Loebner Prize, Marc Andreessen, Mark Zuckerberg, Menlo Park, natural language processing, Norbert Wiener, out of africa, PageRank, pattern recognition, Ray Kurzweil, recommendation engine, remote working, RFID, self-driving car, Silicon Valley, Skype, smart cities, Smart Cities: Big Data, Civic Hackers, and the Quest for a New Utopia, social intelligence, speech recognition, Stephen Hawking, Steve Jobs, Steve Wozniak, Steven Pinker, strong AI, superintelligent machines, technological singularity, The Coming Technological Singularity, The Future of Employment, Tim Cook: Apple, too big to fail, Turing machine, Turing test, Vernor Vinge, Watson beat the top human players on Jeopardy!

Google has estimated that there are approximately 130 million distinct books in the world, and has made clear its intention to scan all of these by the year 2020. Compare that number to the 25,000 books read by the woman who has laid claim to the title of Britain’s most avid reader, having read around a dozen books each week since 1946. In an entire lifetime, even the most prolific reader is unable to read one-thousandth of the books Google has absorbed since it started its book-scanning project in just October 2004. With increasingly large datasets, computers are getting better and better at performing tasks like textual analysis, which is why they are being used for tasks like identifying who wrote particular books in cases where this is unknown. But generating novelty is not enough. My movie title-generating bot would be prolific, but it would simply reverse the problem a lot of screenwriters have.

pages: 369 words: 80,355

Too Big to Know: Rethinking Knowledge Now That the Facts Aren't the Facts, Experts Are Everywhere, and the Smartest Person in the Room Is the Room by David Weinberger

airport security, Alfred Russel Wallace, Amazon Mechanical Turk, Berlin Wall, Black Swan, book scanning, Cass Sunstein, commoditize, corporate social responsibility, crowdsourcing, Danny Hillis, David Brooks, Debian, double entry bookkeeping, double helix,, Exxon Valdez, Fall of the Berlin Wall, future of journalism, Galaxy Zoo, Hacker Ethic, Haight Ashbury, hive mind, Howard Rheingold, invention of the telegraph, jimmy wales, Johannes Kepler, John Harrison: Longitude, Kevin Kelly, linked data, Netflix Prize, New Journalism, Nicholas Carr, Norbert Wiener, openstreetmap, P = NP, Pluto: dwarf planet, profit motive, Ralph Waldo Emerson, RAND corporation, Ray Kurzweil, Republic of Letters, RFID, Richard Feynman, Ronald Reagan, semantic web, slashdot, social graph, Steven Pinker, Stewart Brand, technological singularity, Ted Nelson, the scientific method, The Wisdom of Crowds, Thomas Kuhn: the structure of scientific revolutions, Thomas Malthus, Whole Earth Catalog, X Prize

Even before books, the hundreds of thousands of scrolls in the Library of Alexandria were more than could be carried out to safety from the great fire, much less be read in a lifetime. Only about 2 percent of the Harvard University library system’s physical holdings circulate every year, and most of those are the same works that circulated the previous year.1 The new abundance makes the old abundance look like scarcity. The Google book-scanning project alone has over 15 million scanned books, which you can search through more easily than you can look up an item in the index of the book on your night table.2 Harvard’s Robert Darnton, whom we met in Chapter 6, is among those proposing a Digital Public Library of America,3 a call that has excited interest among public and research librarians, the government, and some large Internet projects.

pages: 240 words: 109,474

Masters of Doom: How Two Guys Created an Empire and Transformed Pop Culture by David Kushner

Apple's 1984 Super Bowl advert, book scanning, Columbine, corporate governance, game design, glass ceiling, Hacker Ethic, informal economy, Marc Andreessen, market design, Marshall McLuhan, Saturday Night Live, side project, Silicon Valley, slashdot, software patent, Steve Jobs, Steven Levy, X Prize

Also to speed things up, characters and objects in the game would not be in true 3-D, they would be sprites, flat images that, if encountered in real lite, would look like cardboard cutouts. Romero, in pure Melvin mode, imagined all the crazy stuff they could do in a game where the object was, as he said, “to mow down Nazis.” He wanted to have the suspense of an Apple II game pumped up with the shock and horror of storming a Nazi bunker. There would be SS soldiers and Hitler. 79 Adrian hit the history books, scanning images of the German leader to include throughout the game. But that wasn’t enough. “How about,” Romero suggested, “we throw in guard dogs? Dogs that you can shoot! Fucking German shepherds!” Adrian cracked up, sketching out a dog that, in a death animation, could yelp back. “And there should be blood,” Romero said, “lots of blood, blood like you never see in games. And the weapons should be lethal but simple: a knife, a pistol, maybe a Gatling gun too.”

pages: 391 words: 105,382

Utopia Is Creepy: And Other Provocations by Nicholas Carr

Air France Flight 447, Airbnb, Airbus A320, AltaVista, Amazon Mechanical Turk, augmented reality, autonomous vehicles, Bernie Sanders, book scanning, Brewster Kahle, Buckminster Fuller, Burning Man, Captain Sullenberger Hudson, centralized clearinghouse, Charles Lindbergh, cloud computing, cognitive bias, collaborative consumption, computer age, corporate governance, crowdsourcing, Danny Hillis, deskilling, digital map, disruptive innovation, Donald Trump, Electric Kool-Aid Acid Test, Elon Musk, factory automation, failed state, feminist movement, Frederick Winslow Taylor, friendly fire, game design, global village, Google bus, Google Glasses, Google X / Alphabet X, Googley, hive mind, impulse control, indoor plumbing, interchangeable parts, Internet Archive, invention of movable type, invention of the steam engine, invisible hand, Isaac Newton, Jeff Bezos, jimmy wales, Joan Didion, job automation, Kevin Kelly, lifelogging, low skilled workers, Marc Andreessen, Mark Zuckerberg, Marshall McLuhan, means of production, Menlo Park, mental accounting, natural language processing, Network effects, new economy, Nicholas Carr, Norman Mailer, off grid, oil shale / tar sands, Peter Thiel, plutocrats, Plutocrats, profit motive, Ralph Waldo Emerson, Ray Kurzweil, recommendation engine, Republic of Letters, robot derives from the Czech word robota Czech, meaning slave, Ronald Reagan, self-driving car, SETI@home, side project, Silicon Valley, Silicon Valley ideology, Singularitarianism, Snapchat, social graph, social web, speech recognition, Startup school, stem cell, Stephen Hawking, Steve Jobs, Steven Levy, technoutopianism, the medium is the message, theory of mind, Turing test, Whole Earth Catalog, Y Combinator

Internet or not, the world may still not be ready for the library of utopia. LARRY PAGE isn’t known for his literary sensibility, but he does like to think big. In 2002, the Google cofounder decided that it was time for his young company to scan all the world’s books into its database. If printed texts weren’t brought online, he feared, Google would never fulfill its mission of making the world’s information “universally accessible and useful.” After doing some book-scanning tests in his office—he manned the camera while Marissa Mayer, then a product manager, turned pages to the beat of a metronome—he concluded that Google had the smarts and the money to get the job done. He set a team of engineers and programmers to work. In a matter of months, they had invented an ingenious scanning device that used a stereoscopic infrared camera to correct for the bowing of pages that occurs when a book is opened.

pages: 380 words: 118,675

The Everything Store: Jeff Bezos and the Age of Amazon by Brad Stone

airport security, Amazon Mechanical Turk, Amazon Web Services, bank run, Bernie Madoff, big-box store, Black Swan, book scanning, Brewster Kahle, buy and hold, call centre, centre right, Chuck Templeton: OpenTable:, Clayton Christensen, cloud computing, collapse of Lehman Brothers, crowdsourcing, cuban missile crisis, Danny Hillis, Douglas Hofstadter, Elon Musk, facts on the ground, game design, housing crisis, invention of movable type, inventory management, James Dyson, Jeff Bezos, John Markoff, Kevin Kelly, Kodak vs Instagram, late fees, loose coupling, low skilled workers, Maui Hawaii, Menlo Park, Network effects, new economy, optical character recognition,, Ponzi scheme, quantitative hedge fund, recommendation engine, Renaissance Technologies, RFID, Rodney Brooks, search inside the book, shareholder value, Silicon Valley, Silicon Valley startup, six sigma, skunkworks, Skype, statistical arbitrage, Steve Ballmer, Steve Jobs, Steven Levy, Stewart Brand, Thomas L Friedman, Tony Hsieh, Whole Earth Catalog, why are manhole covers round?, zero-sum game

About two dozen Amazon employees worked on the service from January 2004 to November 2005. It was considered a Jeff project, which meant that the product manager met with Bezos every few weeks and received a constant stream of e-mail from the CEO, usually containing extraordinarily detailed recommendations and frequently arriving late at night. Amazon started using Mechanical Turk internally in 2005 to have humans do things like review Search Inside the Book scans and check product images uploaded to Amazon by customers to ensure they were not pornographic. The company also used Mechanical Turk to match the images with the corresponding commercial establishments in A9’s fledgling Block View tool. Bezos himself became consumed with this task and used it as a way to demonstrate the service. As the company prepared to introduce Mechanical Turk to the public, Amazon’s PR team and a few employees complained they were uncomfortable with the system’s reference to the Turkish people.

pages: 510 words: 120,048

Who Owns the Future? by Jaron Lanier

3D printing, 4chan, Affordable Care Act / Obamacare, Airbnb, augmented reality, automated trading system, barriers to entry, bitcoin, book scanning, Burning Man, call centre, carbon footprint, cloud computing, commoditize, computer age, crowdsourcing, David Brooks, David Graeber, delayed gratification, digital Maoism, Douglas Engelbart,, Everything should be made as simple as possible, facts on the ground, Filter Bubble, financial deregulation, Fractional reserve banking, Francis Fukuyama: the end of history, George Akerlof, global supply chain, global village, Haight Ashbury, hive mind, if you build it, they will come, income inequality, informal economy, information asymmetry, invisible hand, Jaron Lanier, Jeff Bezos, job automation, John Markoff, Kevin Kelly, Khan Academy, Kickstarter, Kodak vs Instagram, life extension, Long Term Capital Management, Marc Andreessen, Mark Zuckerberg, meta analysis, meta-analysis, Metcalfe’s law, moral hazard, mutually assured destruction, Network effects, new economy, Norbert Wiener, obamacare, packet switching, Panopticon Jeremy Bentham, Peter Thiel, place-making, plutocrats, Plutocrats, Ponzi scheme, post-oil, pre–internet, race to the bottom, Ray Kurzweil, rent-seeking, reversible computing, Richard Feynman, Ronald Reagan, scientific worldview, self-driving car, side project, Silicon Valley, Silicon Valley ideology, Silicon Valley startup, Skype, smart meter, stem cell, Steve Jobs, Steve Wozniak, Stewart Brand, Ted Nelson, The Market for Lemons, Thomas Malthus, too big to fail, trickle-down economics, Turing test, Vannevar Bush, WikiLeaks, zero-sum game

The real people from whom the initial answers were gathered deserve to be paid for each new answer given by the machine. Consider too the act of scanning a book into digital form. The historian George Dyson has written that a Google engineer once said to him: “We are not scanning all those books to be read by people. We are scanning them to be read by an AI.” While we have yet to see how Google’s book scanning will play out, a machine-centric vision of the project might encourage software that treats books as grist for the mill, decontextualized snippets in one big database, rather than separate expressions from individual writers. In this approach, the contents of books would be atomized into bits of information to be aggregated, and the authors themselves, the feeling of their voices, their differing perspectives, would be lost.

pages: 387 words: 119,409

Work Rules!: Insights From Inside Google That Will Transform How You Live and Lead by Laszlo Bock

Airbnb, Albert Einstein, AltaVista, Atul Gawande, Black Swan, book scanning, Burning Man, call centre, Cass Sunstein, Checklist Manifesto, choice architecture, citizen journalism, clean water, correlation coefficient, crowdsourcing, Daniel Kahneman / Amos Tversky, deliberate practice,, experimental subject, Frederick Winslow Taylor, future of work, Google Earth, Google Glasses, Google Hangouts, Google X / Alphabet X, Googley, helicopter parent, immigration reform, Internet Archive, longitudinal study, Menlo Park, mental accounting, meta analysis, meta-analysis, Moneyball by Michael Lewis explains big data, nudge unit, PageRank, Paul Buchheit, Ralph Waldo Emerson, Rana Plaza, random walk, Richard Thaler, Rubik’s Cube, self-driving car, shareholder value, side project, Silicon Valley, six sigma, statistical model, Steve Ballmer, Steve Jobs, Steven Levy, Steven Pinker, survivorship bias, TaskRabbit, The Wisdom of Crowds, Tony Hsieh, Turing machine, winner-take-all economy, Y2K

Susan Wojcicki and Sheryl Sandberg, a sales VP at the time and now COO of Facebook, were instrumental in growing the concept behind these talks, using their networks and interests to recruit a range of speakers to Google to speak about leadership, women’s issues, and politics. Googlers first self-organized these events into a more formal program in 2006, when they noticed more and more authors visiting to speak with our book-scanning teams. The volunteers asked visiting authors to stick around for a conversation, and our first official Authors@Google guest was none other than Malcolm Gladwell. This grew into today’s broader program called Talks at Google, a speaker series where authors, scientists, business leaders, performers, politicians, and other thought-provoking figures are invited to campus to share their thoughts.

pages: 496 words: 154,363

I'm Feeling Lucky: The Confessions of Google Employee Number 59 by Douglas Edwards

Albert Einstein, AltaVista, Any sufficiently advanced technology is indistinguishable from magic, barriers to entry, book scanning, Build a better mousetrap, Burning Man, business intelligence, call centre, commoditize, crowdsourcing, don't be evil, Elon Musk, fault tolerance, Googley, gravity well, invisible hand, Jeff Bezos, job-hopping, John Markoff, Kickstarter, Marc Andreessen, Menlo Park, microcredit, music of the spheres, Network effects, PageRank, performance metric,, Ralph Nader, risk tolerance, second-price auction, side project, Silicon Valley, Silicon Valley startup, slashdot, stem cell, Superbowl ad, Y2K

Larry decreed a meeting be established at which views would be heard from all corners of the Plex, disagreements would be aired, and edicts would be issued. He dubbed it "product review." Google had birthed a process. Product review met in Larry and Sergey's office. I arrived early to get a seat on the black pleather couch. Otherwise, I'd have had to balance my laptop while sitting on a three-foot rubber ball. A large metal exoskeleton—the prototype for Larry's book-scanning project-held a camera and an array of lights pointing down at the coffee table in front of me. Karen White, Marissa Mayer, Jen McGrath from the front-end team, and Craig Silverstein worked around it, connecting cables to a projector so we could display mockups against the office wall. Sergey leaned back in his desk chair across from us, reading and eating a sandwich. It was hard to tell if he was paying attention.

pages: 645 words: 184,311

American Gods by Neil Gaiman

airport security, book scanning, Brownian motion, Golden Gate Park, Lao Tzu

Hinzelmann, originally of Hildemuhlen in Bavaria, was in charge of the lake-building project, and that the city council had granted him the sum of $370 toward the project, any shortfall to be made up by public subscription. Shadow tore off a strip of a paper towel and placed it into the book as a bookmark. He could imagine Hinzelmann's pleasure in seeing the reference to his grandfather. He wondered if the old man knew that his family had been instrumental in building the lake. Shadow flipped forward through the book, scanning for more references to the lake-building project. They had dedicated the lake in a ceremony in the spring of 1876, as a precursor to the town's centennial celebrations. A vote of thanks to Mr. Hinzelmann was taken by the council. Shadow checked his watch. It was five-thirty. He went into the bathroom, shaved, combed his hair. He changed his clothes. Somehow the final fifteen minutes passed.

How to Hide an Empire: A History of the Greater United States by Daniel Immerwahr

Albert Einstein, book scanning, British Empire, Buckminster Fuller, call centre, citizen journalism, City Beautiful movement, clean water, colonial rule, deindustrialization, Deng Xiaoping, desegregation, Donald Trump, drone strike, European colonialism, friendly fire, gravity well, Haber-Bosch Process, Howard Zinn, immigration reform, land reform, Mercator projection, offshore financial centre, oil shale / tar sands, oil shock, QWERTY keyboard, Ralph Waldo Emerson, Richard Feynman, the built environment, The Wealth of Nations by Adam Smith, Thomas L Friedman, Thomas Malthus, transcontinental railway, urban planning, wikimedia commons

The 13 articles the paper ran about Albania (PLOT AGAINST KING ZOG FOILED, etc.) far outstripped the 6 it printed about Alaska. Hawai‘i appeared seven times that year, Guam not once. In contrast, the Times ran 639 articles about India, Britain’s largest colony. That was nearly three times as many as it ran about all U.S. territories combined, territories in which more than 10 percent of the U.S. population lived. It wasn’t much different in the realm of books. Scanning the library shelves, it’s easy to find high-profile books from the interwar period depicting Native Americans and the western frontier (Little House on the Prairie is one), but prominent treatments of overseas territories are rare. The only one with a truly large audience was Coming of Age in Samoa (1928) by the anthropologist Margaret Mead, a wildly popular ethnography that featured frank discussions of Samoan sexuality and launched Mead’s career as one of the most famous scholars in the country.

pages: 936 words: 85,745

Programming Ruby 1.9: The Pragmatic Programmer's Guide by Dave Thomas, Chad Fowler, Andy Hunt

book scanning, David Heinemeier Hansson, Debian, domain-specific language, Jacquard loom, Kickstarter, p-value, revision control, Ruby on Rails, slashdot, sorting algorithm, web application

In this chapter, we’ll look in more depth at creating and manipulating those classes. Let’s give ourselves a simple problem to solve. Let’s say that we’re running a secondhand bookstore. Every week, we do stock control. A gang of clerks uses portable bar-code scanners to record every book on our shelves. Each scanner generates a simple comma-separated value (CSV) file containing one row for each book scanned. The row contains (among other things) the book’s ISBN and price. An extract from one of these files looks something like this: "Date","ISBN","Amount" "2008-04-12","978-1-9343561-0-4",39.45 "2008-04-13","978-1-9343561-6-6",45.67 "2008-04-14","978-1-9343560-7-4",36.95 Our job is to take all the CSV files and work out how many of each title we have, as well as the total list price of the books in stock.

pages: 918 words: 257,605

The Age of Surveillance Capitalism by Shoshana Zuboff

Amazon Web Services, Andrew Keen, augmented reality, autonomous vehicles, barriers to entry, Bartolomé de las Casas, Berlin Wall, bitcoin, blockchain, blue-collar work, book scanning, Broken windows theory, California gold rush, call centre, Capital in the Twenty-First Century by Thomas Piketty, Cass Sunstein, choice architecture, citizen journalism, cloud computing, collective bargaining, Computer Numeric Control, computer vision, connected car, corporate governance, corporate personhood, creative destruction, cryptocurrency, dogs of the Dow, don't be evil, Donald Trump, Edward Snowden,, Erik Brynjolfsson, facts on the ground, Ford paid five dollars a day, future of work, game design, Google Earth, Google Glasses, Google X / Alphabet X, hive mind, impulse control, income inequality, Internet of things, invention of the printing press, invisible hand, Jean Tirole, job automation, Johann Wolfgang von Goethe, John Markoff, John Maynard Keynes: Economic Possibilities for our Grandchildren, John Maynard Keynes: technological unemployment, Joseph Schumpeter, Kevin Kelly, knowledge economy, linked data, longitudinal study, low skilled workers, Mark Zuckerberg, market bubble, means of production, multi-sided market, Naomi Klein, natural language processing, Network effects, new economy, Occupy movement, off grid, PageRank, Panopticon Jeremy Bentham, pattern recognition, Paul Buchheit, performance metric, Philip Mirowski, precision agriculture, price mechanism, profit maximization, profit motive, recommendation engine, refrigerator car, RFID, Richard Thaler, ride hailing / ride sharing, Robert Bork, Robert Mercer, Second Machine Age, self-driving car, sentiment analysis, shareholder value, Shoshana Zuboff, Silicon Valley, Silicon Valley ideology, Silicon Valley startup, slashdot, smart cities, Snapchat, social graph, social web, software as a service, speech recognition, statistical model, Steve Jobs, Steven Levy, structural adjustment programs, The Future of Employment, The Wealth of Nations by Adam Smith, Tim Cook: Apple, two-sided market, union organizing, Watson beat the top human players on Jeopardy!, winner-take-all economy, Wolfgang Streeck

“European Commission—Press Release—Antitrust: Commission Sends Statement of Objections to Google on Android Operating System and Applications,” European Commission, April 20, 2016, 22. “Complaint of Disconnect, Inc.,” 40. 23. Marc Rotenberg, phone interview with author, June 2014. 24. Jennifer Howard, “Publishers Settle Long-Running Lawsuit Over Google’s Book-Scanning Project,” Chronicle of Higher Education, October 4, 2012,; “Google Books Settlement and Privacy,”, October 30, 2016,; Juan Carlos Perez, “Google Books Settlement Proposal Rejected,” PCWorld, March 22, 2011,; Eliot Van Buskirk, “Justice Dept. to Google Books: Close, but No Cigar,” Wired, February 5, 2010,; Miguel Helft, “Opposition to Google Books Settlement Jells,” New York Times—Bits Blog, April 17, 2009,; Brandon Butler, “The Google Books Settlement: Who Is Filing and What Are They Saying?”

EuroTragedy: A Drama in Nine Acts by Ashoka Mody

"Robert Solow", Andrei Shleifer, asset-backed security, availability heuristic, bank run, banking crisis, Basel III, Berlin Wall, book scanning, Bretton Woods, call centre, capital controls, Carmen Reinhart, Celtic Tiger, central bank independence, centre right, credit crunch, Daniel Kahneman / Amos Tversky, debt deflation, Donald Trump, eurozone crisis, Fall of the Berlin Wall, financial intermediation, floating exchange rates, forward guidance, George Akerlof, German hyperinflation, global supply chain, global value chain, hiring and firing, Home mortgage interest deduction, income inequality, inflation targeting, Irish property bubble, Isaac Newton, job automation, Johann Wolfgang von Goethe, Johannes Kepler, Kenneth Rogoff, Kickstarter, liberal capitalism, light touch regulation, liquidity trap, loadsamoney, London Interbank Offered Rate, Long Term Capital Management, low-wage service sector, Mikhail Gorbachev, mittelstand, money market fund, moral hazard, mortgage tax deduction, neoliberal agenda, offshore financial centre, oil shock, open borders, pension reform, premature optimization, price stability, purchasing power parity, quantitative easing, rent-seeking, Republic of Letters, Robert Gordon, Robert Shiller, Robert Shiller, short selling, Silicon Valley, The Great Moderation, The Rise and Fall of American Growth, too big to fail, total factor productivity, trade liberalization, transaction costs, urban renewal, working-age population, Yogi Berra

As 0.000025% December 1969, Conference at the Hague 0.000020% 0.000015% German English 0.000010% French 0.000005% 0.000000% 1940 45 50 55 60 65 70 75 80 85 90 95 2000 Figure 1.2. Germans led the intellectual inquiry into “flexible exchange rates.” (Frequency of reference to “flexible exchange rate” in books digitized by Google) Note: The graph was created using the Google Books Ngram Viewer (https://​​ngrams/​ info). It reports the frequency with which the phrase “flexible exchange rate” is mentioned in the books scanned by Google. The term “flexible Wechselkurs” was used for German books, and “taux de change flexible” was used for French books. The English variation “floating exchange rate,” the German variation “schwankender Wechselkurs,” and the French variation “taux de change flottant” yielded similar trends. three leaps in the dark 41 Robert Hetzel, economist at the Federal Reserve Bank of Richmond, would later explain: “Germany’s commitment to a free market economy pushed it to reject fixed exchange rates and adopt floating exchange rates.”82 Thus, in proposing a monetary union, Pompidou was defying not only the global experience that was causing fixed-​exchange-​rate systems to break down, but he was also ignoring the clash between the French dirigiste temperament and the German market-​oriented economic ideology.

pages: 1,205 words: 308,891

Bourgeois Dignity: Why Economics Can't Explain the Modern World by Deirdre N. McCloskey

Airbnb, Akira Okazaki, big-box store, Black Swan, book scanning, British Empire, business cycle, buy low sell high, Capital in the Twenty-First Century by Thomas Piketty, clean water, Columbian Exchange, conceptual framework, correlation does not imply causation, Costa Concordia, creative destruction, crony capitalism, dark matter, Dava Sobel, David Graeber, David Ricardo: comparative advantage, deindustrialization, demographic transition, Deng Xiaoping, Donald Trump, double entry bookkeeping,, epigenetics, Erik Brynjolfsson, experimental economics, Ferguson, Missouri, fundamental attribution error, Georg Cantor, George Akerlof, George Gilder, germ theory of disease, Gini coefficient, God and Mammon, greed is good, Gunnar Myrdal, Hans Rosling, Henry Ford's grandson gave labor union leader Walter Reuther a tour of the company’s new, automated factory…, Hernando de Soto, immigration reform, income inequality, interchangeable parts, invention of agriculture, invention of writing, invisible hand, Isaac Newton, Islamic Golden Age, James Watt: steam engine, Jane Jacobs, John Harrison: Longitude, John Maynard Keynes: technological unemployment, Joseph Schumpeter, Kenneth Arrow, knowledge economy, labor-force participation, lake wobegon effect, land reform, liberation theology, lone genius, Lyft, Mahatma Gandhi, Mark Zuckerberg, market fundamentalism, means of production, Naomi Klein, new economy, North Sea oil, Occupy movement, open economy, out of africa, Pareto efficiency, Paul Samuelson, Pax Mongolica, Peace of Westphalia, peak oil, Peter Singer: altruism, Philip Mirowski, pink-collar, plutocrats, Plutocrats, positional goods, profit maximization, profit motive, purchasing power parity, race to the bottom, refrigerator car, rent control, rent-seeking, Republic of Letters, road to serfdom, Robert Gordon, Robert Shiller, Robert Shiller, Ronald Coase, Scientific racism, Scramble for Africa, Second Machine Age, secular stagnation, Simon Kuznets, Social Responsibility of Business Is to Increase Its Profits, spinning jenny, stakhanovite, Steve Jobs, The Chicago School, The Market for Lemons, the rule of 72, The Spirit Level, The Wealth of Nations by Adam Smith, Thomas Malthus, Thorstein Veblen, total factor productivity, Toyota Production System, transaction costs, transatlantic slave trade, Tyler Cowen: Great Stagnation, uber lyft, union organizing, very high income, wage slave, Washington Consensus, working poor, Yogi Berra

Bevington 2002, p. 485. 6. It was a convention not always exploited. In Massinger’s A New Way to Pay Old Debts (mid-1620s) everyone, high and low, speaks in blank verse. 7. Thus: “For he today that sheds his blood with me,” iambic pentameter. 8. Magnusson 1999, p. 120: the lower orders “lack the mastery to assimilate the prestige forms successfully to their actual performance.” 9. Google Books scan of the reprinted 1698 edition, p. 117. The first public edition had been 1664, well after Mun’s death. Bizarrely, this famous remark (and “One man’s necessity becomes another man’s opportunity,” p. 116) is in aid of showing that expenditure on a suit at law is a good thing, because at least the money “is still in the kingdom,” and so foreign trade is unaffected, and so all is well in the crucial matter of acquiring bullion from abroad.