linked data

51 results back to index


Mastering Structured Data on the Semantic Web: From HTML5 Microdata to Linked Open Data by Leslie Sikos

AGPL, Amazon Web Services, bioinformatics, business process, cloud computing, create, read, update, delete, Debian, en.wikipedia.org, fault tolerance, Firefox, Google Chrome, Google Earth, information retrieval, Infrastructure as a Service, Internet of things, linked data, machine readable, machine translation, natural language processing, openstreetmap, optical character recognition, platform as a service, search engine result page, semantic web, Silicon Valley, social graph, software as a service, SPARQL, text mining, Watson beat the top human players on Jeopardy!, web application, Wikidata, wikimedia commons, Wikivoyage

For example, tabular data in HTML with RDFa annotation using URIs and semantic properties is five-star data. Maximum reusability and machine-interpretability. The expression of rights provided by licensing makes free data reuse possible. Linked Data without an explicit open license1 (e.g., public domain license) cannot be reused freely, but the quality of Linked Data is independent from licensing. When the specified criteria are met, all five ratings can be used both for Linked Data (for Linked Data without explicit open license) and Linked Open Data (Linked Data with an explicit open license). As a consequence, the five-star rating system can be depicted in a way that the criteria can be read with or without the open license.

More and more universities provide information about staff members, departments, facilities, courses, grants, and publications as Linked Data and RDF dump, such as the University of Florida (http://vivo.ufl.edu) and the Ghent University (http://data.mmlab.be/mmlab). Libraries such as the Princeton University Library (http://findingaids.princeton.edu) publish bibliographic information as Linked Data. Part of the National Digital Data Archive of Hungary is available as Linked Data at http://lod.sztaki.hu. Even Project Gutenberg is available as Linked Data (http://wifo5-03.informatik.uni-mannheim.de/ gutendata/). Museums such as the British Museum publish some of their records as Linked Data (http://collection.britishmuseum.org).

Twitter Card Annotation in the Markup <meta name="twitter:card" content="summary" /> <meta name="twitter:site" content="@lesliesikos" /> <meta name="twitter:creator" content="@lesliesikos" /> <meta property="og:url" content="http://www.lesliesikos.com/linked-data-platform-1-0  standardized/" /> <meta property="og:title" content="Linked Data Platform 1.0 Standardized" /> <meta property="og:description" content="The Linked Data Platform 1.0 is now a W3C  Recommendation, covering a set of rules for HTTP operations on Web resources, including  RDF-based Linked Data, to provide an architecture for read-write Linked Data on the  Semantic Web." /> <meta property="og:image" content="http://www.lesliesikos.com/img/LOD.svg" /> 211 Chapter 8 ■ Big Data Applications IBM Watson IBM Watson’s DeepQA system is a question-answering system originally designed to compete with contestants of the Jeopardy!


pages: 315 words: 70,044

Learning SPARQL by Bob Ducharme

database schema, Donald Knuth, en.wikipedia.org, G4S, linked data, machine readable, semantic web, SPARQL, web application

For example, simply knowing that “spouse” is a symmetric term made it possible to find out the identity of Cindy’s spouse, even though this fact was not part of the dataset. Linked Data The idea of Linked Data is newer than that of the semantic web, but sometimes it’s easier to think of the semantic web as building on the ideas behind Linked Data. Linked Data is not a specification, but a set of best practices for providing a data infrastructure that makes it easier to share data across the web. You can then use semantic web technologies such as RDFS, OWL, and SPARQL to build applications around that data. Tim Berners-Lee came up with these four principles of Linked Data in 2006 (I’ve bolded his wording and added my own commentary): Use URIs as names for things.

., Checking, Adding, and Removing Spoken Language Tags langMatches(), Checking, Adding, and Removing Spoken Language Tags language codes, Making RDF More Readable with Language Tags and Labels, Using the Labels Provided by DBpedia, Checking, Adding, and Removing Spoken Language Tags, Checking, Adding, and Removing Spoken Language Tags checking, adding, and removing, Checking, Adding, and Removing Spoken Language Tags, Checking, Adding, and Removing Spoken Language Tags filtering on, Using the Labels Provided by DBpedia LCASE(), String Functions LIMIT, Retrieving a Specific Number of Results, Federated Queries: Searching Multiple Datasets with One Query Linked Data, What Exactly Is the “Semantic Web”?, Linked Data, Linked Data, Linked Data, Public Endpoints, Private Endpoints, Public Endpoints, Private Endpoints, Glossary intranets and, Public Endpoints, Private Endpoints Linked Open Data, Linked Data, Public Endpoints, Private Endpoints Linked Movie Database, SPARQL and Web Application Development, SPARQL and Web Application Development literal, Data Typing, Glossary LOAD, Adding Data to a Dataset local name, URLs, URIs, IRIs, and Namespaces, Glossary M MAX(), Finding the Smallest, the Biggest, the Count, the Average...

o as variable names, Searching for Strings [], Blank Nodes and Why They’re Useful (see square braces) ^ in property paths, Searching Further in the Data ^^ datatype indicator, Datatypes and Queries _ in blank node names, Blank Nodes and Why They’re Useful | in property paths, Searching Further in the Data || in boolean expressions, Program Logic Functions “"” to delimit strings in Turtle and SPARQL, Representing Strings A a (“a”) as keyword, Reusing and Creating Vocabularies: RDF Schema and OWL abs(), Numeric Functions addition, Comparing Values and Doing Arithmetic AGROVOC thesaurus, Datatypes and Queries APIs, SPARQL, Query Forms: SELECT, DESCRIBE, ASK, and CONSTRUCT arithmetic, Comparing Values and Doing Arithmetic, Comparing Values and Doing Arithmetic ARQ SPARQL processor, Querying the Data, Standalone Processors application development and, Standalone Processors AS, Combining Values and Assigning Values to Variables ASK, Query Forms: SELECT, DESCRIBE, ASK, and CONSTRUCT, Defining Rules with SPARQL, Defining Rules with SPARQL SPARQL rules and, Defining Rules with SPARQL, Defining Rules with SPARQL asterisk, Searching for Strings, Searching Further in the Data in property paths, Searching Further in the Data in SELECT expression, Searching for Strings AVG(), Finding the Smallest, the Biggest, the Count, the Average..., Grouping Data and Finding Aggregate Values within Groups B bad data, finding, Finding Bad Data, Using Existing SPARQL Rules Vocabularies BASE, Node Type Conversion Functions Berners-Lee, Tim, Why Learn SPARQL?, What Exactly Is the “Semantic Web”?, Linked Data Linked Data and, Linked Data biggest value, finding, Finding the Smallest, the Biggest, the Count, the Average..., Finding the Smallest, the Biggest, the Count, the Average... BIND, Combining Values and Assigning Values to Variables, Creating New Data, Comparing Values and Doing Arithmetic in CONSTRUCT queries, Creating New Data binding, More Realistic Data and Matching on Multiple Triples, Glossary, Glossary blank nodes, Blank Nodes and Why They’re Useful, Blank Nodes and Why They’re Useful, Blank Nodes and Why They’re Useful, Searching with Blank Nodes, Using Existing SPARQL Rules Vocabularies, Node Type Conversion Functions, Glossary searching with, Searching with Blank Nodes square braces to represent, Using Existing SPARQL Rules Vocabularies bnode, Blank Nodes and Why They’re Useful (see blank nodes) boolean datatype, Datatypes and Queries bound(), Finding Data That Doesn’t Meet Certain Conditions, Node Type and Datatype Checking Functions C cast, Glossary casting, Functions ceil(), Numeric Functions CGI scripts, SPARQL and Web Application Development classes, Reusing and Creating Vocabularies: RDF Schema and OWL, Reusing and Creating Vocabularies: RDF Schema and OWL, Creating New Data subclasses and, Reusing and Creating Vocabularies: RDF Schema and OWL CLEAR, Deleting Data COALESCE(), Program Logic Functions comma, Storing RDF in Files, Converting Data CONSTRUCT queries and, Converting Data in N3 and Turtle, Storing RDF in Files comma separated values, Standalone Processors comments (in Turtle and SPARQL), The Data to Query CONCAT(), Program Logic Functions CONSTRUCT, Query Forms: SELECT, DESCRIBE, ASK, and CONSTRUCT, Copying Data, Converting Data, Changing Existing Data prototyping update commands with, Changing Existing Data CONTAINS(), String Functions, String Functions, Extension Functions converting data, Converting Data, Converting Data copying data, Copying Data, Copying Data COUNT(), Finding the Smallest, the Biggest, the Count, the Average..., Grouping Data and Finding Aggregate Values within Groups CSS, SPARQL and Web Application Development curl utility, SPARQL and Web Application Development D D2RQ, Querying a Remote SPARQL Service, Middleware SPARQL Support data cleanup, FILTERing Data Based on Conditions data typing, Data Typing, Data Typing datatype(), Defining Rules with SPARQL, Node Type and Datatype Checking Functions datatypes, Datatypes and Queries, Datatype Conversion, Datatype Conversion converting, Datatype Conversion, Datatype Conversion custom, Datatypes and Queries date datatype, Datatypes and Queries date ranges in queries, Comparing Values and Doing Arithmetic dateTime datatype, Datatypes and Queries day(), Date and Time Functions DBpedia, Querying a Public Data Source, Using the Labels Provided by DBpedia, SPARQL and Web Application Development querying, Querying a Public Data Source decimal datatype, Datatypes and Queries default graph, Querying Named Graphs, Glossary DELETE, Deleting Data DELETE DATA, Deleting Data, Deleting Data DELETE vs., Deleting Data DESC(), Sorting Data DESCRIBE, Query Forms: SELECT, DESCRIBE, ASK, and CONSTRUCT, Asking for a Description of a Resource DISTINCT, Eliminating Redundant Output, Eliminating Redundant Output, Querying Named Graphs division, Comparing Values and Doing Arithmetic double precision datatype, Datatypes and Queries DROP, Dropping Graphs Dublin Core, URLs, URIs, IRIs, and Namespaces, Changing Existing Data, Glossary E ENCODE_FOR_URI(), String Functions entailment, The SPARQL Specifications, Glossary F FILTER, Searching for Strings, FILTERing Data Based on Conditions, FILTERing Data Based on Conditions float datatype, Datatypes and Queries floor(), Numeric Functions FOAF (Friend of a Friend), URLs, URIs, IRIs, and Namespaces, Storing RDF in Files, Converting Data, Hash Functions, Glossary hash functions in, Hash Functions Freebase, SPARQL and Web Application Development FROM, Querying the Data, Querying Named Graphs, Copying Data in CONSTRUCT queries, Copying Data FROM NAMED, Querying Named Graphs Fuseki, Getting Started with Fuseki, Getting Started with Fuseki, Adding Data to a Dataset loading data into, Adding Data to a Dataset shutting down, Getting Started with Fuseki starting up, Getting Started with Fuseki G GRAPH, Querying Named Graphs, Querying Named Graphs, Querying Named Graphs, Copying Data, Named Graphs in CONSTRUCT queries, Copying Data in update queries, Named Graphs referencing graphs not named in FROM NAMED clause, Querying Named Graphs variables with, Querying Named Graphs graph pattern, More Realistic Data and Matching on Multiple Triples, Glossary graphs (RDF), More Realistic Data and Matching on Multiple Triples, Glossary GROUP BY, Grouping Data and Finding Aggregate Values within Groups GROUP_CONCAT(), Finding the Smallest, the Biggest, the Count, the Average...


pages: 511 words: 111,423

Learning SPARQL by Bob Ducharme

business logic, Donald Knuth, en.wikipedia.org, G4S, hypertext link, linked data, machine readable, place-making, semantic web, SPARQL, web application

We’ll learn more about RDFS and OWL in Chapter 9. Linked Data The idea of Linked Data is newer than that of the semantic web, but sometimes it’s easier to think of the semantic web as building on the ideas behind Linked Data. Linked Data is not a specification, but a set of best practices for providing a data infrastructure that makes it easier to share data across the Web. You can then use semantic web technologies such as RDFS, OWL, and SPARQL to build applications around that data. Tim Berners-Lee came up with these four principles of Linked Data in 2006 (I’ve bolded his wording and added my own commentary): Use URIs as names for things.

., Checking, Adding, and Removing Spoken Language Tags langMatches(), Checking, Adding, and Removing Spoken Language Tags language codes, Making RDF More Readable with Language Tags and Labels, Checking, Adding, and Removing Spoken Language Tags–Checking, Adding, and Removing Spoken Language Tags adding, Checking, Adding, and Removing Spoken Language Tags checking, Checking, Adding, and Removing Spoken Language Tags filtering on, Using the Labels Provided by DBpedia removing, Checking, Adding, and Removing Spoken Language Tags LCASE(), String Functions, Discussion LIMIT, Retrieving a Specific Number of Results, Federated Queries: Searching Multiple Datasets with One Query Linked Data, What Exactly Is the “Semantic Web”?, Linked DataLinked Data, Problem, Glossary intranets and, Public Endpoints, Private Endpoints Linked Open Data, Linked Data, Public Endpoints, Private Endpoints Linked Movie Database, SPARQL and Web Application Development, SPARQL and Web Application Development Linked Open Data, Discussion List All Triples query, Named Graphs literal, Data Typing, Glossary LOAD, Adding Data to a Dataset local name, URLs, URIs, IRIs, and Namespaces, Extension Functions, Glossary M magic properties (see property functions) materialization of triples, Inferred Triples and Your Query MAX(), Finding the Smallest, the Biggest, the Count, the Average...

This means that a good understanding of the role of URIs gives you greater control over your queries. Note The URIs that identify RDF resources are like the unique ID fields of relational database tables, except that they’re universally unique, which lets you link data from different sources around the world instead of just linking data from different tables in the same database. The Resource Description Framework (RDF) In Chapter 1, we learned the following about the Resource Description Framework: It’s a data model in which the basic unit of information is known as a triple. A triple consists of a subject, a predicate, and an object.


The Data Revolution: Big Data, Open Data, Data Infrastructures and Their Consequences by Rob Kitchin

Bayesian statistics, business intelligence, business process, cellular automata, Celtic Tiger, cloud computing, collateralized debt obligation, conceptual framework, congestion charging, corporate governance, correlation does not imply causation, crowdsourcing, data science, discrete time, disruptive innovation, George Gilder, Google Earth, hype cycle, Infrastructure as a Service, Internet Archive, Internet of things, invisible hand, knowledge economy, Large Hadron Collider, late capitalism, lifelogging, linked data, longitudinal study, machine readable, Masdar, means of production, Nate Silver, natural language processing, openstreetmap, pattern recognition, platform as a service, recommendation engine, RFID, semantic web, sentiment analysis, SimCity, slashdot, smart cities, Smart Cities: Big Data, Civic Hackers, and the Quest for a New Utopia, smart grid, smart meter, software as a service, statistical model, supply-chain management, technological solutionism, the scientific method, The Signal and the Noise by Nate Silver, transaction costs

Conclusion At one level, the case for open and linked data is commonsensical – open data create transparency and accountability; participation, choice and social innovation; efficiency, productivity and enhanced governance; economic innovation and wealth creation. Linked data convert information across the Internet into a semantic web from which data can be machine-read and linked together. Open and linked data thus hold much promise and value as a venture. However, the case for open and linked data is more complex, and their economic underpinnings are not at all straightforward. Open and linked data might seem to have marginal costs, but their production and the technical and institutional apparatus needed to facilitate and maintain them has real cost in terms of labour, equipment, and resources.

When documents are published in this way, information on the Internet can be rendered and repackaged as data and can be linked in an infinite number of ways depending on purpose. However, as P. Miller (2010) notes, ‘linked data may be open, and open data may be linked, but it is equally possible for linked data to carry licensing or other restrictions that prevent it being considered open’, or for open data to be made available in ways that do not easily enable linking. In general, any linked documents that are not on an intranet or behind a pay wall are also open in nature. For Berners-Lee (2009), open and linked data should ideally be synonymous and he sets out five levels of such data, each with progressively more utility and value (see Table 3.3).

Since the late 2000s the movement has noticeably gained prominence and traction, initially with the Guardian newspaper’s campaign in the UK to ‘Free Our Data’ (www.theguardian.com/technology/free-ourdata), the Organization for Economic Cooperation and Development (OECD)’s call for member governments to open up their data in 2008, the launch in 2009 by the US government of data.gov, a website designed to provide access to non-sensitive and historical datasets held by US state and federal agencies, and the development of linked data and the promotion of the ‘Semantic Web’ as a standard element of future Internet technologies, in which open and linked data are often discursively conjoined (Berners-Lee 2009). Since 2010 dozens of countries and international organisations (e.g., the European Union [EU] and the United Nations Development Programme [UNDP]) have followed suit, making thousands of previously restricted datasets open in nature for non-commercial and commercial use (see DataRemixed 2013).


pages: 369 words: 80,355

Too Big to Know: Rethinking Knowledge Now That the Facts Aren't the Facts, Experts Are Everywhere, and the Smartest Person in the Room Is the Room by David Weinberger

airport security, Alfred Russel Wallace, Alvin Toffler, Amazon Mechanical Turk, An Inconvenient Truth, Berlin Wall, Black Swan, book scanning, Cass Sunstein, commoditize, Computer Lib, corporate social responsibility, crowdsourcing, Danny Hillis, David Brooks, Debian, double entry bookkeeping, double helix, Dr. Strangelove, en.wikipedia.org, Exxon Valdez, Fall of the Berlin Wall, future of journalism, Future Shock, Galaxy Zoo, Gregor Mendel, Hacker Ethic, Haight Ashbury, Herman Kahn, hive mind, Howard Rheingold, invention of the telegraph, Jeff Hawkins, jimmy wales, Johannes Kepler, John Harrison: Longitude, Kevin Kelly, Large Hadron Collider, linked data, Neil Armstrong, Netflix Prize, New Journalism, Nicholas Carr, Norbert Wiener, off-the-grid, openstreetmap, P = NP, P vs NP, PalmPilot, Pluto: dwarf planet, profit motive, Ralph Waldo Emerson, RAND corporation, Ray Kurzweil, Republic of Letters, RFID, Richard Feynman, Ronald Reagan, scientific management, semantic web, slashdot, social graph, Steven Pinker, Stewart Brand, systems thinking, technological singularity, Ted Nelson, the Cathedral and the Bazaar, the scientific method, The Wisdom of Crowds, Thomas Kuhn: the structure of scientific revolutions, Thomas Malthus, Whole Earth Catalog, X Prize

The rise of Linked Data encapsulates the transformation of knowledge we have explored throughout this book. While the original Semantic Web emphasized building ontologies that are “knowledge representations” of the world, it turns out that if we go straight to unleashing an abundance of linked but imperfect data, making it widely and openly available in standardized form, the Net becomes a dramatically improved infrastructure for knowledge. Linked Data is nevertheless itself only an example of a more expansive practice: Create metadata so your information can be reused. Linked Data is usable because it points beyond itself to information about the information.

For example, when an article in the journal Public Library of Science Medicine 43 examines “the predictors of live birth” in in vitro fertilization by analyzing 144,018 attempts, it links to the UK open government site where the source data—“the world’s oldest and most comprehensive database of fertility treatment in the UK”—is available.44 The new default is: If you’re going to cite the data, you might as well link to it. Networked facts point to where they came from and, sometimes, where they lead to. Indeed, a new standard called Linked Data is making it easier to make the facts presented in one site useful to other sites in unanticipated ways—enabling an ad hoc worldwide data commons. Key to Linked Data is the ability for a computer program not only to get the fact but to ask the resource for a link to more information about the context of the fact.45 Facts have become networked because our new information infrastructure happens also to be a hyperlinked publishing system.

We used to need trust because paper-based publishing breaks knowledge off from its source. Now, however, science—which has always had a network of inter-cited publications—occurs within a network of links. We create these links by hand, computers prowl the Web suggesting new links, and the surge of interest in the Linked Data format is making it easier than ever to create clouds of linked data just waiting for new uses. In this hyperlinked environment, we will continue to tell science’s stories, but those stories will be embedded within a system of connections. We will click to see the data. We will click to have our computers compare disparate datasets, surfacing the anomalies and disagreements that will never be entirely driven out from the data of science or from its stories.


RDF Database Systems: Triples Storage and SPARQL Query Processing by Olivier Cure, Guillaume Blin

Amazon Web Services, bioinformatics, business intelligence, cloud computing, database schema, fault tolerance, folksonomy, full text search, functional programming, information retrieval, Internet Archive, Internet of things, linked data, machine readable, NP-complete, peer-to-peer, performance metric, power law, random walk, recommendation engine, RFID, semantic web, Silicon Valley, social intelligence, software as a service, SPARQL, sparse data, web application

See Knowledge base management system (KBMS) Key-value stores, 27 Knowledge base management system (KBMS), 192 Kowari system, 112 L Last-to-front mapping property, 91 Lehigh University benchmark (LUBM), 77 data set, 98, 123 Linked Data Integration Benchmark (LODIB), 78 Linked data movement, 181 LinkedIn, 99 LinkedMDb, 168 Linked open data (LOD), 3 Literal, full text search, 99 analyzing text, 100 Load scalability, 170 LOD. See Linked open data (LOD) LODIB. See Linked Data Integration Benchmark (LODIB) LUBM. See Lehigh University benchmark (LUBM) Lucene software library, 176 Lucene documents, 100 M MAAN. See Multi-attribute addressable network (MAAN) MapReduce -based cluster, 32, 109 MapReduce decompression approach, 103 MapReduce processing, 9 MapReduce programming model, 102 MapReduce tasks, 37 MariaDB system, 23 MarkLogic system, 32, 138, 179 MaRVIN system, 217 Maximum-weight independent sets problem, 153 Mediator-based information systems, 187 Membase system, 137 Memcached system, 29 Memory mapping, 81 MemSQL system, 38 Message passing interface (MPI) approach, 216 Microformats, 52 Model checking, 193 MonetDB, 19 MongoDB system, 28, 32, 137 MPI approach.

The main advantages of JSON are its simplicity, flexibility (it’s schemaless), and native processing support for most Web applications due to a tight integration with the JavaScript programming language. But RDF is not without assets. For example, as a semi-structured data model, RDF data sets can be described with expressive schema languages, such as RDF Schema (RDFS) or Web Ontology Language (OWL), and can be linked to other documents present on the Web, forming the Linked Data movement. With the emergence of Linked Data, a pattern for hyperlinking machine-readable data sets that extensively uses RDF, URIs, and HTTP, we can consider that more and more data will be directly produced in or transformed into RDF. In 2013, the linked open data (LOD), a set of RDF data produced from open data sources, is considered to contain over 50 billion triples on domains as diverse as medicine, culture, and science, just to name a few.

The FedBench Benchmark (http://fedbench.fluidops.net/) uses several data sets (around 10, among which there are DBpedia subsets, NewYork Times, LinkedMDB, and Drugbank) on cross and life science domains (news, movies, music, drugs, etc.). The major aim of FedBench is to test the efficiency and effectiveness of federated query processing. Other benchmarks, such as Linked Data Integration Benchmark (LODIB) or JustBench, are designed to evaluate other properties of related systems, such as considering linked data (i.e., with real-world heterogeneities) or OWL capabilities of reasoners. 3.8 BUILDING SEMANTIC WEB APPLICATIONS Jena (http://jena.apache.org/) is an open-source Semantic Web framework for Java and is widely used in the Java community.


pages: 245 words: 68,420

Content Everywhere: Strategy and Structure for Future-Ready Content by Sara Wachter-Boettcher

business logic, crowdsourcing, John Gruber, Kickstarter, linked data, machine readable, search engine result page, semantic web, Silicon Valley, systems thinking, TechCrunch disrupt

But a more semantic Web seems closer than ever with the recent advent of linked data, which is made possible through structured content and markup. Coined by Tim Berners-Lee—yes, the guy who invented the World Wide Web—in 2006, linked data means exactly what it sounds like: bits of information that are linked to other, equivalent sets of data elsewhere on the Internet (often referred to as “in the cloud”), as illustrated in Figure 6.1. The idea is that, as opposed to HTML links, which link one document (e.g., a page) to another, linked data connects the things those pages are about by connecting the actual data behind those two pages instead.

This gives both databases access to the information in the other, and that information then becomes more useful to both people and machines. FIGURE 6.1 Linked data connects content from different places, like between your website and Wikipedia, based on shared content attributes—and it’s getting more and more useful for connecting content across sources. For example, consider The New York Times. Since the 19th century, it’s been maintaining a tremendous index of people, organizations, places, and descriptors in the news. Starting in 1913, it began publishing that data first in a quarterly index, and later an annual one.1 Now that its collection has been digitized, the Times has opened it up as linked data at http://data.nytimes.com, making this extensive list of topics—well over 10,000 as of this writing, with plans to continually add more—accessible to anyone who wants it.

And all these pages of content are built automatically, using the content’s underlying structure to dictate what’s contextually relevant where. Finally, remember our introduction to linked data in Chapter 6, “Understanding Markup”? Well, the BBC is making use of that, too. Rather than, say, hiring writers to craft overviews of every animal the BBC has video footage about, the organization relies on content from other sources, accessible via linked data. That is, by structuring content along the same lines as sources like Wikipedia, the BBC can automatically pull in the content it doesn’t have—and isn’t invested enough in to create—from an external source.


pages: 223 words: 52,808

Intertwingled: The Work and Influence of Ted Nelson (History of Computing) by Douglas R. Dechow

3D printing, Apple II, Bill Duvall, Brewster Kahle, Buckminster Fuller, Claude Shannon: information theory, cognitive dissonance, computer age, Computer Lib, conceptual framework, Douglas Engelbart, Douglas Engelbart, Dynabook, Edward Snowden, game design, HyperCard, hypertext link, Ian Bogost, information retrieval, Internet Archive, Ivan Sutherland, Jaron Lanier, knowledge worker, linked data, Marc Andreessen, Marshall McLuhan, Menlo Park, Mother of all demos, pre–internet, Project Xanadu, RAND corporation, semantic web, Silicon Valley, software studies, Steve Jobs, Steve Wozniak, Stewart Brand, Ted Nelson, TED Talk, The Home Computer Revolution, the medium is the message, Vannevar Bush, Wall-E, Whole Earth Catalog

So for me this really was a seminal conference with so many truly ground breaking ideas emerging at the same time, apparently orthogonal to each other but actually all the same thing as time has confirmed, since the Google Knowledge Graph is the Semantic Web or ZigZag by another name. It’s all about linking data. This is a much quieter revolution than that initiated by the document Web but it will be much more far reaching. Linked data will become an integral part of the development of data-driven systems architectures that will revolutionize the way we build and maintain information management systems. Linked data architectures will supersede relational databases, make websites easier to build and unify the worlds of hypertext, document management, and databases to create rich interlinked knowledge-based systems as envisaged by the pioneers such as Ted and Doug over 50 years ago.

Linked data architectures will supersede relational databases, make websites easier to build and unify the worlds of hypertext, document management, and databases to create rich interlinked knowledge-based systems as envisaged by the pioneers such as Ted and Doug over 50 years ago. But the linked data revolution was very slow to take off—largely because it’s hard to explain the key concepts to people and what the benefits are. In 2004, it seemed to have completely stalled. Analyzing why this was the case is a much longer story than I have time to tell here, but as a by-product of doing this analysis at the time, Tim, Nigel Shadbolt, Danny Weitzner, and I started to look back at the factors that made the web of linked documents take off in order to try and understand why the web of linked data wasn’t. We realized that to understand the ecosystem that is the Web we have to take a socio-technical approach.

Agosti M, Ferr N (2007) A formal model of annotations of digital content. ACM Trans Inf Syst 26(1). doi:10.​1145/​1292591.​1292594 2. Baca M (1998) Introduction to metadata: pathways to digital information. Getty Information Institute, Los Angeles 3. Bechhofer S, Buchan I, De Roure D, Missier P, Ainsworth J, Bhagat J, Goble C et al (2013) Why linked data is not enough for scientists. Futur Gener Comput Syst 29(2). Special section: Recent advances in e-Science: 599–611. doi:10.​1016/​j.​future.​2011.​08.​004 4. Bechhofer S, De Roure D, Gamble M, Goble C, Buchan I (2010) Research objects: towards exchange and reuse of digital knowledge. Nat Proc. doi:10.​1038/​npre.​2010.​4626.​1 5.


The Art of SEO by Eric Enge, Stephan Spencer, Jessie Stricchiola, Rand Fishkin

AltaVista, barriers to entry, bounce rate, Build a better mousetrap, business intelligence, cloud computing, content marketing, dark matter, en.wikipedia.org, Firefox, folksonomy, Google Chrome, Google Earth, hypertext link, index card, information retrieval, Internet Archive, Larry Ellison, Law of Accelerating Returns, linked data, mass immigration, Metcalfe’s law, Network effects, optical character recognition, PageRank, performance metric, Quicken Loans, risk tolerance, search engine result page, self-driving car, sentiment analysis, social bookmarking, social web, sorting algorithm, speech recognition, Steven Levy, text mining, the long tail, vertical integration, Wayback Machine, web application, wikimedia commons

Figure 10-51 and Figure 10-52 depict some example graphs showing the rate of new external links (and in the last two instances, pages) created over time, with some speculation as to what the trends might indicate. Figure 10-51. Interpreting new external link data Figure 10-52. More link data speculation These assumptions do not necessarily hold true for every site or instance, but the graphs make it easy to see how the engines can use temporal link and content growth information to make guesses about the relevance or worthiness of a particular site. Figure 10-53 shows some guesstimates of a few real sites and how these trends have affected them. Figure 10-53. Wikipedia link data guesstimates As you can see in Figure 10-53, Wikipedia has had tremendous growth in both pages and links from 2007 through 2011.

Google and Bing Webmaster Tools As mentioned earlier, other valuable sources of data include Google Webmaster Tools and Bing Webmaster Tools. We cover these extensively in Using Search Engine–Supplied SEO Tools. From a planning perspective, you will want to get these tools in place as soon as possible. Both tools provide valuable insight into how the search engines see your site. This includes things such as external link data, internal link data, crawl errors, high-volume search terms, and much, much more. Note Some companies will not want to set up these tools because they do not want to share their data with the search engines, but this is a nonissue as the tools do not provide the search engines with any more data about your website; rather, they let you see some of the data the search engines already have.

This plug-in provides basic link data on the fly with just a couple of mouse clicks. Figure 10-23 shows the menu you’ll see with regard to backlinks. Notice also in the figure that the SearchStatus plug-in offers an option for highlighting NoFollow links, as well as many other capabilities. It is a great tool that allows you to pull numbers such as these much more quickly than would otherwise be possible. Figure 10-23. Firefox SearchStatus plug-in Third-party link-measuring tools Here is a look at some of the better-known advanced third-party tools for gathering link data. Open Site Explorer Open Site Explorer was developed based on crawl data obtained by SEOmoz, plus a variety of parties engaged by SEOmoz.


Beautiful Visualization by Julie Steele

barriers to entry, correlation does not imply causation, data acquisition, data science, database schema, Drosophila, en.wikipedia.org, epigenetics, global pandemic, Hans Rosling, index card, information retrieval, iterative process, linked data, Mercator projection, meta-analysis, natural language processing, Netflix Prize, no-fly zone, pattern recognition, peer-to-peer, performance metric, power law, QR code, recommendation engine, semantic web, social bookmarking, social distancing, social graph, sorting algorithm, Steve Jobs, the long tail, web application, wikimedia commons, Yochai Benkler

However, choosing an effective presentation is challenging, as not all information visualizations are created equally. Not all information visualizations highlight the patterns, gaps, and outliers important to analysts’ tasks, and furthermore, not all information visualizations “force us to notice what we never expected to see” (Tukey 1977). A growing trend in data analysis is to make sense of linked data as networks. Rather than looking solely at attributes of data, network analysts also focus on the connections between data and the resulting structures. My research focuses on understanding these networks because they are topical, emergent, and inherently challenging for analysts. Networks are difficult to visualize and navigate, and, most problematically, it is difficult to find task-relevant patterns.

If we’re starting from a graph representation of the database, as defined in Figure 14-2, this is a simple task. All we need is a nodeset and an edgeset, which can be easily produced from a relational set of tables; it might even come for free if the database is available in the form of an RDF dump (Freebase 2009) or as Linked Data (Bizer, Heath, and Berners-Lee 2009). From there, we can easily produce a node-link diagram using a graph drawing program such as Cytoscape (Shannon et al. 2003)—an open source application that has its roots in the biological networks scientific community. The resulting diagram, shown in Figure 14-3, depicts the given data model in a similar way as a regular Entity-Relationship (E-R) data structure diagram (Chen 1976), enriched with some quantitative information about the actual data.

The CENSUS data model as a weighted node-link diagram The heterogeneity of node and link type frequency evidenced in Figure 14-3 is not restricted to our example. It is observable in many datasets, including research databases (Schich and Ebert-Schifferer 2009), large bibliographies (Schich et al. 2009), Freebase, and the Linked Data cloud, regardless of whether the number of types is predefined or expandable by the curators. In all cases that I have seen so far, both the number of nodes per node type and the number of links per link type exhibit right-skewed diminishing distributions, which are widely known as long tails (Anderson 2006, Newman 2005), and lack a shared average as found in a normal Gaussian distribution.


pages: 283 words: 78,705

Principles of Web API Design: Delivering Value with APIs and Microservices by James Higginbotham

Amazon Web Services, anti-pattern, business intelligence, business logic, business process, Clayton Christensen, cognitive dissonance, cognitive load, collaborative editing, continuous integration, create, read, update, delete, database schema, DevOps, fallacies of distributed computing, fault tolerance, index card, Internet of things, inventory management, Kubernetes, linked data, loose coupling, machine readable, Metcalfe’s law, microservices, recommendation engine, semantic web, side project, single page application, Snapchat, software as a service, SQL injection, web application, WebSocket

authorId=765" } } } } } * * * Semantic Hypermedia Messaging Semantic hypermedia messaging is the most comprehensive category as it adds semantic profile and linked data support, making APIs part of the Semantic Web. By applying semantics of resource properties through linked data, more meaning is assigned to each property without requiring an explicit name to be used. Linked data usually relies on a shared vocabulary from Schema.org or other resources. With the growth of data analytics and machine learning, linking data to shared vocabularies enable automated systems to easily derive value of the data provided from APIs. Common formats that support semantic hypermedia messaging include Hydra, UBER, Hyper, JSON-LD, and OData.

", "label" : "Book Description", "rel" : ["https://schema.org/description"] }, { "name" : "authors", "rel" : ["collec tion","http://example.org/rels/authors"], "data" : [ { "id" : "author-765", "rel" : ["http://schema.org/Person"], "url" : "http://example.org/authors/765", "data" : [ { "name" : "authorId", "value" : "765", "label" : "Author ID" }, { "name" : "fullName", "value" : "Vaughn Vernon", "label" : "Full Name", "rel" : "https://schema.org/name" } ] } ] }, ] } ] } } * * * Notice how the size of the representations grow compared to the more compact resource serialization formats. With the increased size comes the addition of linked data and more powerful interactions with API clients. These representation formats offer more insight into how to navigate related resources and tap into new operations, including operations that were not available when the client was built. The goal is to enable generic clients to interact with APIs without the need for custom code or user interfaces.


pages: 100 words: 15,500

Getting Started with D3 by Mike Dewar

data science, Firefox, Google Chrome, linked data

First, we lay out the circles and edges: var width = 1500, height = 1500; var svg = d3.select("body") .append("svg") .attr("width", width) .attr("height", height); var node = svg.selectAll("circle.node") .data(data.nodes) .enter() .append("circle") .attr("class", "node") .attr("r", 12); var link = svg.selectAll("line.link") .data(data.links) .enter().append("line") .style("stroke","black"); This populates the web page with the appropriate elements, we just need to lay them out. The force layout applies a force-directed algorithm to decide the position of each node. Here, each node feels a repulsive force from every other node, but is constrained by the edges that keep nodes connected together.

Here, each node feels a repulsive force from every other node, but is constrained by the edges that keep nodes connected together. This can result in an organic layout that looks wonderfully inviting as it unfolds. D3 makes it easy; first we instantiate the algorithm: var force = d3.layout.force() .charge(-120) .linkDistance(30) .size([width, height]) .nodes(data.nodes) .links(data.links) .start(); These methods are all custom methods for the algorithm that detail the various parameters and references the algorithm needs to compute how the position of the nodes and edges should change. We then use it to modify the appropriate attributes of our lines and circles: force.on("tick", function() { link.attr("x1", function(d) { return d.source.x; }) .attr("y1", function(d) { return d.source.y; }) .attr("x2", function(d) { return d.target.x; }) .attr("y2", function(d) { return d.target.y; }); node.attr("cx", function(d) { return d.x; }) .attr("cy", function(d) { return d.y; }); }); The layout algorithm generates a tick event, which corresponds to a single step of the layout algorithm.


pages: 713 words: 93,944

Seven Databases in Seven Weeks: A Guide to Modern Databases and the NoSQL Movement by Eric Redmond, Jim Wilson, Jim R. Wilson

AGPL, Amazon Web Services, business logic, create, read, update, delete, data is the new oil, database schema, Debian, domain-specific language, en.wikipedia.org, fault tolerance, full text search, general-purpose programming language, Kickstarter, Large Hadron Collider, linked data, MVC pattern, natural language processing, node package manager, random walk, recommendation engine, Ruby on Rails, seminal paper, Skype, social graph, sparse data, web application

For example, if the text of the article on Star Wars contains the string "[[Yoda|jedi master]]", we want to store that relationship twice—once as an outgoing link from Star Wars and once as an incoming link to Yoda. Storing the relationship twice means that it’s fast to look up both a page’s outgoing links and its incoming links. To store this additional link data, we’ll create a new table. Head over to the shell and enter this: ​​hbase> create 'links', {​​ ​​ NAME => 'to', VERSIONS => 1, BLOOMFILTER => 'ROWCOL'​​ ​​},{​​ ​​ NAME => 'from', VERSIONS => 1, BLOOMFILTER => 'ROWCOL'​​ ​​}​​ In principle, we could have chosen to shove the link data into an existing column family or merely added one or more additional column families to the wiki table, rather than create a new one. Creating a separate table has the advantage that the tables have separate regions.

The real strength of graph databases is traversing through the nodes by following relationships. In Chapter 7, ​Neo4J​, we discuss the most popular graph database today, Neo4J. Neo4J One operation where other databases often fall flat is crawling through self-referential or otherwise intricately linked data. This is exactly where Neo4J shines. The benefit of using a graph database is the ability to quickly traverse nodes and relationships to find relevant data. Often found in social networking applications, graph databases are gaining traction for their flexibility, with Neo4j as a pinnacle implementation.

​​$ curl -X PUT http://localhost:8091/riak/cages/2 \​​ ​​-H "Content-Type: application/json" \​​ ​​-H "Link:</riak/animals/ace>;riaktag=\"contains\",​​ ​​ </riak/cages/1>;riaktag=\"next_to\"" \​​ ​​-d '{"room" : 101}'​​ What makes Links special in Riak is link walking (and a more powerful variant, linked mapreduce queries, which we investigate tomorrow). Getting the linked data is achieved by appending a link spec to the URL that is structured like this: /_,_,_. The underscores (_) in the URL represent wildcards to each of the link criteria: bucket, tag, keep. We’ll explain those terms shortly. First let’s retrieve all links from cage 1. ​​$ curl http://localhost:8091/riak/cages/1/_,_,_​​ ​​--4PYi9DW8iJK5aCvQQrrP7mh7jZs​​ ​​Content-Type: multipart/mixed; boundary=Av1fawIA4WjypRlz5gHJtrRqklD​​ ​​​​ ​​--Av1fawIA4WjypRlz5gHJtrRqklD​​ ​​X-Riak-Vclock: a85hYGBgzGDKBVIcypz/fvrde/U5gymRMY+VwZw35gRfFgA=​​ ​​Location: /riak/animals/polly​​ ​​Content-Type: application/json​​ ​​Link: </riak/animals>; rel="up"​​ ​​Etag: VD0ZAfOTsIHsgG5PM3YZW​​ ​​Last-Modified: Tue, 13 Dec 2011 17:53:59 GMT​​ ​​​​ ​​{"nickname" : "Sweet Polly Purebred", "breed" : "Purebred"}​​ ​​--Av1fawIA4WjypRlz5gHJtrRqklD--​​ ​​​​ ​​--4PYi9DW8iJK5aCvQQrrP7mh7jZs--​​ It returns a multipart/mixed dump of headers plus bodies of all linked keys/values.


Cataloging the World: Paul Otlet and the Birth of the Information Age by Alex Wright

1960s counterculture, Ada Lovelace, barriers to entry, British Empire, business climate, business intelligence, Cape to Cairo, card file, centralized clearinghouse, Charles Babbage, Computer Lib, corporate governance, crowdsourcing, Danny Hillis, Deng Xiaoping, don't be evil, Douglas Engelbart, Douglas Engelbart, Electric Kool-Aid Acid Test, European colonialism, folksonomy, Frederick Winslow Taylor, Great Leap Forward, hive mind, Howard Rheingold, index card, information retrieval, invention of movable type, invention of the printing press, Jane Jacobs, John Markoff, Kevin Kelly, knowledge worker, Law of Accelerating Returns, Lewis Mumford, linked data, Livingstone, I presume, lone genius, machine readable, Menlo Park, military-industrial complex, Mother of all demos, Norman Mailer, out of africa, packet switching, pneumatic tube, profit motive, RAND corporation, Ray Kurzweil, scientific management, Scramble for Africa, self-driving car, semantic web, Silicon Valley, speech recognition, Steve Jobs, Stewart Brand, systems thinking, Ted Nelson, The Death and Life of Great American Cities, the scientific method, Thomas L Friedman, urban planning, Vannevar Bush, W. E. B. Du Bois, Whole Earth Catalog

One year after writing that essay, he established a company called MetaWeb that created Freebase, which he characterized as an “open, shared database of the world’s knowledge.” In 2010, he sold the company to Google, where its structured snippets now often complement traditional keyword-based search results. In recent years, the Linked Data movement has to some extent subsumed the Semantic Web initiative. Linked Data proposes more of a middle ground, in which ontologies might be derived programmatically from analyzing large data sets, rather than manually created by teams of experts.12 This middle way approach might incorporate some of Otlet’s ideas: a topical structure further refined by automated discovery, bidirectional linking, and the ability to extract content from static documents, then synthesize and interpolate it in new ways.13 278 E ntering the S trea m In a widely circulated 2005 essay, “Ontology Is Overrated,” Clay Shirky argues that projects like the Semantic Web were doomed to failure in the Internet age.

See also Dewey Decimal System development of, 226–227 expanded use of, 40, 232 Josephinian Catalog, 33 playing cards, use of, 33 rejecting Universal Bibliography, 72 significance of, 33 standardized catalog cards, 105 supplies for, as business, 41–42 Library of Congress, 20, 29, 37 Licklider, J. C. R., 15, 248–250, 251, 258, 259 Limited Company for Useful Knowledge, 46 Limousin, Charles, 76 Linked Data movement, 278 Linotype, 89, 91, 92 Lippman, Walter, 162 Literary Machines (Nelson), 266 Lodge, Henry Cabot, 148, 165 Lovelace, Ada, 15 Lumière brothers, 62 Macintosh operating system, 260 Malware, 272 Man-Computer Symbiosis (Licklider), 248 Marburg, Theodore, 143 MARK II computer, 258 Markoff, John, 260 Marlowe, Christopher, 24 Marx, Karl, 59 The Master (Tóibín), 127 Masure, Louis, 158 Max, Adolphe, 105 Mazower, Mark, 67 Mechanical collective brain, 206, 218, 287 Mein Kampf (Hitler), 68 Memex, 217, 254, 256, 256–257 Mergenthaler, Otto, 89 Meta-bibliography, 242 MetaWeb (company), 278 Metric system, 30, 150 Microcosm project (England), 270 Microfilm, 100, 193, 200, 208, 210, 218, 250, 255, 274 Microphotic book, 101–107 Microphotography, 101, 208 Military origins of computers and Internet, 18, 248, 252, 258, 265 A Model Utopia (Wells), 211 Modernism, 179, 191 Mondotheque, 235, 238, 257, 296 Mons (Belgian city), 300 Morel, Edward, 54 Morgan, Pierpont, 125 Morris, William, 36 Morse, Samuel, 90 Motion pictures, possibilities of, 228–229 La Muette de Portici (The Mute Girl of Portici, opera), 44 Multimedia, envisioning of, 199 344 INDEX Mumford, Lewis, 115, 302 Mundaneum, 176–189 Berner-Lee’s views compared to, 274 compared to World Wide Web, 19, 234, 253–254, 277–278, 298 creation of, 5, 176–177, 179 design of, 177, 181–183, 182, 185–188, 187, 277 Encyclopedia Universalis Mundaneum (EUM) and, 193 goals of, 18, 177, 185, 292, 304–305 Google partnership with, 295–297 Le Corbusier’s role in, 181–188 Mons location of, 300–301 Nelson’s views compared to, 266 obscurity of, 11–12 Otlet’s description of, 18, 234–235, 238–239, 242, 243 role in utopian World City, 9, 303 World War II fate of, 9–11, 10 Mundaneum (Le Corbusier and Otlet pamphlet), 181, 182 Murray, James, 32 Musée d’Otlet (childhood display by Otlet), 46 Muséothèque (exhibition kit), 194 Museum for the Book (Brussels), 92–93 Museum of Society and Economy (Vienna), 194–195, 196–197 Museums and museum exhibits, role of, 102, 190–201, 227 Mussolini, 189 National Association for the Advancement of Colored People (NAACP), 168, 171 Nationalism, 144, 245 National Science Foundation, 252, 268 Nazis.

See Palais Mondial Worldstream, 291 World War I, 17, 18, 144–145 World War II Nazi book seizures and burnings, 4–5, 7, 12 Nazi occupation of Belgium, 18 Nazi persecution of Goldberg, 210 Nazi persecution of Zamenhof, 68 Otlet’s attempt to save Mundaneum, 10–11 Rosenberg Commission’s interest in Otlet, 4, 5, 7, 13, 245 World Wide Web. See also Internet flatness of, 285, 303 fundamental disorder of, 253–254, 282, 305 Knowledge Web, 276 Linked Data movement, 278 negatives of structure of, 272, 281, 289–291 ongoing development of, 280, 291 openness of, 271–272, 279, 281, 283, 285 origins of, 14, 15, 217, 252–253, 262, 270–275 Otlet’s prophetic vision of, 8, 14–15, 233–234 popularity of, 289 Semantic Web, 273–276, 278–279, 305 World Wide Web Consortium (W3C), 271, 273, 281 Wright, Frank Lloyd, 181, 262 Writers and economic chain of knowledge production, 231–232 WWW Consortium, 253 Xanadu, 264, 267 Xerox PARC (Palo Alto Research Center), 260 Young Friends of the World Palace, 202 Zamenhof, Ludwig, 67–68, 206 Zeiss Ikon camera company, 208, 210 Zero, Mr.


pages: 430 words: 68,225

Blockchain Basics: A Non-Technical Introduction in 25 Steps by Daniel Drescher

bitcoin, blockchain, business process, central bank independence, collaborative editing, cryptocurrency, disintermediation, disruptive innovation, distributed ledger, Ethereum, ethereum blockchain, fiat currency, job automation, linked data, machine readable, peer-to-peer, place-making, Satoshi Nakamoto, smart contracts, transaction costs

Since broken hash references serve as evidence that data were changed after the reference was created, the whole construct stores data in a change-sensitive manner. How It Works There are two classical patterns of using hash references in order to store data in a change-sensitive manner: • The chain • The tree Blockchain Basics 87 The Chain A chain of linked data, also called a linked list, 2 is formed when each piece of data also contains a hash reference to another piece of data. Such a structure is useful for storing and linking data together that are not fully available at one given point in time but instead arrive step by step in an ongoing fashion. Figure 11-4 illustrates this idea by using the symbols introduced above. The creation of such a chain starts with the piece of data labeled Data 1 and the creation of the hash reference R1.

Architecture and its underlying concepts Blockchain Basics 199 Consensus Logic Since all the nodes of the distributed system maintain their history of transaction data independently, their content can differ due to delays or other adversities of passing messages through a network. As a result, the data store that was meant to form a straight line of linked data blocks actually forms a three-shaped data structure where each branch represents a conflicting version of the transaction history. The consensus logic as depicted in Figure 21-6 makes all nodes of the system eventually consistent by making them choose the identical version of the transaction history that unites the most collective effort.


pages: 193 words: 19,478

Memory Machines: The Evolution of Hypertext by Belinda Barnet

augmented reality, Benoit Mandelbrot, Bill Duvall, British Empire, Buckminster Fuller, Charles Babbage, Claude Shannon: information theory, collateralized debt obligation, computer age, Computer Lib, conceptual framework, Douglas Engelbart, Douglas Engelbart, game design, hiring and firing, Howard Rheingold, HyperCard, hypertext link, Ian Bogost, information retrieval, Internet Archive, John Markoff, linked data, mandelbrot fractal, Marshall McLuhan, Menlo Park, nonsequential writing, Norbert Wiener, Project Xanadu, publish or perish, Robert Metcalfe, semantic web, seminal paper, Steve Jobs, Stewart Brand, technoutopianism, Ted Nelson, the scientific method, Vannevar Bush, wikimedia commons

They would later have a profound influence over hypertext theory and criticism, and also the Storyspace system. From the outset, the nodes in Storyspace were called ‘writing spaces’, and it worked explicitly with topographic MACHINE-ENHANCED (RE)MINDING 121 metaphors, incorporating a graphic ‘map view’ of the link data structure from the first version, along with a tree and an outline view (which are also visual representations of the data). ‘The tree’, Bolter tells us in Turing’s Man, ‘is a remarkably useful way of representing logical relations in spatial terms’ (Bolter 1984, 86). Also in line with the topographic metaphor, writing spaces in Storyspace acted (and still act) as containers for other writing spaces; an author literally ‘builds’ the space as she traverses it, zooming in and out to view details of the work, the map making the territory.

‘You’d tab a text and then you’d be able to associate notes with any particular word or phrase in the text […] an automated version of classical texts with notes’ (Bolter 2011). It wasn’t clickable because the IBM PC wasn’t clickable at the time; the user would move the cursor over the word and select it. This link data structure formed the basis for their future experiments ‘only in the sense that it had this quality of one text leading to another’ (Bolter 2011). In his well-researched chapter on afternoon, Matthew Kirschenbaum suggests that Storyspace has ‘significant grounding in a hierarchical data model’ (Kirschenbaum 2008, 173) that has its origins in the tree structures of ‘interactive fictions of the Adventure type’ (Kirschenbaum 2008, 175).

Guard fields are a powerful device, and one that Joyce deploys to full effect in afternoon. According to the Markle Report, Joyce ‘agitated’ for them to be included in the design of Storyspace from the outset, and Bolter quickly obliged in their fledgling program: It was just a matter of putting a field into the link data structure that would contain the guard, and then just checking that field […] against what the user did before they were allowed to follow the link […] It was [that] idea you know and it was Michael’s. (Bolter 2011) Guard fields, along with the topographic ‘spatial’ writing style, have remained integral to the Storyspace program for 30 years hence.


pages: 201 words: 63,192

Graph Databases by Ian Robinson, Jim Webber, Emil Eifrem

Amazon Web Services, anti-pattern, bioinformatics, business logic, commoditize, corporate governance, create, read, update, delete, data acquisition, en.wikipedia.org, fault tolerance, linked data, loose coupling, Network effects, recommendation engine, semantic web, sentiment analysis, social graph, software as a service, SPARQL, the strength of weak ties, web application

Triple stores typically provide SPARQL ca‐ pabilities to reason about stored RDF data.11 RDF—the lingua franca of triple stores and the Semantic Web—can be serialized several ways. RDF encoding of a simple three-node graph shows the RDF/XML format. Here we see how triples come together to form linked data. RDF encoding of a simple three-node graph. <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.example.org/ter <rdf:Description rdf:about="http://www.example.org/ginger"> <name>Ginger Rogers</name> <occupation>dancer</occupation> <partner rdf:resource="http://www.example.org/fred"/> </rdf:Description> 10. http://www.w3.org/standards/semanticweb/ 11.

See http://www.w3.org/TR/rdf-sparql-query/ and http://www.w3.org/RDF/ Graph Databases | 185 <rdf:Description rdf:about="http://www.example.org/fred"> <name>Fred Astaire</name> <occupation>dancer</occupation> <likes rdf:resource="http://www.example.org/ice-cream"/> </rdf:Description> </rdf:RDF> W3C support That they produce logical representations of triples doesn’t mean triple stores necessarily have triple-like internal implementations. Most triple stores, however, are unified by their support for Semantic Web technology such as RDF and SPARQL. While there’s nothing particularly special about RDF as a means of serializing linked data, it is en‐ dorsed by the W3C and therefore benefits from being widely understood and well doc‐ umented. The query language SPARQL benefits from similar W3C patronage. In the graph database space there is a similar abundance of innovation around graph serialization formats (e.g. GEOFF) and inferencing query languages (e.g. the Cypher query language that we use throughout this book).12 The key difference is that at this point these innovations do not enjoy the patronage of a well-regarded body like the W3C, though they do benefit from strong engagement within their user and vendor communities.


pages: 458 words: 116,832

The Costs of Connection: How Data Is Colonizing Human Life and Appropriating It for Capitalism by Nick Couldry, Ulises A. Mejias

"World Economic Forum" Davos, 23andMe, Airbnb, Amazon Mechanical Turk, Amazon Web Services, behavioural economics, Big Tech, British Empire, call centre, Cambridge Analytica, Cass Sunstein, choice architecture, cloud computing, colonial rule, computer vision, corporate governance, dark matter, data acquisition, data is the new oil, data science, deep learning, different worldview, digital capitalism, digital divide, discovery of the americas, disinformation, diversification, driverless car, Edward Snowden, emotional labour, en.wikipedia.org, European colonialism, Evgeny Morozov, extractivism, fake news, Gabriella Coleman, gamification, gig economy, global supply chain, Google Chrome, Google Earth, hiring and firing, income inequality, independent contractor, information asymmetry, Infrastructure as a Service, intangible asset, Internet of things, Jaron Lanier, job automation, Kevin Kelly, late capitalism, lifelogging, linked data, machine readable, Marc Andreessen, Mark Zuckerberg, means of production, military-industrial complex, move fast and break things, multi-sided market, Naomi Klein, Network effects, new economy, New Urbanism, PageRank, pattern recognition, payday loans, Philip Mirowski, profit maximization, Ray Kurzweil, RFID, Richard Stallman, Richard Thaler, Salesforce, scientific management, Scientific racism, Second Machine Age, sharing economy, Shoshana Zuboff, side hustle, Sidewalk Labs, Silicon Valley, Slavoj Žižek, smart cities, Snapchat, social graph, social intelligence, software studies, sovereign wealth fund, surveillance capitalism, techlash, The Future of Employment, the scientific method, Thomas Davenport, Tim Cook: Apple, trade liberalization, trade route, undersea cable, urban planning, W. E. B. Du Bois, wages for housework, work culture , workplace surveillance

In chapter 1 we noted the social credit system seen by the Chinese government as its route to “the modernization of social governance.”110 Meanwhile in India, the Aadhaar identity-card system is being made a requirement for access to welfare services, tax dealings, and even the online booking of train tickets.111 Through the operation of social caching, we are increasingly becoming data subjects whose responsiveness to data signals is expected, even taken as virtuous. IoT = LAC? (Operationalizing Life’s Annexation to Capital) The business opportunities from innovative extensions of social caching are multiplying, often in alliance with the state. Consider the cameras with linked data analytics now offered in the United States by Axon AI (formerly Taser) to replace law enforcement officers’ crime-scene reports; as one investor said, “Taser wants to be the Tesla or Apple of law enforcement.”112 Even in formal democracies, resource-strapped states will take advantage of these apparently risk-free methods for delegating their knowledge of hard-to-reach areas of the social world to algorithms.

To the transparent networks that slowly occlude the flow of all those aspects of nature and character that distinguish humans from elevator buttons and doorbells. . . . Haven’t you felt it? The loss of autonomy. The sense of being virtualized. All the coded impulses you depend on to guide you. All the sensors in the room that are watching you, listening to you, tracking your habits, measuring your capabilities. All the linked data designed to incorporate you into the megadata.37 Something, in other words, is going wrong with human autonomy. But, you might ask, isn’t the notion of autonomy (the self’s ability to govern its own life, deriving from the Greek words autos for self and nomos for law or rule) itself problematic?

We argued that underlying these was something even more fundamental: the drive to capitalize human life itself in all its aspects and build through this a new social and economic order that installs capitalist management as the privileged mode for governing every aspect of life. Put another way, and updating Marx for the Big Data age, human life becomes a direct factor in capitalist production. This annexation of human capital is what links data colonialism to the further expansion of capitalism. This is the fundamental cost of connection, and it is a cost being paid all over the world, in societies in which connection is increasingly imposed as the basis for participating in everyday life. The resulting order has important similarities whether we are discussing the United States, China, Europe, or Latin America.


pages: 58 words: 12,386

Big Data Glossary by Pete Warden

business intelligence, business logic, crowdsourcing, fault tolerance, functional programming, information retrieval, linked data, machine readable, natural language processing, recommendation engine, web application

It has been designed to make it easy to correct the most common errors you’ll encounter in human-created datasets. For example, it’s easy to spot and correct common problems like typos or inconsistencies in text values and to change cells from one format to another. There’s also rich support for linking data by calling APIs with the data contained in existing rows to augment the spreadsheet with information from external sources. Refine doesn’t let you do anything you can’t with other tools, but its power comes from how well it supports a typical extract and transform workflow. It feels like a good step up in abstraction, packaging processes that would typically take multiple steps in a scripting language or spreadsheet package into single operations with sensible defaults.


Data and the City by Rob Kitchin,Tracey P. Lauriault,Gavin McArdle

A Declaration of the Independence of Cyberspace, algorithmic management, bike sharing, bitcoin, blockchain, Bretton Woods, Chelsea Manning, citizen journalism, Claude Shannon: information theory, clean water, cloud computing, complexity theory, conceptual framework, corporate governance, correlation does not imply causation, create, read, update, delete, crowdsourcing, cryptocurrency, data science, dematerialisation, digital divide, digital map, digital rights, distributed ledger, Evgeny Morozov, fault tolerance, fiat currency, Filter Bubble, floating exchange rates, folksonomy, functional programming, global value chain, Google Earth, Hacker News, hive mind, information security, Internet of things, Kickstarter, knowledge economy, Lewis Mumford, lifelogging, linked data, loose coupling, machine readable, new economy, New Urbanism, Nicholas Carr, nowcasting, open economy, openstreetmap, OSI model, packet switching, pattern recognition, performance metric, place-making, power law, quantum entanglement, RAND corporation, RFID, Richard Florida, ride hailing / ride sharing, semantic web, sentiment analysis, sharing economy, Silicon Valley, Skype, smart cities, Smart Cities: Big Data, Civic Hackers, and the Quest for a New Utopia, smart contracts, smart grid, smart meter, social graph, software studies, statistical model, tacit knowledge, TaskRabbit, technological determinism, technological solutionism, text mining, The Chicago School, The Death and Life of Great American Cities, the long tail, the market place, the medium is the message, the scientific method, Toyota Production System, urban planning, urban sprawl, web application

London: Macmillan and Co. Cosgrove, D. (2001) Apollo’s Eye: A Cartographic Genealogy of the Earth in the Western Imagination. Baltimore, MD: Johns Hopkins University Press. Debruyne, C., Clinton, É., McNerney, L., Lavin, P. and O’Sullivan, D. (2017) ‘On the construction for a linked data platform for Ireland’s authoritative geospatial linked data’, 186 T. P. Lauriault available from: www.osi.ie/wp-content/uploads/2017/01/osi-eswc-2017-preprint.pdf [accessed 10 February 2017]. Dodge, M., Kitchin, R. and Perkins, C. (eds) (2009) Rethinking Maps: New Frontiers in Cartographic Theory. London: Routledge. Foucault, M. (2003) The Essential Foucault: Selections from Essential Works of Foucault, 1954–1984.


pages: 133 words: 42,254

Big Data Analytics: Turning Big Data Into Big Money by Frank J. Ohlhorst

algorithmic trading, bioinformatics, business intelligence, business logic, business process, call centre, cloud computing, create, read, update, delete, data acquisition, data science, DevOps, extractivism, fault tolerance, information security, Large Hadron Collider, linked data, machine readable, natural language processing, Network effects, pattern recognition, performance metric, personalized medicine, RFID, sentiment analysis, six sigma, smart meter, statistical model, supply-chain management, warehouse automation, Watson beat the top human players on Jeopardy!, web application

In both cases, the overarching principle is real-time data integration, in which reflecting data change instantly in a data warehouse—whether originating from a MapReduce job or from a transactional system—and create downstream analytics that have an accurate, timely view of reality. Others are turning to linked data and semantics, where data sets are created using linking methodologies that focus on the semantics of the data. This fits well into the broader notion of pointing at external sources from within a data set, which has been around for quite a long time. That ability to point to unstructured data (whether residing in the file system or some external source) merely becomes an extension of the given capabilities, in which the ability to store and process XML and XQuery natively within an RDBMS enables the combination of different degrees of structure while searching and analyzing the underlying data.


Virtual Competition by Ariel Ezrachi, Maurice E. Stucke

"World Economic Forum" Davos, Airbnb, Alan Greenspan, Albert Einstein, algorithmic management, algorithmic trading, Arthur D. Levinson, barriers to entry, behavioural economics, cloud computing, collaborative economy, commoditize, confounding variable, corporate governance, crony capitalism, crowdsourcing, Daniel Kahneman / Amos Tversky, David Graeber, deep learning, demand response, Didi Chuxing, digital capitalism, disintermediation, disruptive innovation, double helix, Downton Abbey, driverless car, electricity market, Erik Brynjolfsson, Evgeny Morozov, experimental economics, Firefox, framing effect, Google Chrome, independent contractor, index arbitrage, information asymmetry, interest rate derivative, Internet of things, invisible hand, Jean Tirole, John Markoff, Joseph Schumpeter, Kenneth Arrow, light touch regulation, linked data, loss aversion, Lyft, Mark Zuckerberg, market clearing, market friction, Milgram experiment, multi-sided market, natural language processing, Network effects, new economy, nowcasting, offshore financial centre, pattern recognition, power law, prediction markets, price discrimination, price elasticity of demand, price stability, profit maximization, profit motive, race to the bottom, rent-seeking, Richard Thaler, ride hailing / ride sharing, road to serfdom, Robert Bork, Ronald Reagan, search costs, self-driving car, sharing economy, Silicon Valley, Skype, smart cities, smart meter, Snapchat, social graph, Steve Jobs, sunk-cost fallacy, supply-chain management, telemarketer, The Chicago School, The Myth of the Rational Market, The Wealth of Nations by Adam Smith, too big to fail, transaction costs, Travis Kalanick, turn-by-turn navigation, two-sided market, Uber and Lyft, Uber for X, uber lyft, vertical integration, Watson beat the top human players on Jeopardy!, women in the workforce, yield management

One possibility may be to focus on commercially sensitive information that, although publicly available, is of little or no value to customers but helps the competitors arrive at a supracompetitive price.37 Here the focus is on “cheap talk,” that is, data exchanges that facilitate conscious parallelism but are of limited use to customers. One problem, however, is in identifying such information. Part of the value of Big Data is data fusion, whereby computers link data sets, from which new insights emerge.38 Moreover, the data for some applications—such as customers sharing their inventory data with suppliers—can promote efficiency even while raising antitrust concerns.39 Even if the customers seek to limit what information can be shared, the algorithms—by analyzing a variety of data—could fi ll in the gaps.

cote=DSTI/ICCP(2012)9/FINAL&docLanguage =En, observing that “In some cases, big data is defined by the capacity to analyse a variety of mostly unstructured data sets from sources as diverse as web logs, social media, mobile communications, sensors and financial transactions. This requires the capability to link data sets; this can be essential as information is highly context-dependent and may not be of value out of the right context. It also requires the capability to extract information from unstructured data, i.e. data that lack a predefined (explicit or implicit) model.” 39. Stanford Graduate School of Business Staff, “Sharing Information to Boost the Bottom Line,” Insights by Stanford Business (March 1, 1999), http://www .gsb.stanford.edu/insights/sharing-information-boost-bottom-line. 336 Notes to Pages 234–237 Final Reflections 1.


pages: 262 words: 60,248

Python Tricks: The Book by Dan Bader

anti-pattern, business logic, data science, domain-specific language, don't repeat yourself, functional programming, Hacker News, higher-order functions, linked data, off-by-one error, pattern recognition, performance metric

But before we jump in, let’s cover some of the basics first. How do arrays work, and what are they used for? Arrays consist of fixed-size data records that allow each element to be efficiently located based on its index. Because arrays store information in adjoining blocks of memory, they’re considered contiguous data structures (as opposed to linked datas structure like linked lists, for example.) A real world analogy for an array data structure is a parking lot: You can look at the parking lot as a whole and treat it as a single object, but inside the lot there are parking spots indexed by a unique number. Parking spots are containers for vehicles—each parking spot can either be empty or have a car, a motorbike, or some other vehicle parked on it.


pages: 680 words: 157,865

Beautiful Architecture: Leading Thinkers Reveal the Hidden Beauty in Software Design by Diomidis Spinellis, Georgios Gousios

Albert Einstein, barriers to entry, business intelligence, business logic, business process, call centre, continuous integration, corporate governance, database schema, Debian, domain-specific language, don't repeat yourself, Donald Knuth, duck typing, en.wikipedia.org, fail fast, fault tolerance, financial engineering, Firefox, Free Software Foundation, functional programming, general-purpose programming language, higher-order functions, iterative process, linked data, locality of reference, loose coupling, meta-analysis, MVC pattern, Neal Stephenson, no silver bullet, peer-to-peer, premature optimization, recommendation engine, Richard Stallman, Ruby on Rails, semantic web, smart cities, social graph, social web, SPARQL, Steve Jobs, Stewart Brand, Strategic Defense Initiative, systems thinking, the Cathedral and the Bazaar, traveling salesman, Turing complete, type inference, web application, zero-coupon bond

There is no central coordination, and we are free to document our wandering by republishing our stories, thoughts, and journeys as we go. We think of the Web as a series of one-way links between documents (see Figure 5-1). Figure 5-1. Conventional notion of the Web Linked documents are only part of the picture, however. The vision for the Web always included the idea of linked data as well. This content can be consumed through a rendered view or directly referenced and manipulated in preferred forms in different contexts. You can imagine a middle-tier layer asking for information as an XML document while the presentation tier prefers a JSON object via an AJAX call. The same name refers to the same data in different forms.

For the more difficult aspects of establishing the correctness of a design or implementation, the advantage of the functional approach is not so clear. For example, proving that a recursive definition has specific properties and terminates requires the equivalent of a loop invariant and variant. It is also unlikely that efficient functional programs can afford to renounce programmer-visible linked data structures, with all the resulting problems such as aliasing, which are challenging regardless of the underlying programming model. If functional programming fails to bring a significant simplification to the task of establishing correctness, there remains a major practical argument: referential transparency.


The Data Journalism Handbook by Jonathan Gray, Lucy Chambers, Liliana Bounegru

Amazon Web Services, barriers to entry, bioinformatics, business intelligence, carbon footprint, citizen journalism, correlation does not imply causation, crowdsourcing, data science, David Heinemeier Hansson, eurozone crisis, fail fast, Firefox, Florence Nightingale: pie chart, game design, Google Earth, Hans Rosling, high-speed rail, information asymmetry, Internet Archive, John Snow's cholera map, Julian Assange, linked data, machine readable, moral hazard, MVC pattern, New Journalism, openstreetmap, Ronald Reagan, Ruby on Rails, Silicon Valley, social graph, Solyndra, SPARQL, text mining, Wayback Machine, web application, WikiLeaks

While we are all either a journalist, designer, or developer “first,” we continue to work hard to increase our understanding and proficiency in each other’s areas of expertise. The core products for exploring data are Excel, Google Docs, and Fusion Tables. The team has also, but to a lesser extent, used MySQL, Access databases, and Solr to explore larger datasets; and used RDF and SPARQL to begin looking at ways in which we can model events using Linked Data technologies. Developers will also use their programming language of choice, whether that’s ActionScript, Python, or Perl, to match, parse, or generally pick apart a dataset we might be working on. Perl is used for some of the publishing. We use Google, Bing Maps, and Google Earth, along with Esri’s ArcMAP, for exploring and visualizing geographical data.


pages: 224 words: 13,238

Electronic and Algorithmic Trading Technology: The Complete Guide by Kendall Kim

algorithmic trading, automated trading system, backtesting, Bear Stearns, business logic, commoditize, computerized trading, corporate governance, Credit Default Swap, diversification, en.wikipedia.org, family office, financial engineering, financial innovation, fixed income, index arbitrage, index fund, interest rate swap, linked data, market fragmentation, money market fund, natural language processing, proprietary trading, quantitative trading / quantitative finance, random walk, risk tolerance, risk-adjusted returns, short selling, statistical arbitrage, Steven Levy, transaction costs, yield curve

However, most financial services institutions do not have the ability to reach an optimal infrastructure because resources for most of a brokerage firm’s cost center have fallen victim to applying discretionary funds within the profit center such as the trading area of the business. It is clearly evident that budgets for data infrastructure have been reduced in the past years when the need for enhancing performance and technology has never been greater. Presumably, this will change in the future, though, when linking data to trading profitability becomes more evident. 8.5 Impact on Operations and Technology Real-time transaction processing and electronic trading can result in a great deal of automation for operations. Real-time transactions move more Effective Data Management 89 quickly, tend to be more accurate, have fewer problems, and need less attention than manually engaged transactions.


Algorithms in C++ Part 5: Graph Algorithms by Robert Sedgewick

Erdős number, functional programming, linear programming, linked data, NP-complete, reversible computing, search costs, sorting algorithm, traveling salesman

Indeed, the first algorithms that we considered in detail, the union-find algorithms in Chapter 1, are prime examples of graph algorithms. We also used graphs in Chapter 3 as an illustration of applications of two-dimensional arrays and linked lists, and in Chapter 5 to illustrate the relationship between recursive programs and fundamental data structures. Any linked data structure is a representation of a graph, and some familiar algorithms for processing trees and other linked structures are special cases of graph algorithms. The purpose of this chapter is to provide a context for developing an understanding of graph algorithms ranging from the simple ones in Part 1 to the sophisticated ones in Chapters 18 through 22.

The primary disadvantage is that testing for the existence of specific edges can take time proportional to V, as opposed to constant time in the adjacency matrix. These differences trace, essentially, to the difference between using linked lists and vectors to represent the set of vertices incident on each vertex. Thus, we see again that an understanding of the basic properties of linked data structures and vectors is critical if we are to develop efficient graph ADT implementations. Our interest in these performance differences is that we want to avoid implementations that are inappropriately inefficient under unexpected circumstances when a wide range of operations is to be demanded of the ADT.


pages: 318 words: 73,713

The Shame Machine: Who Profits in the New Age of Humiliation by Cathy O'Neil

2021 United States Capitol attack, Affordable Care Act / Obamacare, basic income, big-box store, Black Lives Matter, British Empire, call centre, cognitive dissonance, colonial rule, coronavirus, COVID-19, crack epidemic, crowdsourcing, data science, delayed gratification, desegregation, don't be evil, Edward Jenner, fake news, George Floyd, Greta Thunberg, Jon Ronson, Kickstarter, linked data, Mahatma Gandhi, mass incarceration, microbiome, microdosing, Nelson Mandela, opioid epidemic / opioid crisis, pre–internet, profit motive, QAnon, Ronald Reagan, selection bias, Silicon Valley, social distancing, Stanford marshmallow experiment, Streisand effect, TikTok, Walter Mischel, War on Poverty, working poor

Or likewise you might get punished for littering in the subway or denigrating the ruling party online. Your various infractions might also be announced, by name, on Weibo or WeChat, internet giants in China. No matter where we live, some of us fare far better than others in our relations with the expanding network linking data to shame and stigma. The easiest people to exploit tend to be the most desperate, the ones who lack the money, the knowledge, or the leisure time to tend to the digital baggage that trails them, or simply those who have traditionally been treated badly. These are folks who are disproportionately poor or otherwise marginalized and have the least control over their identities.


pages: 721 words: 197,134

Data Mining: Concepts, Models, Methods, and Algorithms by Mehmed Kantardzić

Albert Einstein, algorithmic bias, backpropagation, bioinformatics, business cycle, business intelligence, business process, butter production in bangladesh, combinatorial explosion, computer vision, conceptual framework, correlation coefficient, correlation does not imply causation, data acquisition, discrete time, El Camino Real, fault tolerance, finite state, Gini coefficient, information retrieval, Internet Archive, inventory management, iterative process, knowledge worker, linked data, loose coupling, Menlo Park, natural language processing, Netflix Prize, NP-complete, PageRank, pattern recognition, peer-to-peer, phenotype, random walk, RFID, semantic web, speech recognition, statistical model, Telecommunications Act of 1996, telemarketer, text mining, traveling salesman, web application

Examples of heterogeneous networks include those in medical domains describing patients, diseases, treatments, and contacts, or in bibliographic domains describing publications, authors, and venues. Graph-mining techniques explicitly consider these links when building predictive or descriptive models of the linked data. The requirement of different applications with graph-based data sets is not very uniform. Thus, graph models and mining algorithms that work well in one domain may not work well in another. For example, chemical data is often represented as graphs in which the nodes correspond to atoms, and the links correspond to bonds between the atoms.

Therefore, a labeled graph G consists of three sets of information: G(N,L,V), where the new component V = {v1, v2, … , vt} is a set of values attached to links. An example of a directed graph is given in Figure 12.2b, while the graph in Figure 12.2c is a labeled graph. Different applications use different types of graphs in modeling linked data. In this chapter the primary focus is on undirected and unlabeled graphs although the reader still has to be aware that there are numerous graph-mining algorithms for directed and/or labeled graphs. Besides a graphical representation, each graph may be presented in the form of the incidence matrix I(G) where nodes are indexing rows and links are indexing columns.


pages: 288 words: 85,073

Factfulness: Ten Reasons We're Wrong About the World – and Why Things Are Better Than You Think by Hans Rosling, Ola Rosling, Anna Rosling Rönnlund

"World Economic Forum" Davos, animal electricity, clean water, colonial rule, en.wikipedia.org, energy transition, fake news, first square of the chessboard, first square of the chessboard / second half of the chessboard, global pandemic, Hans Rosling, illegal immigration, income inequality, income per capita, Intergovernmental Panel on Climate Change (IPCC), jimmy wales, linked data, lone genius, microcredit, purchasing power parity, revenue passenger mile, Stanford marshmallow experiment, Steven Pinker, systems thinking, TED Talk, Thomas L Friedman, Walter Mischel

We presented at the ceremony for their new Open Data platform in May 2010, and since then the World Bank has become the main access point for reliable global statistics; see gapm.io/x6. This was all possible thanks to Tim Berners-Lee and other early visionaries of the free internet. Sometime after he had invented the World Wide Web, Tim Berners-Lee contacted us, asking to borrow a slide show that showed how a web of linked data sources could flourish (using an image of pretty flowers). We share all of our content for free, so of course we said yes. Tim used this “flower-powerpoint” in his 2009 TED talk—see gapm.io/x6—to help people see the beauty of “The Next Web,” and he uses Gapminder as an example of what happens when data from multiple sources come together; see Berners-Lee (2009).


pages: 336 words: 93,672

The Future of the Brain: Essays by the World's Leading Neuroscientists by Gary Marcus, Jeremy Freeman

23andMe, Albert Einstein, backpropagation, bioinformatics, bitcoin, brain emulation, cloud computing, complexity theory, computer age, computer vision, conceptual framework, correlation does not imply causation, crowdsourcing, dark matter, data acquisition, data science, deep learning, Drosophila, epigenetics, Geoffrey Hinton, global pandemic, Google Glasses, ITER tokamak, iterative process, language acquisition, linked data, mouse model, optical character recognition, pattern recognition, personalized medicine, phenotype, race to the bottom, Richard Feynman, Ronald Reagan, semantic web, speech recognition, stem cell, Steven Pinker, supply-chain management, synthetic biology, tacit knowledge, traumatic brain injury, Turing machine, twin studies, web application

While efforts to map the brain have begun as public, government-funded projects, this does not mean that private entities will not enter the arena and seek to compete with those projects. Although initial efforts to map the brain may be fueled by public funds, the issue of how “fine-tuned” information that can be used to determine risk factors or emerging disease states in individual’s brains, which will require linking data to genetic databases, health records, and health databases, will be handled merits discussion now. What rules will govern the sharing of detailed scans or maps about each individual’s brain? Can data be linked from a brain scan to a genome to a database without an individual’s express consent if that person’s identity is not 100 percent secure?


pages: 374 words: 94,508

Infonomics: How to Monetize, Manage, and Measure Information as an Asset for Competitive Advantage by Douglas B. Laney

3D printing, Affordable Care Act / Obamacare, banking crisis, behavioural economics, blockchain, book value, business climate, business intelligence, business logic, business process, call centre, carbon credits, chief data officer, Claude Shannon: information theory, commoditize, conceptual framework, crowdsourcing, dark matter, data acquisition, data science, deep learning, digital rights, digital twin, discounted cash flows, disintermediation, diversification, en.wikipedia.org, endowment effect, Erik Brynjolfsson, full employment, hype cycle, informal economy, information security, intangible asset, Internet of things, it's over 9,000, linked data, Lyft, Nash equilibrium, Neil Armstrong, Network effects, new economy, obamacare, performance metric, profit motive, recommendation engine, RFID, Salesforce, semantic web, single source of truth, smart meter, Snapchat, software as a service, source of truth, supply-chain management, tacit knowledge, technological determinism, text mining, uber lyft, Y2K, yield curve

. • Information accessibility • User request turnaround time • User satisfaction survey Agility The ability to respond to external influences, and the ability to respond to marketplace changes to gain or maintain competitive advantage. SCOR agility metrics include flexibility and adaptability. • Utility of information for a range of purposes • Linked data, metadata, and master data measures • Ease of integrating new types of data or changing dimensions Costs The cost of operating the supply chain processes. This includes labor costs, material costs, management, and transportation costs. A typical cost metric is cost of goods sold. • Data acquisition cost • Data management costs • Data delivery costs (Each include labor and technology related costs) Asset Management Efficiency (Assets) The ability to efficiently utilize assets.


Future Files: A Brief History of the Next 50 Years by Richard Watson

Abraham Maslow, Albert Einstein, bank run, banking crisis, battle of ideas, Black Swan, call centre, carbon credits, carbon footprint, carbon tax, cashless society, citizen journalism, commoditize, computer age, computer vision, congestion charging, corporate governance, corporate social responsibility, deglobalization, digital Maoism, digital nomad, disintermediation, driverless car, epigenetics, failed state, financial innovation, Firefox, food miles, Ford Model T, future of work, Future Shock, global pandemic, global supply chain, global village, hive mind, hobby farmer, industrial robot, invention of the telegraph, Jaron Lanier, Jeff Bezos, knowledge economy, lateral thinking, linked data, low cost airline, low skilled workers, M-Pesa, mass immigration, Northern Rock, Paradox of Choice, peak oil, pensions crisis, precautionary principle, precision agriculture, prediction markets, Ralph Nader, Ray Kurzweil, rent control, RFID, Richard Florida, self-driving car, speech recognition, synthetic biology, telepresence, the scientific method, The Wisdom of Crowds, Thomas Kuhn: the structure of scientific revolutions, Turing test, Victor Gruen, Virgin Galactic, white flight, women in the workforce, work culture , Zipcar

Carolyn 153 trends that will transform transport 5 Embedded intelligence Cars can already be opened or started using fingerprint and iris recognition, so we’ll see more technologies linking vehicle security to user identification. We will also see mood-sensitive vehicles that adjust their behavior according to the mood of the driver or occupants. Cars will also become mobile technology platforms linking data to other services such as healthcare. For example, if your car regularly detects an abnormal heartbeat or high levels of stress, this information could be sent wirelessly to your doctor. Obviously privacy issues abound, but cars could become useful data-collection and delivery points. Remote monitoring Electronic data recorders are little black boxes that already sit covertly inside some cars and monitor your speed, acceleration and braking.


pages: 356 words: 102,224

Pale Blue Dot: A Vision of the Human Future in Space by Carl Sagan

Albert Einstein, anthropic principle, Apollo 11, Apollo 13, cosmological principle, dark matter, Dava Sobel, Francis Fukuyama: the end of history, germ theory of disease, invention of the telescope, Isaac Newton, Johannes Kepler, Kuiper Belt, linked data, low earth orbit, military-industrial complex, Neil Armstrong, nuclear winter, planetary scale, power law, profit motive, remunicipalization, scientific worldview, Search for Extraterrestrial Intelligence, sparse data, Stephen Hawking, telepresence, time dilation

You reach out your arm to pick up something shiny in the soil, and the robot arm does likewise. The sands of Mars trickle through your fingers. The only difficulty with this remote reality technology is that all this must occur in tedious slow motion: The round-trip travel time of 115 the up-link commands from Earth to Mars and the down-link data returned from Mars to Earth might take half an hour or more. But this is something we can learn to do. We can learn to contain our exploratory impatience if that's the price of exploring Mars. The rover can be made smart enough to deal with routine contingencies. Anything more challenging, and it makes a dead stop, puts itself into a safeguard mode, and radios for a very patient human controller to take over.


pages: 313 words: 101,403

My Life as a Quant: Reflections on Physics and Finance by Emanuel Derman

Bear Stearns, Berlin Wall, bioinformatics, Black-Scholes formula, book value, Brownian motion, buy and hold, capital asset pricing model, Claude Shannon: information theory, Dennis Ritchie, Donald Knuth, Emanuel Derman, financial engineering, fixed income, Gödel, Escher, Bach, haute couture, hiring and firing, implied volatility, interest rate derivative, Jeff Bezos, John Meriwether, John von Neumann, Ken Thompson, law of one price, linked data, Long Term Capital Management, moral hazard, Murray Gell-Mann, Myron Scholes, PalmPilot, Paul Samuelson, pre–internet, proprietary trading, publish or perish, quantitative trading / quantitative finance, Sharpe ratio, statistical arbitrage, statistical model, Stephen Hawking, Steve Jobs, stochastic volatility, technology bubble, the new new thing, transaction costs, volatility smile, Y2K, yield curve, zero-coupon bond, zero-sum game

While I was away on a two-week beach vacation at Fire Island with my family, Ed suddenly threw himself into redesigning and then rewriting the entire system-without giving me advance notice. I returned to a fait accompli, a completely new, enhanced, and almost unrecognizable APL-flavored version of the language. Ed's version now incorporated vastly complex dynamically linked data structures, whose details I knew I would not live long enough to master. Ed had also cleverly modified HEQS so that, once you had used it interactively to develop and solve a financial model, you could then use it generate a C program that would solve your equations many times faster. Programming came naturally to Ed in a way it never would to me, and his proficiency daunted me.


pages: 348 words: 97,277

The Truth Machine: The Blockchain and the Future of Everything by Paul Vigna, Michael J. Casey

3D printing, additive manufacturing, Airbnb, altcoin, Amazon Web Services, barriers to entry, basic income, Berlin Wall, Bernie Madoff, Big Tech, bitcoin, blockchain, blood diamond, Blythe Masters, business process, buy and hold, carbon credits, carbon footprint, cashless society, circular economy, cloud computing, computer age, computerized trading, conceptual framework, content marketing, Credit Default Swap, cross-border payments, crowdsourcing, cryptocurrency, cyber-physical system, decentralized internet, dematerialisation, disinformation, disintermediation, distributed ledger, Donald Trump, double entry bookkeeping, Dunbar number, Edward Snowden, Elon Musk, Ethereum, ethereum blockchain, failed state, fake news, fault tolerance, fiat currency, financial engineering, financial innovation, financial intermediation, Garrett Hardin, global supply chain, Hernando de Soto, hive mind, informal economy, information security, initial coin offering, intangible asset, Internet of things, Joi Ito, Kickstarter, linked data, litecoin, longitudinal study, Lyft, M-Pesa, Marc Andreessen, market clearing, mobile money, money: store of value / unit of account / medium of exchange, Network effects, off grid, pets.com, post-truth, prediction markets, pre–internet, price mechanism, profit maximization, profit motive, Project Xanadu, ransomware, rent-seeking, RFID, ride hailing / ride sharing, Ross Ulbricht, Satoshi Nakamoto, self-driving car, sharing economy, Silicon Valley, smart contracts, smart meter, Snapchat, social web, software is eating the world, supply-chain management, Ted Nelson, the market place, too big to fail, trade route, Tragedy of the Commons, transaction costs, Travis Kalanick, Turing complete, Uber and Lyft, uber lyft, unbanked and underbanked, underbanked, universal basic income, Vitalik Buterin, web of trust, work culture , zero-sum game

You could say these “cloud” services are much truer to that name than those of Amazon Web Services, Google, Dropbox, IBM, Oracle, Microsoft, and Apple, the providers with which most people associate that word. But even bigger changes are being considered, including projects to entirely re-architect the Web itself. There’s Solid, which stands for Social Linked Data, a new protocol for data storage that puts data back in the hands of the people to whom it belongs. The core idea is that we will store our data in Pods (Personalized Online Data Stores) and distribute it to applications via permissions we control. Solid is the brainchild of none other than Tim Berners-Lee, the computer scientist who perfected HTTP and gave us the World Wide Web.


pages: 352 words: 98,561

The City by Tony Norfield

accounting loophole / creative accounting, air traffic controllers' union, anti-communist, Asian financial crisis, asset-backed security, bank run, banks create money, Basel III, Berlin Wall, Big bang: deregulation of the City of London, Bretton Woods, BRICs, British Empire, capital controls, central bank independence, colonial exploitation, colonial rule, continuation of politics by other means, currency risk, dark matter, Edward Snowden, Fall of the Berlin Wall, financial innovation, financial intermediation, foreign exchange controls, Francis Fukuyama: the end of history, G4S, global value chain, Goldman Sachs: Vampire Squid, interest rate derivative, interest rate swap, Irish property bubble, Leo Hollis, linked data, London Interbank Offered Rate, London Whale, Londongrad, low interest rates, Mark Zuckerberg, Martin Wolf, means of production, Money creation, money market fund, mortgage debt, North Sea oil, Northern Rock, Occupy movement, offshore financial centre, plutocrats, purchasing power parity, quantitative easing, Real Time Gross Settlement, regulatory arbitrage, reserve currency, Ronald Reagan, seigniorage, Sharpe ratio, sovereign wealth fund, Suez crisis 1956, The Great Moderation, transaction costs, transfer pricing, zero-sum game

The City’s status as a major dealing centre is solidly based on its connections with the rest of the world and its ability to act as an intermediary for global flows of money-capital and credit. Major flows of finance in the form of deposits, loans, and the purchase and sale of securities between UK-based banks and the rest of the world are intermediated by banks outside the UK, but many of these are UK-linked. Data from the Bank of England enable these links to be examined in some detail, and they highlight a key role of the UK banking system, one that has not been analysed before. These data are shown in Table 8.6.22 The figures are in US dollars, since this is the main currency used in the transactions, and they measure the outstanding valuations of bank assets and liabilities.


pages: 350 words: 109,521

Our 50-State Border Crisis: How the Mexican Border Fuels the Drug Epidemic Across America by Howard G. Buffett

airport security, clean water, collective bargaining, defense in depth, Donald Trump, illegal immigration, immigration reform, linked data, low skilled workers, moral panic, opioid epidemic / opioid crisis, pill mill

Anderson’s work directly, but now we support it through a nonprofit called the Colibri Center for Human Rights that works with the medical examiner’s office to identify these remains and provide closure for families regardless of the origins of the deceased. For example, we funded an international geographic information system (GIS) initiative in Pima County to link data from missing person reports to postmortem reports. We agree with Anderson and Colibri that respect for the dead is one measure of a civilized society. Is it civilized to view the “mortal danger” of the desert as a deterrent? Should it give us pause that before Operation Gatekeeper funneled immigrants to the desert, there were only about twelve bodies per year recovered along the border?


pages: 371 words: 108,317

The Inevitable: Understanding the 12 Technological Forces That Will Shape Our Future by Kevin Kelly

A Declaration of the Independence of Cyberspace, Aaron Swartz, AI winter, Airbnb, Albert Einstein, Alvin Toffler, Amazon Web Services, augmented reality, bank run, barriers to entry, Baxter: Rethink Robotics, bitcoin, blockchain, book scanning, Brewster Kahle, Burning Man, cloud computing, commoditize, computer age, Computer Lib, connected car, crowdsourcing, dark matter, data science, deep learning, DeepMind, dematerialisation, Downton Abbey, driverless car, Edward Snowden, Elon Musk, Filter Bubble, Freestyle chess, Gabriella Coleman, game design, Geoffrey Hinton, Google Glasses, hive mind, Howard Rheingold, index card, indoor plumbing, industrial robot, Internet Archive, Internet of things, invention of movable type, invisible hand, Jaron Lanier, Jeff Bezos, job automation, John Markoff, John Perry Barlow, Kevin Kelly, Kickstarter, lifelogging, linked data, Lyft, M-Pesa, machine readable, machine translation, Marc Andreessen, Marshall McLuhan, Mary Meeker, means of production, megacity, Minecraft, Mitch Kapor, multi-sided market, natural language processing, Netflix Prize, Network effects, new economy, Nicholas Carr, off-the-grid, old-boy network, peer-to-peer, peer-to-peer lending, personalized medicine, placebo effect, planetary scale, postindustrial economy, Project Xanadu, recommendation engine, RFID, ride hailing / ride sharing, robo advisor, Rodney Brooks, self-driving car, sharing economy, Silicon Valley, slashdot, Snapchat, social graph, social web, software is eating the world, speech recognition, Stephen Hawking, Steven Levy, Ted Nelson, TED Talk, The future is already here, the long tail, the scientific method, transport as a service, two-sided market, Uber for X, uber lyft, value engineering, Watson beat the top human players on Jeopardy!, WeWork, Whole Earth Review, Yochai Benkler, yottabyte, zero-sum game

Slowly but surely Amazon’s cloud and Google’s cloud and Facebook’s cloud and all the other enterprise clouds are intertwining into one massive cloud that acts as a single cloud—The Cloud—to the average user or company. A counterforce resisting this merger is that an intercloud requires commercial clouds to share their data (a cloud is a network of linked data), and right now data tends to be hoarded like gold. Data hoards are seen as a competitive advantage, and sharing data freely is hampered by laws, so it will be many years (decades?) before companies learn how to share their data creatively, productively, and responsibly. There is one final step in the inexorable march toward decentralized access.


pages: 371 words: 107,141

You've Been Played: How Corporations, Governments, and Schools Use Games to Control Us All by Adrian Hon

"hyperreality Baudrillard"~20 OR "Baudrillard hyperreality", 4chan, Adam Curtis, Adrian Hon, Airbnb, Amazon Mechanical Turk, Amazon Web Services, Astronomia nova, augmented reality, barriers to entry, Bellingcat, Big Tech, bitcoin, bread and circuses, British Empire, buy and hold, call centre, computer vision, conceptual framework, contact tracing, coronavirus, corporate governance, COVID-19, crowdsourcing, cryptocurrency, David Graeber, David Sedaris, deep learning, delayed gratification, democratizing finance, deplatforming, disinformation, disintermediation, Dogecoin, electronic logging device, Elon Musk, en.wikipedia.org, Ethereum, fake news, fiat currency, Filter Bubble, Frederick Winslow Taylor, fulfillment center, Galaxy Zoo, game design, gamification, George Floyd, gig economy, GitHub removed activity streaks, Google Glasses, Hacker News, Hans Moravec, Ian Bogost, independent contractor, index fund, informal economy, Jeff Bezos, job automation, jobs below the API, Johannes Kepler, Kevin Kelly, Kevin Roose, Kickstarter, Kiva Systems, knowledge worker, Lewis Mumford, lifelogging, linked data, lockdown, longitudinal study, loss aversion, LuLaRoe, Lyft, Marshall McLuhan, megaproject, meme stock, meta-analysis, Minecraft, moral panic, multilevel marketing, non-fungible token, Ocado, Oculus Rift, One Laptop per Child (OLPC), orbital mechanics / astrodynamics, Parler "social media", passive income, payment for order flow, prisoner's dilemma, QAnon, QR code, quantitative trading / quantitative finance, r/findbostonbombers, replication crisis, ride hailing / ride sharing, Robinhood: mobile stock trading app, Ronald Coase, Rubik’s Cube, Salesforce, Satoshi Nakamoto, scientific management, shareholder value, sharing economy, short selling, short squeeze, Silicon Valley, SimCity, Skinner box, spinning jenny, Stanford marshmallow experiment, Steve Jobs, Stewart Brand, TED Talk, The Nature of the Firm, the scientific method, TikTok, Tragedy of the Commons, transaction costs, Twitter Arab Spring, Tyler Cowen, Uber and Lyft, uber lyft, urban planning, warehouse robotics, Whole Earth Catalog, why are manhole covers round?, workplace surveillance

No one would mistake the clean lines of my flowcharts for the snarl of links that makes up the Q-Web, a notorious QAnon chart crammed with hundreds of supposedly connected things like #MeToo, Monsanto, and J. Edgar Hoover, but the principles are similar: one discovery leads to the next.10 Of course, these two flowcharts are very different beasts. The Q-Web is an imaginary, retrospective description of spuriously linked data, while my flowcharts were a prescriptive network of events completely orchestrated by my team. Except that’s not quite true. In reality, Perplex City players didn’t always solve our puzzles as quickly as we intended them to, or they became convinced their incorrect solution was correct, or, embarrassingly, our puzzles were broken and had no solution at all.


Remix by John Courtenay Grimwood

clean water, delayed gratification, double helix, fear of failure, haute couture, Herbert Marcuse, Kickstarter, linked data, space junk

But Lady Clare had insisted, reeling off a list that began with the Antiguan Absolutists and ended with Zebediah Nouveau. Mind you, he didn’t hate standing inside that circle as much as he hated being there at all. But Lady Clare had insisted on that as well. Keeping her good side to the main CySat camera, Lady Clare smiled. It was amazing how much clout you carried when you’d linked data credits to gold reserves to keep the senior officers loyal, welcomed the UN Pax Force with open arms, arranged for Paris to be the first European city overflown with the new ‘dote and put some backbone into the Prince Imperial. This was the General’s payback, and as far as Lady Clare was concerned it was a small price.


pages: 404 words: 43,442

The Art of R Programming by Norman Matloff

data science, Debian, discrete time, Donald Knuth, functional programming, general-purpose programming language, linked data, sorting algorithm, statistical model

If implemented in C, a tree node would be represented by a C struct, similar to an R list, whose contents are the stored value, a pointer to the left child, and a pointer to the right child. But since R lacks pointer variables, what can we do? Our solution is to go back to the basics. In the old prepointer days in FORTRAN, linked data structures were implemented in long arrays. A pointer, which in C is a memory address, was an array index instead. Specifically, we’ll represent each node by a row in a three-column matrix. The node’s stored value will be in the third element of that row, while the first and second elements will be the left and right links.


pages: 400 words: 121,988

Trading at the Speed of Light: How Ultrafast Algorithms Are Transforming Financial Markets by Donald MacKenzie

algorithmic trading, automated trading system, banking crisis, barriers to entry, bitcoin, blockchain, Bonfire of the Vanities, Bretton Woods, Cambridge Analytica, centralized clearinghouse, Claude Shannon: information theory, coronavirus, COVID-19, cryptocurrency, disintermediation, diversification, en.wikipedia.org, Ethereum, ethereum blockchain, family office, financial intermediation, fixed income, Flash crash, Google Earth, Hacker Ethic, Hibernia Atlantic: Project Express, interest rate derivative, interest rate swap, inventory management, Jim Simons, level 1 cache, light touch regulation, linked data, lockdown, low earth orbit, machine readable, market design, market microstructure, Martin Wolf, proprietary trading, Renaissance Technologies, Satoshi Nakamoto, Small Order Execution System, Spread Networks laid a new fibre optics cable between New York and Chicago, statistical arbitrage, statistical model, Steven Levy, The Great Moderation, transaction costs, UUNET, zero-sum game

Interviewee CV gave, as an example of this highly demanding form of trading—“taking in the world’s information and being able to translate that to predict the next tick [price movement]”—an algorithm trading 10-year US Treasury futures in the Chicago Mercantile Exchange’s datacenter. The algorithm will take into account the pattern of bids, offers, and trades in those futures, as well as patterns in the trading of the other Treasury and interest-rate futures also traded in that datacenter. The algorithm will receive, via microwave links, data on the buying and selling of the underlying Treasurys, which are traded in the two datacenters in New Jersey shown in the map in figure 4.1. Via Hibernia Atlantic’s ultrafast transatlantic cable, it will receive data on the trading of futures on UK sovereign bonds (these futures are traded in a datacenter just outside of London) and the equivalent German futures, traded in a datacenter in Frankfurt called FR2.


pages: 424 words: 123,180

Democracy's Data: The Hidden Stories in the U.S. Census and How to Read Them by Dan Bouk

Black Lives Matter, card file, COVID-19, dark matter, data science, desegregation, digital map, Donald Trump, George Floyd, germ theory of disease, government statistician, hiring and firing, illegal immigration, index card, invisible hand, Jeff Bezos, linked data, Mahatma Gandhi, mass incarceration, public intellectual, pull request, Ralph Waldo Emerson, Scientific racism, Shoshana Zuboff, Silicon Valley, social distancing, surveillance capitalism, transcontinental railway, union organizing, W. E. B. Du Bois, Works Progress Administration, zero-sum game

Even more remarkable is the thought that it will likely continue to exist as long as there is a United States of America, maybe even longer. The census began as a relatively simple tool to tether political clout to each state’s head count. The framers of the Constitution and their Enlightenment-era values linked data to democracy and democracy to data. Each state’s say in governing the country would henceforth be proportional to its official population. By 1940, the census had developed into an extensive stocktaking of the American people, a picture of who they were, where they came from, what they did, and how they lived.


The Art of Computer Programming: Fundamental Algorithms by Donald E. Knuth

Charles Babbage, discrete time, distributed generation, Donald Knuth, fear of failure, Fermat's Last Theorem, G4S, Gerard Salton, Isaac Newton, Ivan Sutherland, Jacquard loom, Johannes Kepler, John von Neumann, linear programming, linked data, Menlo Park, probability theory / Blaise Pascal / Pierre de Fermat, sorting algorithm, stochastic process, Turing machine

The proper way to design a library is heavily dependent upon the computer used and the applications to be handled. Large modern computers require an entirely different approach to subroutine libraries. But this is a nice exercise anyway, because it involves interesting manipulations on both sequential and linked data.) The problem in this exercise is to design an algorithm for the stated task. Your allocator may transform the tape directory in any way as it prepares its answer, since the tape directory can be read in anew by the subroutine allocator on its next assignment, and the tape directory is not needed by other parts of the loading routine. 27. [25] Write a MIX program for the subroutine allocation algorithm of exercise 26. 28. [40] The following construction shows how to "solve" a fairly general type of two- person game, including chess, nim, and many simpler games: Consider a finite set of nodes, each of which represents a possible position in the game.

The first algorithm we require is one that builds the Data Table in such a form. Note the flexibility in choice of level numbers that is allowed by the COBOL rules; the left structure in D) is completely equivalent to 1 A 2 B 3 C 3 D 2 E 2 F 3 G because level numbers do not have to be sequential. 428 INFORMATION STRUCTURES 2.4 Symbol Table LINK Data Table PREV PARENT NAME CHILD SIB A: B: C: D: E: F: G: H: Al B5 C5 D9 E9 F5 G9 HI Empty boxes indicate additional information not relevant here A A A A A A A A F3 G4 B3 C7 E3 D7 G8 A Al B3 B3 Al Al F3 A HI F5 HI HI C5 C5 C5 A B C D E F G H F G B C E D G B3 C7 A A A G4 A F5 G8 A A E9 A A A HI E3 D7 A F3 A A A B5 A C5 A D9 G9 A E) Al: B3?


pages: 505 words: 133,661

Who Owns England?: How We Lost Our Green and Pleasant Land, and How to Take It Back by Guy Shrubsole

Adam Curtis, Anthropocene, back-to-the-land, Beeching cuts, Boris Johnson, Capital in the Twenty-First Century by Thomas Piketty, centre right, congestion charging, Crossrail, deindustrialization, digital map, do-ocracy, Downton Abbey, false flag, financial deregulation, fixed income, fulfillment center, Garrett Hardin, gentrification, Global Witness, Goldman Sachs: Vampire Squid, Google Earth, housing crisis, housing justice, James Dyson, Jeremy Corbyn, Kickstarter, land bank, land reform, land tenure, land value tax, linked data, loadsamoney, Londongrad, machine readable, mega-rich, mutually assured destruction, new economy, Occupy movement, offshore financial centre, oil shale / tar sands, openstreetmap, place-making, plutocrats, profit motive, rent-seeking, rewilding, Right to Buy, Ronald Reagan, Russell Brand, sceptred isle, Stewart Brand, the built environment, the map is not the territory, The Wealth of Nations by Adam Smith, Tragedy of the Commons, trickle-down economics, urban sprawl, web of trust, Yom Kippur War, zero-sum game

Part of the problem is that the data on what companies own still isn’t good enough to prove whether or not land banking is occurring. Anna has tried to map the land owned by housing developers, but has been thwarted by the lack in the Land Registry’s corporate dataset of the necessary information to link data on who owns a site with digital maps of that area. That makes it very hard to assess, for example, whether a piece of land owned by a housebuilder for decades is a prime site accruing in value or a leftover fragment of ground from a past development. Second, the scope of Letwin’s review was drawn too narrowly to examine the wider problem of land banking by landowners beyond the major housebuilders.


pages: 494 words: 142,285

The Future of Ideas: The Fate of the Commons in a Connected World by Lawrence Lessig

AltaVista, Andy Kessler, AOL-Time Warner, barriers to entry, Bill Atkinson, business process, Cass Sunstein, commoditize, computer age, creative destruction, dark matter, decentralized internet, Dennis Ritchie, disintermediation, disruptive innovation, Donald Davies, Erik Brynjolfsson, Free Software Foundation, Garrett Hardin, George Gilder, Hacker Ethic, Hedy Lamarr / George Antheil, history of Unix, Howard Rheingold, Hush-A-Phone, HyperCard, hypertext link, Innovator's Dilemma, invention of hypertext, inventory management, invisible hand, Jean Tirole, Jeff Bezos, John Gilmore, John Perry Barlow, Joseph Schumpeter, Ken Thompson, Kenneth Arrow, Larry Wall, Leonard Kleinrock, linked data, Marc Andreessen, Menlo Park, Mitch Kapor, Network effects, new economy, OSI model, packet switching, peer-to-peer, peer-to-peer model, price mechanism, profit maximization, RAND corporation, rent control, rent-seeking, RFC: Request For Comment, Richard Stallman, Richard Thaler, Robert Bork, Ronald Coase, Search for Extraterrestrial Intelligence, SETI@home, Silicon Valley, smart grid, software patent, spectrum auction, Steve Crocker, Steven Levy, Stewart Brand, systematic bias, Ted Nelson, Telecommunications Act of 1996, the Cathedral and the Bazaar, The Chicago School, tragedy of the anticommons, Tragedy of the Commons, transaction costs, vertical integration, Yochai Benkler, zero-sum game

For a time, one could find an extraordinary range of songs archived throughout the Web. Slowly these services have migrated to commercial sites. This migration means the commercial sites can support the costs of developing and maintaining this information. And in some cases, with some databases, the Internet provided a simple way to collect and link data about music in particular.8 Here the CDDB—or “CD database”—is the most famous example. As MP3 equipment became common, people needed a simple way to get information about CD titles and tracks onto the MP3 device. Of course, one could type in that information, but why should everyone have to type in that information?


pages: 528 words: 146,459

Computer: A History of the Information Machine by Martin Campbell-Kelly, William Aspray, Nathan L. Ensmenger, Jeffrey R. Yost

Ada Lovelace, air freight, Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem, Apple's 1984 Super Bowl advert, barriers to entry, Bill Gates: Altair 8800, Bletchley Park, borderless world, Buckminster Fuller, Build a better mousetrap, Byte Shop, card file, cashless society, Charles Babbage, cloud computing, combinatorial explosion, Compatible Time-Sharing System, computer age, Computer Lib, deskilling, don't be evil, Donald Davies, Douglas Engelbart, Douglas Engelbart, Dynabook, Edward Jenner, Evgeny Morozov, Fairchild Semiconductor, fault tolerance, Fellow of the Royal Society, financial independence, Frederick Winslow Taylor, game design, garden city movement, Gary Kildall, Grace Hopper, Herman Kahn, hockey-stick growth, Ian Bogost, industrial research laboratory, informal economy, interchangeable parts, invention of the wheel, Ivan Sutherland, Jacquard loom, Jeff Bezos, jimmy wales, John Markoff, John Perry Barlow, John von Neumann, Ken Thompson, Kickstarter, light touch regulation, linked data, machine readable, Marc Andreessen, Mark Zuckerberg, Marshall McLuhan, Menlo Park, Mitch Kapor, Multics, natural language processing, Network effects, New Journalism, Norbert Wiener, Occupy movement, optical character recognition, packet switching, PageRank, PalmPilot, pattern recognition, Pierre-Simon Laplace, pirate software, popular electronics, prediction markets, pre–internet, QWERTY keyboard, RAND corporation, Robert X Cringely, Salesforce, scientific management, Silicon Valley, Silicon Valley startup, Steve Jobs, Steven Levy, Stewart Brand, Ted Nelson, the market place, Turing machine, Twitter Arab Spring, Vannevar Bush, vertical integration, Von Neumann architecture, Whole Earth Catalog, William Shockley: the traitorous eight, women in the workforce, young professional

was already well established when two other Stanford University doctoral students, Larry Page and Sergey Brin, began work on the Stanford Digital Library Project (funded in part by the National Science Foundation)—research that would not only forever change the process of finding things on the Internet but also, in time, lead to an unprecedentedly successful web advertising model. Page became interested in a dissertation project on the mathematical properties of the web, and found strong support from his adviser Terry Winograd, a pioneer of artificial intelligence research on natural language processing. Using a “web crawler” to gather back-link data (that is, the websites that linked to a particular site), Page, now teamed up with Brin, created their “PageRank” algorithm based on back-links ranked by importance—the more prominent the linking site, the more influence it would have on the linked site’s page rank. They insightfully reasoned that this would provide the basis for more useful web searches than any existing tools and, moreover, that there would be no need to hire a corps of indexing staff.


pages: 598 words: 134,339

Data and Goliath: The Hidden Battles to Collect Your Data and Control Your World by Bruce Schneier

23andMe, Airbnb, airport security, AltaVista, Anne Wojcicki, AOL-Time Warner, augmented reality, behavioural economics, Benjamin Mako Hill, Black Swan, Boris Johnson, Brewster Kahle, Brian Krebs, call centre, Cass Sunstein, Chelsea Manning, citizen journalism, Citizen Lab, cloud computing, congestion charging, data science, digital rights, disintermediation, drone strike, Eben Moglen, Edward Snowden, end-to-end encryption, Evgeny Morozov, experimental subject, failed state, fault tolerance, Ferguson, Missouri, Filter Bubble, Firefox, friendly fire, Google Chrome, Google Glasses, heat death of the universe, hindsight bias, informal economy, information security, Internet Archive, Internet of things, Jacob Appelbaum, James Bridle, Jaron Lanier, John Gilmore, John Markoff, Julian Assange, Kevin Kelly, Laura Poitras, license plate recognition, lifelogging, linked data, Lyft, Mark Zuckerberg, moral panic, Nash equilibrium, Nate Silver, national security letter, Network effects, Occupy movement, operational security, Panopticon Jeremy Bentham, payday loans, pre–internet, price discrimination, profit motive, race to the bottom, RAND corporation, real-name policy, recommendation engine, RFID, Ross Ulbricht, satellite internet, self-driving car, Shoshana Zuboff, Silicon Valley, Skype, smart cities, smart grid, Snapchat, social graph, software as a service, South China Sea, sparse data, stealth mode startup, Steven Levy, Stuxnet, TaskRabbit, technological determinism, telemarketer, Tim Cook: Apple, transaction costs, Uber and Lyft, uber lyft, undersea cable, unit 8200, urban planning, Wayback Machine, WikiLeaks, workplace surveillance , Yochai Benkler, yottabyte, zero day

., 160 fiduciary responsibility, data collection and, 204–5 50 Cent Party, 114 FileVault, 215 filter bubble, 114–15 FinFisher, 81 First Unitarian Church of Los Angeles, 91 FISA (Foreign Intelligence Surveillance Act; 1978), 273 FISA Amendments Act (2008), 171, 273, 275–76 Section 702 of, 65–66, 173, 174–75, 261 FISA Court, 122, 171 NSA misrepresentations to, 172, 337 secret warrants of, 174, 175–76, 177 transparency needed in, 177 fishing expeditions, 92, 93 Fitbit, 16, 112 Five Eyes, 76 Flame, 72 FlashBlock, 49 flash cookies, 49 Ford Motor Company, GPS data collected by, 29 Foreign Intelligence Surveillance Act (FISA; 1978), 273 see also FISA Amendments Act Forrester Research, 122 Fortinet, 82 Fox-IT, 72 France, government surveillance in, 79 France Télécom, 79 free association, government surveillance and, 2, 39, 96 freedom, see liberty Freeh, Louis, 314 free services: overvaluing of, 50 surveillance exchanged for, 4, 49–51, 58–59, 60–61, 226, 235 free speech: as constitutional right, 189, 344 government surveillance and, 6, 94–95, 96, 97–99 Internet and, 189 frequent flyer miles, 219 Froomkin, Michael, 198 FTC, see Federal Trade Commission, US fusion centers, 69, 104 gag orders, 100, 122 Gamma Group, 81 Gandy, Oscar, 111 Gates, Bill, 128 gay rights, 97 GCHQ, see Government Communications Headquarters Geer, Dan, 205 genetic data, 36 geofencing, 39–40 geopolitical conflicts, and need for surveillance, 219–20 Georgia, Republic of, cyberattacks on, 75 Germany: Internet control and, 188 NSA surveillance of, 76, 77, 122–23, 151, 160–61, 183, 184 surveillance of citizens by, 350 US relations with, 151, 234 Ghafoor, Asim, 103 GhostNet, 72 Gill, Faisal, 103 Gmail, 31, 38, 50, 58, 219 context-sensitive advertising in, 129–30, 142–43 encryption of, 215, 216 government surveillance of, 62, 83, 148 GoldenShores Technologies, 46–47 Goldsmith, Jack, 165, 228 Google, 15, 27, 44, 48, 54, 221, 235, 272 customer loyalty to, 58 data mining by, 38 data storage capacity of, 18 government demands for data from, 208 impermissible search ad policy of, 55 increased encryption by, 208 as information middleman, 57 linked data sets of, 50 NSA hacking of, 85, 208 PageRank algorithm of, 196 paid search results on, 113–14 search data collected by, 22–23, 31, 123, 202 transparency reports of, 207 see also Gmail Google Analytics, 31, 48, 233 Google Calendar, 58 Google Docs, 58 Google Glass, 16, 27, 41 Google Plus, 50 real name policy of, 49 surveillance by, 48 Google stalking, 230 Gore, Al, 53 government: checks and balances in, 100, 175 surveillance by, see mass surveillance, government Government Accountability Office, 30 Government Communications Headquarters (GCHQ): cyberattacks by, 149 encryption programs and, 85 location data used by, 3 mass surveillance by, 69, 79, 175, 182, 234 government databases, hacking of, 73, 117, 313 GPS: automobile companies’ use of, 29–30 FBI use of, 26, 95 police use of, 26 in smart phones, 3, 14 Grayson, Alan, 172 Great Firewall (Golden Shield), 94, 95, 150–51, 187, 237 Greece, wiretapping of government cell phones in, 148 greenhouse gas emissions, 17 Greenwald, Glenn, 20 Grindr, 259 Guardian, Snowden documents published by, 20, 67, 149 habeas corpus, 229 hackers, hacking, 42–43, 71–74, 216, 313 of government databases, 73, 117, 313 by NSA, 85 privately-made technology for, 73, 81 see also cyberwarfare Hacking Team, 73, 81, 149–50 HAPPYFOOT, 3 Harris Corporation, 68 Harris Poll, 96 Hayden, Michael, 23, 147, 162 health: effect of constant surveillance on, 127 mass surveillance and, 16, 41–42 healthcare data, privacy of, 193 HelloSpy, 3, 245 Hewlett-Packard, 112 Hill, Raquel, 44 hindsight bias, 322 Hobbes, Thomas, 210 Home Depot, 110, 116 homosexuality, 97 Hoover, J.


The Art of Computer Programming: Sorting and Searching by Donald Ervin Knuth

card file, Charles Babbage, Claude Shannon: information theory, complexity theory, correlation coefficient, Donald Knuth, double entry bookkeeping, Eratosthenes, Fermat's Last Theorem, G4S, information retrieval, iterative process, John von Neumann, linked data, locality of reference, Menlo Park, Norbert Wiener, NP-complete, p-value, Paul Erdős, RAND corporation, refrigerator car, sorting algorithm, Vilfredo Pareto, Yogi Berra, Zipf's Law

Example of Wheeler's tree insertion scheme. structure slightly with "two-way insertion" cuts the number of moves down to about |-/V2. Shellsort cuts the number of comparisons and moves to about N7//6, for N in a practical range; as N —> oo this number can be lowered to order N(\ogNJ. Another way to improve on Algorithm S, using a linked data structure, gave us the list insertion method, which does about \N2 comparisons, 0 moves, and 2N changes of links. Is it possible to marry the best features of these methods, reducing the number of comparisons to order NlogN as in binary insertion, yet reducing the number of moves as in list insertion?

An alert, "modern" reader will note, however, that the whole idea of mak- making digit counts for the storage allocation is tied to old-fashioned ideas about sequential data representation. We know that linked allocation is specifically designed to handle a set of tables of variable size, so it is natural to choose a linked data structure for radix sorting. Since we traverse each pile serially, all 5.2.5 SORTING BY DISTRIBUTION 171 Table 1 RADIX SORTING Input area contents: 503 087 512 061 908 170 897 275 653 426 154 509 612 677 765 703 Counts for units digit distribution: 1123121311 Storage allocations based on these counts: 1 2 4 7 8 10 11 14 15 16 Auxiliary area contents: 170 061 512 612 503 653 703 154 275 765 426 087 897 677 908 509 Counts for tens digit distribution: 4210022311 Storage allocations based on these counts: 4 6 7 7 7 9 11 14 15 16 Input area contents: 503 703 908 509 512 612 426 653 154 061 765 170 275 677 087 897 Counts for hundreds digit distribution: 2210133211 Storage allocations based on these counts: 2 4 5 5 6 9 12 14 15 16 Auxiliary area contents: 061 087 154 170 275 426 503 509 512 612 653 677 703 765 897 908 we need is a single link from each item to its successor.


In the Age of the Smart Machine by Shoshana Zuboff

affirmative action, American ideology, blue-collar work, collective bargaining, computer age, Computer Numeric Control, conceptual framework, data acquisition, demand response, deskilling, factory automation, Ford paid five dollars a day, fudge factor, future of work, industrial robot, information retrieval, interchangeable parts, job automation, lateral thinking, linked data, Marshall McLuhan, means of production, old-boy network, optical character recognition, Panopticon Jeremy Bentham, pneumatic tube, post-industrial society, radical decentralization, RAND corporation, scientific management, Shoshana Zuboff, social web, systems thinking, tacit knowledge, The Wealth of Nations by Adam Smith, Thorstein Veblen, union organizing, vertical integration, work culture , zero-sum game

Ironically, it means creating a doubly abstract world, where the refer- ence function of the electronic symbols becomes less problematic be- cause of yet another layer of abstractions (mental images) called up to serve as referents. Operators did not appear equally adept at generating an inward im- age. 7 Many seemed unable to link data on the screen to a referential reality. Their interactions with the data were confined to the two- dimensional space of the terminal screen; the electronic symbols were deciphered according to the varying patterns in which they were ar- rayed. Typically, when asked what the data on the screen meant, these operators would point to distinct data elements and discuss them in terms of their spatial relationships on the screen, as if there were no external referents.


pages: 834 words: 180,700

The Architecture of Open Source Applications by Amy Brown, Greg Wilson

8-hour work day, anti-pattern, bioinformatics, business logic, c2.com, cloud computing, cognitive load, collaborative editing, combinatorial explosion, computer vision, continuous integration, Conway's law, create, read, update, delete, David Heinemeier Hansson, Debian, domain-specific language, Donald Knuth, en.wikipedia.org, fault tolerance, finite state, Firefox, Free Software Foundation, friendly fire, functional programming, Guido van Rossum, Ken Thompson, linked data, load shedding, locality of reference, loose coupling, Mars Rover, MITM: man-in-the-middle, MVC pattern, One Laptop per Child (OLPC), peer-to-peer, Perl 6, premature optimization, recommendation engine, revision control, Ruby on Rails, side project, Skype, slashdot, social web, speech recognition, the scientific method, The Wisdom of Crowds, web application, WebSocket

If the data changes from one execution to another, a new version is checked in to the repository. Thus, the (uuid, version) tuple is a compound identifier to retrieve the data in any state. In addition, we store the hash of the data as well as the signature of the upstream portion of the workflow that generated it (if it is not an input). This allows one to link data that might be identified differently as well as reuse data when the same computation is run again. The main concern when designing this package was the way users were able to select and retrieve their data. Also, we wished to keep all data in the same repository, regardless of whether it is used as input, output, or intermediate data (an output of one workflow might be used as the input of another).


pages: 933 words: 205,691

Hadoop: The Definitive Guide by Tom White

Amazon Web Services, bioinformatics, business intelligence, business logic, combinatorial explosion, data science, database schema, Debian, domain-specific language, en.wikipedia.org, exponential backoff, fallacies of distributed computing, fault tolerance, full text search, functional programming, Grace Hopper, information retrieval, Internet Archive, Kickstarter, Large Hadron Collider, linked data, loose coupling, openstreetmap, recommendation engine, RFID, SETI@home, social graph, sparse data, web application

This information is not readily available when crawling. Also, the indexing process benefits from taking into account the anchor text on inlinks so that this text may semantically enrich the text of the current page. As mentioned earlier, Nutch collects the outlink information and then uses this data to build a LinkDb, which contains this reversed link data in the form of inlinks and anchor text. This section presents a rough outline of the implementation of the LinkDb tool—many details have been omitted (such as URL normalization and filtering) in order to present a clear picture of the process. What’s left gives a classical example of why the MapReduce paradigm fits so well with the key data transformation processes required to run a search engine.


pages: 761 words: 231,902

The Singularity Is Near: When Humans Transcend Biology by Ray Kurzweil

additive manufacturing, AI winter, Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem, Albert Einstein, anthropic principle, Any sufficiently advanced technology is indistinguishable from magic, artificial general intelligence, Asilomar, augmented reality, autonomous vehicles, backpropagation, Benoit Mandelbrot, Bill Joy: nanobots, bioinformatics, brain emulation, Brewster Kahle, Brownian motion, business cycle, business intelligence, c2.com, call centre, carbon-based life, cellular automata, Charles Babbage, Claude Shannon: information theory, complexity theory, conceptual framework, Conway's Game of Life, coronavirus, cosmological constant, cosmological principle, cuban missile crisis, data acquisition, Dava Sobel, David Brooks, Dean Kamen, digital divide, disintermediation, double helix, Douglas Hofstadter, en.wikipedia.org, epigenetics, factory automation, friendly AI, functional programming, George Gilder, Gödel, Escher, Bach, Hans Moravec, hype cycle, informal economy, information retrieval, information security, invention of the telephone, invention of the telescope, invention of writing, iterative process, Jaron Lanier, Jeff Bezos, job automation, job satisfaction, John von Neumann, Kevin Kelly, Law of Accelerating Returns, life extension, lifelogging, linked data, Loebner Prize, Louis Pasteur, mandelbrot fractal, Marshall McLuhan, Mikhail Gorbachev, Mitch Kapor, mouse model, Murray Gell-Mann, mutually assured destruction, natural language processing, Network effects, new economy, Nick Bostrom, Norbert Wiener, oil shale / tar sands, optical character recognition, PalmPilot, pattern recognition, phenotype, power law, precautionary principle, premature optimization, punch-card reader, quantum cryptography, quantum entanglement, radical life extension, randomized controlled trial, Ray Kurzweil, remote working, reversible computing, Richard Feynman, Robert Metcalfe, Rodney Brooks, scientific worldview, Search for Extraterrestrial Intelligence, selection bias, semantic web, seminal paper, Silicon Valley, Singularitarianism, speech recognition, statistical model, stem cell, Stephen Hawking, Stewart Brand, strong AI, Stuart Kauffman, superintelligent machines, technological singularity, Ted Kaczynski, telepresence, The Coming Technological Singularity, Thomas Bayes, transaction costs, Turing machine, Turing test, two and twenty, Vernor Vinge, Y2K, Yogi Berra

Resources and Contact Information Singularity.com New developments in the diverse fields discussed in this book are accumulating at an accelerating pace. To help you keep pace, I invite you to visit Singularity.com, where you will find ·Recent news stories ·A compilation of thousands of relevant news stories going back to 2001 from KurzweilAI.net (see below) ·Hundreds of articles on related topics from KurzweilAI.net ·Research links ·Data and citation for all graphs ·Material about this book ·Excerpts from this book ·Online endnotes KurzweilAI.net You are also invited to visit our award-winning Web site, KurzweilAI.net, which includes over six hundred articles by over one hundred "big thinkers" (many of whom are cited in this book), thousands of news articles, listings of events, and other features.


pages: 897 words: 242,580

The Temporal Void by Peter F. Hamilton

corporate governance, dark matter, forensic accounting, linked data, megacity, place-making, trade route

The Yenisey couldn’t even get an accurate quantum signature scan to determine what kind of drive it used. ‘Admiral,’ Lucian called urgently. ‘We can’t—’ The unknown ship fired. ‘What the fuck was that!’ Gore yelled as the secure link abruptly vanished. Kazimir took a second to review the TD link data, he was so surprised. His tactical staff had produced a number of scenarios, mostly incorporating the Ocisens utilizing weapons technology they’d procured from a more advanced species. This hadn’t been a remote consideration. ‘I don’t recognize that design at all,’ Ilanthe said. ‘Do we have any spherical ship on the Navy’s intelligence registry?’


pages: 903 words: 235,753

The Stack: On Software and Sovereignty by Benjamin H. Bratton

1960s counterculture, 3D printing, 4chan, Ada Lovelace, Adam Curtis, additive manufacturing, airport security, Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem, algorithmic trading, Amazon Mechanical Turk, Amazon Robotics, Amazon Web Services, Andy Rubin, Anthropocene, augmented reality, autonomous vehicles, basic income, Benevolent Dictator For Life (BDFL), Berlin Wall, bioinformatics, Biosphere 2, bitcoin, blockchain, Buckminster Fuller, Burning Man, call centre, capitalist realism, carbon credits, carbon footprint, carbon tax, carbon-based life, Cass Sunstein, Celebration, Florida, Charles Babbage, charter city, clean water, cloud computing, company town, congestion pricing, connected car, Conway's law, corporate governance, crowdsourcing, cryptocurrency, dark matter, David Graeber, deglobalization, dematerialisation, digital capitalism, digital divide, disintermediation, distributed generation, don't be evil, Douglas Engelbart, Douglas Engelbart, driverless car, Edward Snowden, Elon Musk, en.wikipedia.org, Eratosthenes, Ethereum, ethereum blockchain, Evgeny Morozov, facts on the ground, Flash crash, Frank Gehry, Frederick Winslow Taylor, fulfillment center, functional programming, future of work, Georg Cantor, gig economy, global supply chain, Google Earth, Google Glasses, Guggenheim Bilbao, High speed trading, high-speed rail, Hyperloop, Ian Bogost, illegal immigration, industrial robot, information retrieval, Intergovernmental Panel on Climate Change (IPCC), intermodal, Internet of things, invisible hand, Jacob Appelbaum, James Bridle, Jaron Lanier, Joan Didion, John Markoff, John Perry Barlow, Joi Ito, Jony Ive, Julian Assange, Khan Academy, Kim Stanley Robinson, Kiva Systems, Laura Poitras, liberal capitalism, lifelogging, linked data, lolcat, Mark Zuckerberg, market fundamentalism, Marshall McLuhan, Masdar, McMansion, means of production, megacity, megaproject, megastructure, Menlo Park, Minecraft, MITM: man-in-the-middle, Monroe Doctrine, Neal Stephenson, Network effects, new economy, Nick Bostrom, ocean acidification, off-the-grid, offshore financial centre, oil shale / tar sands, Oklahoma City bombing, OSI model, packet switching, PageRank, pattern recognition, peak oil, peer-to-peer, performance metric, personalized medicine, Peter Eisenman, Peter Thiel, phenotype, Philip Mirowski, Pierre-Simon Laplace, place-making, planetary scale, pneumatic tube, post-Fordism, precautionary principle, RAND corporation, recommendation engine, reserve currency, rewilding, RFID, Robert Bork, Sand Hill Road, scientific management, self-driving car, semantic web, sharing economy, Silicon Valley, Silicon Valley ideology, skeuomorphism, Slavoj Žižek, smart cities, smart grid, smart meter, Snow Crash, social graph, software studies, South China Sea, sovereign wealth fund, special economic zone, spectrum auction, Startup school, statistical arbitrage, Steve Jobs, Steven Levy, Stewart Brand, Stuxnet, Superbowl ad, supply-chain management, supply-chain management software, synthetic biology, TaskRabbit, technological determinism, TED Talk, the built environment, The Chicago School, the long tail, the scientific method, Torches of Freedom, transaction costs, Turing complete, Turing machine, Turing test, undersea cable, universal basic income, urban planning, Vernor Vinge, vertical integration, warehouse automation, warehouse robotics, Washington Consensus, web application, Westphalian system, WikiLeaks, working poor, Y Combinator, yottabyte

Through various combinations of open or proprietary exigetics of data, and perhaps a sequence of application programming interfaces (APIs), a query entered as “book me a ticket to New York” can activate a series of secondary inquiries to calendars, banks, flight schedules, airline databases, bank accounts, and so on and, through this, initiate the cascading programming resulting in that booking. For this, to search is also to program. Such tidy consumer use cases require enormously difficult standardizations of interoperability between competitive services (not to mention beyond-Esperanto level standardization of all Users’ conceptual taxonomies). The goal of linking data into semantically relevant and accessible structures so that “search” would also provide more actionable results, and in turn allowing queries to program those results for specific ends, remains compelling for search engines, if less so for individual down-service-stream providers, such as airlines and banks, which see their business absorbed into a handful of search platforms.20 By comparison, physical search may be based on a similar tissue of interrelation between addressable entities—in this case, a mix of physical things and data of interest—and might be a necessary condition of a really viable Internet of Things or SPIME space.


Data Mining: Concepts and Techniques: Concepts and Techniques by Jiawei Han, Micheline Kamber, Jian Pei

backpropagation, bioinformatics, business intelligence, business process, Claude Shannon: information theory, cloud computing, computer vision, correlation coefficient, cyber-physical system, database schema, discrete time, disinformation, distributed generation, finite state, industrial research laboratory, information retrieval, information security, iterative process, knowledge worker, linked data, machine readable, natural language processing, Netflix Prize, Occam's razor, pattern recognition, performance metric, phenotype, power law, random walk, recommendation engine, RFID, search costs, semantic web, seminal paper, sentiment analysis, sparse data, speech recognition, statistical model, stochastic process, supply-chain management, text mining, thinkpad, Thomas Bayes, web application

.; Ronkainen, P.; Toivonen, H.; Verkamo, A.I., Finding interesting rules from large sets of discovered association rules, In: Proc. 3rd Int. Conf. Information and Knowledge Management Gaithersburg, MD. (Nov. 1994), pp. 401–408. [KMS03] Kubica, J.; Moore, A.; Schneider, J., Tractable group detection on large link data sets, In: Proc. 2003 Int. Conf. Data Mining (ICDM’03) Melbourne, FL. (Nov. 2003), pp. 573–576. [KN97] Knorr, E.; Ng, R., A unified notion of outliers: Properties and computation, In: Proc. 1997 Int. Conf. Knowledge Discovery and Data Mining (KDD’97) Newport Beach, CA. (Aug. 1997), pp. 219–222. [KNNL04] Kutner, M.H.; Nachtsheim, C.J.; Neter, J.; Li, W., Applied Linear Statistical Models with Student CD. (2004) Irwin .


pages: 918 words: 257,605

The Age of Surveillance Capitalism by Shoshana Zuboff

"World Economic Forum" Davos, algorithmic bias, Amazon Web Services, Andrew Keen, augmented reality, autonomous vehicles, barriers to entry, Bartolomé de las Casas, behavioural economics, Berlin Wall, Big Tech, bitcoin, blockchain, blue-collar work, book scanning, Broken windows theory, California gold rush, call centre, Cambridge Analytica, Capital in the Twenty-First Century by Thomas Piketty, Cass Sunstein, choice architecture, citizen journalism, Citizen Lab, classic study, cloud computing, collective bargaining, Computer Numeric Control, computer vision, connected car, context collapse, corporate governance, corporate personhood, creative destruction, cryptocurrency, data science, deep learning, digital capitalism, disinformation, dogs of the Dow, don't be evil, Donald Trump, Dr. Strangelove, driverless car, Easter island, Edward Snowden, en.wikipedia.org, Erik Brynjolfsson, Evgeny Morozov, facts on the ground, fake news, Ford Model T, Ford paid five dollars a day, future of work, game design, gamification, Google Earth, Google Glasses, Google X / Alphabet X, Herman Kahn, hive mind, Ian Bogost, impulse control, income inequality, information security, Internet of things, invention of the printing press, invisible hand, Jean Tirole, job automation, Johann Wolfgang von Goethe, John Markoff, John Maynard Keynes: Economic Possibilities for our Grandchildren, John Maynard Keynes: technological unemployment, Joseph Schumpeter, Kevin Kelly, Kevin Roose, knowledge economy, Lewis Mumford, linked data, longitudinal study, low skilled workers, Mark Zuckerberg, market bubble, means of production, multi-sided market, Naomi Klein, natural language processing, Network effects, new economy, Occupy movement, off grid, off-the-grid, PageRank, Panopticon Jeremy Bentham, pattern recognition, Paul Buchheit, performance metric, Philip Mirowski, precision agriculture, price mechanism, profit maximization, profit motive, public intellectual, recommendation engine, refrigerator car, RFID, Richard Thaler, ride hailing / ride sharing, Robert Bork, Robert Mercer, Salesforce, Second Machine Age, self-driving car, sentiment analysis, shareholder value, Sheryl Sandberg, Shoshana Zuboff, Sidewalk Labs, Silicon Valley, Silicon Valley ideology, Silicon Valley startup, slashdot, smart cities, Snapchat, social contagion, social distancing, social graph, social web, software as a service, speech recognition, statistical model, Steve Bannon, Steve Jobs, Steven Levy, structural adjustment programs, surveillance capitalism, technological determinism, TED Talk, The Future of Employment, The Wealth of Nations by Adam Smith, Tim Cook: Apple, two-sided market, union organizing, vertical integration, Watson beat the top human players on Jeopardy!, winner-take-all economy, Wolfgang Streeck, work culture , Yochai Benkler, you are the product

Conlee, “How Automation and Analytics Are Changing Customer Care,” Conduent Blog, July 18, 2016, https://www.blogs.conduent.com/2016/07/18/how-automation-and-analytics-are-changing-customer-care; Ryan Knutson, “Call Centers May Know a Surprising Amount About You,” Wall Street Journal, January 6, 2017, http://www.wsj.com/articles/that-anonymous-voice-at-the-call-center-they-may-know-a-lot-about-you-1483698608. 74. Nicholas Confessore and Danny Hakim, “Bold Promises Fade to Doubts for a Trump-Linked Data Firm,” New York Times, March 6, 2017, https://www.nytimes.com/2017/03/06/us/politics/cambridge-analytica.html; Mary-Ann Russon, “Political Revolution: How Big Data Won the US Presidency for Donald Trump,” International Business Times UK, January 20, 2017, http://www.ibtimes.co.uk/political-revolution-how-big-data-won-us-presidency-donald-trump-1602269; Grassegger and Krogerus, “The Data That Turned the World Upside Down”; Carole Cadwalladr, “Revealed: How US Billionaire Helped to Back Brexit,” Guardian, February 25, 2017, https://www.theguardian.com/politics/2017/feb/26/us-billionaire-mercer-helped-back-brexit; Paul-Olivier Dehaye, “The (Dis)Information Mercenaries Now Controlling Trump’s Databases,” Medium, January 3, 2017, https://medium.com/personaldata-io/the-dis-information-mercenaries-now-controlling-trumps-databases-4f6a20d4f3e7; Harry Davies, “Ted Cruz Using Firm That Harvested Data on Millions of Unwitting Facebook Users,” Guardian, December 11, 2015, https://www.theguardian.com/us-news/2015/dec/11/senator-ted-cruz-president-campaign-facebook-user-data. 75.