fault tolerance

110 results back to index

pages: 1,237 words: 227,370

Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems by Martin Kleppmann

active measures, Amazon Web Services, bitcoin, blockchain, business intelligence, business process, c2.com, cloud computing, collaborative editing, commoditize, conceptual framework, cryptocurrency, database schema, DevOps, distributed ledger, Donald Knuth, Edward Snowden, Ethereum, ethereum blockchain, fault tolerance, finite state, Flash crash, full text search, general-purpose programming language, informal economy, information retrieval, Infrastructure as a Service, Internet of things, iterative process, John von Neumann, Kubernetes, loose coupling, Marc Andreessen, microservices, natural language processing, Network effects, packet switching, peer-to-peer, performance metric, place-making, premature optimization, recommendation engine, Richard Feynman, self-driving car, semantic web, Shoshana Zuboff, social graph, social web, software as a service, software is eating the world, sorting algorithm, source of truth, SPARQL, speech recognition, statistical model, undersea cable, web application, WebSocket, wikimedia commons

EventSource (browser API), Pushing state changes to clients eventual consistency, Replication, Problems with Replication Lag, Safety and liveness, Consistency Guarantees(see also conflicts) and perpetual inconsistency, Timeliness and Integrity evolvability, Evolvability: Making Change Easy, Encoding and Evolutioncalling services, Data encoding and evolution for RPC graph-structured data, Property Graphs of databases, Schema flexibility in the document model, Dataflow Through Databases-Archival storage, Deriving several views from the same event log, Reprocessing data for application evolution of message-passing, Distributed actor frameworks reprocessing data, Reprocessing data for application evolution, Unifying batch and stream processing schema evolution in Avro, The writer’s schema and the reader’s schema schema evolution in Thrift and Protocol Buffers, Field tags and schema evolution schema-on-read, Schema flexibility in the document model, Encoding and Evolution, The Merits of Schemas exactly-once semantics, Exactly-once message processing, Fault Tolerance, Exactly-once execution of an operationparity with batch processors, Unifying batch and stream processing preservation of integrity, Correctness of dataflow systems exclusive mode (locks), Implementation of two-phase locking eXtended Architecture transactions (see XA transactions) extract-transform-load (see ETL) F FacebookPresto (query engine), The divergence between OLTP databases and data warehouses React, Flux, and Redux (user interface libraries), End-to-end event streams social graphs, Graph-Like Data Models Wormhole (change data capture), Implementing change data capture fact tables, Stars and Snowflakes: Schemas for Analytics failover, Leader failure: Failover, Glossary(see also leader-based replication) in leaderless replication, absence of, Writing to the Database When a Node Is Down leader election, The leader and the lock, Total Order Broadcast, Distributed Transactions and Consensus potential problems, Leader failure: Failover failuresamplification by distributed transactions, Limitations of distributed transactions, Maintaining derived state failure detection, Detecting Faultsautomatic rebalancing causing cascading failures, Operations: Automatic or Manual Rebalancing perfect failure detectors, Three-phase commit timeouts and unbounded delays, Timeouts and Unbounded Delays, Network congestion and queueing using ZooKeeper, Membership and Coordination Services faults versus, Reliability partial failures in distributed systems, Faults and Partial Failures-Cloud Computing and Supercomputing, Summary fan-out (messaging systems), Describing Load, Multiple consumers fault tolerance, Reliability-How Important Is Reliability?, Glossaryabstractions for, Consistency and Consensus formalization in consensus, Fault-Tolerant Consensus-Limitations of consensususe of replication, Single-leader replication and consensus human fault tolerance, Philosophy of batch process outputs in batch processing, Bringing related data together in the same place, Philosophy of batch process outputs, Fault tolerance, Fault tolerance in log-based systems, Applying end-to-end thinking in data systems, Timeliness and Integrity-Correctness of dataflow systems in stream processing, Fault Tolerance-Rebuilding state after a failureatomic commit, Atomic commit revisited idempotence, Idempotence maintaining derived state, Maintaining derived state microbatching and checkpointing, Microbatching and checkpointing rebuilding state after a failure, Rebuilding state after a failure of distributed transactions, XA transactions-Limitations of distributed transactions transaction atomicity, Atomicity, Atomic Commit and Two-Phase Commit (2PC)-Exactly-once message processing faults, ReliabilityByzantine faults, Byzantine Faults-Weak forms of lying failures versus, Reliability handled by transactions, Transactions handling in supercomputers and cloud computing, Cloud Computing and Supercomputing hardware, Hardware Faults in batch processing versus distributed databases, Designing for frequent faults in distributed systems, Faults and Partial Failures-Cloud Computing and Supercomputing introducing deliberately, Reliability, Network Faults in Practice network faults, Network Faults in Practice-Detecting Faultsasymmetric faults, The Truth Is Defined by the Majority detecting, Detecting Faults tolerance of, in multi-leader replication, Multi-datacenter operation software errors, Software Errors tolerating (see fault tolerance) federated databases, The meta-database of everything fence (CPU instruction), Linearizability and network delays fencing (preventing split brain), Leader failure: Failover, The leader and the lock-Fencing tokensgenerating fencing tokens, Using total order broadcast, Membership and Coordination Services properties of fencing tokens, Correctness of an algorithm stream processors writing to databases, Idempotence, Exactly-once execution of an operation Fibre Channel (networks), MapReduce and Distributed Filesystems field tags (Thrift and Protocol Buffers), Thrift and Protocol Buffers-Field tags and schema evolution file descriptors (Unix), A uniform interface financial data, Advantages of immutable events Firebase (database), API support for change streams Flink (processing framework), Dataflow engines-Discussion of materializationdataflow APIs, High-Level APIs and Languages fault tolerance, Fault tolerance, Microbatching and checkpointing, Rebuilding state after a failure Gelly API (graph processing), The Pregel processing model integration of batch and stream processing, Batch and Stream Processing, Unifying batch and stream processing machine learning, Specialization for different domains query optimizer, The move toward declarative query languages stream processing, Stream analytics flow control, Network congestion and queueing, Messaging Systems, Glossary FLP result (on consensus), Distributed Transactions and Consensus FlumeJava (dataflow library), MapReduce workflows, High-Level APIs and Languages followers, Leaders and Followers, Glossary(see also leader-based replication) foreign keys, Comparison to document databases, Reduce-Side Joins and Grouping forward compatibility, Encoding and Evolution forward decay (algorithm), Describing Performance Fossil (version control system), Limitations of immutabilityshunning (deleting data), Limitations of immutability FoundationDB (database)serializable transactions, Serializable Snapshot Isolation (SSI), Performance of serializable snapshot isolation, Limitations of distributed transactions fractal trees, B-tree optimizations full table scans, Reduce-Side Joins and Grouping full-text search, Glossaryand fuzzy indexes, Full-text search and fuzzy indexes building search indexes, Building search indexes Lucene storage engine, Making an LSM-tree out of SSTables functional reactive programming (FRP), Designing Applications Around Dataflow functional requirements, Summary futures (asynchronous operations), Current directions for RPC fuzzy search (see similarity search) G garbage collectionimmutability and, Limitations of immutability process pauses for, Describing Performance, Process Pauses-Limiting the impact of garbage collection, The Truth Is Defined by the Majority(see also process pauses) genome analysis, Summary, Specialization for different domains geographically distributed datacenters, Distributed Data, Reading Your Own Writes, Unreliable Networks, The limits of total ordering geospatial indexes, Multi-column indexes Giraph (graph processing), The Pregel processing model Git (version control system), Custom conflict resolution logic, The causal order is not a total order, Limitations of immutability GitHub, postmortems, Leader failure: Failover, Leader failure: Failover, Mapping system models to the real world global indexes (see term-partitioned indexes) GlusterFS (distributed filesystem), MapReduce and Distributed Filesystems GNU Coreutils (Linux), Sorting versus in-memory aggregation GoldenGate (change data capture), Trigger-based replication, Multi-datacenter operation, Implementing change data capture(see also Oracle) GoogleBigtable (database)data model (see Bigtable data model) partitioning scheme, Partitioning, Partitioning by Key Range storage layout, Making an LSM-tree out of SSTables Chubby (lock service), Membership and Coordination Services Cloud Dataflow (stream processor), Stream analytics, Atomic commit revisited, Unifying batch and stream processing(see also Beam) Cloud Pub/Sub (messaging), Message brokers compared to databases, Using logs for message storage Docs (collaborative editor), Collaborative editing Dremel (query engine), The divergence between OLTP databases and data warehouses, Column-Oriented Storage FlumeJava (dataflow library), MapReduce workflows, High-Level APIs and Languages GFS (distributed file system), MapReduce and Distributed Filesystems gRPC (RPC framework), Current directions for RPC MapReduce (batch processing), Batch Processing(see also MapReduce) building search indexes, Building search indexes task preemption, Designing for frequent faults Pregel (graph processing), The Pregel processing model Spanner (see Spanner) TrueTime (clock API), Clock readings have a confidence interval gossip protocol, Request Routing government use of data, Data as assets and power GPS (Global Positioning System)use for clock synchronization, Unreliable Clocks, Clock Synchronization and Accuracy, Clock readings have a confidence interval, Synchronized clocks for global snapshots GraphChi (graph processing), Parallel execution graphs, Glossaryas data models, Graph-Like Data Models-The Foundation: Datalogexample of graph-structured data, Graph-Like Data Models property graphs, Property Graphs RDF and triple-stores, Triple-Stores and SPARQL-The SPARQL query language versus the network model, The SPARQL query language processing and analysis, Graphs and Iterative Processing-Parallel executionfault tolerance, Fault tolerance Pregel processing model, The Pregel processing model query languagesCypher, The Cypher Query Language Datalog, The Foundation: Datalog-The Foundation: Datalog recursive SQL queries, Graph Queries in SQL SPARQL, The SPARQL query language-The SPARQL query language Gremlin (graph query language), Graph-Like Data Models grep (Unix tool), Simple Log Analysis GROUP BY clause (SQL), GROUP BY grouping records in MapReduce, GROUP BYhandling skew, Handling skew H Hadoop (data infrastructure)comparison to distributed databases, Batch Processing comparison to MPP databases, Comparing Hadoop to Distributed Databases-Designing for frequent faults comparison to Unix, Philosophy of batch process outputs-Philosophy of batch process outputs, Unbundling Databases diverse processing models in ecosystem, Diversity of processing models HDFS distributed filesystem (see HDFS) higher-level tools, MapReduce workflows join algorithms, Reduce-Side Joins and Grouping-MapReduce workflows with map-side joins(see also MapReduce) MapReduce (see MapReduce) YARN (see YARN) happens-before relationship, Ordering and Causalitycapturing, Capturing the happens-before relationship concurrency and, The “happens-before” relationship and concurrency hard disksaccess patterns, Advantages of LSM-trees detecting corruption, The end-to-end argument, Don’t just blindly trust what they promise faults in, Hardware Faults, Durability sequential write throughput, Hash Indexes, Disk space usage hardware faults, Hardware Faults hash indexes, Hash Indexes-Hash Indexesbroadcast hash joins, Broadcast hash joins partitioned hash joins, Partitioned hash joins hash partitioning, Partitioning by Hash of Key-Partitioning by Hash of Key, Summaryconsistent hashing, Partitioning by Hash of Key problems with hash mod N, How not to do it: hash mod N range queries, Partitioning by Hash of Key suitable hash functions, Partitioning by Hash of Key with fixed number of partitions, Fixed number of partitions HAWQ (database), Specialization for different domains HBase (database)bug due to lack of fencing, The leader and the lock bulk loading, Key-value stores as batch process output column-family data model, Data locality for queries, Column Compression dynamic partitioning, Dynamic partitioning key-range partitioning, Partitioning by Key Range log-structured storage, Making an LSM-tree out of SSTables request routing, Request Routing size-tiered compaction, Performance optimizations use of HDFS, Diversity of processing models use of ZooKeeper, Membership and Coordination Services HDFS (Hadoop Distributed File System), MapReduce and Distributed Filesystems-MapReduce and Distributed Filesystems(see also distributed filesystems) checking data integrity, Don’t just blindly trust what they promise decoupling from query engines, Diversity of processing models indiscriminately dumping data into, Diversity of storage metadata about datasets, MapReduce workflows with map-side joins NameNode, MapReduce and Distributed Filesystems use by Flink, Rebuilding state after a failure use by HBase, Dynamic partitioning use by MapReduce, MapReduce workflows HdrHistogram (numerical library), Describing Performance head (Unix tool), Simple Log Analysis head vertex (property graphs), Property Graphs head-of-line blocking, Describing Performance heap files (databases), Storing values within the index Helix (cluster manager), Request Routing heterogeneous distributed transactions, Distributed Transactions in Practice, Limitations of distributed transactions heuristic decisions (in 2PC), Recovering from coordinator failure Hibernate (object-relational mapper), The Object-Relational Mismatch hierarchical model, Are Document Databases Repeating History?

Protocols for making systems Byzantine fault-tolerant are quite complicated [84], and fault-tolerant embedded systems rely on support from the hardware level [81]. In most server-side data systems, the cost of deploying Byzantine fault-tolerant solutions makes them impractical. Web applications do need to expect arbitrary and malicious behavior of clients that are under end-user control, such as web browsers. This is why input validation, sanitization, and output escaping are so important: to prevent SQL injection and cross-site scripting, for example. However, we typically don’t use Byzantine fault-tolerant protocols here, but simply make the server the authority on deciding what client behavior is and isn’t allowed. In peer-to-peer networks, where there is no such central authority, Byzantine fault tolerance is more relevant.

in derived data systems, Derived Data materialized views, Aggregation: Data Cubes and Materialized Views updating derived data, Single-Object and Multi-Object Operations, The need for multi-object transactions, Combining Specialized Tools by Deriving Data versus normalization, Deriving several views from the same event log derived data, Derived Data, Stream Processing, Glossaryfrom change data capture, Implementing change data capture in event sourcing, Deriving current state from the event log-Deriving current state from the event log maintaining derived state through logs, Databases and Streams-API support for change streams, State, Streams, and Immutability-Concurrency control observing, by subscribing to streams, End-to-end event streams outputs of batch and stream processing, Batch and Stream Processing through application code, Application code as a derivation function versus distributed transactions, Derived data versus distributed transactions deterministic operations, Pros and cons of stored procedures, Faults and Partial Failures, Glossaryaccidental nondeterminism, Fault tolerance and fault tolerance, Fault tolerance, Fault tolerance and idempotence, Idempotence, Reasoning about dataflows computing derived data, Maintaining derived state, Correctness of dataflow systems, Designing for auditability in state machine replication, Using total order broadcast, Databases and Streams, Deriving current state from the event log joins, Time-dependence of joins DevOps, The Unix Philosophy differential dataflow, What’s missing?

Martin Kleppmann-Designing Data-Intensive Applications. The Big Ideas Behind Reliable, Scalable and Maintainable Systems-O’Reilly (2017) by Unknown

active measures, Amazon Web Services, bitcoin, blockchain, business intelligence, business process, c2.com, cloud computing, collaborative editing, commoditize, conceptual framework, cryptocurrency, database schema, DevOps, distributed ledger, Donald Knuth, Edward Snowden, Ethereum, ethereum blockchain, fault tolerance, finite state, Flash crash, full text search, general-purpose programming language, informal economy, information retrieval, Internet of things, iterative process, John von Neumann, Kubernetes, loose coupling, Marc Andreessen, microservices, natural language processing, Network effects, packet switching, peer-to-peer, performance metric, place-making, premature optimization, recommendation engine, Richard Feynman, self-driving car, semantic web, Shoshana Zuboff, social graph, social web, software as a service, software is eating the world, sorting algorithm, source of truth, SPARQL, speech recognition, statistical model, undersea cable, web application, WebSocket, wikimedia commons

Protocols for mak‐ ing systems Byzantine fault-tolerant are quite complicated [84], and fault-tolerant embedded systems rely on support from the hardware level [81]. In most server-side data systems, the cost of deploying Byzantine fault-tolerant solutions makes them impractical. Web applications do need to expect arbitrary and malicious behavior of clients that are under end-user control, such as web browsers. This is why input validation, sani‐ tization, and output escaping are so important: to prevent SQL injection and crosssite scripting, for example. However, we typically don’t use Byzantine fault-tolerant protocols here, but simply make the server the authority on deciding what client behavior is and isn’t allowed. In peer-to-peer networks, where there is no such cen‐ tral authority, Byzantine fault tolerance is more relevant.

A fault is usually defined as one com‐ ponent of the system deviating from its spec, whereas a failure is when the system as a whole stops providing the required service to the user. It is impossible to reduce the probability of a fault to zero; therefore it is usually best to design fault-tolerance mechanisms that prevent faults from causing failures. In this book we cover several techniques for building reliable systems from unreliable parts. Counterintuitively, in such fault-tolerant systems, it can make sense to increase the rate of faults by triggering them deliberately—for example, by randomly killing indi‐ vidual processes without warning. Many critical bugs are actually due to poor error handling [3]; by deliberately inducing faults, you ensure that the fault-tolerance machinery is continually exercised and tested, which can increase your confidence that faults will be handled correctly when they occur naturally. The Netflix Chaos Monkey [4] is an example of this approach.

The biggest dif‐ ferences are that in 2PC the coordinator is not elected, and that fault-tolerant consen‐ sus algorithms only require votes from a majority of nodes, whereas 2PC requires a “yes” vote from every participant. Moreover, consensus algorithms define a recovery process by which nodes can get into a consistent state after a new leader is elected, ensuring that the safety properties are always met. These differences are key to the correctness and fault tolerance of a consensus algorithm. 368 | Chapter 9: Consistency and Consensus Limitations of consensus Consensus algorithms are a huge breakthrough for distributed systems: they bring concrete safety properties (agreement, integrity, and validity) to systems where every‐ thing else is uncertain, and they nevertheless remain fault-tolerant (able to make pro‐ gress as long as a majority of nodes are working and reachable).

Elixir in Action by Saša Jurić

demand response, en.wikipedia.org, fault tolerance, finite state, general-purpose programming language, place-making, Ruby on Rails, WebSocket

Summary ¡ When a system needs to perform various tasks, it’s often beneficial to run different tasks in separate processes. Doing so promotes the scalability and fault-tolerance of the system. ¡ A process is internally sequential and handles requests one by one. A single process can thus keep its state consistent, but it can also cause a performance bottleneck if it serves many clients. ¡ Carefully consider calls versus casts. Calls are synchronous and therefore block the caller. If the response isn’t needed, casts may improve performance at the expense of reduced guarantees, because a client process doesn’t know the outcome. ¡ You can use mix projects to manage more involved systems that consist of multiple modules. 8 Fault-tolerance basics This chapter covers ¡ Runtime errors ¡ Errors in concurrent systems ¡ Supervisors Fault-tolerance is a first-class concept in BEAM. The ability to develop reliable systems that can operate even when faced with runtime errors is what brought us Erlang in the first place.

If that happens, keep in mind all the benefits you get from Erlang. As I’ve explained, Erlang goes a long way toward making it possible to write fault-tolerant systems that can run for a long time with hardly any downtime. This is a big challenge and a specific focus of the Erlang platform. Although it’s admittedly unfortunate that the ecosystem isn’t as mature as it could be, my sentiment is that Erlang significantly helps with hard problems, even if simple problems can sometimes be more clumsy to solve. Of course, those difficult problems may not always be important. Perhaps you don’t expect a high load, or a system doesn’t need to run constantly and be extremely fault-tolerant. In such cases, you may want to consider some other technology stack with a more evolved ecosystem. Summary 15 Summary ¡ Erlang is a technology for developing highly available systems that constantly provide service with little or no downtime.

We’ll spend some time exploring BEAM concurrency, a feature that plays a central role in Elixir’s and Erlang’s support for scalability, fault-tolerance, and distribution. In this chapter, we’ll start our tour of BEAM concurrency by looking at basic techniques and tools. Before we explore the lower-level details, we’ll take a look at higher-level principles. 129 130 5.1 Chapter 5 Concurrency primitives Concurrency in BEAM Erlang is all about writing highly available systems — systems that run forever and are always able to meaningfully respond to client requests. To make your system highly available, you have to tackle the following challenges: ¡ Fault-tolerance — Minimize, isolate, and recover from the effects of runtime errors. ¡ Scalability — Handle a load increase by adding more hardware resources without changing or redeploying the code. ¡ Distribution — Run your system on multiple machines so that others can take over if one machine crashes.

Principles of Protocol Design by Robin Sharp

accounting loophole / creative accounting, business process, discrete time, fault tolerance, finite state, Gödel, Escher, Bach, information retrieval, loose coupling, MITM: man-in-the-middle, packet switching, RFC: Request For Comment, stochastic process

The first is a practical objection: Simple languages generally do not correspond to protocols which can tolerate faults, such as missing or duplicated messages. Protocols which are fault-tolerant often require the use of state machines with enormous numbers of states, or they may define context-dependent languages. A more radical objection is that classical analysis of the protocol language from a formal language point of view traditionally concerns itself with the problems of constructing a suitable recogniser, determining the internal states of the recogniser, and so on. This does not help us to analyse or check many of the properties which we may require the protocol to have, such as the properties of fault-tolerance mentioned above. To be able to investigate this we need analytical tools which can describe the parallel operation of all the parties which use the protocol to regulate their communication. 1.2 Protocols as Processes A radically different way of looking at things has therefore gained prominence within recent years.

For generality1 , we define the value of the function ma jority(v) as being the value selected by a lieutenant receiving the values in v. If no value is received from a particular participant, the algorithm should supply some default, vde f . 5.4.1 Using unsigned messages Solutions to this problem depend quite critically on the assumptions made about the system. Initially, we shall assume the following: Degree of fault-tolerance: Out of the n participants, at most t are unreliable. This defines the degree of fault tolerance required of the system. We cannot expect the protocol to work correctly if this limit is overstepped. Network properties: Every message that is sent is delivered correctly, and the receiver of a message knows who sent it. These assumptions mean that an unreliable participant cannot interfere with the message traffic between the other participants.

Other features: — Coding: Ad hoc binary coding of fixed fields in TPDUs, with TLV encoding of optional fields (‘parameters’) (Table 8.3). Addressing: Hierarchical addressing. T-address formed by concatenating T-selector onto N-address. Fault tolerance: Loss or duplication of data (DT TPDUs) or acknowledgments (AK TPDUs). Whereas the ISO Class 0 protocol provides minimal functionality, and is therefore only suitable for use when the underlying network is comparatively reliable, the Class 4 protocol is designed to be resilient to a large range of potential disasters, including the arrival of spurious PDUs, PDU loss and PDU corruption. To ensure this degree of fault tolerance, the protocol uses a large number of timers, whose identifications and functions are summarised in Table 9.3. The persistence timer is 3 Network restart is assumed to give rise to N-DISCONNECT.ind on all connections. 286 9 Protocols in the OSI Lower Layers only conceptual, as R is equal to T 1 · (N − 1), where N is the maximum number of attempts to retransmit a PDU, as illustrated in Figure 4.9.

pages: 371 words: 78,103

Webbots, Spiders, and Screen Scrapers by Michael Schrenk

Amazon Web Services, corporate governance, fault tolerance, Firefox, Marc Andreessen, new economy, pre–internet, SpamAssassin, The Hackers Conference, Turing test, web application

It's better to avoid these issues by designing fault-tolerant webbots that anticipate changes in the websites they target. Fault tolerance does not mean that everything will always work perfectly. Sometimes changes in a targeted website confuse even the most fault-tolerant webbot. In these cases, the proper thing for a webbot to do is to abort its task and report an error to its owner. Essentially, you want your webbot to fail in the same manner a person using a browser might fail. For example, if a webbot is buying an airline ticket, it should not proceed with a purchase if a seat is not available on a desired flight. This action sounds silly, but it is exactly what a poorly programmed webbot may do if it is expecting an available seat and has no provision to act otherwise. Types of Webbot Fault Tolerance For a webbot, fault tolerance involves adapting to changes to URLs, HTML content (which affect parsing), forms, cookie use, and network outages and congestion).

* * * [68] See Chapter 28 for more information about trespass to chattels. [69] You can find the owner of an IP address at http://www.arin.net. Chapter 25. WRITING FAULT-TOLERANT WEBBOTS The biggest complaint users have about webbots is their unreliability: Your webbots will suddenly and inexplicably fail if they are not fault tolerant, or able to adapt to the changing conditions of your target websites. This chapter is devoted to helping you write webbots that are tolerant to network outages and unexpected changes in the web pages you target. Webbots that don't adapt to their changing environments are worse than nonfunctional ones because, when presented with the unexpected, they may perform in odd and unpredictable ways. For example, a non-fault-tolerant webbot may not notice that a form has changed and will continue to emulate the nonexistent form.

Types of Webbot Fault Tolerance For a webbot, fault tolerance involves adapting to changes to URLs, HTML content (which affect parsing), forms, cookie use, and network outages and congestion). We'll examine each of these aspects of fault tolerance in the following sections. Adapting to Changes in URLs Possibly the most important type of webbot fault tolerance is URL tolerance, or a webbot's ability to make valid requests for web pages under changing conditions. URL tolerance ensures that your webbot does the following: Download pages that are available on the target site Follow header redirections to updated pages Use referer values to indicate that you followed a link from a page that is still on the website Avoid Making Requests for Pages That Don't Exist Before you determine that your webbot downloaded a valid web page, you should verify that you made a valid request.

Pragmatic.Programming.Erlang.Jul.2007 by Unknown

Debian, en.wikipedia.org, fault tolerance, finite state, full text search, RFC: Request For Comment

If the clause that executes this statement is not within the scope of 170 E RROR H ANDLING P RIMITIVES Joe Asks. . . How Can We Make a Fault-Tolerant System? To make something fault tolerant, we need at least two computers. One computer does the job, and another computer watches the first computer and must be ready to take over at a moment’s notice if the first computer fails. This is exactly how error recovery works in Erlang. One process does the job, and another process watches the first process and takes over if things go wrong. That’s why we need to monitor processes and to know why things fail. The examples in this chapter show you how to do this. In distributed Erlang, the process that does the job and the processes that monitor the process that does the job can be placed on physically different machines. Using this technique, we can start designing fault-tolerant software. This pattern is common.

Using behaviors, we can concentrate on the functional behavior of a component, while allowing the behavior framework to solve the nonfunctional aspects of the problem. The framework might, for example, take care of making the application fault tolerant or scalable, whereas the behavioral callback concentrates on the specific aspects of the problem. The chapter starts with a general discussion on how to build your own behaviors and then moves to describing the gen_server behavior that is part of the Erlang standard libraries. 14 R OAD M AP • Chapter 17, Mnesia: The Erlang Database, on page 313 talks about the Erlang database management system (DBMS) Mnesia. Mnesia is an integrated DBMS with extremely fast, soft, real-time response times. It can be configured to replicate its data over several physically separated nodes to provide fault-tolerant operation. • Chapter 18, Making a System with OTP, on page 335 is the second of the OTP chapters.

Not only were the programs different, but the whole approach to programming was different. The author kept on and on about concurrency and distribution and fault tolerance and about a method of programming called concurrency-oriented programming—whatever that might mean. But some of the examples looked like fun. That evening the programmer looked at the example chat program. It was pretty small and easy to understand, even if the syntax was a bit strange. Surely it couldn’t be that easy. The basic program was simple, and with a few more lines of code, file sharing and encrypted conversations became possible. The programmer started typing.... What’s This All About? It’s about concurrency. It’s about distribution. It’s about fault tolerance. It’s about functional programming. It’s about programming a distributed concurrent system without locks and mutexes but using only pure message passing.

pages: 680 words: 157,865

Beautiful Architecture: Leading Thinkers Reveal the Hidden Beauty in Software Design by Diomidis Spinellis, Georgios Gousios

Albert Einstein, barriers to entry, business intelligence, business process, call centre, continuous integration, corporate governance, database schema, Debian, domain-specific language, don't repeat yourself, Donald Knuth, en.wikipedia.org, fault tolerance, Firefox, general-purpose programming language, iterative process, linked data, locality of reference, loose coupling, meta analysis, meta-analysis, MVC pattern, peer-to-peer, premature optimization, recommendation engine, Richard Stallman, Ruby on Rails, semantic web, smart cities, social graph, social web, SPARQL, Steve Jobs, Stewart Brand, traveling salesman, Turing complete, type inference, web application, zero-coupon bond

Guardian is the operating system for Tandem’s fault-tolerant “NonStop” series of computers. It was designed in parallel with the hardware to provide fault tolerance with minimal overhead cost. This chapter describes the original Tandem machine, designed between 1974 and 1976 and shipped between 1976 and 1982. It was originally called “Tandem/16,” but after the introduction of its successor, “NonStop II,” it was retrospectively renamed “NonStop I.” Tandem frequently used the term “T/16” both for the system and later for the architecture. I worked with Tandem hardware full-time from 1977 until 1991. Working with the Tandem machine was both exhilarating and unusual. In this chapter, I’d like to bring back to life some of the feeling that programmers had about the machine. The T/16 was a fault-tolerant machine, but that wasn’t its only characteristic, and in this discussion I mention many aspects that don’t directly contribute to fault tolerance—in fact, a couple detract from it!

Clients will connect to one of these servers to interact with the abstract representation of the world held by the server. Figure 3-1. Project Darkstar high-level architecture Unlike most replication schemes, the different copies of the game logic are not meant to process the same events. Instead, each copy can independently interact with the clients. Replication in this design is used primarily to allow scale rather than to ensure fault tolerance (although, as we will see later, fault tolerance is also achieved). Further, the game logic itself does not know or need to know that there are other copies of the server operating on other machines. The code written by the game programmer runs as if it were on a single machine, with coordination of the different copies done by the Project Darkstar infrastructure. Indeed, it is possible to run a Darkstar-based game on a single server if that is all the capacity the game needs.

Still, there are several workstations in every studio, and each workstation has many threads, so it pays to be careful. NIO image transfer Obviously, that leaves the problem of getting the images from the client to the server. One option we considered and rejected early was CIFS—Windows shared drives. Our main concern here was fault-tolerance, but transfer speed also worried us. These machines needed to move a lot of data back and forth, while photographers and customers were sitting around waiting. In our matrix of off-the-shelf options, nothing had the right mix of speed, parallelism, fault-tolerance, and information hiding. Reluctantly, we decided to build our own file transfer protocol, which led us into one of the most complex areas of Creation Center. Image transfer became a severe trial, but we emerged, at last, with one of the most robust features of the whole system.

Mastering Blockchain, Second Edition by Imran Bashir

3D printing, altcoin, augmented reality, autonomous vehicles, bitcoin, blockchain, business process, carbon footprint, centralized clearinghouse, cloud computing, connected car, cryptocurrency, data acquisition, Debian, disintermediation, disruptive innovation, distributed ledger, domain-specific language, en.wikipedia.org, Ethereum, ethereum blockchain, fault tolerance, fiat currency, Firefox, full stack developer, general-purpose programming language, gravity well, interest rate swap, Internet of things, litecoin, loose coupling, MITM: man-in-the-middle, MVC pattern, Network effects, new economy, node package manager, Oculus Rift, peer-to-peer, platform as a service, prediction markets, QR code, RAND corporation, Real Time Gross Settlement, reversible computing, RFC: Request For Comment, RFID, ride hailing / ride sharing, Satoshi Nakamoto, single page application, smart cities, smart contracts, smart grid, smart meter, supply-chain management, transaction costs, Turing complete, Turing machine, web application, x509 certificate

Consensus is pluggable and currently, there are two types of ordering services available in Hyperledger Fabric: SOLO: This is a basic ordering service intended to be used for development and testing purposes. Kafka: This is an implementation of Apache Kafka, which provides ordering service. It should be noted that currently Kafka only provides crash fault tolerance but does not provide byzantine fault tolerance. This is acceptable in a permissioned network where chances of malicious actors are almost none. In addition to these mechanisms, the Simple Byzantine Fault Tolerance (SBFT) based mechanism is also under development, which will become available in the later releases of Hyperledger Fabric. Distributed ledger Blockchain and world state are two main elements of the distributed ledger. Blockchain is simply a cryptographically linked list of blocks (as introduced in Chapter 1, Blockchain 101) and world state is a key-value database.

The following describes these requirements: Agreement: All honest nodes decide on the same value Termination: All honest nodes terminate execution of the consensus process and eventually reach a decision Validity: The value agreed upon by all honest nodes must be the same as the initial value proposed by at least one honest node Fault tolerant: The consensus algorithm should be able to run in the presence of faulty or malicious nodes (Byzantine nodes) Integrity: This is a requirement that no node can make the decision more than once in a single consensus cycle Types of consensus mechanisms All consensus mechanisms are developed to deal with faults in a distributed system and to allow distributed systems to reach a final state of agreement. There are two general categories of consensus mechanisms. These categories deal with all types of faults (fail stop type or arbitrary). These common types of consensus mechanisms are as follows: Traditional Byzantine Fault Tolerance (BFT)-based: With no compute-intensive operations, such as partial hash inversion (as in Bitcoin PoW), this method relies on a simple scheme of nodes that are publisher-signed messages.

In that case, if the node accepts the update, then only that one node in the network is updated and therefore consistency is lost. Now, if the update is rejected by the node, that would result in loss of availability. In that case due to partition tolerance, both availability and consistency are unachievable. This is strange because somehow blockchain manages to achieve all of these properties—or does it? This will be explained shortly. To achieve fault tolerance, replication is used. This is a standard and widely-used method to achieve fault tolerance. Consistency is achieved using consensus algorithms in order to ensure that all nodes have the same copy of the data. This is also called state machine replication. The blockchain is a means for achieving state machine replication. In general, there are two types of faults that a node can experience. Both of these types fall under the broader category of faults that can occur in a distributed system: Fail-stop fault: This type of fault occurs when a node merely has crashed.

pages: 931 words: 79,142

Concepts, Techniques, and Models of Computer Programming by Peter Van-Roy, Seif Haridi

computer age, Debian, discrete time, Donald Knuth, Eratosthenes, fault tolerance, G4S, general-purpose programming language, George Santayana, John von Neumann, Lao Tzu, Menlo Park, natural language processing, NP-complete, Paul Graham, premature optimization, sorting algorithm, Therac-25, Turing complete, Turing machine, type inference

This means that the called routine has to be carefully written so that its mess is always limited in extent. The routine can be inside a transaction. This solution is harder to implement, but can make the program much simpler. Raising an exception corresponds to aborting the transaction. A third motivation is fault tolerance. Lightweight transactions are important for writing fault-tolerant applications. With respect to a component, e.g., an application doing a transaction, we define a fault as incorrect behavior in one of its subcomponents. Ideally, the application should continue to behave correctly when there are faults, i.e., it should be fault tolerant. When a fault occurs, a fault-tolerant application has to take three steps: (1) detect the fault, (2) contain the fault in a limited part of the application, and (3) repair any problems caused by the fault. Lightweight transactions are a good mechanism for fault confinement.

One way around this problem is to provide mutiple server objects to allow serving multiple clients simultaneously. 11.8.4 Active fault tolerance Applications sometimes need active fault tolerance, i.e., part of the application is replicated on several processes and a replication algorithm is used to keep the parts coherent with each other. Building abstractions to provide this is an active research topic. For example, in Mozart we have built a replicated transactional object store, called GlobalStore [147]. This keeps copies of a set of objects on several processes and gives access to them through a transactional protocol. The copies are kept coherent through the protocol. As long as at least one process is alive, the GlobalStore will survive. Because of the failure detection provided by the Fault module, the Mozart system lets the GlobalStore and other fault-tolerant abstractions be written completely in Oz without recompiling the system.

If one of the replicas fails, the other replica detects this, starts a new second replica using 11.11 Exercises 747 the Remote module, and informs the client. For this exercise, write an abstraction for a replicated server that hides all the fault-handling activities from the clients. 5. (advanced exercise) Fault tolerance and synchronous communication. Section 11.10 says that synchronous communication makes fault confinement easier. Section 5.7 says that asynchronous communication helps keep concurrent components independent, which is important when building fault tolerance abstractions. For this exercise, reconcile these two principles by studying the architecture of fault tolerant applications. This page intentionally left blank 12 Constraint Programming by Peter Van Roy, Raphaël Collet, and Seif Haridi Plans within plans within plans within plans. – Dune, Frank Herbert (1920–1986) Constraint programming consists of a set of techniques for solving constraint satisfaction problems.

pages: 496 words: 70,263

Erlang Programming by Francesco Cesarini

cloud computing, fault tolerance, finite state, loose coupling, revision control, RFC: Request For Comment, sorting algorithm, Turing test, type inference, web application

This should record enough information to enable billing for the use of the phone. 138 | Chapter 5: Process Design Patterns CHAPTER 6 Process Error Handling Whatever the programming language, building distributed, fault-tolerant, and scalable systems with requirements for high availability is not for the faint of heart. Erlang’s reputation for handling the fault-tolerant and high-availability aspects of these systems has its foundations in the simple but powerful constructs built into the language’s concurrency model. These constructs allow processes to monitor each other’s behavior and to recover from software faults. They give Erlang a competitive advantage over other programming languages, as they facilitate development of the complex architecture that provides the required fault tolerance through isolating errors and ensuring nonstop operation. Attempts to develop similar frameworks in other languages have either failed or hit a major complexity barrier due to the lack of the very constructs described in this chapter.

Only then was the language deemed mature enough to use in major projects with hundreds of developers, including Ericsson’s broadband, GPRS, and ATM switching solutions. In conjunction with these projects, the OTP framework was developed and released in 1996. OTP provides a framework to structure Erlang systems, offering robustness and fault tolerance together with a set of tools and libraries. The history of Erlang is important in understanding its philosophy. Although many languages were developed before finding their niche, Erlang was developed to solve the “time-to-market” requirements of distributed, fault-tolerant, massively concurrent, soft real-time systems. The fact that web services, retail and commercial banking, computer telephony, messaging systems, and enterprise integration, to mention but a few, happen to share the same requirements as telecom systems explains why Erlang is gaining headway in these sectors.

A typical example here is a web server: if you are planning a new release of a piece of software, or you are planning to stream video of a football match in real time, distributing the server across a number of machines will make this possible without failure. This performance is given by replication of a service—in this case a web server— which is often found in the architecture of a distributed system. • Replication also provides fault tolerance: if one of the replicated web servers fails or becomes unavailable for some reason, HTTP requests can still be served by the other servers, albeit at a slower rate. This fault tolerance allows the system to be more robust and reliable. • Distribution allows transparent access to remote resources, and building on this, it is possible to federate a collection of different systems to provide an overall user service. Such a collection of facilities is provided by modern e-commerce systems, such as the Amazon.com website. • Finally, distributed system architecture makes a system extensible, with other services becoming available through remote access.

pages: 570 words: 115,722

The Tangled Web: A Guide to Securing Modern Web Applications by Michal Zalewski

barriers to entry, business process, defense in depth, easy for humans, difficult for computers, fault tolerance, finite state, Firefox, Google Chrome, information retrieval, RFC: Request For Comment, semantic web, Steve Jobs, telemarketer, Turing test, Vannevar Bush, web application, WebRTC, WebSocket

Vendors released their products with embedded programming languages such as JavaScript and Visual Basic, plug-ins to execute platform-independent Java or Flash applets on the user’s machine, and useful but tricky HTTP extensions such as cookies. Only a limited degree of superficial compatibility, sometimes hindered by patents and trademarks,[7] would be maintained. As the Web grew larger and more diverse, a sneaky disease spread across browser engines under the guise of fault tolerance. At first, the reasoning seemed to make perfect sense: If browser A could display a poorly designed, broken page but browser B refused to (for any reason), users would inevitably see browser B’s failure as a bug in that product and flock in droves to the seemingly more capable client, browser A. To make sure that their browsers could display almost any web page correctly, engineers developed increasingly complicated and undocumented heuristics designed to second-guess the intent of sloppy webmasters, often sacrificing security and occasionally even compatibility in the process.

In several scenarios outlined in that RFC, the desire to explicitly mandate the handling of certain corner cases led to patently absurd outcomes. One such example is the advice on parsing dates in certain HTTP headers, at the request of section 3.3 in RFC 1945. The resulting implementation (the prtime.c file in the Firefox codebase[118]) consists of close to 2,000 lines of extremely confusing and unreadable C code just to decipher the specified date, time, and time zone in a sufficiently fault-tolerant way (for uses such as deciding cache content expiration). Semicolon-Delimited Header Values Several HTTP headers, such as Cache-Control or Content-Disposition, use a semicolon-delimited syntax to cram several separate name=value pairs into a single line. The reason for allowing this nested notation is unclear, but it is probably driven by the belief that it will be a more efficient or a more intuitive approach that using several separate headers that would always have to go hand in hand.

In general, be mindful of control and high-bit characters, commas, quotes, backslashes, and semicolons; other characters or strings may be of concern on a case-by-case basis. Escape or substitute these values as appropriate. When building a new HTTP client, server, or proxy: Do not create a new implementation unless you absolutely have to. If you can’t help it, read this chapter thoroughly and aim to mimic an existing mainstream implementation closely. If possible, ignore the RFC-provided advice about fault tolerance and bail out if you encounter any syntax ambiguities. * * * [24] Public key cryptography relies on asymmetrical encryption algorithms to create a pair of keys: a private one, kept secret by the owner and required to decrypt messages, and a public one, broadcast to the world and useful only to encrypt traffic to that recipient, not to decrypt it. Chapter 4. Hypertext Markup Language The Hypertext Markup Language (HTML) is the primary method of authoring online documents.

pages: 1,085 words: 219,144

Solr in Action by Trey Grainger, Timothy Potter

business intelligence, cloud computing, commoditize, conceptual framework, crowdsourcing, data acquisition, en.wikipedia.org, failed state, fault tolerance, finite state, full text search, glass ceiling, information retrieval, natural language processing, openstreetmap, performance metric, premature optimization, recommendation engine, web application

At this point, you’ve seen that Solr has a modern, well-designed architecture that’s scalable and fault-tolerant. Although these are important aspects to consider if you’ve already decided to use Solr, you still might not be convinced that Solr is the right choice for your needs. In the next section, we describe the benefits of Solr from the perspective of different stakeholders, such as the software architect, system administrator, and CEO. 1.3. Why Solr? In this section, we provide key information to help you decide if Solr is the right technology for your organization. Let’s begin by addressing why Solr is attractive to software architects. 1.3.1. Solr for the software architect When evaluating new technology, software architects must consider a number of factors including stability, scalability, and fault tolerance. Solr scores high marks in all three categories.

We won’t advise you either way on whether this is acceptable for your organization. We only point this out because it’s a testament to the depth and breadth of automated testing in Lucene and Solr. If you have a nightly build off trunk in which all the automated tests pass, then you can be fairly confident that the core functionality is solid. We’ve touched on Solr’s approach to scalability and fault tolerance in sections 1.2.6 and 1.2.7. As an architect, you’re probably most curious about the limitations of Solr’s approach to scalability and fault tolerance. First, you should realize that the sharding and replication features in Solr have been improved in Solr 4 to be robust and easier to manage. The new approach to scaling is called SolrCloud. Under the covers, SolrCloud uses Apache ZooKeeper to distribute configurations across a cluster of Solr servers and to keep track of cluster state.

You may want to use replication either when you want to isolate indexing from searching operations to different servers within your cluster or when you need to increase available queries-per-second capacity. Fault tolerance It’s great that we can increase our overall query capacity by adding another server and replicating the index to that server, but what happens when one of our servers eventually crashes? When our application had only one server, the application clearly would have stopped. Now that multiple, redundant servers exist, one server dying will simply reduce our capacity back to the capacity of however many servers remain. If you want to build fault tolerance into your system, it’s a good idea to have additional resources (extra slave servers) in your cluster so that your system can continue functioning with enough capacity even if a single server fails.

pages: 161 words: 44,488

The Business Blockchain: Promise, Practice, and Application of the Next Internet Technology by William Mougayar

Airbnb, airport security, Albert Einstein, altcoin, Amazon Web Services, bitcoin, Black Swan, blockchain, business process, centralized clearinghouse, Clayton Christensen, cloud computing, cryptocurrency, disintermediation, distributed ledger, Edward Snowden, en.wikipedia.org, Ethereum, ethereum blockchain, fault tolerance, fiat currency, fixed income, global value chain, Innovator's Dilemma, Internet of things, Kevin Kelly, Kickstarter, market clearing, Network effects, new economy, peer-to-peer, peer-to-peer lending, prediction markets, pull request, QR code, ride hailing / ride sharing, Satoshi Nakamoto, sharing economy, smart contracts, social web, software as a service, too big to fail, Turing complete, web application

Game Theory: Analysis of Conflict, Harvard University Press. 5. Leslie Lamport, Robert Shostak, and Marshall Pease, The Byzantine Generals Problem. http://research.microsoft.com/en-us/um/people/lamport/pubs/byz.pdf. 6. IT Does not Matter, https://hbr.org/2003/05/it-doesnt-matter. 7. PayPal website, https://www.paypal.com/webapps/mpp/about. 8. Personal communication with Vitalik Buterin, February 2016. 9. Byzantine fault tolerance, https://en.wikipedia.org/wiki/Byzantine_fault_tolerance. 10. Proof-of-stake, https://en.wikipedia.org/wiki/Proof-of-stake. 2 HOW BLOCKCHAIN TRUST INFILTRATES “I cannot understand why people are frightened of new ideas. I’m frightened of the old ones.” –JOHN CAGE REACHING CONSENSUS is at the heart of a blockchain’s operations. But the blockchain does it in a decentralized way that breaks the old paradigm of centralized consensus, when one central database used to rule transaction validity.

In part, the continuation of some of the trends in crypto 2.0, and particularly generalized protocols that provide both computational abstraction and privacy. But equally important is the current technological elephant in the room in the blockchain sphere: scalability. Currently, all existing blockchain protocols have the property that every computer in the network must process every transaction—a property that provides extreme degrees of fault tolerance and security, but at the cost of ensuring that the network's processing power is effectively bounded by the processing power of a single node. Crypto 3.0—at least in my mind—consists of approaches that move beyond this limitation, in one of various ways to create systems that break through this limitation and actually achieve the scale needed to support mainstream adoption (technically astute readers may have heard of “lightning networks,” “state channels,” and “sharding”).

Game theory is ‘the study of mathematical models of conflict and cooperation between intelligent rational decision-makers.”4 And this is related to the blockchain because the Bitcoin blockchain, originally conceived by Satoshi Nakamoto, had to solve a known game theory conundrum called the Byzantine Generals Problem.5 Solving that problem consists in mitigating any attempts by a small number of unethical Generals who would otherwise become traitors, and lie about coordinating their attack to guarantee victory. This is accomplished by enforcing a process for verifying the work that was put into crafting these messages, and time-limiting the requirement for seeing untampered messages in order to ensure their validity. Implementing a “Byzantine Fault Tolerance” is important because it starts with the assumption that you cannot trust anyone, and yet it delivers assurance that the transaction has traveled and arrived safely based on trusting the network during its journey, while surviving potential attacks. There are fundamental implications for this new method of reaching safety in the finality of a transaction, because it questions the existence and roles of current trusted intermediaries, who held the traditional authority on validating transactions.

Scala in Action by Nilanjan Raychaudhuri

continuous integration, create, read, update, delete, database schema, domain-specific language, don't repeat yourself, en.wikipedia.org, failed state, fault tolerance, general-purpose programming language, index card, MVC pattern, type inference, web application

When Alan Kay[7] first thought about OOP, his big idea was “message passing.”[8] In fact, working with actors is more object-oriented than you think. 7 Alan Curtis Kay, http://en.wikipedia.org/wiki/Alan_Kay. 8 Alan Kay, “Prototypes vs. classes was: Re: Sun’s HotSpot,” Oct 10, 1998, http://mng.bz/L12u. What happens if something fails? So many things can go wrong in the concurrent/ parallel programming world. What if we get an IOException while reading the file? Let’s learn how to handle faults in an actor-based application. 9.3.4. Fault tolerance made easy with a supervisor Akka encourages nondefensive programming in which failure is a valid state in the lifecycle of an application. As a programmer you know you can’t prevent every error, so it’s better to prepare your application for the errors. You can easily do this through fault-tolerance support provided by Akka through the supervisor hierarchy. Think of this supervisor as an actor that links to supervised actors and restarts them when one dies. The responsibility of a supervisor is to start, stop, and monitor child actors.

Figure 9.6 shows an example of supervisor hierarchy. Figure 9.6. Supervisor hierarchy in Akka You aren’t limited to one supervisor. You can have one supervisor linked to another supervisor. That way you can supervise a supervisor in case of a crash. It’s hard to build a fault-tolerant system with one box, so I recommend having your supervisor hierarchy spread across multiple machines. That way, if a node (machine) is down, you can restart an actor in a different box. Always remember to delegate the work so that if a crash occurs, another supervisor can recover. Now let’s look into the fault-tolerant strategies available in Akka. Supervision Strategies in Akka Akka comes with two restarting strategies: One-for-One and All-for-One. In the One-for-One strategy (see figure 9.7), if one actor dies, it’s recreated. This is a great strategy if actors are independent in the system.

First I’ll talk about the philosophy behind Akka so you understand the goal behind the Akka project and the problems it tries to solve. 12.1. The philosophy behind Akka The philosophy behind Akka is simple: make it easier for developers to build correct, concurrent, scalable, and fault-tolerant applications. To that end, Akka provides a higher level of abstractions to deal with concurrency, scalability, and faults. Figure 12.1 shows the three core modules provided by Akka for concurrency, scalability, and fault tolerance. Figure 12.1. Akka core modules The concurrency module provides options to solve concurrency-related problems. By now I’m sure you’re comfortable with actors (message-oriented concurrency). But actors aren’t a be-all-end-all solution for concurrency. You need to understand alternative concurrency models available in Akka, and in the next section you’ll explore all of them.

pages: 250 words: 73,574

Nine Algorithms That Changed the Future: The Ingenious Ideas That Drive Today's Computers by John MacCormick, Chris Bishop

Ada Lovelace, AltaVista, Claude Shannon: information theory, fault tolerance, information retrieval, Menlo Park, PageRank, pattern recognition, Richard Feynman, Silicon Valley, Simon Singh, sorting algorithm, speech recognition, Stephen Hawking, Steve Jobs, Steve Wozniak, traveling salesman, Turing machine, Turing test, Vannevar Bush

At the time of writing, however, many of the systems that claim to be peer-to-peer in fact use central servers for some of their functionality and thus do not need to rely on distributed hash tables. The technique of “Byzantine fault tolerance” falls in the same category: a surprising and beautiful algorithm that can't yet be classed as great, due to lack of adoption. Byzantine fault tolerance allows certain computer systems to tolerate any type of error whatsoever (as long as there are not too many simultaneous errors). This contrasts with the more usual notion of fault tolerance, in which a system can survive more benign errors, such as the permanent failure of a disk drive or an operating system crash. CAN GREAT ALGORITHMS FADEAWAY? In addition to speculating about what algorithms might rise to greatness in the future, we might wonder whether any of our current “great” algorithms—indispensable tools that we use constantly without even thinking about it—might fade in importance.

This immense level of concurrency, together with rapid query responses via the virtual table trick, make large databases efficient. The to-do list trick also guarantees consistency in the face of failures. When combined with the prepare-then-commit trick for replicated databases, we are left with iron-clad consistency and durability for our data. The heroic triumph of databases over unreliable components, known by computer scientists as “fault-tolerance,” is the work of many researchers over many decades. But among the most important contributors was Jim Gray, a superb computer scientist who literally wrote the book on transaction processing. (The book is Transaction Processing: Concepts and Techniques, first published in 1992.) Sadly, Gray's career ended early: one day in 2007, he sailed his yacht out of San Francisco Bay, under the Golden Gate Bridge, and into the open ocean on a planned day trip to some nearby islands.

See also certification authority authority trick B-tree Babylonia backup bank; account number; balance; for keys; online banking; for signatures; transfer; as trusted third party base, in exponentiation Battelle, John Bell Telephone Company binary Bing biology biometric sensor Bishop, Christopher bit block cipher body, of a web page brain Brin, Sergey British government browser brute force bug Burrows, Mike Bush, Vannevar Businessweek Byzantine fault tolerance C++ programming language CA. See certification authority calculus Caltech Cambridge CanCrash.exe CanCrashWeird.exe Carnegie Mellon University CD cell phone. See phone certificate certification authority Charles Babbage Institute chat-bot checkbook checksum; in practice; simple; staircase. See also cryptographic hash function checksum trick chemistry chess Church, Alonzo Church-Turing thesis citations class classification classifier clock arithmetic clock size; conditions on; factorization of; need for large; primary; as a public number; in RSA; secondary Codd, E.

pages: 305 words: 89,103

Scarcity: The True Cost of Not Having Enough by Sendhil Mullainathan

American Society of Civil Engineers: Report Card, Andrei Shleifer, Cass Sunstein, clean water, computer vision, delayed gratification, double entry bookkeeping, Exxon Valdez, fault tolerance, happiness index / gross national happiness, impulse control, indoor plumbing, inventory management, knowledge worker, late fees, linear programming, mental accounting, microcredit, p-value, payday loans, purchasing power parity, randomized controlled trial, Report Card for America’s Infrastructure, Richard Thaler, Saturday Night Live, Walter Mischel, Yogi Berra

And much of it does not sit so well with being a student. Skipping class in a training program while you’re dealing with scarcity is not the same as playing hooky in middle school. Linear classes that must not be missed can work well for the full-time student; they do not make sense for the juggling poor. It is important to emphasize that fault tolerance is not a substitute for personal responsibility. On the contrary: fault tolerance is a way to ensure that when the poor do take it on themselves, they can improve—as so many do. Fault tolerance allows the opportunities people receive to match the effort they put in and the circumstances they face. It does not take away the need for hard work; rather, it allows hard work to yield better returns for those who are up for the challenge, just as improved levers in the cockpit allow the dedicated pilot to excel.

It has also occasionally led to programs with strong incentives, such as conditional cash transfer programs, where the amount of aid one receives depends on performing assorted “good” behaviors. But why not look at the design of the cockpit rather than the workings of the pilot? Why not look at the structure of the programs rather than the failings of the clients? If we accept that pilots can fail and that cockpits need to be wisely structured so as to inhibit those failures, why can we not do the same with the poor? Why not design programs structured to be more fault tolerant? We could ask the same question of anti-poverty programs. Consider the training programs, where absenteeism is common and dropout rates are high. What happens when, loaded and depleted, a client misses a class? What happens when her mind wanders in class? The next class becomes a lot harder. Miss one or two more classes and dropping out becomes the natural outcome, perhaps even the best option, as she really no longer understands much of what is being discussed in the class.

You’re exhausted and weighed down by things more proximal, and you know that even if you go you won’t absorb a thing. Now roll forward a few more weeks. By now you’ve missed another class. And when you go, you understand less than before. Eventually you decide it’s just too much right now; you’ll drop out and sign up another time, when your financial life is more together. The program you tried was not designed to be fault tolerant. It magnified your mistakes, which were predictable, and essentially pushed you out the door. But it need not be that way. Instead of insisting on no mistakes or for behavior to change, we can redesign the cockpit. Curricula can be altered, for example, so that there are modules, staggered to start at different times and to proceed in parallel. You missed a class and fell behind? Move to a parallel session running a week or two “behind” this one.

pages: 523 words: 112,185

Doing Data Science: Straight Talk From the Frontline by Cathy O'Neil, Rachel Schutt

Amazon Mechanical Turk, augmented reality, Augustin-Louis Cauchy, barriers to entry, Bayesian statistics, bioinformatics, computer vision, correlation does not imply causation, crowdsourcing, distributed generation, Edward Snowden, Emanuel Derman, fault tolerance, Filter Bubble, finite state, Firefox, game design, Google Glasses, index card, information retrieval, iterative process, John Harrison: Longitude, Khan Academy, Kickstarter, Mars Rover, Nate Silver, natural language processing, Netflix Prize, p-value, pattern recognition, performance metric, personalized medicine, pull request, recommendation engine, rent-seeking, selection bias, Silicon Valley, speech recognition, statistical model, stochastic process, text mining, the scientific method, The Wisdom of Crowds, Watson beat the top human players on Jeopardy!, X Prize

You cannot reason about the efficiency of fault tolerance easily; everything is complicated. And note, efficiency is just as important as correctness, because a thousand computers are worth more than your salary. It’s like this: The first 10 computers are easy; The first 100 computers are hard; and The first 1,000 computers are impossible. There’s really no hope. Or at least there wasn’t until about eight years ago. At Google now, David uses 10,000 computers regularly. Enter MapReduce In 2004 Jeff and Sanjay published their paper “MapReduce: Simplified Data Processing on Large Clusters” (and here’s another one on the underlying filesystem). MapReduce allows us to stop thinking about fault tolerance; it is a platform that does the fault tolerance work for us. Programming 1,000 computers is now easier than programming 100.

If we denote by the variable that exhibits whether a given computer is working, so means it works and means it’s broken, then we can assume: But this means, when we have 1,000 computers, the chance that no computer is broken is which is generally pretty small even if is small. So if for each individual computer, then the probability that all 1,000 computers work is 0.37, less than even odds. This isn’t sufficiently robust. What to do? We address this problem by talking about fault tolerance for distributed work. This usually involves replicating the input (the default is to have three copies of everything), and making the different copies available to different machines, so if one blows, another one will still have the good data. We might also embed checksums in the data, so the data itself can be audited for errors, and we will automate monitoring by a controller machine (or maybe more than one?).

Programming 1,000 computers is now easier than programming 100. It’s a library to do fancy things. To use MapReduce, you write two functions: a mapper function, and then a reducer function. It takes these functions and runs them on many machines that are local to your stored data. All of the fault tolerance is automatically done for you once you’ve placed the algorithm into the map/reduce framework. The mapper takes each data point and produces an ordered pair of the form (key, value). The framework then sorts the outputs via the “shuffle,” and in particular finds all the keys that match and puts them together in a pile. Then it sends these piles to machines that process them using the reducer function. The reducer function’s outputs are of the form (key, new value), where the new value is some aggregate function of the old values.

pages: 589 words: 147,053

The Age of Em: Work, Love and Life When Robots Rule the Earth by Robin Hanson

8-hour work day, artificial general intelligence, augmented reality, Berlin Wall, bitcoin, blockchain, brain emulation, business cycle, business process, Clayton Christensen, cloud computing, correlation does not imply causation, creative destruction, demographic transition, Erik Brynjolfsson, Ethereum, ethereum blockchain, experimental subject, fault tolerance, financial intermediation, Flynn Effect, hindsight bias, information asymmetry, job automation, job satisfaction, John Markoff, Just-in-time delivery, lone genius, Machinery of Freedom by David Friedman, market design, meta analysis, meta-analysis, Nash equilibrium, new economy, prediction markets, rent control, rent-seeking, reversible computing, risk tolerance, Silicon Valley, smart contracts, statistical model, stem cell, Thomas Malthus, trade route, Turing test, Vernor Vinge

If emulation hardware is digital, then it could either be deterministic, so that the value and timing of output states are always exactly predictable, or it could be fault-prone and fault-tolerant in the sense of having and tolerating more frequent and larger logic errors and timing fluctuations. Most digital hardware today is deterministic, but large parallel systems are more often fault-tolerant. The design of fault-tolerant hardware and software is an active area of research today (Bogdan et al. 2007). As human brains are large, parallel, and have an intrinsically fault-tolerant design, brain emulation software is likely to need less special adaptation to run on fault-prone hardware. Such hardware is usually cheaper to design and construct, occupies less volume, and takes less energy to run. Thus em hardware is likely to often be fault-prone and fault-tolerant. Cosmic rays are high-energy particles that come from space and disrupt the operation of electronic devices.

BLS 2012. “Employee Tenure in 2012.” United States Bureau of Labor Statistics USDL-12–1887, September 18. http://www.bls.gov/news.release/archives/tenure_09182012.pdf. Boehm, Christopher. 1999. Hierarchy in the Forest: The Evolution of Egalitarian Behavior. Harvard University Press, December 1. Bogdan, Paul, Tudor Dumitras, and Radu Marculescu. 2007. “Stochastic Communication: A New Paradigm for Fault-Tolerant Networks-on-Chip.” VLSI Design 2007: 95348. Boning, Brent, Casey Ichniowski, and Kathryn Shaw. 2007. “Opportunity Counts: Teams and the Effectiveness of Production Incentives.” Journal of Labor Economics 25(4): 613–650. Bonke, Jens. 2012. “Do Morning-Type People Earn More than Evening-Type People? How Chronotypes Influence Income.” Annals of Economics and Statistics 105/106: 55–72. Boserup, Ester. 1981.

Eric 33 dust 103 E early scans 148, 150 earthquakes 93 eating 298 economic analysis, v economic growth 28 economics 37–9, 382 economy 130, 179, 190, 276, 278, 374 doubling time of 190–4, 201–2, 221 early em 360 growth of 10, 92 size of 194 efficiency 155–65, 278 clan concentration 155–6 competition 156–9 eliteness 161–3 implications 159–61 qualities 163–5 elections 182, 183, 265 eliteness 161–3 ems see emulations emotion words 217 emulations 2, 6–7, 130, 338 assumptions 47–8 brain 2 compared to ordinary humans 11–2 enough 151–4 envisioning a world of 34–7 inequality 244–6 introduction to 1–2 many 122–4 mass 308 models 48 niche 308 one-name 155–6 opaque 61 open source 61 overview of 5–8 precedents 13–15 slow 257 start of 5–11 summary of conclusions 8–11 technologies 46 time-sharing 65, 222 energy 70, 71, 74, 75, 82 see also entropy control of 126 influence on behavior 83 entrenchment 344 entropy 77–80 see also energy eras 13–14, 15 see also farming era; foraging era; industrial era present 18–21 prior 15–18 values 21–3 erasures of bits 81, 82, 83 logical78rate of 80 reversible 79 eunuchs 285, 343 evaluations 367–70 evolution 22, 24, 25, 26, 134, 153 animal 24 em 153, 154 foragers 24, 25, 238 human 134, 153, 227 systems 344 existence 119–26 copying 119–21 many ems 122–4 rights 121–2 surveillance 124–6 existential risk 369 expenses 357 experimental art 203 experts, fake 254–5 exports 87, 95, 224 F faces 102, 297 factions 268–70 factories 96–7, 190, 191, 192, 193 failures 208 fake experts 254–5 fakery 113–14 farmers 1, 5, 8, 13, 16–17 communities 216 culture 326–8 farming era 5, 13, 14, 190, 252 firms 253 inequality 243 marriages 289 stories 331 wars 251 fashions 257, 268, 298, 310, 325, 326 clothes 18 intellectual 301 local 296 music 28 fast ems 257 fears 343 feelings 217 fertility 25, 26 fiction 1, 2, 41, 334 see also science fiction finance 195–7 financial inequality 247 fines 273 firms 231–4, 245 cost-focused 233 family-based 232 firm-clan relations 235–7 managers 234 mass versus niche teams 239–41 novelty-focused 233 private-equity owned 232 quality-focused 233 teams 237–9 first ems 147–50 flexibility 184, 202, 206, 224, 288, 378 flow projects 192 foragers 1, 5, 6, 8, 24–5, 29, 156, 190, 238 communities 13 pair bonds 289 foraging era 14, 16 inequality 243 stories 331 forecasting 33–4 fractal reversing 79, 81 fractional factorial experiment design 115 fragility 127–30 friendship 320, 371 future, vi 1, 26, 28, 31–2, 381 abstract construal of 42 analysis of 382, 383 em 384 eras 27, 29 evaluation of 367 technology 2, 7 futurists 35 G gates, computer 77–8 gender 290–1, 325 imbalance 291–3 geographical divisions 326 ghosts 132–3 global laws 124 God 316 governance 197, 258–62 clan 262–4 global 358 governments 364 gravity 74, 101 grit 164, 379 groups 227–41 clans 227–9 firm-clan relations 235–7 firms 231–4 managing clans 229–31 mass versus niche teams 239–41 signals 299–302 teams 237–9 growth 14, 15, 27, 28, 29, 189–97 estimate 192–4 faster 189–92 financial 195–7 modes 14 myths 194–5 H happiness 42, 165, 204–5, 232, 238, 247, 253, 303, 311, 320, 339, 370–1 hardware 54, 56–60, 63, 65, 278 clan-specific 355 communication 86 computer 86 deterministic 58, 86, 97, 174 digital 58 fault-tolerant 58 parallel 63–5 reversible 82 signal-processing 46, 57, 59 variable speed 82 heat transport 91–2 historians vi, 35 history 31, 32, 41, 248, 301 leisure 204, 207 personal 111 homosexual ems 292 homosexuality 10 hospitals 302 humans 1, 5, 7, 8, 14 introduction of 13 I identical twins 227 identity 49, 303–8, 317 ideologies 326 illness 305 implementation of emulations 55–65 hardware 56–60 mindreading 55–6 parallelism 63–5 security 60–3 impressions 295, 300 incentives 180, 181, 182, 183, 274 inclinations 342 income tax 182 individualism 20 industrial era 18–21 firms 253 stories 332 industrial organization 158 industrial revolution 232, 363 industry 5, 6, 13, 14 inequality 243–7 information 109–17 fake 113–14 records 111–2 simulations 115–17 views 109–11 infrastructure 85–98 air and water 90–2 buildings 92–5 climate controlled 85 cooling 86–9 manufacturing 95–8 innovation 189, 193, 275–7 institutions 179–80 new 181–4 intellectual property 124, 125, 147, 276, 277, 324, 362, 378 intelligence 163, 194, 295, 297, 299, 346–7 intelligence explosion 347–50 interactions 83, 109–10 interest rates 131, 196–7, 224 J job(s) categories 153 evaluations 159, 233 performance 164 tasks 356 see also careers; work judges 133, 173, 174, 261, 262, 267, 270, 272, 277, 286 K Kahn, Herman 33 kilo-ems 224 Kingdom Tower, Jeddah 93 L labor 54, 143–54, 190, 361 enough ems 151–4 first ems 147–50 Malthusian wages 146–7 markets 237 selection 150–1 supply and demand 143–5 languages 16, 128, 172, 217, 278, 345 law 229, 271–3 efficient 273–5 lawsuits 274 leisure 100, 102, 129, 168, 207, 374 activities 329 fast 258 speeds 222 liability 229, 273, 274, 277 liberals 327 lifecycle 199–212 careers 199–202 childhood 210–2 maturity 204–5 peak age 202–4 preparation for tasks 206–8 training 208–10 lifespan 11, 245, 246, 247 limits 27–9 logic gates 78, 79 loyalty 115, 117, 297, 299 lying 205 M machine reproduction estimates 192–3 machine shops 192 maladaptive behaviors 26 maladaptive cultures 25 Malthusian wages 146–7 management 200 of physical systems 109 practices 232–3 manic-depressive disorder 165 marketing 331 mass labor markets 239, 324 mass market teams 239–41 mass production 96 mating 285–93, 320, 342 gender 290–1 gender imbalance 291–3 open-source lovers 287–8 pair bonds 288–90 sexuality 285–7 maturity 204–5 meetings 75–7, 310 memories 48, 112, 136, 149, 207, 221, 304, 307 memory 63–5, 70–1, 79, 145, 219 mental fatigue 170 mental flexibility 203 mental speeds see mind speeds messages 81–2, 104 delays 77 methods 33, 34, 37, 40, 41, 42 Microsoft 91 military 359–60 mindfulness 165 minds 10, 335–50 features 344–5 humans 335–9 intelligence 346–7 intelligence explosion 347–50 merging 358 partial 341–3 psychology 343–6 quality 74 reading 55–6, 265, 271, 310, 314 speeds 65, 194, 199, 221–4 see also speed(s) theft 10, 61, 62, 76, 124, 302 unhumans 339–41 modeling, brain cell 364 modes of civilization 13–30 dreamtime 23–6 era values 21–3 limits 27–9 our era 18–21 precedents 13–15 prior eras 15–18 modular buildings 94 functional units 49 Moore’s law 54, 59, 80 moral choices 303 morality 2, 368 motivation, for studying future emulations 31–3 multitasking 171 music 311, 312, 328 myths 194–5 N nanotech manufacturing 97 nations 39, 87, 159, 163, 184, 195, 216, 243, 244, 245, 253 democratic 264 poor 22 rich 22, 39, 73, 94, 216, 234 war between 259 nature 81, 303 Neanderthals 21 nepotism 252–4 networks, talk 237 neurons 69 niche ems 308 niche labor markets 239, 324 niche market teams 239–41 normative considerations 44 nostalgia 308 nuclear weapons 251 O office politics 236 offices 100, 102, 104 older people 204–5 see also aging; retirement open-source lovers 287–8 outcome measures 260 ownership 120 P pair bonds 286, 288–90, 292–3 parallel computing 63–5, 278, 279, 280, 353 parents 383 partial sims 115 past, the see history patents 277 pay-for-performance 181–2 peak age 202–4 period 64–5, 70, 72, 76, 110 reversing 79–83 perseverance 164 personality, gender differences 290 personal signals 296–9 phase 65, 76, 81, 83, 110, 222 physical bodies 73, 75–6 physical jobs 73 physical violence 103 physical worlds 81 pipes 87, 88 plants 16, 87, 190, 303 police spurs 358 policy analysis 372–6 political power 354 politics 257–70, 322, 333 clan governance 262–4 coalitions 266–8 democracy 264–6 factions 268–70 governance 258–62 population 125 portable brain hardware 251 portfolios 196, 264, 378 positive considerations 44 poverty 246, 247, 249, 250 em 147, 153, 325 human 338 power 175–7 power laws 243 prediction markets 184, 186–8, 252, 255, 274, 317 city auctions 220 estimates 231 use of 276 pre-human primates 15–16 pre-skills 143–4, 152–3, 158, 356 preparation for tasks 206–8 prices 181–4, 187 of manufactured goods 145 for resources 179 printers, 3D 192 prison 273 privacy 172 productivity 12, 163, 171, 209–10, 211, 371 progress 2, 46–7, 49, 52, 53, 54 psychology 343–6 punishments 229, 273 purchasing 97, 182, 183, 277, 304 Q qualities 163–5 quality of life 370–2 quantum computing 357 R random access memory (RAM) 70 rare products 299 reaction time 72–3, 76–8, 83, 217 body size and 73 physical em body 223 real world, merging virtual and 105–7 records 111–2 redistribution 246–50 regulations 28, 37–8, 106, 110, 123, 151, 159, 217, 221, 264, 356, 358, 359 religion 276, 311–2, 326 research 194–5, 376 retirement 110, 127, 129–33, 135, 170, 174, 221–2, 336–9 human 8 reversibility 77–80, 82, 83 rewards 159–60 rights 121–2 rituals 309–11 rulers 259 rules 164, 271–81 S safes 172–3 salt water 91 scales 69–83 bodies 72–4 Lilliputian 74–5 speeds 69–72 scanning 148, 151, 363 scans 148–50 scenarios 34–7, 354–9, 363, 364 schools v, 20, 164, 168, 181, 233, 295–6, 302, 309, 333, 382 science fiction v, 2, 6, 312 scope 39–40 search teams 210 security 60–3, 71, 101, 104, 110, 117, 231, 306, 354, 357 breaches 85, 117 computer 104, 252, 357 costs 76 selection 5, 24, 26, 112, 137, 150–1, 153, 158, 162, 175, 263, 292, 339, 346 self-deception 173, 261, 296 self-governance 230 serial computing 353 sexuality 285–7, 328 shared spaces 103–5 showing off 295–6 sight perception 341 signals 295–308 copy identity 305–8 groups 299–302 identity 303–5 personal 296–9 processing 46 sim administrators 116 simulations 115–17 singing 311 sins 312 size 69, 72, 73, 74, 75, 110 slaves 16, 60, 121, 123–4, 147, 149, 245, 302, 327, 342 sleep 18, 60, 83, 133, 165 sleeping beauty strategy 131 social bonds 239 social gatherings 267 social interactions 238 social power 175–7 social reasoning 342 social relations 323 social science 382 social status 258 society 12, 321–34 software 54, 126, 277–9, 355 software developers 280–1 software engineers 200, 278, 280 souls 106 sound perception 341 spaces 110–14 space travel 225 speculation 39 speed(s) 69, 110, 137, 245, 246, 332 alternative scenario 355, 358 divisions 325, 326 em 8, 10, 353–4 em era 353 ghosts 132, 133 human-speed emulation 47 redistribution based on 248 retirement 130, 131 talking 298 time-shared em 65 top cheap 69, 70, 82, 89, 133, 222, 280, 281 travel 329, 330 variable speed hardware 82 walking 74 spurs 9, 110, 136, 169–71, 271, 292 social interactions 171 uses of 171–4 stability 131, 132 status 257–8, 301 stories 32, 35, 102, 325, 330–3 see also fiction clan 333–4 stress 20, 103, 134, 137, 164, 313 structure, city 217–19 subclans 227, 229 conflicting 356 inequality between 248 subordinates 200 subsistence levels 249 success 377–9 suicide 138–9 supply and demand 143–5 surveillance 124–6, 271 swearing 312–14 synchronization 309, 318–20 T takeovers 196 talk networks 237 taxes 249–50, 337 teams 237–9, 296, 299, 301, 306, 307 application 210 intelligence 346 mass versus niche teams 239–41 training 204 technologies 362–4 temperature 85, 88–91 territories 374 tests 114–17 theory 37, 39, 143 tools, non-computer-based 279 top cheap speed 69, 70, 82, 89, 133, 222, 280, 281 track records 181, 255 training 147, 151, 208–10, 212 transexuality 10 transgender conversions 292 transition, from our world to the em world 359–62 transport 224–6 travel 18, 22, 29, 43, 75, 102, 215, 218–19, 303, 329–30 travel times 102 trends 353–4 trust 208, 236 clans 227, 228, 234, 235 maturity and 204, 205 Tsiolkovsky, Konstantin 33 tweaking 150, 151 U undo action 104–5 unhumans, minds of 339–41 unions 236 United States of America 23 uploads see emulations utilitarianism 370, 372 V vacations 207 values 21–3, 237–8, 322, 383, 384 variety 20, 23, 96, 156, 157, 160, 189, 199, 234, 298, 375 views 109–11, 381, 382, 383 virtual meetings 217 virtual reality 8, 102, 103–4, 112, 217, 288, 291, 362 appearances 99–101 authentication 113 cultures 324 design of 104 leisure environments 102 meetings 76 merging real and 105–7 nature 81 travel, 224voices, pitch of 297 voting 183, 265–6 W wages 9, 12, 124, 143–5, 245, 336, 358 inequality 234, 248 Malthusian wages 146–7 rules 121, 122, 123 subsistence 354 war 16–17, 36, 131, 134, 250–2, 327, 354, 361 water 87, 90–2 Watkins, John 33 wealth 23, 26, 245–6, 321–2, 325, 336–8 weapons 251 Whole Brain Emulation Roadmap (Sandberg and Bostrom) 47 Wiener, Anthony 33 wind pressures 92, 93 work 167–77, 327, 328, 331 conditions 169 culture 321, 322, 323, 324 hours 167–9, 299, 372 methods 202 social power 175–7 speeds 222 spurs 169–71 teams 237–9 workers, time spent “loafing” 170 workaholics 165, 167 World Wide Web 34 Y Year 2000, The (Kahn and Wiener) 33 youth 11, 30, 376 see also children Z zoning 184, 185

pages: 463 words: 118,936

Darwin Among the Machines by George Dyson

Ada Lovelace, Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem, Albert Einstein, anti-communist, British Empire, carbon-based life, cellular automata, Claude Shannon: information theory, combinatorial explosion, computer age, Danny Hillis, Donald Davies, fault tolerance, Fellow of the Royal Society, finite state, IFF: identification friend or foe, invention of the telescope, invisible hand, Isaac Newton, Jacquard loom, James Watt: steam engine, John Nash: game theory, John von Neumann, low earth orbit, Menlo Park, Nash equilibrium, Norbert Wiener, On the Economy of Machinery and Manufactures, packet switching, pattern recognition, phenotype, RAND corporation, Richard Feynman, spectrum auction, strong AI, the scientific method, The Wealth of Nations by Adam Smith, Turing machine, Von Neumann architecture, zero-sum game

How could a mechanism composed of some ten billion unreliable components function reliably while computers with ten thousand components regularly failed? Von Neumann believed that entirely different logical foundations would be required to arrive at an understanding of even the simplest nervous system, let alone the human brain. His Probabilistic Logics and the Synthesis of Reliable Organisms from Unreliable Components (1956) explored the possibilities of parallel architecture and fault-tolerant neural nets. This approach would soon be superseded by a development that neither nature nor von Neumann had counted on: the integrated circuit, composed of logically intricate yet structurally monolithic microscopic parts. Serial architecture swept the stage. Probabilistic logics, along with vacuum tubes and acoustic delay-line memory, would scarcely be heard from again. If the development of solid-state electronics had been delayed a decade or two we might have advanced sooner rather than later into neural networks, parallel architectures, asynchronous processing, and other mechanisms by which nature, with sloppy hardware, achieves reliable results.

At one level, this language may appear to us to be money, especially the new, polymorphous E-money that circulates without reserve at the speed of light. E-money is, after all, simply a consensual definition of “electrons with meaning,” allowing other levels of meaning to freely evolve. Composed of discrete yet divisible and liquid units, digital currency resembles the pulse-frequency coding that has proved to be such a rugged and fault-tolerant characteristic of the nervous systems evolved by biology. Frequency-modulated signals that travel through the nerves are associated with chemical messages that are broadcast by diffusion through the fluid that bathes the brain. Money has a twofold nature that encompasses both kinds of behavior: it can be transmitted, like an electrical signal, from one place (or time) to another; or it can be diffused in any number of more chemical, hormonelike ways.

From the point of view of an individual packet, not only is there a huge number of physically distinct paths from A to B through the mesh of lunch boxes, but there are 162 alternative channels leading to the nearest lunch box at any given time. The packet chooses a channel that happens to be quiet at that instant and jumps to the next lamppost at the speed of light. The multiplexing of communications across the available network topology is extended to the multiplexing of network topology across the available frequency spectrum. Communication becomes more efficient, fault tolerant, and secure. The way the system works now (in a growing number of metropolitan areas—hence the name) is that you purchase or rent a small Ricochet modem, about the size of a large candy bar and transmitting at about two-thirds of a watt. Your modem establishes contact with the nearest pole-top lunch box or directly with any other modem of its species within range. Your computer sees the system as a standard modem connection or an Internet node, and the network, otherwise transparent to the users, keeps track of where all the users and all the lunch boxes are.

pages: 1,758 words: 342,766

Code Complete (Developer Best Practices) by Steve McConnell

Ada Lovelace, Albert Einstein, Buckminster Fuller, call centre, continuous integration, data acquisition, database schema, don't repeat yourself, Donald Knuth, fault tolerance, Grace Hopper, haute cuisine, if you see hoof prints, think horses—not zebras, index card, inventory management, iterative process, Larry Wall, loose coupling, Menlo Park, Perl 6, place-making, premature optimization, revision control, Sapir-Whorf hypothesis, slashdot, sorting algorithm, statistical model, Tacoma Narrows Bridge, the scientific method, Thomas Kuhn: the structure of scientific revolutions, Turing machine, web application

The fact that an environment has a particular error-handling approach doesn't mean that it's the best approach for your requirements. Fault Tolerance The architecture should also indicate the kind of fault tolerance expected. Fault tolerance is a collection of techniques that increase a system's reliability by detecting errors, recovering from them if possible, and containing their bad effects if not. Further Reading For a good introduction to fault tolerance, see the July 2001 issue of IEEE Software. In addition to providing a good introduction, the articles cite many key books and key articles on the topic. For example, a system could make the computation of the square root of a number fault tolerant in any of several ways: The system might back up and try again when it detects a fault. If the first answer is wrong, it would back up to a point at which it knew everything was all right and continue from there.

It might have three square-root classes that each use a different method. Each class computes the square root, and then the system compares the results. Depending on the kind of fault tolerance built into the system, it then uses the mean, the median, or the mode of the three results. The system might replace the erroneous value with a phony value that it knows to have a benign effect on the rest of the system. Other fault-tolerance approaches include having the system change to a state of partial operation or a state of degraded functionality when it detects an error. It can shut itself down or automatically restart itself. These examples are necessarily simplistic. Fault tolerance is a fascinating and complex subject—unfortunately, it's one that's outside the scope of this book. Architectural Feasibility The designers might have concerns about a system's ability to meet its performance targets, work within resource limitations, or be adequately supported by the implementation environments.

Are the architecture's security requirements described? Does the architecture set space and speed budgets for each class, subsystem, or functionality area? Does the architecture describe how scalability will be achieved? Does the architecture address interoperability? Is a strategy for internationalization/localization described? Is a coherent error-handling strategy provided? Is the approach to fault tolerance defined (if any is needed)? Has technical feasibility of all parts of the system been established? Is an approach to overengineering specified? Are necessary buy-vs.-build decisions included? Does the architecture describe how reused code will be made to conform to other architectural objectives? Is the architecture designed to accommodate likely changes? General Architectural Quality Does the architecture account for all the requirements?

pages: 480 words: 99,288

Mastering ElasticSearch by Rafal Kuc, Marek Rogozinski

Amazon Web Services, create, read, update, delete, en.wikipedia.org, fault tolerance, finite state, full text search, information retrieval

Besides the fact that ElasticSearch can automatically discover field type by looking at its value, sometimes (in fact usually always) we will want to configure the mappings ourselves to avoid unpleasant surprises. Type Each document in ElasticSearch has its type defined. This allows us to store various document types in one index and have different mappings for different document types. Node The single instance of the ElasticSearch server is called a node. A single node ElasticSearch deployment can be sufficient for many simple use cases, but when you have to think about fault tolerance or you have lots of data that cannot fit in a single server, you should think about multi-node ElasticSearch cluster. Cluster Cluster is a set of ElasticSearch nodes that work together to handle the load bigger than single instance can handle (both in terms of handling queries and documents). This is also the solution which allows us to have uninterrupted work of application even if several machines (nodes) are not available due to outage or administration tasks, such as upgrade.

Finally, we've learned about segments merging, merge policies, and scheduling. In the next chapter, we'll look closely at what ElasticSearch offers us when it comes to shard control. We'll see how to choose the right amount of shards and replicas for our index, we'll manipulate shard placement and we will see when to create more shards than we actually need. We'll discuss how the shard allocator works. Finally, we'll use all the knowledge we've got so far to create fault tolerant and scalable clusters. Chapter 4. Index Distribution Architecture In the previous chapter, we've learned how to use different scoring formulas and how we can benefit from using them. We've also seen how to use different posting formats to change how the data is indexed. In addition to that, we now know how to handle near real-time searching and real-time get and what searcher reopening means for ElasticSearch.

If we sent many queries we would end up having the same (or almost the same) number of queries run against each of the shard and replicas. Using our knowledge As we are slowly approaching the end of the fourth chapter we need to get something that is closer to what you can encounter during your everyday work. Because of that we have decided to divide the real-life example into two sections. In this section, you'll see how to combine the knowledge we've got so far to build a fault-tolerant and scalable cluster based on some assumptions. Because this chapter is mostly about configuration, we will concentrate on that. The mappings and your data may be different, but with similar amount data and queries hitting your cluster the following sections may be useful for you. Assumptions Before we go into the juicy configuration details let's make some basic assumptions with which using which we will configure our ElasticSearch cluster.

pages: 719 words: 181,090

Site Reliability Engineering: How Google Runs Production Systems by Betsy Beyer, Chris Jones, Jennifer Petoff, Niall Richard Murphy

Air France Flight 447, anti-pattern, barriers to entry, business intelligence, business process, Checklist Manifesto, cloud computing, combinatorial explosion, continuous integration, correlation does not imply causation, crowdsourcing, database schema, defense in depth, DevOps, en.wikipedia.org, fault tolerance, Flash crash, George Santayana, Google Chrome, Google Earth, information asymmetry, job automation, job satisfaction, Kubernetes, linear programming, load shedding, loose coupling, meta analysis, meta-analysis, microservices, minimum viable product, MVC pattern, performance metric, platform as a service, revision control, risk tolerance, side project, six sigma, the scientific method, Toyota Production System, trickle-down economics, web application, zero day

The product developers have more visibility into the time and effort involved in writing and releasing their code, while the SREs have more visibility into the service’s reliability (and the state of production in general). These tensions often reflect themselves in different opinions about the level of effort that should be put into engineering practices. The following list presents some typical tensions: Software fault tolerance How hardened do we make the software to unexpected events? Too little, and we have a brittle, unusable product. Too much, and we have a product no one wants to use (but that runs very stably). Testing Again, not enough testing and you have embarrassing outages, privacy data leaks, or a number of other press-worthy events. Too much testing, and you might lose your market. Push frequency Every push is risky.

This amortizes the fixed costs of the disk logging and network latency over the larger number of operations, increasing throughput. Deploying Distributed Consensus-Based Systems The most critical decisions system designers must make when deploying a consensus-based system concern the number of replicas to be deployed and the location of those replicas. Number of Replicas In general, consensus-based systems operate using majority quorums, i.e., a group of replicas may tolerate failures (if Byzantine fault tolerance, in which the system is resistant to replicas returning incorrect results, is required, then replicas may tolerate failures [Cas99]). For non-Byzantine failures, the minimum number of replicas that can be deployed is three—if two are deployed, then there is no tolerance for failure of any process. Three replicas may tolerate one failure. Most system downtime is a result of planned maintenance [Ken12]: three replicas allow a system to operate normally when one replica is down for maintenance (assuming that the remaining two replicas can handle system load at an acceptable performance).

Robbins, Web Operations: Keeping the Data on Time: O’Reilly, 2010. [All12] J. Allspaw, “Blameless PostMortems and a Just Culture”, blog post, 2012. [All15] J. Allspaw, “Trade-Offs Under Pressure: Heuristics and Observations of Teams Resolving Internet Service Outages”, MSc thesis, Lund University, 2015. [Ana07] S. Anantharaju, “Automating web application security testing”, blog post, July 2007. [Ana13] R. Ananatharayan et al., “Photon: Fault-tolerant and Scalable Joining of Continuous Data Streams”, in SIGMOD ’13, 2013. [And05] A. Andrieux, K. Czajkowski, A. Dan, et al., “Web Services Agreement Specification (WS-Agreement)”, September 2005. [Bai13] P. Bailis and A. Ghodsi, “Eventual Consistency Today: Limitations, Extensions, and Beyond”, in ACM Queue, vol. 11, no. 3, 2013. [Bai83] L. Bainbridge, “Ironies of Automation”, in Automatica, vol. 19, no. 6, November 1983.

Reactive Messaging Patterns With the Actor Model: Applications and Integration in Scala and Akka by Vaughn Vernon

A Pattern Language, business intelligence, business process, cloud computing, cognitive dissonance, domain-specific language, en.wikipedia.org, fault tolerance, finite state, Internet of things, Kickstarter, loose coupling, remote working, type inference, web application

• If scheduling tasks is difficult and error prone, leave the task scheduling to software that is best at that job, and focus on your system’s use cases instead. • If errors happen—and errors do happen—design your system to expect errors and react to errors by being fault tolerant. These are powerful assertions. Yet, is there a way to realize these sound concurrency design principles? Or have we just identified a panacea of wishful thinking? Can we actually use multithreaded software development techniques that enable us to reason about our systems, that react to changing conditions, that are scalable and fault tolerant, and that really work? How the Actor Model Helps A system of actors helps you leverage the simultaneous use of multiple processor cores. Each core can be kept busy executing a given actor’s reaction to its current incoming message.

Trying to take full advantage of contemporary hardware improvements such as increasing numbers of processors and cores and growing processor cache is seriously impeded by the very tools and patterns that should be helping us. Thus, implementing event-driven, scalable, resilient, and responsive applications is often deemed too difficult and risky and as a result is generally avoided. The Akka toolkit was created to address the failings of common multithreaded programming approaches, distributed computing, and fault tolerance. It does so by using the Actor model, which provides powerful abstractions that make creating solutions around concurrency and parallelism much easier to reason about and succeed in. This is not to say that Akka removes the need to think about concurrency. It doesn’t, and you must still design for parallelism, latency, and eventually consistent application state and think of how you will prevent your application from unnecessary blocking.

• Cluster-aware routers: Distribute work across nodes in a cluster by using routers to apportion the work and routees to perform the work. Akka clustering is useful not only for peak demand but also for failover. Even if you have five machines available, clustering can make more efficient use of resources by assigning extra work to the machines least under load and by rebalancing work between machines if a machine crashes unexpectedly. Akka clustering is designed to support a multinode, fault-tolerant, distributed system of actors. It does this by creating a cluster of nodes. A node must be an ActorSystem that is exposed on a TCP port so that it has a unique identifier. Every node must share the same ActorSystem name. Every node member must use a different port number within its host server hardware; no two node members may share a socket port number on the same physical machine. There is no global state shared between ActorSystem instances, so within a single machine, you may have several nodes inside a single JVM or several JVMs with one node.

pages: 721 words: 197,134

Data Mining: Concepts, Models, Methods, and Algorithms by Mehmed Kantardzić

Albert Einstein, bioinformatics, business cycle, business intelligence, business process, butter production in bangladesh, combinatorial explosion, computer vision, conceptual framework, correlation coefficient, correlation does not imply causation, data acquisition, discrete time, El Camino Real, fault tolerance, finite state, Gini coefficient, information retrieval, Internet Archive, inventory management, iterative process, knowledge worker, linked data, loose coupling, Menlo Park, natural language processing, Netflix Prize, NP-complete, PageRank, pattern recognition, peer-to-peer, phenotype, random walk, RFID, semantic web, speech recognition, statistical model, Telecommunications Act of 1996, telemarketer, text mining, traveling salesman, web application

In the context of data classification, an ANN can be designed to provide information not only about which particular class to select for a given sample, but also about confidence in the decision made. This latter information may be used to reject ambiguous data, should they arise, and thereby improve the classification performance or performances of the other tasks modeled by the network. 5. Fault Tolerance. An ANN has the potential to be inherently fault-tolerant, or capable of robust computation. Its performances do not degrade significantly under adverse operating conditions such as disconnection of neurons, and noisy or missing data. There is some empirical evidence for robust computation, but usually it is uncontrolled. 6. Uniformity of Analysis and Design. Basically, ANNs enjoy universality as information processors.

The SOMs have been used in large spectrum of applications such as automatic speech recognition, clinical data analysis, monitoring of the condition of industrial plants and processes, classification from satellite images, analysis of genetic information, analysis of electrical signals from the brain, and retrieval from large document collections. Illustrative examples are given in Figure 7.18. Figure 7.18. SOM applications. (a) Drugs binding to human cytochrome; (b) interest rate classification; (c) analysis of book-buying behavior. 7.8 REVIEW QUESTIONS AND PROBLEMS 1. Explain the fundamental differences between the design of an ANN and “classical” information-processing systems. 2. Why is fault-tolerance property one of the most important characteristics and capabilities of ANNs? 3. What are the basic components of the neuron’s model? 4. Why are continuous functions such as log-sigmoid or hyperbolic tangent considered common activation functions in real-world applications of ANNs? 5. Discuss the differences between feedforward and recurrent neural networks. 6. Given a two-input neuron with the following parameters: bias b = 1.2, weight factors W = [w1, w2] = [3, 2], and input vector X = [−5, 6]T; calculate the neuron’s output for the following activation functions: (a) a symmetrical hard limit (b) a log-sigmoid (c) a hyperbolic tangent 7.

Because of the massive amount of data and the speed of which the data are generated, many data-mining applications in sensor networks require in-network processing such as aggregation to reduce sample size and communication overhead. Online data mining in sensor networks offers many additional challenges, including: limited communication bandwidth, constraints on local computing resources, limited power supply, need for fault tolerance, and asynchronous nature of the network. Obviously, data-mining systems have evolved in a short period of time from stand-alone programs characterized by single algorithms with little support for the entire knowledge-discovery process to integrated systems incorporating several mining algorithms, multiple users, communications, and various and heterogeneous data formats and distributed data sources.

Industry 4.0: The Industrial Internet of Things by Alasdair Gilchrist

3D printing, additive manufacturing, Amazon Web Services, augmented reality, autonomous vehicles, barriers to entry, business intelligence, business process, chief data officer, cloud computing, connected car, cyber-physical system, deindustrialization, DevOps, digital twin, fault tolerance, global value chain, Google Glasses, hiring and firing, industrial robot, inflight wifi, Infrastructure as a Service, Internet of things, inventory management, job automation, low cost airline, low skilled workers, microservices, millennium bug, pattern recognition, peer-to-peer, platform as a service, pre–internet, race to the bottom, RFID, Skype, smart cities, smart grid, smart meter, smart transportation, software as a service, stealth mode startup, supply-chain management, trade route, undersea cable, web application, WebRTC, Y2K

Therefore, we see the following delivery mechanisms: At most once delivery—This is commonly called fire and forget and rides on unreliable protocols such as UDP At least once delivery—This is reliable delivery such as TCP/IP where every message is delivered to the recipient Exactly once delivery—This technique is used in batch jobs as means of delivery that ensures late packets, delayed through excessive latency or delay or even jitter do not mess up the results Additionally, there are also many other factors that need to be taken into consideration such as lifespan, which relates to the IISs to discard old data packets, much like the time-to-live factor on IP packets. There is also fault tolerance, which ensures that there is fault survivability and alternative routes or hardware redundancy is available, which will guarantee availability and reliability. Similarly, there is the case of security, which we will discuss in detail in a later chapter. Industry 4.0 Key Functions of the Communication Layer The communication layer functions can deliver the data to the correct address and application.

There is also considerable interest in the production of IoT devices capable of energy harvesting solar, wind, or electromagnetic fields as a power source, as that can be a major technology advance in deploying remote M2M style mesh networking in rural areas. For example, in a smart agriculture scenario. Energy harvesting IoT devices would provide the means through mesh M2M networks for highly fault tolerant, unattended long-term solutions that require only minimal human intervention However, research and technology is not just focused on the technology. They are also keenly studying methods that would make application protocols and data formats far more efficient. For instance, low-power sources require that devices running on minimal power levels or are harvesting energy, again at subsistence levels, must communicate their data in a highly efficient and timely manner and this has serious implications for protocol design.

One drawback to xDSL is that the advertised bandwidth is shared among subscribers and service providers oversell link capacity, due to the nature of spiky TCP/IP and Internet browsing habits. Therefore, contention ratios—the number of other customers you are sharing the bandwidth with—can be as high as 50:1 for residential use and 10:1 for business use. • SDH/Sonnet—This optic ring technology is typically deployed as the service provider’s transport core as it is provides high speed, high capacity, and highly reliable and fault-tolerant transport for data over sometimesvast geographical regions. However, for customers that require high-speed data links over a large geographical region, typically enterprises or large company's fiber optic 163 164 Chapter 11 | IIoT WAN Technologies and Protocols rings are high performance, highly reliable, and high cost. Sonnet and SDH are transport protocols that encapsulate payload data within fixed synchronous frames.

pages: 757 words: 193,541

The Practice of Cloud System Administration: DevOps and SRE Practices for Web Services, Volume 2 by Thomas A. Limoncelli, Strata R. Chalup, Christina J. Hogan

active measures, Amazon Web Services, anti-pattern, barriers to entry, business process, cloud computing, commoditize, continuous integration, correlation coefficient, database schema, Debian, defense in depth, delayed gratification, DevOps, domain-specific language, en.wikipedia.org, fault tolerance, finite state, Firefox, Google Glasses, information asymmetry, Infrastructure as a Service, intermodal, Internet of things, job automation, job satisfaction, Kickstarter, load shedding, longitudinal study, loose coupling, Malcom McLean invented shipping containers, Marc Andreessen, place-making, platform as a service, premature optimization, recommendation engine, revision control, risk tolerance, side project, Silicon Valley, software as a service, sorting algorithm, standardized shipping container, statistical model, Steven Levy, supply-chain management, Toyota Production System, web application, Yogi Berra

Sometimes services were also scaled by deploying servers for the application into several geographic regions, or business units, each of which would then use its local server. For example, when Tom first worked at AT&T, there was a different payroll processing center for each division of the company. High Availability Applications requiring high availability required “fault-tolerant” computers. These computers had multiple CPUs, error-correcting RAM, and other technologies that were extremely expensive at the time. Fault-tolerant systems were niche products. Generally only the military and Wall Street needed such systems. As a result they were usually priced out of the reach of typical companies. Costs During this era the Internet was not business-critical, and outages for internal business-critical systems could be scheduled because the customer base was a limited, known set of people.

Hardware can also fail, with the scope of the failure ranging from the smallest component to the largest network. Failure domains can be any size: a device, a computer, a rack, a datacenter, or even an entire company. The amount of capacity in a system is N + M, where N is the amount of capacity used to provide a service and M is the amount of spare capacity available, which can be used in the event of a failure. A system that is N + 1 fault tolerant can survive one unit of failure and remain operational. The most common way to route around failure is through replication of services. A service may be replicated one or more times per failure domain to provide resilience greater than the domain. Failures can also come from external sources that overload a system, and from human mistakes. There are countermeasures to nearly every failure imaginable.

Originally based on applying Agile methodology to operations, the result is a streamlined set of principles and processes that can create reliable services. Appendix B will make the case that cloud or distributed computing was the inevitable result of the economics of hardware. DevOps is the inevitable result of needing to do efficient operations in such an environment. If hardware and software are sufficiently fault tolerant, the remaining problems are human. The seminal paper “Why Do Internet Services Fail, and What Can Be Done about It?” by Oppenheimer et al. (2003) raised awareness that if web services are to be a success in the future, operational aspects must improve: We find that (1) operator error is the largest single cause of failures in two of the three services, (2) operator errors often take a long time to repair, (3) configuration errors are the largest category of operator errors, (4) failures in custom-written front-end software are significant, and (5) more extensive online testing and more thoroughly exposing and detecting component failures would reduce failure rates in at least one service.

pages: 194 words: 49,310

Clock of the Long Now by Stewart Brand

Albert Einstein, Brewster Kahle, Buckminster Fuller, Colonization of Mars, complexity theory, Danny Hillis, Eratosthenes, Extropian, fault tolerance, George Santayana, Internet Archive, Jaron Lanier, Kevin Kelly, knowledge economy, life extension, longitudinal study, low earth orbit, Metcalfe’s law, Mitch Kapor, nuclear winter, pensions crisis, phenotype, Ray Kurzweil, Robert Metcalfe, Stephen Hawking, Stewart Brand, technological singularity, Ted Kaczynski, Thomas Malthus, Vernor Vinge, Whole Earth Catalog

Imagine a mountain range of opportunities, where the higher you get the greater the advantage. Hasty opportunists will never get past the foothills because they only pay attention to the slope of the ground under their feet, climb quickly to the immediate hilltop, and get stuck there. Patient opportunists take the longer view to the distant peaks, and toil through many ups and downs on the long trek to the heights. There are two ways to make systems fault-tolerant: One is to make them small, so that correction is local and quick; the other is to make them slow, so that correction has time to permeate the system. When you proceed too rapidly with something mistakes cascade, whereas when you proceed slowly the mistakes instruct. Gradual, incremental projects engage the full power of learning and discovery, and they are able to back out of problems. Gradually emergent processes get steadily better over time, while quickly imposed processes often get worse over time.

Diamond, Jared Digital information and core standards discontinuity of and immortality and megadata and migration preservation of Digital records, passive and active Discounting of value Drexler, Eric Drucker, Peter Dubos, René Dyson, Esther Dyson, Freeman Earth, view of from outer space Earth Day Easterbrook, Gregg Eaton Collection Eberling, Richard Ecological communities systems and change See also Environment Economic forecasting Ecotrust Egyptian civilization and time Ehrlich, Paul Electronic Frontier Foundation Eliade, Mircea Eno, Brian and ancient Egyptian woman and Clock of the Long Now ideas for participation in Clock/Library and tour of Big Ben Environment degradation of and peace, prosperity, and continuity reframing of problems of and technology See also Ecological Environmentalists and long-view Europe-America dialogue Event horizon Evolution of Cooperation, The “Experts Look Ahead, The” Extinction rate Extra-Terrestrial Intelligence programs and time-release services Extropians Family Tree Maker Fashion Fast and bad things Fault-tolerant systems Feedback and tuning of systems Feldman, Marcus Finite and Infinite Games Finite games Florescence Foresight Institute Freefall Free will Fuller, Buckminster Fundamental tracking Future configuration towards continuous of desire versus fate feeling of and nuclear armageddon one hundred years and present moment tree uses of and value Future of Industrial Man, The “Futurismists” Gabriel, Peter Galileo Galvin, Robert Gambling Games, finite and infinite Gender imbalance in Chinese babies Generations Gershenfeld, Neil Gibbon, Edward GI Bill Gibson, William Gilbert, Joseph Henry Global Business Network (GBN) Global collapse Global computer Global perspective Global warming Goebbels, Joseph Goethe, Johann Wolfgang von Goldberg, Avram “Goldberg rule, the” Goldsmith, Oliver Goodall, Jane Governance Governing the Commons Government and the long view Grand Canyon Great Year Greek tragedy Grove, Andy Hale-Bopp comet Hampden-Turner, Charles Hardware dependent digital experiences, preservation of Hawking, Stephen Hawthorne, Nathaniel Heinlein, Robert Herman, Arthur Hill climbing Hillis, Daniel definition of technology and design of Clock and digital discontinuity and digital preservation and extra-terrestrial intelligence programs ideas for participation in Clock/Library and Long Now Foundation and long-term responsibility and motivation to build linear Clock and the Singularity and sustained endeavors and types of time History and accessible data as a horror and warning how to apply intelligently Hitler, Adolf Holling, C.

pages: 348 words: 97,277

The Truth Machine: The Blockchain and the Future of Everything by Paul Vigna, Michael J. Casey

3D printing, additive manufacturing, Airbnb, altcoin, Amazon Web Services, barriers to entry, basic income, Berlin Wall, Bernie Madoff, bitcoin, blockchain, blood diamonds, Blythe Masters, business process, buy and hold, carbon footprint, cashless society, cloud computing, computer age, computerized trading, conceptual framework, Credit Default Swap, crowdsourcing, cryptocurrency, cyber-physical system, dematerialisation, disintermediation, distributed ledger, Donald Trump, double entry bookkeeping, Edward Snowden, Elon Musk, Ethereum, ethereum blockchain, failed state, fault tolerance, fiat currency, financial innovation, financial intermediation, global supply chain, Hernando de Soto, hive mind, informal economy, intangible asset, Internet of things, Joi Ito, Kickstarter, linked data, litecoin, longitudinal study, Lyft, M-Pesa, Marc Andreessen, market clearing, mobile money, money: store of value / unit of account / medium of exchange, Network effects, off grid, pets.com, prediction markets, pre–internet, price mechanism, profit maximization, profit motive, ransomware, rent-seeking, RFID, ride hailing / ride sharing, Ross Ulbricht, Satoshi Nakamoto, self-driving car, sharing economy, Silicon Valley, smart contracts, smart meter, Snapchat, social web, software is eating the world, supply-chain management, Ted Nelson, the market place, too big to fail, trade route, transaction costs, Travis Kalanick, Turing complete, Uber and Lyft, uber lyft, unbanked and underbanked, underbanked, universal basic income, web of trust, zero-sum game

These tweaked versions of Bitcoin shared various elements of the cryptocurrency’s powerful cryptography and network rules. However, instead of its electricity-hungry “proof-of-work” consensus model, they drew upon older, pre-Bitcoin protocols that were more efficient but which couldn’t achieve the same level of security without putting a centralized entity in charge of identifying and authorizing participants. Predominantly, the bankers’ models used a consensus algorithm known as practical byzantine fault tolerance, or PBFT, a cryptographic solution invented in 1999. It gave all approved ledger-keepers in the network confidence that each other’s actions weren’t undermining the shared record even when there was no way of knowing whether one or more had malicious intent to defraud the others. With these consensus-building systems, the computers adopted each updated version of the ledger once certain thresholds of acceptance were demonstrated across the network.

See also R3 CEV Cosmos costs-per-impression measures (CPMs) Craigslist Creative Commons credit default swap (CDS) Crowdfunder crowdfunding crypto-asset analysts crypto-assets Crytpo Company cryptocurrency and criminality and Cypherpunk movement and decentralization and fair distribution and financial sector and Fourth Industrial Revolution hoarding investors and privacy and quantum computing and regulatory challenges See also Bitcoin cryptography and blockchain technology and data storage and financial sector hashes history of and identity and math Merkle Tree practical byzantine fault tolerance (PBFT) and registers and security and privacy signatures and supply chains and tokens triple-entry bookkeeping and trust crypto-impact-economics Cryptokernel (CK) crypto-libertarians cryptomoney Cryptonomos Cuende, Luis Iván Cuomo, Jerry cyber-attacks ransom attacks cybersecurity and decentralized trust model device identity model shared-secret model Cypherpunk manifesto Cypherpunk movement and community DAO, The (The Decentralized Autonomous Organization) Dapps.

MIT Media Lab MIT Media Lab’s Digital Currency Initiative Mizrahi, Alex MME Modi, Narendra Monax Monero monetary and banking systems central bank fiat digital currency and community connections and digital counterfeiting mobile money systems money laundering See also cryptocurrency; financial sector Moore’s law Mooti Morehead, Dan Mozilla M-Pesa Nakamoto, Satoshi (pseudonymous Bitcoin creator) Nasdaq Nelson, Ted New America Foundation New York Department of Financial Services Niederauer, Duncan North American Bitcoin Conference Norway Obama, Barack Occupy Wall Street Ocean Health Coin off-chain environment Olsen, Richard open protocols open-source systems and movement and art and innovation challenges of Cryptokernel (CK) and data storage and financial sector and health care sector and honest accounting Hyperledger and identity and permissioned systems and registries and ride-sharing and tokens See also Ethereum organized crime Pacioli, Luca Pantera Capital Parity Wallet peer-to-peer commerce and economy Pentland, Alex “Sandy” Perkins Coie permissioned (private) blockchains advantages of challenges of and cryptocurrency-less systems definition of and finance sector open-source development of scalability of and security and supply chains permissionless blockchains Bitcoin and Cypherpunks Ethereum financial sector and identity information mobile money systems and scalability and trusted computing Pink Army Cooperative Plasma Polkadot Polychain Capital Poon, Joseph practical byzantine fault tolerance (PBFT) pre-mining pre-selling private blockchains. See permissioned (private) blockchains Procivis proof-of-stake algorithm proof of work prosumers Protocol Labs Provenance public key infrastructure (PKI) Pureswaran, Veena R3 CEV consortium ransom attacks Ravikant, Naval Realini, Carol re-architecting record keeping and proof-of-stake algorithm and supply chains and trust See also ledger-keeping Reddit refugee camps Regenor, James reputation scoring Reuschel, Peter Rhodes, Yorke ride-sharing Commuterz Lyft reputation scoring Uber Ripple Labs Rivest Co.

RDF Database Systems: Triples Storage and SPARQL Query Processing by Olivier Cure, Guillaume Blin

Amazon Web Services, bioinformatics, business intelligence, cloud computing, database schema, fault tolerance, full text search, information retrieval, Internet Archive, Internet of things, linked data, NP-complete, peer-to-peer, performance metric, random walk, recommendation engine, RFID, semantic web, Silicon Valley, social intelligence, software as a service, SPARQL, web application

But abstractions to program these two functions are available using an SQL-like query language, such as PIG LATIN. When writing these programs, one does not need to take care about the data distribution and parallelism aspects. In fact, the main contribution of MapReduce-based systems is to orchestrate the distribution and execution of these map and reduce operations on a cluster of machines over very large data sets. It also fault-tolerant, meaning that if a machine of the cluster fails during the execution of a process, its job will be given to another machine automatically.Therefore, most of the hard tasks from an end-user point of view are automatized and taken care of by the system: data partitioning, execution scheduling, handling machine failure, and managing intermachine communication. In this framework, the map function processes key-value pairs and outputs an intermediate set of key-value pairs.The reduce function processes the key-value pairs generated by the map function by operating over the values of the same associated keys.The framework partitions the input data over a cluster of machines and sends the map function to each machine.This supports a parallel execution of the map function.

Instead, SPARQL queries are processed (using Sesame’s query processor) to generate index lookups over the different Cassandra indexes. Because these index lookups are defined procedurally, we can consider that any forms of optimization are quite difficult to process. This implies that the generated index lookups need to be optimal to ensure efficient query answering. We saw in Chapter 5 that many systems are using a MapReduce approach to benefit from a parallel-processing, fault-tolerant environment. PigSPARQL, presented in Schätzle et al. (2013), is a system that maps SPARQL queries to Pig Latin queries. In a nutshell, Pig is a data analysis platform developed by Yahoo! that runs on top of the Hadoop processing framework, and Latin is its query language that abstracts the creation of the map and reduce functions using a relational algebra–like approach. Therefore, Pig Latin is used as an intermediate layer between SPARQL and Hadoop.

Oracle, MS SQL Server, IBM DB2, is also supported. These two mechanisms come with support for conflict resolution, i.e., detect whether an update has been correctly replicated to a subscriber. The second strategy is based on partitioning that is specified at the index level using a hash function on key parts. Each partition is replicated on different physical machines to ensure load balancing and fault tolerance. When triple updates are being performed, all copies are updated within the same transaction. The clustering approach of the Mark Logic system distinguishes between two kinds of nodes: data managers (denoted as D-nodes) and evaluators (denoted as E-nodes). The D-nodes are responsible for the management of a data subset, while the E-nodes handle the access to data and the query processing. A load balancer component distributes queries across E-nodes.

Engineering Security by Peter Gutmann

active measures, algorithmic trading, Amazon Web Services, Asperger Syndrome, bank run, barriers to entry, bitcoin, Brian Krebs, business process, call centre, card file, cloud computing, cognitive bias, cognitive dissonance, combinatorial explosion, Credit Default Swap, crowdsourcing, cryptocurrency, Daniel Kahneman / Amos Tversky, Debian, domain-specific language, Donald Davies, Donald Knuth, double helix, en.wikipedia.org, endowment effect, fault tolerance, Firefox, fundamental attribution error, George Akerlof, glass ceiling, GnuPG, Google Chrome, iterative process, Jacob Appelbaum, Jane Jacobs, Jeff Bezos, John Conway, John Markoff, John von Neumann, Kickstarter, lake wobegon effect, Laplace demon, linear programming, litecoin, load shedding, MITM: man-in-the-middle, Network effects, Parkinson's law, pattern recognition, peer-to-peer, Pierre-Simon Laplace, place-making, post-materialism, QR code, race to the bottom, random walk, recommendation engine, RFID, risk tolerance, Robert Metcalfe, Ruby on Rails, Sapir-Whorf hypothesis, Satoshi Nakamoto, security theater, semantic web, Skype, slashdot, smart meter, social intelligence, speech recognition, statistical model, Steve Jobs, Steven Pinker, Stuxnet, telemarketer, text mining, the built environment, The Death and Life of Great American Cities, The Market for Lemons, the payments system, Therac-25, too big to fail, Turing complete, Turing machine, Turing test, web application, web of trust, x509 certificate, Y2K, zero day, Zimmermann PGP

For example a resolver could decide that although a particular entry may be stale, it came from an authoritative source and so it can still be used until newer information becomes available (the technical name for a resolver that provides this type of service on behalf of the user is “curated DNS”). What DNSSEC does is take the irregularity- and fault-tolerant behaviour of resolvers and turn any problem into a fatal error, since close-enough is no longer sufficient to satisfy a resolver that for security reasons can’t allow a single bit to be out of place. The DNSSEC documents describe in great detail the bits-on-the-wire representation of the packets that carry the data but say nothing about what happens to those bits once they’ve reached their destination [637]. As a result the implicit fault-tolerance of the DNS, which works because resolvers go to great lengths to tolerate any form of vaguely-acceptable (and in a number of cases unacceptable but present in widelydeployed implementations) responses [638], is seriously impacted when glitches are 390 Design no longer allowed to be tolerated.

Other Threat Analysis Techniques The discussion above has focused heavily on PSMs for threat analysis because that seems to be the most useful technique to apply to product development. Another 260 Threats threat analysis technique that you may run into is the use of attack trees or graphs [97][98][99][100][101][102][103][104][105][106][107][108][109][110][111][112] [113][114][115][116][117][118][119][120][121][122] which are derived from fault trees used in fault-tolerant computing and safety-critical systems [123][124] [125][126][127][128]. The general idea behind a fault tree is shown in Figure 70 and involves starting with the general high-level concept that “a failure occurred” and then iteratively breaking it down into more and more detailed failure classes. For example in Figure 70 the abstract overall failure case can be decomposed into hardware failures, software failures, and human failures such as configuring or operating the device incorrectly.

Some battery manufacturers actually run active penetration-testing processes in which the attackers try any means possible to make the batteries fail catastrophically in order to identify possible weak points in the physical safety interlocks. The analysis process for these methods is a relatively straightforward modification of the existing FMEA one that involves identifying all of the system components that would be affected by a particular type of attack (typically a computer-based one rather than just a standard component failure) and then applying standard mitigation techniques used with fault-tolerant and safety-critical systems. So although FMEA and RA aren’t entirely useful for dealing with malicious rather than benign faults, they can at least be applied as a general tool to structuring the allocation of resources towards dealing with malicious faults. Another area where FMEA can be useful is in modelling the process of risk diversification that’s covered in “Security through Diversity” on page 315.

pages: 201 words: 63,192

Graph Databases by Ian Robinson, Jim Webber, Emil Eifrem

Amazon Web Services, anti-pattern, bioinformatics, commoditize, corporate governance, create, read, update, delete, data acquisition, en.wikipedia.org, fault tolerance, linked data, loose coupling, Network effects, recommendation engine, semantic web, sentiment analysis, social graph, software as a service, SPARQL, web application

Since the queries run slowly, the database can process fewer of them per second, which means the avail‐ ability of the database to do useful work diminishes from the client’s point of view. Whatever the database, understanding the underlying storage and caching infrastruc‐ ture will help you construct idiomatic-- and hence, mechanically sympathetic—queries that maximise performance. Our final observation on availability is that scaling for cluster-wide replication has a positive impact, not just in terms of fault-tolerance, but also responsiveness. Since there are many machines available for a given workload, query latency is low and availability is maintained. But as we’ll now discuss, scale itself is more nuanced than simply the number of servers we deploy. Scale The topic of scale has become more important as data volumes have grown. In fact, the problems of data at scale, which have proven difficult to solve with relational databases, have been a substantial motivation for the NOSQL movement.

Though optimistic concurrency control mechanisms are useful, we also rather like transactions, and there are numerous example of highthroughput performance transaction processing systems in the litera‐ ture. Document Stores | 173 Key-Value Stores Key-value stores are cousins of the document store family, but their lineage comes from Amazon’s Dynamo database. 3 They act like large, distributed hashmap data structures that store and retrieve opaque values by key. As shown in Figure A-3 the key space of the hashmap is spread across numerous buckets on the network. For fault-tolerance reasons each bucket is replicated onto several ma‐ chines. The formula for number of replicas required is given by R = 2F +1 where F is the number of failures we can tolerate. The replication algorithm seeks to ensure that machines aren’t exact copies of each other. This allows the system to load-balance while a machine and its buckets recover; it also helps avoid hotspots, which can cause inad‐ vertent self denial-of-service.

pages: 540 words: 103,101

Building Microservices by Sam Newman

airport security, Amazon Web Services, anti-pattern, business process, call centre, continuous integration, create, read, update, delete, defense in depth, don't repeat yourself, Edward Snowden, fault tolerance, index card, information retrieval, Infrastructure as a Service, inventory management, job automation, Kubernetes, load shedding, loose coupling, microservices, MITM: man-in-the-middle, platform as a service, premature optimization, pull request, recommendation engine, social graph, software as a service, source of truth, the built environment, web application, WebSocket

If the in-house service template supports only Java, then people may be discouraged from picking alternative stacks if they have to do lots more work themselves. Netflix, for example, is especially concerned with aspects like fault tolerance, to ensure that the outage of one part of its system cannot take everything down. To handle this, a large amount of work has been done to ensure that there are client libraries on the JVM to provide teams with the tools they need to keep their services well behaved. Anyone introducing a new technology stack would mean having to reproduce all this effort. The main concern for Netflix is less about the duplicated effort, and more about the fact that it is so easy to get this wrong. The risk of a service getting newly implemented fault tolerance wrong is high if it could impact more of the system. Netflix mitigates this by using sidecar services, which communicate locally with a JVM that is using the appropriate libraries.

This means if part of your system uses DNS already and can support SRV records, you can just drop in Consul and start using it without any changes to your existing system. Consul also builds in other capabilities that you might find useful, such as the ability to perform health checks on nodes. This means that Consul could well overlap the capabilities provided by other dedicated monitoring tools, although you would more likely use Consul as a source of this information and then pull it into a more comprehensive dashboard or alerting system. Consul’s highly fault-tolerant design and focus on handling systems that make heavy use of ephemeral nodes does make me wonder, though, if it may end up replacing systems like Nagios and Sensu for some use cases. Consul uses a RESTful HTTP interface for everything from registering a service, querying the key/value store, or inserting health checks. This makes integration with different technology stacks very straightforward.

pages: 31 words: 9,168

Designing Reactive Systems: The Role of Actors in Distributed Architecture by Hugh McKee

Amazon Web Services, fault tolerance, Internet of things, microservices

Clustering provides the building blocks for building application systems that grow and contract as the processing load requires. These two features of the actor system directly impact the operational costs of your application system: you use the processing capacity that you have more efficiently and you use only the capacity that is needed at a given point in time. The main takeaways in this chapter are: Delegation of work through supervised workers allows for higher levels of concurrency and fault tolerance. Workers are asynchronous and run concurrently, never sitting idle as in synchronous systems. Efficient utilization of system resources (CPU, memory, and threads) results in reduced infrastructure costs. It’s simple to scale elastically at the actor level by increasing or decreasing workers as needed. Using clusters gives the ability to scale at the system level. Chapter 4.

pages: 923 words: 516,602

The C++ Programming Language by Bjarne Stroustrup

combinatorial explosion, conceptual framework, database schema, distributed generation, Donald Knuth, fault tolerance, general-purpose programming language, index card, iterative process, job-hopping, locality of reference, Menlo Park, Parkinson's law, premature optimization, sorting algorithm

Concrete and abstract classes (interfaces) are presented here (Chapter 10, Chapter 12), together with operator overloading (Chapter 11), polymorphism, and the use of class hierarchies (Chapter 12, Chapter 15). Chapter 13 presents templates, that is, C++’s facilities for defining families of types and functions. It demonstrates the basic techniques used to provide containers, such as lists, and to support generic programming. Chapter 14 presents exception handling, discusses techniques for error handling, and presents strategies for fault tolerance. I assume that you either aren’t well acquainted with objectoriented programming and generic programming or could benefit from an explanation of how the main abstraction techniques are supported by C++. Thus, I don’t just present the language features supporting the abstraction techniques; I also explain the techniques themselves. Part IV goes further in this direction. Part III presents the C++ standard library.

Many systems offer mechanisms, such as signals, to deal with asynchrony, but because these tend to be system-dependent, they are not described here. The exception-handling mechanism is a nonlocal control structure based on stack unwinding (§14.4) that can be seen as an alternative return mechanism. There are therefore legitimate uses of exceptions that have nothing to do with errors (§14.5). However, the primary aim of the exception-handling mechanism and the focus of this chapter is error handling and the support of fault tolerance. Standard C++ doesn’t have the notion of a thread or a process. Consequently, exceptional circumstances relating to concurrency are not discussed here. The concurrency facilities available on your system are described in its documentation. Here, I’ll just note that the C++ exception- The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T. Published by Addison Wesley Longman, Inc.

For example: vvooiidd uussee__ffiillee(ccoonnsstt cchhaarr* ffnn) { F FIIL LE E* f = ffooppeenn(ffnn,"w w"); // use f ffcclloossee(ff); } This looks plausible until you realize that if something goes wrong after the call of ffooppeenn() and before the call of ffcclloossee(), an exception may cause uussee__ffiillee() to be exited without ffcclloossee() being called. Exactly the same problem can occur in languages that do not support exception handling. For example, the standard C library function lloonnggjjm mpp() can cause the same problem. Even an ordinary rreettuurrnn-statement could exit uussee__ffiillee without closing ff. A first attempt to make uussee__ffiillee() to be fault-tolerant looks like this: vvooiidd uussee__ffiillee(ccoonnsstt cchhaarr* ffnn) { F FIIL LE E* f = ffooppeenn(ffnn,"rr"); ttrryy { // use f } The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T. Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved. Section 14.4 Resource Management 365 ccaattcchh (...) { ffcclloossee(ff); tthhrroow w; } ffcclloossee(ff); } The code using the file is enclosed in a ttrryy block that catches every exception, closes the file, and re-throws the exception.

pages: 933 words: 205,691

Hadoop: The Definitive Guide by Tom White

Amazon Web Services, bioinformatics, business intelligence, combinatorial explosion, database schema, Debian, domain-specific language, en.wikipedia.org, fault tolerance, full text search, Grace Hopper, information retrieval, Internet Archive, Kickstarter, linked data, loose coupling, openstreetmap, recommendation engine, RFID, SETI@home, social graph, web application

The storage subsystem deals with blocks, simplifying storage management (since blocks are a fixed size, it is easy to calculate how many can be stored on a given disk) and eliminating metadata concerns (blocks are just a chunk of data to be stored—file metadata such as permissions information does not need to be stored with the blocks, so another system can handle metadata separately). Furthermore, blocks fit well with replication for providing fault tolerance and availability. To insure against corrupted blocks and disk and machine failure, each block is replicated to a small number of physically separate machines (typically three). If a block becomes unavailable, a copy can be read from another location in a way that is transparent to the client. A block that is no longer available due to corruption or machine failure can be replicated from its alternative locations to other live machines to bring the replication factor back to the normal level.

The Command-Line Interface We’re going to have a look at HDFS by interacting with it from the command line. There are many other interfaces to HDFS, but the command line is one of the simplest and, to many developers, the most familiar. We are going to run HDFS on one machine, so first follow the instructions for setting up Hadoop in pseudo-distributed mode in Appendix A. Later you’ll see how to run on a cluster of machines to give us scalability and fault tolerance. There are two properties that we set in the pseudo-distributed configuration that deserve further explanation. The first is fs.default.name, set to hdfs://localhost/, which is used to set a default filesystem for Hadoop. Filesystems are specified by a URI, and here we have used an hdfs URI to configure Hadoop to use HDFS by default. The HDFS daemons will use this property to determine the host and port for the HDFS namenode.

Reads are OK, but writes are getting slower and slower Drop secondary indexes and triggers (no indexes?). At this point, there are no clear solutions for how to solve your scaling problems. In any case, you’ll need to begin to scale horizontally. You can attempt to build some type of partitioning on your largest tables, or look into some of the commercial solutions that provide multiple master capabilities. Countless applications, businesses, and websites have successfully achieved scalable, fault-tolerant, and distributed data systems built on top of RDBMSs and are likely using many of the previous strategies. But what you end up with is something that is no longer a true RDBMS, sacrificing features and conveniences for compromises and complexities. Any form of slave replication or external caching introduces weak consistency into your now denormalized data. The inefficiency of joins and secondary indexes means almost all queries become primary key lookups.

pages: 58 words: 12,386

Big Data Glossary by Pete Warden

business intelligence, crowdsourcing, fault tolerance, information retrieval, linked data, natural language processing, recommendation engine, web application

More recent data processing systems, such as Hadoop and Cassandra, are designed to run on clusters of comparatively low-specification servers, and so the easiest way to handle more data is to add more of those machines to the cluster. This horizontal scaling approach tends to be cheaper as the number of operations and the size of the data increases, and the very largest data processing pipelines are all built on a horizontal model. There is a cost to this approach, though. Writing distributed data handling code is tricky and involves tradeoffs between speed, scalability, fault tolerance, and traditional database goals like atomicity and consistency. MapReduce MapReduce is an algorithm design pattern that originated in the functional programming world. It consists of three steps. First, you write a mapper function or script that goes through your input data and outputs a series of keys and values to use in calculating the results. The keys are used to cluster together bits of data that will be needed to calculate a single output result.

Applied Cryptography: Protocols, Algorithms, and Source Code in C by Bruce Schneier

active measures, cellular automata, Claude Shannon: information theory, complexity theory, dark matter, Donald Davies, Donald Knuth, dumpster diving, Exxon Valdez, fault tolerance, finite state, invisible hand, John von Neumann, knapsack problem, MITM: man-in-the-middle, NP-complete, P = NP, packet switching, RAND corporation, RFC: Request For Comment, software patent, telemarketer, traveling salesman, Turing machine, web of trust, Zimmermann PGP

Efficiency: Efficiency: + Speed is the same as the block + Speed is the same as the block cipher. cipher. Ciphertext is up to one block longer - Ciphertext is up to one block longer than the plaintext, due to padding. than the plaintext, not counting the IV. - No preprocessing is possible. - No preprocessing is possible. + Processing is parallelizable. +/- Encryptions not parallelizable; decryption is parallelizable and has a random-access property. Fault-tolerance: Fault-tolerance: - A ciphertext error affects one full - A ciphertext error affects one full block of plaintext. block of plaintext and the corresponding bit in the next block. - Synchronization error is - Synchronization error is unrecoverable. unrecoverable. CFB: OFB/Counter: Security: Security: + Plaintext patterns are concealed. + Plaintext patterns are concealed. + Input to the block cipher is + Input to the block cipher is randomized. randomized. + More than one message can be + More than one message can be encrypted with the same key provided encrypted with the same key, that a different IV is used. provided that a different IV is used. +/- Plaintext is somewhat difficult to - Plaintext is very easy to manipulate, manipulate;blocks can be removed any change in ciphertext directly from the beginning and end of the affects the plaintext. message, bits of the first block can be changed, and repetition allows some controlled changes.

. - Some preprocessing is possible before a block is seen; the previous ciphertext block can be encrypted. +/- Encryption is not parallelizable; decryption is parallelizable and has a random-access property. Fault-tolerance: - A ciphertext error affects the corresponding bit of plaintext and the next full block. +Synchronization errors of full block sizes are recoverable. 1-bit CFB can recover from the addition or loss of single bits. Efficiency: + Speed is the same as the block cipher. - Ciphertext is the same size as the plaintext, not counting the IV. + Processing is possible before the message is seen. -/+ OFB processing is not parallelizable; counter processing is parallelizable. Fault-tolerance: + A ciphertext error affects only the corresponding bit of plaintext. -Synchronization error is unrecoverable. Previous Table of Contents Next Products | Contact Us | About Us | Privacy | Ad Info | Home Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc.

There are other security considerations: Patterns in the plaintext should be concealed, input to the cipher should be randomized, manipulation of the plaintext by introducing errors in the ciphertext should be difficult, and encryption of more than one message with the same key should be possible. These will be discussed in detail in the next sections. Efficiency is another consideration. The mode should not be significantly less efficient than the underlying cipher. In some circumstances it is important that the ciphertext be the same size as the plaintext. A third consideration is fault-tolerance. Some applications need to parallelize encryption or decryption, while others need to be able to preprocess as much as possible. In still others it is important that the decrypting process be able to recover from bit errors in the ciphertext stream, or dropped or added bits. As we will see, different modes have different subsets of these characteristics. 9.1 Electronic Codebook Mode Electronic codebook (ECB) mode is the most obvious way to use a block cipher: A block of plaintext encrypts into a block of ciphertext.

pages: 834 words: 180,700

The Architecture of Open Source Applications by Amy Brown, Greg Wilson

8-hour work day, anti-pattern, bioinformatics, c2.com, cloud computing, collaborative editing, combinatorial explosion, computer vision, continuous integration, create, read, update, delete, David Heinemeier Hansson, Debian, domain-specific language, Donald Knuth, en.wikipedia.org, fault tolerance, finite state, Firefox, friendly fire, Guido van Rossum, linked data, load shedding, locality of reference, loose coupling, Mars Rover, MITM: man-in-the-middle, MVC pattern, peer-to-peer, Perl 6, premature optimization, recommendation engine, revision control, Ruby on Rails, side project, Skype, slashdot, social web, speech recognition, the scientific method, The Wisdom of Crowds, web application, WebSocket

The coordinator distributes requests to individual CouchDB instances based on the key of the document being requested. Twitter has built the notions of sharding and replication into a coordinating framework called Gizzard16. Gizzard takes standalone data stores of any type—you can build wrappers for SQL or NoSQL storage systems—and arranges them in trees of any depth to partition keys by key range. For fault tolerance, Gizzard can be configured to replicate data to multiple physical machines for the same key range. 13.4.3. Consistent Hash Rings Good hash functions distribute a set of keys in a uniform manner. This makes them a powerful tool for distributing key-value pairs among multiple servers. The academic literature on a technique called consistent hashing is extensive, and the first applications of the technique to data stores was in systems called distributed hash tables (DHTs).

Routing is simple in the hash partitioning scheme: for the most part, the hash function can be executed by clients to find the appropriate server. With more complicated rebalancing schemes, finding the right node for a key becomes more difficult. Range partitioning requires the upfront cost of maintaining routing and configuration nodes, which can see heavy load and become central points of failure in the absence of relatively complex fault tolerance schemes. Done well, however, range-partitioned data can be load-balanced in small chunks which can be reassigned in high-load situations. If a server goes down, its assigned ranges can be distributed to many servers, rather than loading the server's immediate neighbors during downtime. 13.5. Consistency Having spoken about the virtues of replicating data to multiple machines for durability and spreading load, it's time to let you in on a secret: keeping replicas of your data on multiple machines consistent with one-another is hard.

., RFC 3280 SubjectPublicKeyInfo, with the algorithm I.e., as a RFC 3279 Dsa-Sig-Value, created by algorithm 1.2.840.10040.4.3. The Architecture of Open Source Applications Amy Brown and Greg Wilson (eds.) ISBN 978-1-257-63801-7 License / Buy / Contribute Chapter 15. Riak and Erlang/OTP Francesco Cesarini, Andy Gross, and Justin Sheehy Riak is a distributed, fault tolerant, open source database that illustrates how to build large scale systems using Erlang/OTP. Thanks in large part to Erlang's support for massively scalable distributed systems, Riak offers features that are uncommon in databases, such as high-availability and linear scalability of both capacity and throughput. Erlang/OTP provides an ideal platform for developing systems like Riak because it provides inter-node communication, message queues, failure detectors, and client-server abstractions out of the box.

pages: 319 words: 72,969

Nginx HTTP Server Second Edition by Clement Nedelcu

Debian, fault tolerance, Firefox, Google Chrome, Ruby on Rails, web application

Features As of the stable version 1.2.9, Nginx offers an impressive variety of features, which, contrary to what the title of this book indicates, are not all related to serving HTTP content. Here is a list of the main features of the web branch, quoted from the official website www.nginx.org: • Handling of static files, index files, and autoindexing; open file descriptor cache. • Accelerated reverse proxying with caching; simple load balancing and fault tolerance. • Accelerated support with caching of remote FastCGI servers; simple load balancing and fault tolerance. • Modular architecture. Filters include Gzipping, byte ranges, chunked responses, XSLT, SSI, and image resizing filter. Multiple SSI inclusions within a single page can be processed in parallel if they are handled by FastCGI or proxied servers. • SSL and TLS SNI support (TLS with Server Name Indication (SNI), required for using TLS on a server doing virtual hosting).

pages: 66 words: 9,247

MongoDB and Python by Niall O’Higgins

cloud computing, Debian, fault tolerance, semantic web, web application

MongoDB ObjectIds have the nice property of being almost-certainly-unique upon generation, hence no central coordination is required. This contrasts sharply with the common RDBMS idiom of using auto-increment primary keys. Guaranteeing that an auto-increment key is not already in use usually requires consulting some centralized system. When the intention is to provide a horizontally scalable, de-centralized and fault-tolerant database—as is the case with MongoDB—auto-increment keys represent an ugly bottleneck. By employing ObjectId as your _id, you leave the door open to horizontal scaling via MongoDB’s sharding capabilities. While you can in fact supply your own value for the _id property if you wish—so long as it is globally unique—this is best avoided unless there is a strong reason to do otherwise. Examples of cases where you may be forced to provide your own _id property value include migration from RDBMS systems which utilized the previously-mentioned auto-increment primary key idiom.

pages: 355 words: 81,788

Monolith to Microservices: Evolutionary Patterns to Transform Your Monolith by Sam Newman

Airbnb, business process, continuous integration, database schema, DevOps, fault tolerance, ghettoisation, inventory management, Jeff Bezos, Kubernetes, loose coupling, microservices, MVC pattern, price anchoring, pull request, single page application, software as a service, source of truth, telepresence

Norton & Company, 2010) as an excellent overview of the part that credit derivatives played in the global financial crisis of 2007–2008. I often look back at the small part I played in this industry with a great deal of regret. It turns out not knowing what you’re doing and doing it anyway can have some pretty disastrous implications. 7 See Liming Chen and Algirdas Avizienis, “N-Version Programming: A Fault-Tolerance Approach to Reliability of Software Operation,” published in the Twenty-Fifth International Symposium on Fault-Tolerant Computing (1995). Chapter 4. Decomposing the Database As we’ve already explored, there are a host of ways to extract functionality into microservices. However, we need to address the elephant in the room: namely, what do we do about our data? Microservices work best when we practice information hiding, which in turn typically leads us toward microservices totally encapsulating their own data storage and retrieval mechanisms.

pages: 713 words: 93,944

Seven Databases in Seven Weeks: A Guide to Modern Databases and the NoSQL Movement by Eric Redmond, Jim Wilson, Jim R. Wilson

AGPL, Amazon Web Services, create, read, update, delete, data is the new oil, database schema, Debian, domain-specific language, en.wikipedia.org, fault tolerance, full text search, general-purpose programming language, Kickstarter, linked data, MVC pattern, natural language processing, node package manager, random walk, recommendation engine, Ruby on Rails, Skype, social graph, web application

Just like Riak (“Ree-ahck”), you never use only one, but the multiple parts working together make the overall system durable. Each component is cheap and expendable, but when used right, it’s hard to find a simpler or stronger structure upon which to build a foundation. Riak is a distributed key-value database where values can be anything—from plain text, JSON, or XML to images or video clips—all accessible through a simple HTTP interface. Whatever data you have, Riak can store it. Riak is also fault-tolerant. Servers can go up or down at any moment with no single point of failure. Your cluster continues humming along as servers are added, removed, or (ideally not) crash. Riak won’t keep you up nights worrying about your cluster—a failed node is not an emergency, and you can wait to deal with it in the morning. As core developer Justin Sheehy once noted, “[The Riak team] focused so hard on things like write availability…to go back to sleep.”

It is based on BigTable, a high-performance, proprietary database developed by Google and described in the 2006 white paper “Bigtable: A Distributed Storage System for Structured Data.”[26] Initially created for natural-language processing, HBase started life as a contrib package for Apache Hadoop. Since then, it has become a top-level Apache project. On the architecture front, HBase is designed to be fault tolerant. Hardware failures may be uncommon for individual machines, but in a large cluster, node failure is the norm. By using write-ahead logging and distributed configuration, HBase can quickly recover from individual server failures. Additionally, HBase lives in an ecosystem that has its own complementary benefits. HBase is built on Hadoop—a sturdy, scalable computing platform that provides a distributed file system and mapreduce capabilities.

pages: 304 words: 91,566

Bitcoin Billionaires: A True Story of Genius, Betrayal, and Redemption by Ben Mezrich

"side hustle", airport security, Albert Einstein, bank run, Ben Horowitz, bitcoin, blockchain, Burning Man, buttonwood tree, cryptocurrency, East Village, El Camino Real, Elon Musk, family office, fault tolerance, fiat currency, financial innovation, game design, Isaac Newton, Marc Andreessen, Mark Zuckerberg, Menlo Park, Metcalfe’s law, new economy, offshore financial centre, paypal mafia, peer-to-peer, Peter Thiel, Ponzi scheme, QR code, Ronald Reagan, Ross Ulbricht, Sand Hill Road, Satoshi Nakamoto, Schrödinger's Cat, self-driving car, side project, Silicon Valley, Skype, smart contracts, South of Market, San Francisco, Steve Jobs, transaction costs, zero-sum game

With this security design, a thief would have to rob three different banks—or bribe employees at three different banks—or pull off some combination thereof to gain control of the twins’ bitcoin. Either way, it would be a logistical nightmare—Mission Impossible shit that only worked in the movies—to get ahold of the three shards that made up the bitcoin private key. Moreover, the twins had replicated this model four times across different geographic regions, to build redundancy into their system—removing the final single point of failure—and improving their overall fault tolerance. This way, if a natural disaster like a major tornado decimated the Midwest, there would still be other sets of alpha, bravo, and charlie spread across other regions in the country (the Northeast, Mid-Atlantic, West, etc.) that could be assembled to form the twins’ private key. If a mega tsunami—or hell, Godzilla—hit the eastern seaboard, or a meteor hit Los Angeles, the twins’ private key would still be safe.

Tyler corralled the security expert by the pool table, where McCaleb and Levchin were geeking out on god knows what. “Why all three?” Tyler asked. “Doesn’t one do the job?” Kaminsky shrugged. “The second one is to tell if the first one is broken. The third is to tell if the other two are lying.” It was exactly how Tyler should have expected a security engineer to think—in terms of systems and their fault tolerance and integrity. Over the next ten minutes, he interrogated Kaminsky about his hacking efforts; at first, the security expert had expected to be able to penetrate such a complex piece of code easily—the fact that it was so complex, so long, meant there should have been many weak spots to exploit. But over the days spent in his parents’ basement filled with computers, he kept coming up empty. Every time he thought he had found a bug or an exploit, he was met by a message in the code proclaiming “Attack removed.”

pages: 329 words: 95,309

Digital Bank: Strategies for Launching or Becoming a Digital Bank by Chris Skinner

algorithmic trading, AltaVista, Amazon Web Services, Any sufficiently advanced technology is indistinguishable from magic, augmented reality, bank run, Basel III, bitcoin, business cycle, business intelligence, business process, business process outsourcing, buy and hold, call centre, cashless society, clean water, cloud computing, corporate social responsibility, credit crunch, crowdsourcing, cryptocurrency, demand response, disintermediation, don't be evil, en.wikipedia.org, fault tolerance, fiat currency, financial innovation, Google Glasses, high net worth, informal economy, Infrastructure as a Service, Internet of things, Jeff Bezos, Kevin Kelly, Kickstarter, M-Pesa, margin call, mass affluent, MITM: man-in-the-middle, mobile money, Mohammed Bouazizi, new economy, Northern Rock, Occupy movement, Pingit, platform as a service, Ponzi scheme, prediction markets, pre–internet, QR code, quantitative easing, ransomware, reserve currency, RFID, Satoshi Nakamoto, Silicon Valley, smart cities, social intelligence, software as a service, Steve Jobs, strong AI, Stuxnet, trade route, unbanked and underbanked, underbanked, upwardly mobile, We are the 99%, web application, WikiLeaks, Y2K

The first category is the one that will occur more and more often, as banks have so many legacy systems across their core back office operations. It is far easier to change and add new front office systems – new trading desks, new channels or new customer service operations – than to replace core back office platforms – deposit account processing, post-trade services and payment systems. Why? Because the core processing needs to be highly resilient; 99.9999999999999999999999% and a few more 9’s fault tolerant; and running 24 by 7. In other words these systems are non-stop and would highly expose the bank to failure if they stop working. It is these systems that cause most of the challenges for a bank however. This is because, being a core system, they were often developed in the 1960s and 1970s. Back then, computing technologies were based upon lines of code fed into the machine through packs and packs of punched cards.

Add to this the regulatory regime change, which would force banks to respond more and more rapidly to new requirements, and the old technologies could not keep up. Finally, the technology had to change. This is why banks have been working hard to consolidate and replace their old infrastructures, and why we are seeing more and more glitches and failures. As soon as you upgrade an old, embedded, non-stop fault tolerant machine however, you are open to risk. The 99.9999+% non-stop machine suddenly has to stop. A competent bank derisks the risk of change by testing, testing and testing, whilst an incompetent bank may test but not enough. Luckily, most banks and exchanges are competent enough to test these things properly by planning correctly through roll forward and roll back cycles. The real issue with an upgrade or consolidation though is that it has be done more and more frequently due to the combined forces of regulatory, technology and customer change.

Service Design Patterns: Fundamental Design Solutions for SOAP/WSDL and RESTful Web Services by Robert Daigneau

Amazon Web Services, business intelligence, business process, continuous integration, create, read, update, delete, en.wikipedia.org, fault tolerance, loose coupling, MITM: man-in-the-middle, MVC pattern, pull request, RFC: Request For Comment, Ruby on Rails, software as a service, web application

., reserve flight) for each retrieved request. Once a task has completed, the request would be forwarded to the next background process to perform the next task (e.g., reserve hotel), and so on. The request is therefore processed much like a baton is passed from one runner to the next in a relay race. Web server scalability is promoted because the work is off-loaded from the web servers. This pattern also provides a relatively fault-tolerant way to conduct long-running business processes. However, it can be challenging to understand the entire business process at a macro level, and it can also be difficult to change or debug control-flow logic since these rules are typically buried within individual services, configuration W ORKFLOW C ONNECTOR files, routing tables, and messages in transit. Furthermore, the status of a client’s request can be difficult to ascertain for similar reasons.

Many workflow engines save the state of tasks and variables to a database before and after tasks are executed. These Process Snapshots provide several benefits. One may query the database to determine the status of any process instance. If a process instance crashes, the database may be queried to determine the last task that completed successfully, and the process may be restarted from that step. This is one way Workflow Engines help to ensure fault tolerance. Complete Flight Reservation Issue Confirmation Callback Message Flight Reservation ID Process Variable Figure 5.4 Graphical workflow design tools let developers depict control flow through UML activity diagrams and flowcharts. Information may be mapped from one task to another through Process Variables. 159 Workflow Connector 160 Workflow Connector C HAPTER 5 W EB S ERVICE I MPLEMENTATION S TYLES The Workflow Connector pattern uses web services as a means to launch the business processes managed by workflow engines.

The Art of Scalability: Scalable Web Architecture, Processes, and Organizations for the Modern Enterprise by Martin L. Abbott, Michael T. Fisher

always be closing, anti-pattern, barriers to entry, Bernie Madoff, business climate, business continuity plan, business intelligence, business process, call centre, cloud computing, combinatorial explosion, commoditize, Computer Numeric Control, conceptual framework, database schema, discounted cash flows, en.wikipedia.org, fault tolerance, finite state, friendly fire, hiring and firing, Infrastructure as a Service, inventory management, new economy, packet switching, performance metric, platform as a service, Ponzi scheme, RFC: Request For Comment, risk tolerance, Rubik’s Cube, Search for Extraterrestrial Intelligence, SETI@home, shareholder value, Silicon Valley, six sigma, software as a service, the scientific method, transaction costs, Vilfredo Pareto, web application, Y2K

If we have a technology platform comprised of a number of noncommunicating services, we increase the number of airports or runways for which we are managing traffic; as a result, we can have many more “landings” or changes. If the services communicate asynchronously, we would have a few more concerns, but we are also likely more willing to take risks. On the other hand, if the services all communicate synchronously with each other, there isn’t much more fault tolerance than with a monolithic system (see Chapter 21, Creating Fault Isolative Architectural Structures) and we are back to managing a single runway at a single airport. The expected result of the change is important as we want to be able to verify later that the change was successful. For instance, if a change is being made to a Web server and that change is to allow more threads of execution in the Web server, we should state that as the expected result.

Be careful here, because if you become an early adopter of software or systems, you will also be on the leading edge of finding all the bugs with that software or system. If availability and reliability are important to you and your customers, try to be an early majority or late majority adopter of those systems that are critical to the operations of your service, product, or platform. Asynchronous Design Whenever possible, systems should communicate in an asynchronous fashion. Asynchronous systems tend to be more fault tolerant to extreme load and do not easily fall prey to the multiplicative effects of failure that characterize synchronous systems. We will discuss the reasons for this in greater detail in the next section of this chapter. Stateless Systems Although some systems need state, state has a cost in terms of availability, scalability, and overall cost of your system. When you store state, you do so at a cost of memory or disk space and maybe the cost of databases.

The first factor to use in determining which services should be selected for stress testing is the criticality of each service to the overall system performance. If there is a central service such as a data abstract layer (DAL) or user authorization, this should be included as a candidate for stress testing because the stability of the entire application depends on this service. If you have architected your application into fault tolerant “swim lanes,” which will be discussed in Chapter 21, Creating Fault Isolative Architectural Structures, you still likely have core services that have been replicated across the lanes. The second consideration for determining services to stress test is the likelihood that a service affects performance. This decision will be influenced by knowledgeable engineers but should also be somewhat scientific.

pages: 1,201 words: 233,519

Coders at Work by Peter Seibel

Ada Lovelace, bioinformatics, cloud computing, Conway's Game of Life, domain-specific language, don't repeat yourself, Donald Knuth, fault tolerance, Fermat's Last Theorem, Firefox, George Gilder, glass ceiling, Guido van Rossum, HyperCard, information retrieval, Larry Wall, loose coupling, Marc Andreessen, Menlo Park, Metcalfe's law, Perl 6, premature optimization, publish or perish, random walk, revision control, Richard Stallman, rolodex, Ruby on Rails, Saturday Night Live, side project, slashdot, speech recognition, the scientific method, Therac-25, Turing complete, Turing machine, Turing test, type inference, Valgrind, web application

It's a lot better than shared memory programming. I think that's the one thing Erlang has done—it has actually demonstrated that. When we first did Erlang and we went to conferences and said, “You should copy all your data.” And I think they accepted the arguments over fault tolerance—the reason you copy all your data is to make the system fault tolerant. They said, “It'll be terribly inefficient if you do that,” and we said, “Yeah, it will but it'll be fault tolerant.” The thing that is surprising is that it's more efficient in certain circumstances. What we did for the reasons of fault tolerance, turned out to be, in many circumstances, just as efficient or even more efficient than sharing. Then we asked the question, “Why is that?” Because it increased the concurrency. When you're sharing, you've got to lock your data when you access it.

pages: 102 words: 27,769

Rework by Jason Fried, David Heinemeier Hansson

call centre, Clayton Christensen, Dean Kamen, Exxon Valdez, fault tolerance, James Dyson, Jeff Bezos, Ralph Nader, risk tolerance, Ruby on Rails, Steve Jobs, Tony Hsieh, Y Combinator

—Saul Kaplan, chief catalyst, Business Innovation Factory “Appealingly intimate, as if you’re having coffee with the authors. Rework is not just smart and succinct but grounded in the concreteness of doing rather than hard-to-apply philosophizing. This book inspired me to trust myself in defying the status quo.” —Penelope Trunk, author of Brazen Careerist: The New Rules for Success “[This book’s] assumption is that an organization is a piece of software. Editable. Malleable. Sharable. Fault-tolerant. Comfortable in Beta. Reworkable. The authors live by the credo ‘keep it simple, stupid’ and Rework possesses the same intelligence—and irreverence—of that simple adage.” —John Maeda, author of The Laws of Simplicity “Rework is like its authors: fast-moving, iconoclastic, and inspiring. It’s not just for startups. Anyone who works can learn from this.” —Jessica Livingston, partner, Y Combinator; author, Founders at Work INTRODUCTION FIRST The new reality TAKEDOWNS Ignore the real world Learning from mistakes is overrated Planning is guessing Why grow?

pages: 400 words: 94,847

Reinventing Discovery: The New Era of Networked Science by Michael Nielsen

Albert Einstein, augmented reality, barriers to entry, bioinformatics, Cass Sunstein, Climategate, Climatic Research Unit, conceptual framework, dark matter, discovery of DNA, Donald Knuth, double helix, Douglas Engelbart, Douglas Engelbart, en.wikipedia.org, Erik Brynjolfsson, fault tolerance, Fellow of the Royal Society, Firefox, Freestyle chess, Galaxy Zoo, Internet Archive, invisible hand, Jane Jacobs, Jaron Lanier, Johannes Kepler, Kevin Kelly, Magellanic Cloud, means of production, medical residency, Nicholas Carr, P = NP, publish or perish, Richard Feynman, Richard Stallman, selection bias, semantic web, Silicon Valley, Silicon Valley startup, Simon Singh, Skype, slashdot, social intelligence, social web, statistical model, Stephen Hawking, Stewart Brand, Ted Nelson, The Death and Life of Great American Cities, The Nature of the Firm, The Wisdom of Crowds, University of East Anglia, Vannevar Bush, Vernor Vinge

Lipman, and Nancy J. Cox et al. A global initiative on sharing avian flu data. Nature, 442:981, August 31, 2006. [21] John Bohannon. Gamers unravel the secret life of protein. Wired, 17(5), April 20, 2009. http://www.wired.com/medtech/genetics/magazine/17-05/ff_protein?currentPage=all. [22] Parsa Bonderson, Sankar Das Sarma, Michael Freedman, and Chetan Nayak. A blueprint for a topologically fault-tolerant quantum computer. eprint arXiv:1003.2856, 2010. [23] Christine L. Borgman. Scholarship in the Digital Age. Cambrdge, MA: MIT Press, 2007. [24] Kirk D. Borne et al. Astroinformatics: A 21st century approach to astronomy. eprint arXiv: 0909.3892, 2009. Position paper for Astro2010 Decadal Survey State, available at http://arxiv.org/abs/0909.3892. [25] Todd A. Boroson and Tod R. Lauer.

Speculations on the future of science. Edge: The Third Culture, 2006. http://www.edge.org/3rd_culture/kelly06/kelly06_index.html. [109] Kevin Kelly. What Technology Wants. New York: Viking, 2010. [110] Richard A. Kerr. Recently discovered habitable world may not exist. Science Now, October 12, 2010. http://news.sciencemag.org/sciencenow/2010/10/recently-discovered-habitable-world.html. [111] A. Yu Kitaev. Fault-tolerant quantum computation by anyons. Annals of Physics, 303(1):2–30, 2003. [112] Helge Kragh. Max Planck: The reluctant revolutionary. Physics World, December 2000. http://physicsworld.com/cws/article/print/373. [113] Greg Kroah-Hartman. The Linux kernel. Online video from Google Tech Talks. http://www.youtube.com/watch?v=L2SED6sewRw. [114] Greg Kroah-Hartman, Jonathan Corbet, and Amanda McPherson.

pages: 178 words: 33,275

Ansible Playbook Essentials by Gourav Shah

Amazon Web Services, cloud computing, Debian, DevOps, fault tolerance, web application

ISBN 978-1-78439-829-3 www.packtpub.com Credits Author Gourav Shah Reviewers Ajey Gore Olivier Korver Ben Mildren Aditya Patawari Acquisition Editor Vinay Argekar Content Development Editor Amey Varangaonkar Technical Editor Abhishek R. Kotian Copy Editors Pranjali Chury Neha Vyas Project Coordinator Suzanne Coutinho Proofreader Safis Editing Indexer Monica Ajmera Mehta Graphics Jason Monteiro Production Coordinator Nilesh R. Mohite Cover Work Nilesh R. Mohite About the Author Gourav Shah (www.gouravshah.com) has extensive experience in building and managing highly available, automated, fault-tolerant infrastructure and scaling it. He started his career as a passionate Linux and open source enthusiast, transformed himself into an operations engineer, and evolved to be a cloud and DevOps expert and trainer. In his previous avatar, Gourav headed IT operations for Efficient Frontier (now Adobe), India. He founded Initcron Systems (www.initcron.com), a niche consulting firm that specializes in DevOps enablement and cloud infrastructure management.

pages: 554 words: 108,035

Scala in Depth by Tom Kleenex, Joshua Suereth

discrete time, domain-specific language, fault tolerance, MVC pattern, sorting algorithm, type inference

These aren’t discussed in the book, but can be found in Akka’s documentation at http://akka.io/docs/ This technique can be powerful when distributed and clustered. The Akka 2.0 framework is adding the ability to create actors inside a cluster and allow them to be dynamically moved around to machines as needed. 9.6. Summary Actors provide a simpler parallelization model than traditional locking and threading. A well-behaved actors system can be fault-tolerant and resistant to total system slowdown. Actors provide an excellent abstraction for designing high-performance servers, where throughput and uptime are of the utmost importance. For these systems, designing failure zones and failure handling behaviors can help keep a system running even in the event of critical failures. Splitting actors into scheduling zones can ensure that input overload to any one portion of the system won’t bring the rest of the system down.

So, while the Scala actors library is an excellent resource for creating actors applications, the Akka library provides the features and performance needed to make a production application. Akka also supports common features out of the box. Actors and actor-related system design is a rich subject. This chapter lightly covered a few of the key aspects to actor-related design. These should be enough to create a fault-tolerant high-performant actors system. Next let’s look into a topic of great interest: Java interoperability with Scala. Chapter 10. Integrating Scala with Java In this chapter The benefits of using interfaces for Scala-Java interaction The dangers of automatic implicit conversions of Java types The complications of Java serialization in Scala How to effectively use annotations in Scala for Java libraries One of the biggest advantages of the Scala language is its ability to seamlessly interact with existing Java libraries and applications.

HBase: The Definitive Guide by Lars George

Amazon Web Services, bioinformatics, create, read, update, delete, Debian, distributed revision control, domain-specific language, en.wikipedia.org, fault tolerance, Firefox, Google Earth, Kickstarter, place-making, revision control, smart grid, web application

You may have a background in relational database theory or you want to start fresh and this “column-oriented thing” is something that seems to fit your bill. You also heard that HBase can scale without much effort, and that alone is reason enough to look at it since you are building the next web-scale system. I was at that point in late 2007 when I was facing the task of storing millions of documents in a system that needed to be fault-tolerant and scalable while still being maintainable by just me. I had decent skills in managing a MySQL database system, and was using the database to store data that would ultimately be served to our website users. This database was running on a single server, with another as a backup. The issue was that it would not be able to hold the amount of data I needed to store for this new project. I would have to either invest in serious RDBMS scalability skills, or find something else instead.

Looking at open source alternatives in the RDBMS space, you will likely have to give up many or all relational features, such as secondary indexes, to gain some level of performance. The question is, wouldn’t it be good to trade relational features permanently for performance? You could denormalize (see the next section) the data model and avoid waits and deadlocks by minimizing necessary locking. How about built-in horizontal scalability without the need to repartition as your data grows? Finally, throw in fault tolerance and data availability, using the same mechanisms that allow scalability, and what you get is a NoSQL solution—more specifically, one that matches what HBase has to offer. Database (De-)Normalization At scale, it is often a requirement that we design schema differently, and a good term to describe this principle is Denormalization, Duplication, and Intelligent Keys (DDI).[20] It is about rethinking how data is stored in Bigtable-like storage systems, and how to make use of it in an appropriate way.

These are abstractions that define higher-level features and APIs, which are then used by Hadoop to store the data. The data is eventually stored on a disk, at which point the OS filesystem is used. HDFS is the most used and tested filesystem in production. Almost all production clusters use it as the underlying storage layer. It is proven stable and reliable, so deviating from it may impose its own risks and subsequent problems. The primary reason HDFS is so popular is its built-in replication, fault tolerance, and scalability. Choosing a different filesystem should provide the same guarantees, as HBase implicitly assumes that data is stored in a reliable manner by the filesystem. It has no added means to replicate data or even maintain copies of its own storage files. This functionality must be provided by the lower-level system. You can select a different filesystem implementation by using a URI[36] pattern, where the scheme (the part before the first “:”, i.e., the colon) part of the URI identifies the driver to be used.

pages: 541 words: 109,698

Mining the Social Web: Finding Needles in the Social Haystack by Matthew A. Russell

Climategate, cloud computing, crowdsourcing, en.wikipedia.org, fault tolerance, Firefox, full text search, Georg Cantor, Google Earth, information retrieval, Mark Zuckerberg, natural language processing, NP-complete, Saturday Night Live, semantic web, Silicon Valley, slashdot, social graph, social web, statistical model, Steve Jobs, supply-chain management, text mining, traveling salesman, Turing test, web application

Sorting by date seems like a good idea and opens the door to certain kinds of time-series analysis, so let’s start there and see what happens. But first, we’ll need to make a small configuration change so that we can write our map/reduce functions to perform this task in Python. CouchDB is especially intriguing in that it’s written in Erlang, a language engineered to support super-high concurrency[16] and fault tolerance. The de facto out-of-the-box language you use to query and transform your data via map/reduce functions is JavaScript. Note that we could certainly opt to write map/reduce functions in JavaScript and realize some benefits from built-in JavaScript functions CouchDB offers—such as _sum, _count, and _stats. But the benefit gained from your development environment’s syntax checking/highlighting may prove more useful and easier on the eyes than staring at JavaScript functions wrapped up as triple-quoted string values that exist inside of Python code.

Some steps have been made in this direction: for instance, we discussed how microformats already make this possible for certain domains in Chapter 2, and in Chapter 9 we looked at how Facebook is aggressively bootstrapping an explicit graph construct into the Web with its Open Graph protocol. But before we get too pie-in-the-sky, let’s back up for just a moment and reflect on how we got to where we are right now. The Internet is just a network of networks,[63] and what’s very fascinating about it from a technical standpoint is how layers of increasingly higher-level protocols build on top of lower-level protocols to ultimately produce a fault-tolerant worldwide computing infrastructure. In our online activity, we rely on dozens of protocols every single day, without even thinking about it. However, there is one ubiquitous protocol that is hard not to think about explicitly from time to time: HTTP, the prefix of just about every URL that you type into your browser, the enabling protocol for the extensive universe of hypertext documents (HTML pages), and the links that glue them all together into what we know as the Web.

pages: 470 words: 109,589

Apache Solr 3 Enterprise Search Server by Unknown

bioinformatics, continuous integration, database schema, en.wikipedia.org, fault tolerance, Firefox, full text search, information retrieval, natural language processing, performance metric, platform as a service, Ruby on Rails, web application

While Solr offers some impressive scaling techniques through replication and sharding of data, it assumes that you know a priori what your scaling needs are. The distributed search of Solr doesn't adapt to real time changes in indexing or query load and doesn't provide any fail-over support. SolrCloud is an ongoing effort to build a fault tolerant, centrally managed support for clusters of Solr instances and is part of the trunk development path (Solr 4.0). SolrCloud introduces the idea that a logical collection of documents (otherwise known as an index) is distributed across a number of slices. Each slice is made up of shards, which are the physical pieces of the collection. In order to support fault tolerance, there may be multiple replicas of a shard distributed across different physical nodes. To keep all this data straight, Solr embeds Apache ZooKeeper as the centralized service for managing all configuration information for the cluster of Solr instances, including mapping which shards are available on which set of nodes of the cluster.

pages: 210 words: 42,271

Programming HTML5 Applications by Zachary Kessin

barriers to entry, continuous integration, fault tolerance, Firefox, Google Chrome, mandelbrot fractal, QWERTY keyboard, web application, WebSocket

Ruby Event Machine web socket handler require 'em-websocket' EventMachine::WebSocket.start(:host => "", :port => 8080) do |ws| ws.onopen { ws.send "Hello Client!"} ws.onmessage { |msg| ws.send "Pong: #{msg}" } ws.onclose { puts "WebSocket closed" } end Erlang Yaws Erlang is a pretty rigorously functional language that was developed several decades ago for telephone switches and has found acceptance in many other areas where massive parallelism and strong robustness are desired. The language is concurrent, fault-tolerant, and very scalable. In recent years it has moved into the web space because all of the traits that make it useful in phone switches are very useful in a web server. The Erlang Yaws web server also supports web sockets right out of the box. The documentation can be found at the Web Sockets in Yaws web page, along with code for a simple echo server. Example 9-5. Erlang Yaws web socket handler out(A) -> case get_upgrade_header(A#arg.headers) of undefined -> {content, "text/plain", "You're not a web sockets client!

pages: 133 words: 42,254

Big Data Analytics: Turning Big Data Into Big Money by Frank J. Ohlhorst

algorithmic trading, bioinformatics, business intelligence, business process, call centre, cloud computing, create, read, update, delete, data acquisition, DevOps, fault tolerance, linked data, natural language processing, Network effects, pattern recognition, performance metric, personalized medicine, RFID, sentiment analysis, six sigma, smart meter, statistical model, supply-chain management, Watson beat the top human players on Jeopardy!, web application

Big Data analytics requires that organizations choose the data to analyze, consolidate them, and then apply aggregation methods before the data can be subjected to the ETL process. This has to occur with large volumes of data, which can be structured, unstructured, or from multiple sources, such as social networks, data logs, web sites, mobile devices, and sensors. Hadoop accomplishes that by incorporating pragmatic processes and considerations, such as a fault-tolerant clustered architecture, the ability to move computing power closer to the data, parallel and/or batch processing of large data sets, and an open ecosystem that supports enterprise architecture layers from data storage to analytics processes. Not all enterprises require what Big Data analytics has to offer; those that do must consider Hadoop’s ability to meet the challenge. However, Hadoop cannot accomplish everything on its own.

Seeking SRE: Conversations About Running Production Systems at Scale by David N. Blank-Edelman

Affordable Care Act / Obamacare, algorithmic trading, Amazon Web Services, bounce rate, business continuity plan, business process, cloud computing, cognitive bias, cognitive dissonance, commoditize, continuous integration, crowdsourcing, dark matter, database schema, Debian, defense in depth, DevOps, domain-specific language, en.wikipedia.org, fault tolerance, fear of failure, friendly fire, game design, Grace Hopper, information retrieval, Infrastructure as a Service, Internet of things, invisible hand, iterative process, Kubernetes, loose coupling, Lyft, Marc Andreessen, microservices, minimum viable product, MVC pattern, performance metric, platform as a service, pull request, RAND corporation, remote working, Richard Feynman, risk tolerance, Ruby on Rails, search engine result page, self-driving car, sentiment analysis, Silicon Valley, single page application, Snapchat, software as a service, software is eating the world, source of truth, the scientific method, Toyota Production System, web application, WebSocket, zero day

As we close out, you should take the following points with you: Third parties are an extension of your stack, not ancillary. If it’s critical path, treat it like a service. Consider abandonment during the life cycle of an integration. The quality of your third-party integration depends on good communication. Contributor Bio Jonathan Mercereau has spent his career leading teams and architecting resilient, fault tolerant, and performant solutions working with the biggest players in DNS, CDN, Certificate Authority, and Synthetic Monitoring. Chances are, you’ve experienced the results of his work, from multi-CDN and optimizing streaming algorithms at Netflix to all multi-vendor solutions and performance enhancements at LinkedIn. In 2016, Jonathan cofounded a SaaS startup, traffiq corp, to bring big company traffic engineering best practices, orchestration, and automation to the broader web. 1 Site Reliability Engineering, Introduction. 2 Procurement teams are the experts in handling of purchase orders and subcontracts.

Modern distributed system design emphasizes the need to build systems that do not depend on maintaining state. The result is that we now have systems that lack unique state. In such a world, reverting a software change can make the system take on a more familiar appearance, but it might not restore the world to the way it once was. Special Knowledge About Complex Systems The situation facing SREs is seldom simple. The fault-tolerance mechanisms built into the design of distributed systems and related automation handle most problems that arise. Because of this, incidents represent situations that fall outside of the “most problems” boundary. Reasoning about cause and effect here is often challenging. For example, simply observing that a process is failing does not necessarily mean that fixing that process will resolve the incident.

Operations groups often take on big projects to increase MTBF and decrease MTTR, usually at the level of hardware components, because this is the only level at which assumptions of rationality hold well enough for “mean time to anything” to be well defined. It’s certainly worthwhile for a team to optimize within one “accountability domain” like this. Even so, when you’re looking at the entire system, an increase in fault tolerance beyond “barely acceptable” tends to be immediately eaten up by another layer. Suppose that you have a distributed storage system that was deployed to tolerate three simultaneous disk failures in an array. Then the hardware team, full of gumption and wishing to be promoted, takes clever measures to “guarantee” that the array will never have more than one disk down. Check back in a year, and you will find that the application has been reoptimized and can now tolerate only one failure.

pages: 377 words: 21,687

Digital Apollo: Human and Machine in Spaceflight by David A. Mindell

1960s counterculture, Charles Lindbergh, computer age, deskilling, fault tolerance, interchangeable parts, Mars Rover, more computing power than Apollo, Norbert Wiener, Norman Mailer, orbital mechanics / astrodynamics, Silicon Valley, Stewart Brand, telepresence, telerobotics

‘‘Apollo Experience Report: Guidance and Control Systems: Primary Guidance, Navigation, and Control System Development.’’ NASA TN D-8227. Houston, Tex.: Johnson Space Center, 1976. Holliday, Will L., and Dale P. Hoffman. ‘‘Systems Approach to Flight Controls.’’ Astronautics (May 1962): 36–37, 74–80. Hong, Sungook. ‘‘Man and Machine in the 1960s.’’ Techne 7, no. 3 (2004): 49–77. Hopkins, Albert L. ‘‘A Fault-Tolerant Information Processing Concept for Space Vehicles.’’ Cambridge, Mass.: MIT Instrumentation Laboratory, 1970. Hopkins, Albert L. ‘‘A Fault-Tolerant Information Processing System for Advanced Control, Guidance, and Navigation.’’ Cambridge, Mass.: Charles Stark Draper Laboratories, 1970. Hopkins Jr., Albert L., Ramon Alonso, and Hugh Blair-Smith. ‘‘Logical Description for the Apollo Guidance Computer (AGC4).’’ Cambridge, Mass.: MIT Instrumentation Laboratory, 1963. Horner, Richard.

pages: 271 words: 52,814

Blockchain: Blueprint for a New Economy by Melanie Swan

23andMe, Airbnb, altcoin, Amazon Web Services, asset allocation, banking crisis, basic income, bioinformatics, bitcoin, blockchain, capital controls, cellular automata, central bank independence, clean water, cloud computing, collaborative editing, Conway's Game of Life, crowdsourcing, cryptocurrency, disintermediation, Edward Snowden, en.wikipedia.org, Ethereum, ethereum blockchain, fault tolerance, fiat currency, financial innovation, Firefox, friendly AI, Hernando de Soto, intangible asset, Internet Archive, Internet of things, Khan Academy, Kickstarter, lifelogging, litecoin, Lyft, M-Pesa, microbiome, Network effects, new economy, peer-to-peer, peer-to-peer lending, peer-to-peer model, personalized medicine, post scarcity, prediction markets, QR code, ride hailing / ride sharing, Satoshi Nakamoto, Search for Extraterrestrial Intelligence, SETI@home, sharing economy, Skype, smart cities, smart contracts, smart grid, software as a service, technological singularity, Turing complete, uber lyft, unbanked and underbanked, underbanked, web application, WikiLeaks

Consensus without mining is another area being explored, such as in Tendermint’s modified version of DLS (the solution to the Byzantine Generals’ Problem by Dwork, Lynch, and Stockmeyer), with bonded coins belonging to byzantine participants.184 Another idea for consensus without mining or proof of work is through a consensus algorithm such as Hyperledger’s, which is based on the Practical Byzantine Fault Tolerance algorithm. Only focus on the most recent or unspent outputs Many blockchain operations could be based on surface calculations of the most recent or unspent outputs, similar to how credit card transactions operate. “Thin wallets” operate this way, as opposed to querying a full Bitcoind node, and this is how Bitcoin ewallets work on cellular telephones. A related proposal is Cryptonite, which has a “mini-blockchain” abbreviated data scheme.

Programming Android by Zigurd Mednieks, Laird Dornin, G. Blake Meike, Masumi Nakamura

anti-pattern, business process, conceptual framework, create, read, update, delete, database schema, Debian, domain-specific language, en.wikipedia.org, fault tolerance, Google Earth, interchangeable parts, iterative process, loose coupling, MVC pattern, revision control, RFID, web application

When someone is related to multiple things (such as multiple addresses), relational databases have ways of handling that too, but we won't go into such detail in this chapter. SQLite Android uses the SQLite database engine, a self-contained, transactional database engine that requires no separate server process. Many applications and environments beyond Android make use of it, and a large open source community actively develops SQLite. In contrast to desktop-oriented or enterprise databases, which provide a plethora of features related to fault tolerance and concurrent access to data, SQLite aggressively strips out features that are not absolutely necessary in order to achieve a small footprint. For example, many database systems use static typing, but SQLite does not store database type information. Instead, it pushes the responsibility of keeping type information into high-level languages, such as Java, that map database structures into high-level types.

You can also explicitly start and end a transaction so that it encompasses multiple statements. For a given transaction, SQLite does not modify the database until all statements in the transaction have completed successfully. Given the volatility of the Android mobile environment, we recommend that in addition to meeting the needs for consistency in your app, you also make liberal use of transactions to support fault tolerance in your application. Example Database Manipulation Using sqlite3 Now that you understand the basics of SQL as it pertains to SQLite, let’s have a look at a simple database for storing video metadata using the sqlite3 command-line tool and the Android debug shell, which you can start by using the adb command. Using the command line will allow us to view database changes right away, and will provide some simple examples of how to work with this useful database debugging tool.

Beautiful Data: The Stories Behind Elegant Data Solutions by Toby Segaran, Jeff Hammerbacher

23andMe, airport security, Amazon Mechanical Turk, bioinformatics, Black Swan, business intelligence, card file, cloud computing, computer vision, correlation coefficient, correlation does not imply causation, crowdsourcing, Daniel Kahneman / Amos Tversky, DARPA: Urban Challenge, data acquisition, database schema, double helix, en.wikipedia.org, epigenetics, fault tolerance, Firefox, Hans Rosling, housing crisis, information retrieval, lake wobegon effect, longitudinal study, Mars Rover, natural language processing, openstreetmap, prediction markets, profit motive, semantic web, sentiment analysis, Simon Singh, social graph, SPARQL, speech recognition, statistical model, supply-chain management, text mining, Vernor Vinge, web application

In conclusion, as more data is available online, and as computing capacity increases, I believe that the probabilistic data-driven methodology will become a major approach for solving complex problems in uncertain domains. Acknowledgments Thanks to Darius Bacon, Thorsten Brants, Andy Golding, Mark Paskin, Franco Salvetti, and Casey Whitelaw for comments, corrections, and code. 242 CHAPTER FOURTEEN Download at Boykma.Com Chapter 15 CHAPTER FIFTEEN Life in Data: The Story of DNA Matt Wood and Ben Blackburne DNA IS A BIOLOGICAL BUILDING BLOCK, A CONCISE, SCHEMA-LESS, FAULT-TOLERANT DATABASE OF AN organism’s chemical makeup, designed and implemented by a population over millions of years. Over the past 20 years, biologists have begun to move from the study of individual genes to whole genomes, with genomic approaches forming an increasingly large part of modern biomedical research. In recent years, however, biologists have been learning to handle DNA as both a data store and a data source.

DNA As a Data Store A genome is the database for an organism. It is written in the molecules of DNA, copies of which are stored in each cell of the human body (with a few exceptions). This pattern is repeated across nature, right down to the simplest forms of life. The information encoded within the genome contains the directions to build the proteins that make up the molecular machinery that runs the chemistry of the cell. Now that’s what I call fault-tolerant and redundant storage. 243 Download at Boykma.Com Almost every cell in your body contains a central data center, which stores these genomic databases, called the nucleus. Within this are the chromosomes. Like all humans, you are diploid, with two copies of each chromosome, one from your father and one from your mother. Added to these are the sex chromosomes, two X chromosomes for a female, or an X and a Y chromosome for a male.

pages: 211 words: 58,677

Philosophy of Software Design by John Ousterhout

conceptual framework, fault tolerance, iterative process, move fast and break things, move fast and break things, MVC pattern, revision control, Silicon Valley

For example, an I/O operation may fail, or a required resource may not be available. In a distributed system, network packets may be lost or delayed, servers may not respond in a timely fashion, or peers may communicate in unexpected ways. The code may detect bugs, internal inconsistencies, or situations it is not prepared to handle. Large systems have to deal with many exceptional conditions, particularly if they are distributed or need to be fault-tolerant. Exception handling can account for a significant fraction of all the code in a system. Exception handling code is inherently more difficult to write than normal-case code. An exception disrupts the normal flow of the code; it usually means that something didn’t work as expected. When an exception occurs, the programmer can deal with it in two ways, each of which can be complicated. The first approach is to move forward and complete the work in progress in spite of the exception.

The Dream Machine: J.C.R. Licklider and the Revolution That Made Computing Personal by M. Mitchell Waldrop

Ada Lovelace, air freight, Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem, Albert Einstein, anti-communist, Apple II, battle of ideas, Berlin Wall, Bill Duvall, Bill Gates: Altair 8800, Byte Shop, Claude Shannon: information theory, computer age, conceptual framework, cuban missile crisis, Donald Davies, double helix, Douglas Engelbart, Douglas Engelbart, Dynabook, experimental subject, fault tolerance, Frederick Winslow Taylor, friendly fire, From Mathematics to the Technologies of Life and Death, Haight Ashbury, Howard Rheingold, information retrieval, invisible hand, Isaac Newton, James Watt: steam engine, Jeff Rulifson, John von Neumann, Leonard Kleinrock, Marc Andreessen, Menlo Park, New Journalism, Norbert Wiener, packet switching, pink-collar, popular electronics, RAND corporation, RFC: Request For Comment, Robert Metcalfe, Silicon Valley, Steve Crocker, Steve Jobs, Steve Wozniak, Steven Levy, Stewart Brand, Ted Nelson, Turing machine, Turing test, Vannevar Bush, Von Neumann architecture, Wiener process, zero-sum game

He and his colleagues would have to give up every engineer's first instinct, which was to control things so that problems could not happen, and instead design a system that was guaranteed to fail-but that would keep running anyhow. Nowadays this is known as a fault-tolerant system, and designing one is still considered a cutting-edge challenge. It means giving the system some of the same quality possessed by a superbly trained military unit, or a talented football team, or, for that matter, any living organism-namely, an ability to react to the unexpected. But in the early 1960s, with CTSS, Corbato and his colleagues had to pioneer fault-tolerant design even as they were pioneering time-sharing itself: For example, among their early innovations were "firewalls," or software barriers that kept each user's area of computer memory isolated from its neighbors, so that a flameout in one program wouldn't necessarily consume the others.

., 197-98 electncal engmeenng, 82 Electncal Engmeermg, 113 electnc power networks, 25-26 electroencephalography (EEG), 11-12 electronIc commons Idea, 413-14,420 ElectronIc Discrete Vanable Automatic Computer (EDVAC),47, 100-101 von Neumann's report on, 59-65 Electronic News, 338 ElectronIc Numencal Integrator and Calculator (EN lAC), 43,45-47,87-88, 101, 102, 103,339 drawbacks of, 46-47 patent dispute over, 63 programmmg of, 46-47 electronIc office Idea, 363-64, 407 ElIas, Peter, 220 ELIZA, 229 Elkmd, Jerry, 110, 111, 152, 175-76, 194, 295, 345, 351, 354,368,371,399,438,444, 446,447 Ellenby, John, 382, 408 EllIs, Jim, 427 E-maIl, 231, 324-26, 384, 420, 465 Engelbart, Douglas, 210-17, 241-43,255,261,273,278, 285, 342, 358, 360n, 364, 406,465,470 at Fall JOInt Computer Confer- ence,5,287-94 EnglIsh, BIll, 242, 243, 289-90, 293-94,354,355,361-62, 365n,366,368 EN lAC, see ElectroniC Numeflcal Integrator and Calculator EnIgma machines, 80 entropy, 81 error-checking codes, 271 error-correcting codes, 79-80, 94n Ethernet, 5, 374-75, 382, 385, 386,439-40,452 Ethernet-AI to- RCG-S LOT (EARS), 385 EuclId, 137 Evans, David, 239, 261, 274, 282, 303, 343, 357, 358 Everett, Robert, 102-3, 108 expectation, 10 behavIOral theory, 74, 97 expert systems, 397-98, 406 ExtrapolatIOn, InterpolatlOn, and Smoothmg of StatIOnary Time Series (Wiener), 54 facsimile machines, 347-48 Fahlman, Scott, 438 FairchIld Semiconductor, 339 Fall JOInt Computer Conference, 5,287-94 Fano, Robert, 19, 75, 94-95, 107, 174,193,217-24,227-36, 243,244,249-51,252-53, 257, 281, 307, 310, 317, 453 FantasIa, 338 Farley, Belmont, 144 fault-tolerant systems, 234 Federal Research Internet Coor- dlnatmg Committee (FRICC), 462 feedback, 55-57, 92, 138 Feigenbaum, Edward, 210, 281, 396,397-98,403,405-6 Fiala, Ed, 346 file systems, hierarchical, 230 FIle Transfer Protocol (FTP), 301 firewalls, 234 "First Draft of a Report on the EDV AC" (von Neumann), 59-65,68,86,102 flat-panel displays, 359 Flegal, Bob, 345 FLEX machine, 358, 359, 361 Flexownter, 166, 188 flight simulators, 101-2 floppy disks, CP/M software and, 434 Ford Motor Company, 334, 335, 336, 337, 389 Forrester, Jay, 102-3, 113, 114-15, 117, 173,230-31 Fortran, 165, 168, 169, 171-72, 246 Fortune, 27, 93 Fossum, Bob, 418, 420 Foster, John, 278, 279, 330 Frankston, Bob, 315 Fredkin, Edward, 152-56, 179, 194,208,313-14,323,412 Freeman, Greydon, 457 Freyman, Monsieur, 83 FRICC (Federal Research Inter- net Coordlnatmg Commit- tee), 462 Fnck, Fredenck, 97, 128, 201-2, 203n Fublni, Gene, 202 Fuchs, Ira, 457 FUJI Xerox, 409 Fumblmg the Future (Smith and Alexander), 382n, 446 functions, 10 lIst processing, 169-70 Galanter, Eugene, 139 Galley, Stuart, 319-20 games, computer, 188, 320, 435 game theory, 85-86, 91 Garner, W.

PostgreSQL Cookbook by Chitij Chauhan

database schema, Debian, fault tolerance, GnuPG, Google Glasses, index card

If you have purchased Premium support from vendors such as 2ndQuadrant and EnterpriseDB, you can log tickets with their support team concerning PostgreSQL issues. Chapter 7. High Availability and Replication In this chapter, we will cover the following recipes: Setting up hot streaming replication Replication using Slony-I Replication using Londiste Replication using Bucardo Replication using DRBD Setting up the Postgres-XC cluster Introduction The important components for any production database is to achieve fault tolerance, 24/7 availability, and redundancy. It is for this purpose that we have different high availability and replication solutions available for PostgreSQL. From a business perspective, it is important to ensure 24/7 data availability in the event of a disaster situation or a database crash due to disk or hardware failure. In such situations, it becomes critical to ensure that a duplicate copy of the data is available on a different server or a different database, so that seamless failover can be achieved even when the primary server/database is unavailable.

pages: 391 words: 71,600

Hit Refresh: The Quest to Rediscover Microsoft's Soul and Imagine a Better Future for Everyone by Satya Nadella, Greg Shaw, Jill Tracie Nichols

"Robert Solow", 3D printing, Amazon Web Services, anti-globalists, artificial general intelligence, augmented reality, autonomous vehicles, basic income, Bretton Woods, business process, cashless society, charter city, cloud computing, complexity theory, computer age, computer vision, corporate social responsibility, crowdsourcing, Deng Xiaoping, Donald Trump, Douglas Engelbart, Edward Snowden, Elon Musk, en.wikipedia.org, equal pay for equal work, everywhere but in the productivity statistics, fault tolerance, Gini coefficient, global supply chain, Google Glasses, Grace Hopper, industrial robot, Internet of things, Jeff Bezos, job automation, John Markoff, John von Neumann, knowledge worker, Mars Rover, Minecraft, Mother of all demos, NP-complete, Oculus Rift, pattern recognition, place-making, Richard Feynman, Robert Gordon, Ronald Reagan, Second Machine Age, self-driving car, side project, Silicon Valley, Skype, Snapchat, special economic zone, speech recognition, Stephen Hawking, Steve Ballmer, Steve Jobs, telepresence, telerobotics, The Rise and Fall of American Growth, Tim Cook: Apple, trade liberalization, two-sided market, universal basic income, Wall-E, Watson beat the top human players on Jeopardy!, young professional, zero-sum game

The same can be said of a dozen other areas in which technology is “stuck”—high temperature superconductors, energy efficient fertilizer production, string theory. A quantum computer would allow a new look at our most compelling problems. Computer scientist Krysta Svore is at the heart of our quest to solve problems on a quantum computer. Krysta received her PhD from Columbia University focusing on fault tolerance and scalable quantum computing, and she spent a year at MIT working with an experimentalist designing the software needed to control a quantum computer. Her team is designing an exotic software architecture that assumes our math, physics, and superconducting experts succeed in building a quantum computer. To decide which problems her software should go after first, she invited quantum chemists from around the world to make presentations and to brainstorm.

pages: 218 words: 68,648

Confessions of a Crypto Millionaire: My Unlikely Escape From Corporate America by Dan Conway

Affordable Care Act / Obamacare, Airbnb, bank run, basic income, bitcoin, blockchain, buy and hold, cloud computing, cognitive dissonance, corporate governance, crowdsourcing, cryptocurrency, disruptive innovation, distributed ledger, double entry bookkeeping, Ethereum, ethereum blockchain, fault tolerance, financial independence, gig economy, Gordon Gekko, Haight Ashbury, high net worth, job satisfaction, litecoin, Marc Andreessen, Mitch Kapor, obamacare, offshore financial centre, Ponzi scheme, prediction markets, rent control, reserve currency, Ronald Coase, Satoshi Nakamoto, Silicon Valley, smart contracts, Steve Jobs, supercomputer in your pocket, Turing complete, Uber for X, universal basic income, upwardly mobile

Some were very old, some were very young. Imagine the cantina scene in Star Wars. It was a delightfully odd group, without a single big personality pushing business cards. The first question was something like this: “Vitalik, don’t you think that the Byzantine general’s dilemma could be exploited by the various geographic nodes in a proof of stake architecture? Is there a way to compile the blockchain that is fault tolerant and aligns incentives with the miners?” I had no idea what they were talking about. I especially didn’t understand Vitalik’s response, which he delivered in an even voice seasoned with small bursts of energy, as if he were connected to a gentle electrical current that gave his face a stutter step every so often. I could read people, and it was obvious that his words allowed this guy who asked the question, and others in the room nodding their heads, to understand something that had previously been elusive.

Mastering Structured Data on the Semantic Web: From HTML5 Microdata to Linked Open Data by Leslie Sikos

AGPL, Amazon Web Services, bioinformatics, business process, cloud computing, create, read, update, delete, Debian, en.wikipedia.org, fault tolerance, Firefox, Google Chrome, Google Earth, information retrieval, Infrastructure as a Service, Internet of things, linked data, natural language processing, openstreetmap, optical character recognition, platform as a service, search engine result page, semantic web, Silicon Valley, social graph, software as a service, SPARQL, text mining, Watson beat the top human players on Jeopardy!, web application, wikimedia commons

Blazegraph Blazegraph is the flagship graph database product of SYSTAP, the vendor of the graph database previously known as Bigdata. It is a highly scalable, open source storage and computing platform [11]. Suitable for Big Data applications and selected for the Wikidata Query Service, Blazegraph is specifically designed to support big graphs, offering Semantic Web (RDF/SPARQL) and graph database (tinkerpop, blueprints, vertex-centric) APIs. The robust, scalable, fault-tolerant, enterprise-class storage and query features are combined with high availability, online backup, failover, and self-healing. Blazegraph features an ultra-high performance RDF graph database that supports RDFS and OWL Lite reasoning, as well as SPARQL 1.1 querying. Designed for huge amounts of information, the Blazegraph RDF graph database can load 1 billion graph edges in less than an hour on a 15-node cluster.

pages: 237 words: 76,486

Mars Rover Curiosity: An Inside Account From Curiosity's Chief Engineer by Rob Manning, William L. Simon

Elon Musk, fault tolerance, fear of failure, Kickstarter, Kuiper Belt, Mars Rover

Once I had my diploma in hand, JPL changed my status from draftsman to engineer, but my role as an engineer was slow going. My first job was as an apprentice electronics tester, helping run tests on what would become the brains of the Galileo spacecraft. I quickly discovered that building spacecraft included many extremely tedious jobs. After Galileo, I worked on Magellan (to Venus) and Cassini (to Saturn), becoming expert in the design of spacecraft computers, computer memory, computer architectures, and fault-tolerant systems. In 1993, after thirteen years at JPL, my career took a sudden leap forward. Brian Muirhead, the most inspiring and level-headed spacecraft leader I have ever met, had recently been named spacecraft manager for a funky little mission to Mars called Pathfinder. We had a conversation in which he explained that he was a master of mechanical systems but had not had much experience with electronics.

pages: 313 words: 75,583

Ansible for DevOps: Server and Configuration Management for Humans by Jeff Geerling

AGPL, Amazon Web Services, cloud computing, continuous integration, database schema, Debian, defense in depth, DevOps, fault tolerance, Firefox, full text search, Google Chrome, inventory management, loose coupling, microservices, Minecraft, MITM: man-in-the-middle, Ruby on Rails, web application

Have one master SAN that’s mounted on each of the servers. Use a distributed file system, like Gluster, Lustre, Fraunhofer, or Ceph. Some options are easier to set up than others, and all have benefits—and drawbacks. Rsync, git, or NFS offer simple initial setup, and low impact on filesystem performance (in many scenarios). But if you need more flexibility and scalability, less network overhead, and greater fault tolerance, you will have to consider something that requires more configuration (e.g. a distributed file system) and/or more hardware (e.g. a SAN). GlusterFS is licensed under the AGPL license, has good documentation, and a fairly active support community (especially in the #gluster IRC channel). But to someone new to distributed file systems, it can be daunting to get set it up the first time. Configuring Gluster - Basic Overview To get Gluster working on a basic two-server setup (so you can have one folder that’s synchronized and replicated across the two servers—allowing one server to go down completely, and the other to still have access to the files), you need to do the following: Install Gluster server and client on each server, and start the server daemon.

pages: 619 words: 197,256

Apollo by Charles Murray, Catherine Bly Cox

cuban missile crisis, fault tolerance, index card, low earth orbit, old-boy network, orbital mechanics / astrodynamics, The Bell Curve by Richard Herrnstein and Charles Murray, War on Poverty, white flight

He thought that the obsession with the S.P.S. as a one-shot, nonredundant system was a lot of hype. “When they say ‘no redundancy,’ that’s a misnomer,” he said later. “There was only one engine bell, of course, and only one combustion chamber, but all the avionics that fed the signals to that engine and all the mechanical components that had to work, like the little valves that had to be pressurized to open the ball valves, and so forth, were at least single-fault tolerant and usually two-fault tolerant. . . . There were a heck of a lot of ways to start that engine.” And of course they had indeed checked it out carefully before the flight, but nothing they didn’t do for any other mission. All this was still correct as of Christmas Eve, 1968. And yet it ultimately didn’t make any difference to the way many of the people in Apollo felt. Caldwell Johnson, speaking as a designer of the spacecraft, explained it.

pages: 1,380 words: 190,710

Building Secure and Reliable Systems: Best Practices for Designing, Implementing, and Maintaining Systems by Heather Adkins, Betsy Beyer, Paul Blankinship, Ana Oprea, Piotr Lewandowski, Adam Stubblefield

anti-pattern, barriers to entry, bash_history, business continuity plan, business process, Cass Sunstein, cloud computing, continuous integration, correlation does not imply causation, create, read, update, delete, cryptocurrency, cyber-physical system, database schema, Debian, defense in depth, DevOps, Edward Snowden, fault tolerance, fear of failure, general-purpose programming language, Google Chrome, Internet of things, Kubernetes, load shedding, margin call, microservices, MITM: man-in-the-middle, performance metric, pull request, ransomware, revision control, Richard Thaler, risk tolerance, self-driving car, Skype, slashdot, software as a service, source of truth, Stuxnet, Turing test, undersea cable, uranium enrichment, Valgrind, web application, Y2K, zero day

You might also want to solicit an expert to review the assessment to identify risks with hidden factors or dependencies. Your risk assessment may vary depending on where your organization’s assets are located. For example, a site in Japan or Taiwan should account for typhoons, while a site in the Southeastern US should account for hurricanes. Risk ratings may also change as an organization matures and incorporates fault-tolerant systems, like redundant internet circuits and backup power supplies, into its systems. Large organizations should perform risk assessments on both global and per-site levels, and review and update these assessments periodically as the operating environment changes. Equipped with a risk assessment that identifies which systems need protection, you’re ready to create a response team prepared with tools, procedures, and training.

To facilitate rapid response to incidents, an IR team should predetermine appropriate levels of access for incident response and establish escalation procedures ahead of time so the process to obtain emergency access isn’t slow and convoluted. The IR team should have read access to logs for analysis and event reconstruction, as well as access to tools for analyzing data, sending reports, and conducting forensic examinations. Configuring Systems You can make a number of adjustments to systems before a disaster or incident to reduce an IR team’s initial response time. For example: Build fault tolerance into local systems and create failovers. For more information on this topic, see Chapters 8 and 9. Deploy forensic agents, such as GRR agents or EnCase Remote Agents, across the network with logs enabled. This will aid both your response and later forensic analysis. Be aware that security logs may require a lengthy retention period, as discussed in Chapter 15 (the industry average for detecting intrusions is approximately 200 days, and logs deleted before an incident is detected cannot be used to investigate it).

pages: 275 words: 84,980

Before Babylon, Beyond Bitcoin: From Money That We Understand to Money That Understands Us (Perspectives) by David Birch

agricultural Revolution, Airbnb, bank run, banks create money, bitcoin, blockchain, Bretton Woods, British Empire, Broken windows theory, Burning Man, business cycle, capital controls, cashless society, Clayton Christensen, clockwork universe, creative destruction, credit crunch, cross-subsidies, crowdsourcing, cryptocurrency, David Graeber, dematerialisation, Diane Coyle, disruptive innovation, distributed ledger, double entry bookkeeping, Ethereum, ethereum blockchain, facts on the ground, fault tolerance, fiat currency, financial exclusion, financial innovation, financial intermediation, floating exchange rates, Fractional reserve banking, index card, informal economy, Internet of things, invention of the printing press, invention of the telegraph, invention of the telephone, invisible hand, Irish bank strikes, Isaac Newton, Jane Jacobs, Kenneth Rogoff, knowledge economy, Kuwabatake Sanjuro: assassination market, large denomination, M-Pesa, market clearing, market fundamentalism, Marshall McLuhan, Martin Wolf, mobile money, money: store of value / unit of account / medium of exchange, new economy, Northern Rock, Pingit, prediction markets, price stability, QR code, quantitative easing, railway mania, Ralph Waldo Emerson, Real Time Gross Settlement, reserve currency, Satoshi Nakamoto, seigniorage, Silicon Valley, smart contracts, social graph, special drawing rights, technoutopianism, the payments system, The Wealth of Nations by Adam Smith, too big to fail, transaction costs, tulip mania, wage slave, Washington Consensus, wikimedia commons

At the time of writing, the ‘market cap’ of Ethereum is significantly higher than that of Ethereum Classic. Ripple After Bitcoin and Ethereum, the third biggest cryptocurrency is Ripple, which unlike those first two has its roots in local exchange trading systems (Peck 2013). It is a protocol for value exchange that uses a shared ledger but it does not use a Bitcoin-like blockchain, preferring another kind of what is known as a ‘Byzantine fault-tolerant consensus-forming process’. Ripple signs every transaction that parties submit to the network with a digital signature. Each user selects a list, called a ‘unique node list’, comprising other users that it trusts as what are known as ‘validating nodes’. Each validating node independently verifies every proposed transaction within its network to determine if it is valid. A transaction is valid if the correct signature appears on the transaction, i.e. the signature of the funds’ owner, and if the parties have enough funds to make the transaction.

pages: 362 words: 86,195

Fatal System Error: The Hunt for the New Crime Lords Who Are Bringing Down the Internet by Joseph Menn

Brian Krebs, dumpster diving, fault tolerance, Firefox, John Markoff, Menlo Park, offshore financial centre, pirate software, plutocrats, Plutocrats, popular electronics, profit motive, RFID, Silicon Valley, zero day

Cerf, who has a generally upbeat tone about most things, gives the impression that he remains pleasantly surprised that the Internet has continued to function and thrive—even though, as he put it, “We never got to do the production engineering,” the version ready for prime time. Even after his years on the front line, Barrett found such statements amazing. “It’s incredibly disturbing,” he said. “The engine of the world economy is based on this really cool experiment that is not designed for security, it’s designed for fault-tolerance,” which is a system’s ability to withstand some failures. “You can reduce your risks, but the naughty truth is that the Net is just not a secure place for business or society.” Cerf listed a dozen things that could be done to make the Internet safer. Among them: encouraging research into “hardware-assisted security mechanisms,” limiting the enormous damage that Web browsers can wreak on operating systems, and hiring more and better trained federal cybercrime agents while pursuing international legal frameworks.

pages: 669 words: 210,153

Tools of Titans: The Tactics, Routines, and Habits of Billionaires, Icons, and World-Class Performers by Timothy Ferriss

Airbnb, Alexander Shulgin, artificial general intelligence, asset allocation, Atul Gawande, augmented reality, back-to-the-land, Ben Horowitz, Bernie Madoff, Bertrand Russell: In Praise of Idleness, Black Swan, blue-collar work, Boris Johnson, Buckminster Fuller, business process, Cal Newport, call centre, Charles Lindbergh, Checklist Manifesto, cognitive bias, cognitive dissonance, Colonization of Mars, Columbine, commoditize, correlation does not imply causation, David Brooks, David Graeber, diversification, diversified portfolio, Donald Trump, effective altruism, Elon Musk, fault tolerance, fear of failure, Firefox, follow your passion, future of work, Google X / Alphabet X, Howard Zinn, Hugh Fearnley-Whittingstall, Jeff Bezos, job satisfaction, Johann Wolfgang von Goethe, John Markoff, Kevin Kelly, Kickstarter, Lao Tzu, lateral thinking, life extension, lifelogging, Mahatma Gandhi, Marc Andreessen, Mark Zuckerberg, Mason jar, Menlo Park, Mikhail Gorbachev, MITM: man-in-the-middle, Nelson Mandela, Nicholas Carr, optical character recognition, PageRank, passive income, pattern recognition, Paul Graham, peer-to-peer, Peter H. Diamandis: Planetary Resources, Peter Singer: altruism, Peter Thiel, phenotype, PIHKAL and TIHKAL, post scarcity, post-work, premature optimization, QWERTY keyboard, Ralph Waldo Emerson, Ray Kurzweil, recommendation engine, rent-seeking, Richard Feynman, risk tolerance, Ronald Reagan, selection bias, sharing economy, side project, Silicon Valley, skunkworks, Skype, Snapchat, social graph, software as a service, software is eating the world, stem cell, Stephen Hawking, Steve Jobs, Stewart Brand, superintelligent machines, Tesla Model S, The Wisdom of Crowds, Thomas L Friedman, Wall-E, Washington Consensus, Whole Earth Catalog, Y Combinator, zero-sum game

The most successful computer company of the seventies and eighties, next to IBM, was Digital Equipment Corporation. IBM was first in computers. DEC was first in minicomputers. Many other computer companies (and their entrepreneurial owners) became rich and famous by following a simple principle: If you can’t be first in a category, set up a new category you can be first in. Tandem was first in fault-tolerant computers and built a $1.9 billion business. So Stratus stepped down with the first fault-tolerant minicomputer. Are the laws of marketing difficult? No, they are quite simple. Working things out in practice is another matter, however. Cray Research went over the top with the first supercomputer. So Convex put two and two together and launched the first mini supercomputer. Sometimes you can also turn an also-ran into a winner by inventing a new category.

pages: 722 words: 90,903

Practical Vim: Edit Text at the Speed of Thought by Drew Neil

Bram Moolenaar, don't repeat yourself, en.wikipedia.org, fault tolerance, finite state, place-making, QWERTY keyboard, web application

That means any bulb can go out, and the rest will be unaffected. I’ve borrowed the expressions in series and in parallel from the field of electronics to differentiate between two techniques for executing a macro multiple times. The technique for executing a macro in series is brittle. Like cheap Christmas tree lights, it breaks easily. The technique for executing a macro in parallel is more fault tolerant. Execute the Macro in Series Picture a robotic arm and a conveyor belt containing a series of items for the robot to manipulate (Figure 4, ​Vim's macros make quick work of repetitive tasks​). Recording a macro is like programming the robot to do a single unit of work. As a final step, we instruct the robot to move the conveyor belt and bring the next item within reach. In this manner, we can have a single robot carry out a series of repetitive tasks on similar items

Scratch Monkey by Stross, Charles

carbon-based life, defense in depth, fault tolerance, gravity well, Kuiper Belt, packet switching, phenotype, telepresence

I realise that I may never hear them again. I'm probably grinning like a corpse but I don't care -- she must know by now that blind people often smile. It's easier to grin than to frown; the facial muscles contract into a smirk more easily. Even when you're about to die. "It takes a lot of stress to unbalance a network processor the size of a small moon," she replies calmly; "it shows a remarkable degree of fault tolerance. As for physical assault, the automatic defences are still armed ... as they always have been. So If we want to take it for ourselves, we must overwhelm it by frontal assault, sending uploaded minds out into the simulation space until it overloads and drops into NP-stasis. They do that if you feed them faster than they can transfer capacity elsewhere, you know. It's happened before, and it's what the Superbrights are most afraid of.

pages: 352 words: 96,532

Where Wizards Stay Up Late: The Origins of the Internet by Katie Hafner, Matthew Lyon

air freight, Bill Duvall, computer age, conceptual framework, Donald Davies, Douglas Engelbart, Douglas Engelbart, fault tolerance, Hush-A-Phone, information retrieval, John Markoff, Kevin Kelly, Leonard Kleinrock, Marc Andreessen, Menlo Park, natural language processing, packet switching, RAND corporation, RFC: Request For Comment, Robert Metcalfe, Ronald Reagan, Silicon Valley, speech recognition, Steve Crocker, Steven Levy

But imagine a local post office somewhere that decided to go it alone, making up its own rules for addressing, packaging, stamping, and sorting mail. Imagine if that rogue post office decided to invent its own set of ZIP codes. Imagine any number of post offices taking it upon themselves to invent new rules. Imagine widespread confusion. Mail handling begs for a certain amount of conformity, and because computers are less fault-tolerant than human beings, e-mail begs loudly. The early wrangling on the ARPANET over attempts to impose standard message headers was typical of other debates over computer industry standards that came later. But because the struggle over e-mail standards was one of the first sources of real tension in the community, it stood out. In 1973 an ad hoc committee led by MIT’s Bhushan tried bringing some order to the implementation of new e-mail programs.

Data and the City by Rob Kitchin,Tracey P. Lauriault,Gavin McArdle

A Declaration of the Independence of Cyberspace, bike sharing scheme, bitcoin, blockchain, Bretton Woods, Chelsea Manning, citizen journalism, Claude Shannon: information theory, clean water, cloud computing, complexity theory, conceptual framework, corporate governance, correlation does not imply causation, create, read, update, delete, crowdsourcing, cryptocurrency, dematerialisation, digital map, distributed ledger, fault tolerance, fiat currency, Filter Bubble, floating exchange rates, global value chain, Google Earth, hive mind, Internet of things, Kickstarter, knowledge economy, lifelogging, linked data, loose coupling, new economy, New Urbanism, Nicholas Carr, open economy, openstreetmap, packet switching, pattern recognition, performance metric, place-making, RAND corporation, RFID, Richard Florida, ride hailing / ride sharing, semantic web, sentiment analysis, sharing economy, Silicon Valley, Skype, smart cities, Smart Cities: Big Data, Civic Hackers, and the Quest for a New Utopia, smart contracts, smart grid, smart meter, social graph, software studies, statistical model, TaskRabbit, text mining, The Chicago School, The Death and Life of Great American Cities, the market place, the medium is the message, the scientific method, Toyota Production System, urban planning, urban sprawl, web application

Such data include public administrative records, operational management information, as well as that produced by sensors, transponders and cameras that make up the internet of things, smartphones, wearables, social media, loyalty cards and commercial sources. In many cases, cities are turning to big data technologies and their novel distributed computational infrastructure for the reliable and fault tolerant storage, analysis and dissemination of data from various sources. In such systems, processing is generally brought to the data, rather than bringing data to the processing. Since each organization uses different platforms, operating systems and software to generate and analyse data, data sharing mechanisms should ideally be provided as platform-independent services so that they can be utilized by various users for different purposes, for example, for research, business, improving existing services of city authorities and organizations, and for facilitating communication between people and policymakers.

pages: 354 words: 26,550

High-Frequency Trading: A Practical Guide to Algorithmic Strategies and Trading Systems by Irene Aldridge

algorithmic trading, asset allocation, asset-backed security, automated trading system, backtesting, Black Swan, Brownian motion, business cycle, business process, buy and hold, capital asset pricing model, centralized clearinghouse, collapse of Lehman Brothers, collateralized debt obligation, collective bargaining, computerized trading, diversification, equity premium, fault tolerance, financial intermediation, fixed income, high net worth, implied volatility, index arbitrage, information asymmetry, interest rate swap, inventory management, law of one price, Long Term Capital Management, Louis Bachelier, margin call, market friction, market microstructure, martingale, Myron Scholes, New Journalism, p-value, paper trading, performance metric, profit motive, purchasing power parity, quantitative trading / quantitative finance, random walk, Renaissance Technologies, risk tolerance, risk-adjusted returns, risk/return, Sharpe ratio, short selling, Small Order Execution System, statistical arbitrage, statistical model, stochastic process, stochastic volatility, systematic trading, trade route, transaction costs, value at risk, yield curve, zero-sum game

New York–based MarketFactory provides a suite of software tools to help automated traders get an extra edge in the market, help their models scale, increase their fill ratios, reduce slippage, and thereby improve profitability (P&L). Chapter 18 discusses optimization of execution. Run-time risk management applications ensure that the system stays within prespecified behavioral and P&L bounds. Such applications may also be known as system-monitoring and fault-tolerance software. 26 HIGH-FREQUENCY TRADING r Mobile applications suitable for monitoring performance of highfrequency trading systems alert administration of any issues. r Real-time third-party research can stream advanced information and forecasts. Legal, Accounting, and Other Professional Services Like any business in the financial sector, high-frequency trading needs to make sure that “all i’s are dotted and all t’s are crossed” in the legal and accounting departments.

pages: 406 words: 105,602

The Startup Way: Making Entrepreneurship a Fundamental Discipline of Every Enterprise by Eric Ries

activist fund / activist shareholder / activist investor, Affordable Care Act / Obamacare, Airbnb, autonomous vehicles, barriers to entry, basic income, Ben Horowitz, Black-Scholes formula, call centre, centralized clearinghouse, Clayton Christensen, cognitive dissonance, connected car, corporate governance, DevOps, Elon Musk, en.wikipedia.org, fault tolerance, Frederick Winslow Taylor, global supply chain, index card, Jeff Bezos, Kickstarter, Lean Startup, loss aversion, Marc Andreessen, Mark Zuckerberg, means of production, minimum viable product, moral hazard, move fast and break things, move fast and break things, obamacare, peer-to-peer, place-making, rent-seeking, Richard Florida, Sam Altman, Sand Hill Road, secular stagnation, shareholder value, Silicon Valley, Silicon Valley startup, six sigma, skunkworks, Steve Jobs, the scientific method, time value of money, Toyota Production System, Uber for X, universal basic income, web of trust, Y Combinator

But it was anathema to everything he knew from the private sector. “We spent three or four weeks when the only visible thing we were doing was making everybody come to one place,” he recalls. “When things went wrong, we just went and found the person who was responsible.” In addition, the site architecture was so bad that the slightest problem had the potential to knock the whole thing out. There was no way to track issues, and none of the fault tolerance or resistance that such a massive system should have had in place, as a matter of course, existed. Faced with this quagmire, the team asked a single question: “Why is the site not working on October 22?” Then they worked backward, applying the management and technological practices that by now should sound familiar: small teams, rapid iteration, accountability metrics, and a culture of transparency without fear of recrimination.

pages: 648 words: 108,814

Solr 1.4 Enterprise Search Server by David Smiley, Eric Pugh

Amazon Web Services, bioinformatics, cloud computing, continuous integration, database schema, domain-specific language, en.wikipedia.org, fault tolerance, Firefox, information retrieval, Ruby on Rails, web application, Y Combinator

Obviously, this is a fairly complex setup and requires a fairly sophisticated load balancer to frontend this whole collection, but it does allow Solr to handle extremely large data sets. Where next for Solr scaling? There has been a fair amount of discussion on Solr mailing lists about setting up distributed Solr on a robust foundation that adapts to changing environment. There has been some investigation regarding using Apache Hadoop, a platform for building reliable, distributing computing as a foundation for Solr that would provide a robust fault-tolerant filesystem. Another interesting sub project of Hadoop is ZooKeeper, which aims to be a service for centralizing the management required by distributed applications. There has been some development work on integrating ZooKeeper as the management interface for Solr. Keep an eye on the Hadoop homepage for more information about these efforts at http://hadoop.apache.org/ and Zookeeper at http://hadoop.apache.org/zookeeper/.

pages: 1,266 words: 278,632

Backup & Recovery by W. Curtis Preston

Berlin Wall, business intelligence, business process, database schema, Debian, dumpster diving, failed state, fault tolerance, full text search, job automation, Kickstarter, side project, Silicon Valley, web application

These two steps are Sybase’s “ounce of prevention.” In addition to these dbcc tasks, you need to choose a transaction log archive strategy. If you follow these tasks, you will help maintain the database, keeping it running smoothly and ready for proper backups. dbcc: The Database Consistency Checker Even though Sybase’s dataserver products are very robust and much effort has gone into making them fault-tolerant, there is always the chance that a problem will occur. For very large tables, some of these problems might not show until very specific queries are run. This is one of the reasons for the database consistency checker, dbcc. This set of SQL commands can review all the database page allocations, linkages, and data pointers, finding problems and, in many cases, fixing them before they become insurmountable.

You can achieve ACID compliance in a MySQL database if you use the InnoDB or the NDB storage engines. (As of this writing, the MySQL team is developing other ACID-compliant storage engines.) With PostgreSQL, all data is stored in an ACID-compliant fashion. PostgreSQL also offers sophisticated features such as point-in-time recovery, tablespaces, checkpoints, hot backups, and write ahead logging for fault tolerance. These are all very good things from a data-protection and data-integrity standpoint. PostgreSQL Architecture From a power-user standpoint, PostgreSQL is like any other database. The following terms mean essentially the same in PostgreSQL as they do in any other relational database: Database Table Index Row Attribute Extent Partition Transaction Clusters A PostgreSQL cluster is analogous to an instance in other RDBMSs, and each cluster works with one or more databases.

pages: 302 words: 82,233

Beautiful security by Andy Oram, John Viega

Albert Einstein, Amazon Web Services, business intelligence, business process, call centre, cloud computing, corporate governance, credit crunch, crowdsourcing, defense in depth, Donald Davies, en.wikipedia.org, fault tolerance, Firefox, loose coupling, Marc Andreessen, market design, MITM: man-in-the-middle, Monroe Doctrine, new economy, Nicholas Carr, Nick Leeson, Norbert Wiener, optical character recognition, packet switching, peer-to-peer, performance metric, pirate software, Robert Bork, Search for Extraterrestrial Intelligence, security theater, SETI@home, Silicon Valley, Skype, software as a service, statistical model, Steven Levy, The Wisdom of Crowds, Upton Sinclair, web application, web of trust, zero day, Zimmermann PGP

After installation, the infected bot machine contacts the bot server to download additional components or obtain the latest commands, such as denial-of-service attacks or spam to send out. With this dynamic control and command infrastructure, the botnet owner can mobilize a massive amount of computing resources from one corner of the Internet to another within a matter of minutes. It should be noted that the control server itself might not be static. Botnets have evolved from a static control infrastructure to a peer-to-peer structure for the purposes of fault tolerance and evading detection. When one server is detected and blocked, other servers can step in and take over. It is also common for the control server to run on a compromised machine or by proxy, so that the botnet’s owner is unlikely to be identified. Botnets commonly communicate through the same method as their creators’ public IRC servers. Recently, however, we have seen botnets branch out to P2P, HTTPS, SMTP, and other protocols.

pages: 424 words: 114,905

Deep Medicine: How Artificial Intelligence Can Make Healthcare Human Again by Eric Topol

23andMe, Affordable Care Act / Obamacare, AI winter, Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem, artificial general intelligence, augmented reality, autonomous vehicles, bioinformatics, blockchain, cloud computing, cognitive bias, Colonization of Mars, computer age, computer vision, conceptual framework, creative destruction, crowdsourcing, Daniel Kahneman / Amos Tversky, dark matter, David Brooks, digital twin, Elon Musk, en.wikipedia.org, epigenetics, Erik Brynjolfsson, fault tolerance, George Santayana, Google Glasses, ImageNet competition, Jeff Bezos, job automation, job satisfaction, Joi Ito, Mark Zuckerberg, medical residency, meta analysis, meta-analysis, microbiome, natural language processing, new economy, Nicholas Carr, nudge unit, pattern recognition, performance metric, personalized medicine, phenotype, placebo effect, randomized controlled trial, recommendation engine, Rubik’s Cube, Sam Altman, self-driving car, Silicon Valley, speech recognition, Stephen Hawking, text mining, the scientific method, Tim Cook: Apple, War on Poverty, Watson beat the top human players on Jeopardy!, working-age population

Our energy-efficient brain uses only about 10 watts of power, less than a household light bulb, in a tiny space less than 2 liters, or smaller than a shoebox. The K supercomputer in Japan, by contrast, requires about 10 megawatts of power and occupies more than 1.3 million liters.56 Where our brain’s estimated 100 billion neurons and 100 trillion connections give it a high tolerance for failure—not to mention its astonishing ability to learn both with and without a teacher, from very few examples—even the most powerful computers have poor fault tolerance for any lost circuitry, and they certainly require plenty of programming before they can begin to learn, and then only from millions of examples. Another major difference is that our brain is relatively slow, with computation speeds 10 million times slower than machines, so a machine can respond to a stimulus much faster than we can. For example, when we see something, it takes about 200 milliseconds from the time light hits the retina to go through brain processing and get to conscious perception.57 Another important difference between computers and humans is that machines don’t generally know how to update their memories and overwrite information that isn’t useful.

pages: 960 words: 125,049

Mastering Ethereum: Building Smart Contracts and DApps by Andreas M. Antonopoulos, Gavin Wood Ph. D.

Amazon Web Services, bitcoin, blockchain, continuous integration, cryptocurrency, Debian, domain-specific language, don't repeat yourself, Edward Snowden, en.wikipedia.org, Ethereum, ethereum blockchain, fault tolerance, fiat currency, Firefox, Google Chrome, intangible asset, Internet of things, litecoin, move fast and break things, move fast and break things, node package manager, peer-to-peer, Ponzi scheme, prediction markets, pull request, QR code, Ruby on Rails, Satoshi Nakamoto, sealed-bid auction, sharing economy, side project, smart contracts, transaction costs, Turing complete, Turing machine, Vickrey auction, web application, WebSocket

While providing high availability, auditability, transparency, and neutrality, it also reduces or eliminates censorship and reduces certain counterparty risks. Compared to Bitcoin Many people will come to Ethereum with some prior experience of cryptocurrencies, specifically Bitcoin. Ethereum shares many common elements with other open blockchains: a peer-to-peer network connecting participants, a Byzantine fault–tolerant consensus algorithm for synchronization of state updates (a proof-of-work blockchain), the use of cryptographic primitives such as digital signatures and hashes, and a digital currency (ether). Yet in many ways, both the purpose and construction of Ethereum are strikingly different from those of the open blockchains that preceded it, including Bitcoin. Ethereum’s purpose is not primarily to be a digital currency payment network.

pages: 571 words: 124,448

Building Habitats on the Moon: Engineering Approaches to Lunar Settlements by Haym Benaroya

3D printing, biofilm, Black Swan, Brownian motion, Buckminster Fuller, carbon-based life, centre right, clean water, Colonization of Mars, Computer Numeric Control, conceptual framework, data acquisition, Elon Musk, fault tolerance, gravity well, inventory management, Johannes Kepler, low earth orbit, orbital mechanics / astrodynamics, performance metric, RAND corporation, risk tolerance, Ronald Reagan, stochastic process, telepresence, telerobotics, the scientific method, urban planning, X Prize, zero-sum game

An additional assumption regarding the calculation of reliability for redundant systems is that the extra elements, the so-called mediating elements that establish the need for the redundant system to activate in the event of a failure of a primary system, must themselves be completely reliable. Such mediating elements do not have their own redundant systems. “Despite being critical to the reliability of redundant systems, however, mediating systems cannot be redundant themselves, as then they would need mediating, leading to an infinite regress. ‘The daunting truth,’ to quote a 1993 report to the FAA, ‘is that some of the core [mediating] mechanisms in fault-tolerant systems are single points of failure: they just have to work correctly’.” The assumption of independence of elements in the system, whether for purposes of redundancy or as part of the system model, can also be a cause of failure. Interdependencies (correlations) exist in complex systems at the least because they are operating in, and are driven by, the same environment. Also, “seemingly redundant and isolated elements frequently link to each other at the level of the people who operate and maintain them. … Technological failures open a window for human error.”

pages: 448 words: 117,325

Click Here to Kill Everybody: Security and Survival in a Hyper-Connected World by Bruce Schneier

23andMe, 3D printing, autonomous vehicles, barriers to entry, bitcoin, blockchain, Brian Krebs, business process, cloud computing, cognitive bias, computer vision, connected car, corporate governance, crowdsourcing, cryptocurrency, cuban missile crisis, Daniel Kahneman / Amos Tversky, David Heinemeier Hansson, Donald Trump, drone strike, Edward Snowden, Elon Musk, fault tolerance, Firefox, Flash crash, George Akerlof, industrial robot, information asymmetry, Internet of things, invention of radio, job automation, job satisfaction, John Markoff, Kevin Kelly, license plate recognition, loose coupling, market design, medical malpractice, Minecraft, MITM: man-in-the-middle, move fast and break things, move fast and break things, national security letter, Network effects, pattern recognition, profit maximization, Ralph Nader, RAND corporation, ransomware, Rodney Brooks, Ross Ulbricht, security theater, self-driving car, Shoshana Zuboff, Silicon Valley, smart cities, smart transportation, Snapchat, Stanislav Petrov, Stephen Hawking, Stuxnet, The Market for Lemons, too big to fail, Uber for X, Unsafe at Any Speed, uranium enrichment, Valery Gerasimov, web application, WikiLeaks, zero day

In 2014, the Turkish government used this technique to censor parts of the Internet. In 2017, traffic to and from several major US ISPs was briefly routed to an obscure Russian Internet provider. And don’t think this kind of attack is limited to nation-states; a 2008 talk at the DefCon hackers conference showed how anyone can do it. When the Internet was developed, what security there was focused on physical attacks against the network. Its fault-tolerant architecture can handle servers and connections failing or being destroyed. What it can’t handle is systemic attacks against the underlying protocols. The base Internet protocols were developed without security in mind, and many of them remain insecure to this day. There’s no security in the “From” line of an e-mail: anyone can pretend to be anyone. There’s no security in the Domain Name Service that translates Internet addresses from human-readable names to computer-readable numeric addresses, or the Network Time Protocol that keeps everything in synch.

pages: 587 words: 117,894

Cybersecurity: What Everyone Needs to Know by P. W. Singer, Allan Friedman

4chan, A Declaration of the Independence of Cyberspace, Apple's 1984 Super Bowl advert, barriers to entry, Berlin Wall, bitcoin, blood diamonds, borderless world, Brian Krebs, business continuity plan, Chelsea Manning, cloud computing, crowdsourcing, cuban missile crisis, data acquisition, do-ocracy, drone strike, Edward Snowden, energy security, failed state, Fall of the Berlin Wall, fault tolerance, global supply chain, Google Earth, Internet of things, invention of the telegraph, John Markoff, Julian Assange, Khan Academy, M-Pesa, MITM: man-in-the-middle, mutually assured destruction, Network effects, packet switching, Peace of Westphalia, pre–internet, profit motive, RAND corporation, ransomware, RFC: Request For Comment, risk tolerance, rolodex, Silicon Valley, Skype, smart grid, Steve Jobs, Stuxnet, uranium enrichment, We are Anonymous. We are Legion, web application, WikiLeaks, zero day, zero-sum game

There are three elements behind the concept. One is the importance of building in “the intentional capacity to work under degraded conditions.” Beyond that, resilient systems must also recover quickly, and, finally, learn lessons to deal better with future threats. For decades, most major corporations have had business continuity plans for fires or natural disasters, while the electronics industry has measured what it thinks of as fault tolerance, and the communications industry has talked about reliability and redundancy in its operations. All of these fit into the idea of resilience, but most assume some natural disaster, accident, failure, or crisis rather than deliberate attack. This is where cybersecurity must go in a very different direction: if you are only thinking in terms of reliability, a network can be made resilient merely by creating redundancies.

pages: 461 words: 125,845

This Machine Kills Secrets: Julian Assange, the Cypherpunks, and Their Fight to Empower Whistleblowers by Andy Greenberg

Apple II, Ayatollah Khomeini, Berlin Wall, Bill Gates: Altair 8800, Burning Man, Chelsea Manning, computerized markets, crowdsourcing, cryptocurrency, domain-specific language, drone strike, en.wikipedia.org, fault tolerance, hive mind, Jacob Appelbaum, Julian Assange, Mahatma Gandhi, Mitch Kapor, MITM: man-in-the-middle, Mohammed Bouazizi, nuclear winter, offshore financial centre, pattern recognition, profit motive, Ralph Nader, Richard Stallman, Robert Hanssen: Double agent, Silicon Valley, Silicon Valley ideology, Skype, social graph, statistical model, stem cell, Steve Jobs, Steve Wozniak, Steven Levy, undersea cable, Vernor Vinge, We are Anonymous. We are Legion, We are the 99%, WikiLeaks, X Prize, Zimmermann PGP

And Nick Mathewson, Tor’s grinning, round-faced, ponytailed chief architect and codirector, had kicked off the day by dropping the room into the deep end of the cryptographic swimming pool. The geekery had gotten so thick that even some of Tor’s modern-day cypherpunks and volunteer coders, loath as they might have been to admit it, might just have gotten lost. Within minutes, Mathewson, wearing a sport jacket over a Tor T-shirt over a dwarfish potbelly, was delving into security issues like “epistemic attacks” and “Byzantine fault tolerances.” By the time he sat down, still grinning, a growing fraction of the room seemed baffled or possibly bored. Appelbaum’s presence, on the other hand, is as much guerrilla as geek. He’s Tor’s field researcher, unofficial revolutionary, and man on the ground in countries from Qatar to Brazil. And he knows the appeal of a sexy piece of hardware. After instantly acquiring the room’s attention, Appelbaum explains that the device his small audience is ogling is a satellite modem, one that he’s just rented with the aim of figuring out how to make Tor accessible to those in the Middle East who need to use satellite connections to access the Internet.

Autonomous Driving: How the Driverless Revolution Will Change the World by Andreas Herrmann, Walter Brenner, Rupert Stadler

Airbnb, Airbus A320, augmented reality, autonomous vehicles, blockchain, call centre, carbon footprint, cleantech, computer vision, conceptual framework, connected car, crowdsourcing, cyber-physical system, DARPA: Urban Challenge, data acquisition, demand response, digital map, disruptive innovation, Elon Musk, fault tolerance, fear of failure, global supply chain, industrial cluster, intermodal, Internet of things, Jeff Bezos, Lyft, manufacturing employment, market fundamentalism, Mars Rover, Masdar, megacity, Pearl River Delta, peer-to-peer rental, precision agriculture, QWERTY keyboard, RAND corporation, ride hailing / ride sharing, self-driving car, sensor fusion, sharing economy, Silicon Valley, smart cities, smart grid, smart meter, Steve Jobs, Tesla Model S, Tim Cook: Apple, uber lyft, upwardly mobile, urban planning, Zipcar

Toyota and Rand Corporation have published calculations of the number of miles self-driving cars have to be tested before they can be assessed as roadworthy because the algorithms required for driverless cars undergo self-learning in multiple road traffic situations. The more traffic situations these algorithms are exposed to, the better prepared they are to master a new situation. Designing this training process so that the accuracy demanded by Jen-Hsun Huang is obtained will be the crucial challenge in the development of autonomous vehicles. When discussing what fault tolerance might be acceptable, it should be borne in mind that people are more likely to forgive mistakes made by other people than mistakes made by machines. This also applies to driving errors, which are more likely to be overlooked if they were committed by a driver and not by a machine. This means that autonomous vehicles will only be accepted if they cause significantly fewer errors than the drivers.

pages: 482 words: 125,973

Competition Demystified by Bruce C. Greenwald

additive manufacturing, airline deregulation, AltaVista, asset allocation, barriers to entry, business cycle, creative destruction, cross-subsidies, deindustrialization, discounted cash flows, diversified portfolio, Everything should be made as simple as possible, fault tolerance, intangible asset, John Nash: game theory, Nash equilibrium, Network effects, new economy, oil shock, packet switching, pets.com, price discrimination, price stability, selective serotonin reuptake inhibitor (SSRI), shareholder value, Silicon Valley, six sigma, Steve Jobs, transaction costs, yield management, zero-sum game

TABLE 6.1 Compaq and Dell, 1990 and 1995 ($ million, costs as a percentage of sales) FIGURE 6.4 Compaq’s return on invested capital and operating income margin, 1990–2001 For a time, the approach was successful, as the company combined strong sales growth with decent operating margins and high return on invested capital (figure 6.4).* But ingrained cultures are difficult to uproot. The engineering mentality and love of technology that was part of Compaq’s tradition did not disappear, even after Rod Canion left. In 1997 the company bought Tandem Computers, a firm that specialized in producing fault-tolerant machines designed for uninterruptible transaction processing. A year later it bought Digital Equipment Corporation, a former engineering star in the computing world which had fallen from grace as its minicomputer bastion was undermined by the personal computer revolution. At the time of the purchase, Compaq wanted DEC for its consulting business, its AltaVista Internet search engine, and some in-process research.

pages: 448 words: 71,301

Programming Scala by Unknown

domain-specific language, en.wikipedia.org, fault tolerance, general-purpose programming language, loose coupling, type inference, web application

Miscellaneous Scala libraries Name Description and URL Kestrel A tiny, very fast queue system (http://github.com/robey/kestrel/tree/master). ScalaModules Scala DSL to ease OSGi development (http://code.google.com/p/scalamodules/). Configgy Managing configuration files and logging for “daemons” written in Scala (http://www.lag.net/configgy/). scouchdb Scala interface to CouchDB (http://code.google.com/p/scouchdb/). Akka A project to implement a platform for building fault-tolerant, distributed applications based on REST, Actors, etc. (http://akkasource.org/). scala-query A type-safe database query API for Scala (http://github.com/szeiger/scala-query/tree/master). We’ll discuss using Scala with several well-known Java libraries after we discuss Java interoperability, next. 368 | Chapter 14: Scala Tools, Libraries, and IDE Support Download at WoweBook.Com Java Interoperability Of all the alternative JVM languages, Scala’s interoperability with Java source code is among the most seamless.

pages: 458 words: 137,960

Ready Player One by Ernest Cline

Albert Einstein, call centre, dematerialisation, fault tolerance, financial independence, game design, late fees, pre–internet, Rubik’s Cube, side project, telemarketer, walking around money

It managed to overcome limitations that had plagued previous simulated realities. In addition to restricting the overall size of their virtual environments, earlier MMOs had been forced to limit their virtual populations, usually to a few thousand users per server. If too many people were logged in at the same time, the simulation would slow to a crawl and avatars would freeze in midstride as the system struggled to keep up. But the OASIS utilized a new kind of fault-tolerant server array that could draw additional processing power from every computer connected to it. At the time of its initial launch, the OASIS could handle up to five million simultaneous users, with no discernible latency and no chance of a system crash. A massive marketing campaign promoted the launch of the OASIS. The pervasive television, billboard, and Internet ads featured a lush green oasis, complete with palm trees and a pool of crystal blue water, surrounded on all sides by a vast barren desert.

pages: 559 words: 130,949

Learn You a Haskell for Great Good!: A Beginner's Guide by Miran Lipovaca

fault tolerance, loose coupling, type inference

With the sum operator, we return a stack that has only one element, which is the sum of the stack so far. ghci> solveRPN "2.7 ln" 0.9932517730102834 ghci> solveRPN "10 10 10 10 sum 4 /" 10.0 ghci> solveRPN "10 10 10 10 10 sum 4 /" 12.5 ghci> solveRPN "10 2 ^" 100.0 I think that making a function that can calculate arbitrary floating-point RPN expressions and has the option to be easily extended in 10 lines is pretty awesome. Note This RPN calculation solution is not really fault tolerant. When given input that doesn’t make sense, it might result in a runtime error. But don’t worry, you’ll learn how to make this function more robust in Chapter 14. Heathrow to London Suppose that we’re on a business trip. Our plane has just landed in England, and we rent a car. We have a meeting really soon, and we need to get from Heathrow Airport to London as fast as we can (but safely!).

pages: 474 words: 130,575

Surveillance Valley: The Rise of the Military-Digital Complex by Yasha Levine

23andMe, activist fund / activist shareholder / activist investor, Airbnb, AltaVista, Amazon Web Services, Anne Wojcicki, anti-communist, Apple's 1984 Super Bowl advert, bitcoin, borderless world, British Empire, call centre, Chelsea Manning, cloud computing, collaborative editing, colonial rule, computer age, computerized markets, corporate governance, crowdsourcing, cryptocurrency, digital map, don't be evil, Donald Trump, Douglas Engelbart, Douglas Engelbart, drone strike, Edward Snowden, El Camino Real, Electric Kool-Aid Acid Test, Elon Musk, fault tolerance, George Gilder, ghettoisation, global village, Google Chrome, Google Earth, Google Hangouts, Howard Zinn, hypertext link, IBM and the Holocaust, index card, Jacob Appelbaum, Jeff Bezos, jimmy wales, John Markoff, John von Neumann, Julian Assange, Kevin Kelly, Kickstarter, life extension, Lyft, Mark Zuckerberg, market bubble, Menlo Park, Mitch Kapor, natural language processing, Network effects, new economy, Norbert Wiener, packet switching, PageRank, Paul Buchheit, peer-to-peer, Peter Thiel, Philip Mirowski, plutocrats, Plutocrats, private military company, RAND corporation, Ronald Reagan, Ross Ulbricht, Satoshi Nakamoto, self-driving car, sentiment analysis, shareholder value, side project, Silicon Valley, Silicon Valley startup, Skype, slashdot, Snapchat, speech recognition, Steve Jobs, Steve Wozniak, Steven Levy, Stewart Brand, Telecommunications Act of 1996, telepresence, telepresence robot, The Bell Curve by Richard Herrnstein and Charles Murray, The Hackers Conference, uber lyft, Whole Earth Catalog, Whole Earth Review, WikiLeaks

Paul Syverson wrote the NRL Review article along with two other cocreators of onion routing, David Goldschlag and Michael Reed, mathematicians and computer systems researchers working for the US Navy. NRL Review was an in-house navy magazine that showcased all the cool gadgets cooked up by the lab over the previous year. D. M. Goldschlag, M. G. Reed, and P. F. Syverson, “Internet Communication Resistant to Traffic Analysis,” NRL Review, April 1997. 13. This last stage of development was funded by both the Office of Naval Research and DARPA under its Fault Tolerant Networks Program. The amount of the DARPA funding is unknown. “Onion Routing: Brief Selected History,” website formerly operated by the Center for High Assurance Computer Systems in the Information Technology Division of the US Naval Research Lab, 2005, accessed July 6, 2017, https://www.onion-router.net/History.html. 14. Paul Syverson, email message sent to [tor-talk], “Iran cracks down on web dissident technology,” Tor Project, March 21, 2011, http://web.archive.org/web/20170521144023/https:/lists.torproject.org/pipermail/tor-talk/2011-March/019868.html. 15.

pages: 1,025 words: 150,187

ZeroMQ by Pieter Hintjens

AGPL, anti-pattern, carbon footprint, cloud computing, Debian, distributed revision control, domain-specific language, factory automation, fault tolerance, fear of failure, finite state, Internet of things, iterative process, premature optimization, profit motive, pull request, revision control, RFC: Request For Comment, Richard Stallman, Skype, smart transportation, software patent, Steve Jobs, Valgrind, WebSocket

Postface Tales from Out There I asked some of the contributors to this book to tell us what they were doing with ØMQ. Here are their stories. Rob Gagnon’s Story “We use ØMQ to assist in aggregating thousands of events occurring every minute across our global network of telecommunications servers so that we can accurately report and monitor for situations that require our attention. ØMQ made the development of the system not only easier, but faster to develop and more robust and fault-tolerant than we had originally planned in our original design. “We’re able to easily add and remove clients from the network without the loss of any message. If we need to enhance the server portion of our system, we can stop and restart it as well, without having to worry about stopping all of the clients first. The built-in buffering of ØMQ makes this all possible.” Tom van Leeuwen’s Story “I was looking at creating some kind of service bus connecting all kinds of services together.

pages: 470 words: 144,455

Secrets and Lies: Digital Security in a Networked World by Bruce Schneier

Ayatollah Khomeini, barriers to entry, business process, butterfly effect, cashless society, Columbine, defense in depth, double entry bookkeeping, fault tolerance, game design, IFF: identification friend or foe, John von Neumann, knapsack problem, MITM: man-in-the-middle, moral panic, mutually assured destruction, pez dispenser, pirate software, profit motive, Richard Feynman, risk tolerance, Silicon Valley, Simon Singh, slashdot, statistical model, Steve Ballmer, Steven Levy, the payments system, Y2K, Yogi Berra

Availability has been defined by various security standards as “the property that a product’s services are accessible when needed and without undue delay,” or “the property of being accessible and usable upon demand by an authorized entity.” These definitions have always struck me as being somewhat circular. We know intuitively what we mean by availability with respect to computers: We want the computer to work when we expect it to as we expect it to. Lots of software doesn’t work when and as we expect it to, and there are entire areas of computer science research in reliability and fault- tolerant computing and software quality ... none of which has anything to do with security. In the context of security, availability is about ensuring that an attacker can’t prevent legitimate users from having reasonable access to their systems. For example, availability is about ensuring that denial-of-service attacks are not possible. ACCESS CONTROL Confidentiality, availability, and integrity all boil down to access control.

pages: 598 words: 134,339

Data and Goliath: The Hidden Battles to Collect Your Data and Control Your World by Bruce Schneier

23andMe, Airbnb, airport security, AltaVista, Anne Wojcicki, augmented reality, Benjamin Mako Hill, Black Swan, Boris Johnson, Brewster Kahle, Brian Krebs, call centre, Cass Sunstein, Chelsea Manning, citizen journalism, cloud computing, congestion charging, disintermediation, drone strike, Edward Snowden, experimental subject, failed state, fault tolerance, Ferguson, Missouri, Filter Bubble, Firefox, friendly fire, Google Chrome, Google Glasses, hindsight bias, informal economy, Internet Archive, Internet of things, Jacob Appelbaum, Jaron Lanier, John Markoff, Julian Assange, Kevin Kelly, license plate recognition, lifelogging, linked data, Lyft, Mark Zuckerberg, moral panic, Nash equilibrium, Nate Silver, national security letter, Network effects, Occupy movement, Panopticon Jeremy Bentham, payday loans, pre–internet, price discrimination, profit motive, race to the bottom, RAND corporation, recommendation engine, RFID, Ross Ulbricht, self-driving car, Shoshana Zuboff, Silicon Valley, Skype, smart cities, smart grid, Snapchat, social graph, software as a service, South China Sea, stealth mode startup, Steven Levy, Stuxnet, TaskRabbit, telemarketer, Tim Cook: Apple, transaction costs, Uber and Lyft, uber lyft, undersea cable, urban planning, WikiLeaks, zero day

Advancing technology adds new perturbations into existing systems, creating instabilities. If systemic imperfections are inevitable, we have to accept them—in laws, in government institutions, in corporations, in individuals, in society. We have to design systems that expect them and can work despite them. If something is going to fail or break, we need it to fail in a predictable way. That’s resilience. In systems design, resilience comes from a combination of elements: fault-tolerance, mitigation, redundancy, adaptability, recoverability, and survivability. It’s what we need in the complex and ever-changing threat landscape I’ve described in this book. I am advocating for several flavors of resilience for both our systems of surveillance and our systems that control surveillance: resilience to hardware and software failure, resilience to technological innovation, resilience to political change, and resilience to coercion.

pages: 496 words: 154,363

I'm Feeling Lucky: The Confessions of Google Employee Number 59 by Douglas Edwards

Albert Einstein, AltaVista, Any sufficiently advanced technology is indistinguishable from magic, barriers to entry, book scanning, Build a better mousetrap, Burning Man, business intelligence, call centre, commoditize, crowdsourcing, don't be evil, Elon Musk, fault tolerance, Googley, gravity well, invisible hand, Jeff Bezos, job-hopping, John Markoff, Kickstarter, Marc Andreessen, Menlo Park, microcredit, music of the spheres, Network effects, PageRank, performance metric, pets.com, Ralph Nader, risk tolerance, second-price auction, side project, Silicon Valley, Silicon Valley startup, slashdot, stem cell, Superbowl ad, Y2K

A global shortage of RAM (memory) made it worse, and Google's system, which had never been all that robust, started wheezing asthmatically. Part of the problem was that Google had built its system to fail. "Build machines so cheap that we don't care if they fail. And if they fail, just ignore them until we get around to fixing them." That was Google's strategy, according to hardware designer Will Whitted, who joined the company in 2001. "That concept of using commodity parts and of being extremely fault tolerant, of writing the software in a way that the hardware didn't have to be very good, was just brilliant." But only if you could get the parts to fix the broken computers and keep adding new machines. Or if you could improve the machines' efficiency so you didn't need so many of them. The first batch of Google servers had been so hastily assembled that the solder points on the motherboards touched the metal of the trays beneath them, so the engineers added corkboard liners as insulation.

pages: 523 words: 143,139

Algorithms to Live By: The Computer Science of Human Decisions by Brian Christian, Tom Griffiths

4chan, Ada Lovelace, Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem, Albert Einstein, algorithmic trading, anthropic principle, asset allocation, autonomous vehicles, Bayesian statistics, Berlin Wall, Bill Duvall, bitcoin, Community Supported Agriculture, complexity theory, constrained optimization, cosmological principle, cryptocurrency, Danny Hillis, David Heinemeier Hansson, delayed gratification, dematerialisation, diversification, Donald Knuth, double helix, Elon Musk, fault tolerance, Fellow of the Royal Society, Firefox, first-price auction, Flash crash, Frederick Winslow Taylor, George Akerlof, global supply chain, Google Chrome, Henri Poincaré, information retrieval, Internet Archive, Jeff Bezos, Johannes Kepler, John Nash: game theory, John von Neumann, Kickstarter, knapsack problem, Lao Tzu, Leonard Kleinrock, linear programming, martingale, Nash equilibrium, natural language processing, NP-complete, P = NP, packet switching, Pierre-Simon Laplace, prediction markets, race to the bottom, RAND corporation, RFC: Request For Comment, Robert X Cringely, Sam Altman, sealed-bid auction, second-price auction, self-driving car, Silicon Valley, Skype, sorting algorithm, spectrum auction, Stanford marshmallow experiment, Steve Jobs, stochastic process, Thomas Bayes, Thomas Malthus, traveling salesman, Turing machine, urban planning, Vickrey auction, Vilfredo Pareto, Walter Mischel, Y Combinator, zero-sum game

The winner of that particular honor is an algorithm called Comparison Counting Sort. In this algorithm, each item is compared to all the others, generating a tally of how many items it is bigger than. This number can then be used directly as the item’s rank. Since it compares all pairs, Comparison Counting Sort is a quadratic-time algorithm, like Bubble Sort. Thus it’s not a popular choice in traditional computer science applications, but it’s exceptionally fault-tolerant. This algorithm’s workings should sound familiar. Comparison Counting Sort operates exactly like a Round-Robin tournament. In other words, it strongly resembles a sports team’s regular season—playing every other team in the division and building up a win-loss record by which they are ranked. That Comparison Counting Sort is the single most robust sorting algorithm known, quadratic or better, should offer something very specific to sports fans: if your team doesn’t make the playoffs, don’t whine.

pages: 590 words: 152,595

Army of None: Autonomous Weapons and the Future of War by Paul Scharre

active measures, Air France Flight 447, algorithmic trading, artificial general intelligence, augmented reality, automated trading system, autonomous vehicles, basic income, brain emulation, Brian Krebs, cognitive bias, computer vision, cuban missile crisis, dark matter, DARPA: Urban Challenge, DevOps, drone strike, Elon Musk, en.wikipedia.org, Erik Brynjolfsson, facts on the ground, fault tolerance, Flash crash, Freestyle chess, friendly fire, IFF: identification friend or foe, ImageNet competition, Internet of things, Johann Wolfgang von Goethe, John Markoff, Kevin Kelly, Loebner Prize, loose coupling, Mark Zuckerberg, moral hazard, mutually assured destruction, Nate Silver, pattern recognition, Rodney Brooks, Rubik’s Cube, self-driving car, sensor fusion, South China Sea, speech recognition, Stanislav Petrov, Stephen Hawking, Steve Ballmer, Steve Wozniak, Stuxnet, superintelligent machines, Tesla Model S, The Signal and the Noise by Nate Silver, theory of mind, Turing test, universal basic income, Valery Gerasimov, Wall-E, William Langewiesche, Y2K, zero day

The organization that enables high reliability is not available—the machine is on its own, at least for some period of time. Safety under these conditions requires something more than high-reliability organizations. It requires high-reliability fully autonomous complex machines, and there is no precedent for such systems. This would require a vastly different kind of machine from Aegis, one that was exceptionally predictable to the user but not to the enemy, and with a fault-tolerant design that defaulted to safe operations in the event of failures. Given the state of technology today, no one knows how to build a complex system that is 100 percent fail-safe. It is tempting to think that future systems will change this dynamic. The promise of “smarter” machines is seductive: they will be more advanced, more intelligent, and therefore able to account for more variables and avoid failures.

pages: 528 words: 146,459

Computer: A History of the Information Machine by Martin Campbell-Kelly, William Aspray, Nathan L. Ensmenger, Jeffrey R. Yost

Ada Lovelace, air freight, Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem, Apple's 1984 Super Bowl advert, barriers to entry, Bill Gates: Altair 8800, borderless world, Buckminster Fuller, Build a better mousetrap, Byte Shop, card file, cashless society, cloud computing, combinatorial explosion, computer age, deskilling, don't be evil, Donald Davies, Douglas Engelbart, Douglas Engelbart, Dynabook, fault tolerance, Fellow of the Royal Society, financial independence, Frederick Winslow Taylor, game design, garden city movement, Grace Hopper, informal economy, interchangeable parts, invention of the wheel, Jacquard loom, Jeff Bezos, jimmy wales, John Markoff, John von Neumann, Kickstarter, light touch regulation, linked data, Marc Andreessen, Mark Zuckerberg, Marshall McLuhan, Menlo Park, Mitch Kapor, natural language processing, Network effects, New Journalism, Norbert Wiener, Occupy movement, optical character recognition, packet switching, PageRank, pattern recognition, Pierre-Simon Laplace, pirate software, popular electronics, prediction markets, pre–internet, QWERTY keyboard, RAND corporation, Robert X Cringely, Silicon Valley, Silicon Valley startup, Steve Jobs, Steven Levy, Stewart Brand, Ted Nelson, the market place, Turing machine, Vannevar Bush, Von Neumann architecture, Whole Earth Catalog, William Shockley: the traitorous eight, women in the workforce, young professional

Although computer technology is at the heart of the Internet, its importance is economic and social: the Internet gives computer users the ability to communicate, to gain access to information sources, and to conduct business. I. From the World Brain to the World Wide Web The Internet sprang from a confluence of three desires, two that emerged in the 1960s and one that originated much further back in time. First, there was the rather utilitarian desire for an efficient, fault-tolerant networking technology, suitable for military communications, that would never break down. Second, there was a wish to unite the world’s computer networks into a single system. Just as the telephone would never have become the dominant person-to-person communications medium if users had been restricted to the network of their particular provider, so the world’s isolated computer networks would be far more useful if they were joined together.

Turing's Cathedral by George Dyson

1919 Motor Transport Corps convoy, Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem, Albert Einstein, anti-communist, Benoit Mandelbrot, British Empire, Brownian motion, cellular automata, cloud computing, computer age, Danny Hillis, dark matter, double helix, fault tolerance, Fellow of the Royal Society, finite state, Georg Cantor, Henri Poincaré, housing crisis, IFF: identification friend or foe, indoor plumbing, Isaac Newton, Jacquard loom, John von Neumann, mandelbrot fractal, Menlo Park, Murray Gell-Mann, Norbert Wiener, Norman Macrae, packet switching, pattern recognition, Paul Erdős, Paul Samuelson, phenotype, planetary scale, RAND corporation, random walk, Richard Feynman, SETI@home, social graph, speech recognition, Thorstein Veblen, Turing complete, Turing machine, Von Neumann architecture

“If the only demerit of the digital expansion system were its greater logical complexity, nature would not, for this reason alone, have rejected it,” von Neumann admitted in 1948.48 Search engines and social networks are analog computers of unprecedented scale. Information is being encoded (and operated upon) as continuous (and noise-tolerant) variables such as frequencies (of connection or occurrence) and the topology of what connects where, with location being increasingly defined by a fault-tolerant template rather than by an unforgiving numerical address. Pulse-frequency coding for the Internet is one way to describe the working architecture of a search engine, and PageRank for neurons is one way to describe the working architecture of the brain. These computational structures use digital components, but the analog computing being performed by the system as a whole exceeds the complexity of the digital code on which it runs.

pages: 552 words: 168,518

MacroWikinomics: Rebooting Business and the World by Don Tapscott, Anthony D. Williams

accounting loophole / creative accounting, airport security, Andrew Keen, augmented reality, Ayatollah Khomeini, barriers to entry, Ben Horowitz, bioinformatics, Bretton Woods, business climate, business process, buy and hold, car-free, carbon footprint, Charles Lindbergh, citizen journalism, Clayton Christensen, clean water, Climategate, Climatic Research Unit, cloud computing, collaborative editing, collapse of Lehman Brothers, collateralized debt obligation, colonial rule, commoditize, corporate governance, corporate social responsibility, creative destruction, crowdsourcing, death of newspapers, demographic transition, disruptive innovation, distributed generation, don't be evil, en.wikipedia.org, energy security, energy transition, Exxon Valdez, failed state, fault tolerance, financial innovation, Galaxy Zoo, game design, global village, Google Earth, Hans Rosling, hive mind, Home mortgage interest deduction, information asymmetry, interchangeable parts, Internet of things, invention of movable type, Isaac Newton, James Watt: steam engine, Jaron Lanier, jimmy wales, Joseph Schumpeter, Julian Assange, Kevin Kelly, Kickstarter, knowledge economy, knowledge worker, Marc Andreessen, Marshall McLuhan, mass immigration, medical bankruptcy, megacity, mortgage tax deduction, Netflix Prize, new economy, Nicholas Carr, oil shock, old-boy network, online collectivism, open borders, open economy, pattern recognition, peer-to-peer lending, personalized medicine, Ray Kurzweil, RFID, ride hailing / ride sharing, Ronald Reagan, Rubik’s Cube, scientific mainstream, shareholder value, Silicon Valley, Skype, smart grid, smart meter, social graph, social web, software patent, Steve Jobs, text mining, the scientific method, The Wisdom of Crowds, transaction costs, transfer pricing, University of East Anglia, urban sprawl, value at risk, WikiLeaks, X Prize, young professional, Zipcar

To make it work, you’ll need to reveal your IP in an appropriate network, socializing it with participants and letting it spawn new knowledge and invention. You’ll need to stay plugged into the community so that you can leverage new contributions as they come in. You’ll also need to dedicate some resources to filtering and aggregating contributions. It can be a lot of work, but these types of collaborations can produce more robust, user-defined, fault-tolerant products in less time and for less expense than the conventional closed approach. 3. LET GO Leaders in business and society who are attempting to transform their organizations have many understandable concerns about moving forward. One of the biggest is a fear of losing control. I can’t open up, it’s too risky. Our lawyers would go berserk. There are too many obstacles. I can’t empower others to make decisions because I’ll get all the blame if they get it wrong.

pages: 604 words: 161,455

The Moral Animal: Evolutionary Psychology and Everyday Life by Robert Wright

"Robert Solow", agricultural Revolution, Andrei Shleifer, Asian financial crisis, British Empire, centre right, cognitive dissonance, double entry bookkeeping, double helix, fault tolerance, Francis Fukuyama: the end of history, George Gilder, global village, invention of gunpowder, invention of movable type, invention of the telegraph, invention of writing, invisible hand, John Nash: game theory, John von Neumann, Marshall McLuhan, Norbert Wiener, planetary scale, pre–internet, profit motive, Ralph Waldo Emerson, random walk, Richard Thaler, rising living standards, Silicon Valley, social intelligence, social web, Steven Pinker, talking drums, the medium is the message, The Wealth of Nations by Adam Smith, trade route, your tax dollars at work, zero-sum game

The iron horseshoe and the windpipe-friendly harness seem to have been invented in Asia and then to have leapt from person to person to person—maybe hitching a ride with nomads for a time—all the way to the Atlantic Ocean. One key to the resilience of this giant multicultural brain is its multiculturalness. No one culture is in charge, so no one culture controls the memes (though some try in vain). This decentralization makes epic social setbacks of reliably limited duration; the system is “fault-tolerant,” as computer engineers say. While Europe fell into its slough of despond, Byzantium and southern China stayed standing, India had ups and downs, and the newborn Islamic civilization flourished. These cultures performed two key services: inventing neat new things that would eventually spread into Europe (the spinning wheel probably arose somewhere in the Orient); and conserving useful old things that were now scarce in Europe (the astrolabe, a Greek invention, came to Europe via Islam, as did Ptolemy’s astronomy—which, though ultimately wrong, worked for navigational purposes).

pages: 666 words: 181,495

In the Plex: How Google Thinks, Works, and Shapes Our Lives by Steven Levy

23andMe, AltaVista, Anne Wojcicki, Apple's 1984 Super Bowl advert, autonomous vehicles, book scanning, Brewster Kahle, Burning Man, business process, clean water, cloud computing, crowdsourcing, Dean Kamen, discounted cash flows, don't be evil, Donald Knuth, Douglas Engelbart, Douglas Engelbart, El Camino Real, fault tolerance, Firefox, Gerard Salton, Gerard Salton, Google bus, Google Chrome, Google Earth, Googley, HyperCard, hypertext link, IBM and the Holocaust, informal economy, information retrieval, Internet Archive, Jeff Bezos, John Markoff, Kevin Kelly, Kickstarter, Mark Zuckerberg, Menlo Park, one-China policy, optical character recognition, PageRank, Paul Buchheit, Potemkin village, prediction markets, recommendation engine, risk tolerance, Rubik’s Cube, Sand Hill Road, Saturday Night Live, search inside the book, second-price auction, selection bias, Silicon Valley, skunkworks, Skype, slashdot, social graph, social software, social web, spectrum auction, speech recognition, statistical model, Steve Ballmer, Steve Jobs, Steven Levy, Ted Nelson, telemarketer, trade route, traveling salesman, turn-by-turn navigation, undersea cable, Vannevar Bush, web application, WikiLeaks, Y Combinator

“We’re going to build hundreds and thousands of cheap servers knowing from the get-go that a certain percentage, maybe 10 percent, are going to fail,” says Reese. Google’s first CIO, Douglas Merrill, once noted that the disk drives Google purchased were “poorer quality than you would put into your kid’s computer at home.” But Google designed around the flaws. “We built capabilities into the software, the hardware, and the network—the way we hook them up, the load balancing, and so on—to build in redundancy, to make the system fault-tolerant,” says Reese. The Google File System, written by Jeff Dean and Sanjay Ghemawat, was invaluable in this process: it was designed to manage failure by “sharding” data, distributing it to multiple servers. If Google search called for certain information at one server and didn’t get a reply after a couple of milliseconds, there were two other Google servers that could fulfill the request. “The Google business model was constrained by cost, especially at the very beginning,” says Erik Teetzel, who worked with Google’s data centers.

pages: 798 words: 240,182

The Transhumanist Reader by Max More, Natasha Vita-More

23andMe, Any sufficiently advanced technology is indistinguishable from magic, artificial general intelligence, augmented reality, Bill Joy: nanobots, bioinformatics, brain emulation, Buckminster Fuller, cellular automata, clean water, cloud computing, cognitive bias, cognitive dissonance, combinatorial explosion, conceptual framework, Conway's Game of Life, cosmological principle, data acquisition, discovery of DNA, Douglas Engelbart, Drosophila, en.wikipedia.org, endogenous growth, experimental subject, Extropian, fault tolerance, Flynn Effect, Francis Fukuyama: the end of history, Frank Gehry, friendly AI, game design, germ theory of disease, hypertext link, impulse control, index fund, John von Neumann, joint-stock company, Kevin Kelly, Law of Accelerating Returns, life extension, lifelogging, Louis Pasteur, Menlo Park, meta analysis, meta-analysis, moral hazard, Network effects, Norbert Wiener, pattern recognition, Pepto Bismol, phenotype, positional goods, prediction markets, presumed consent, Ray Kurzweil, reversible computing, RFID, Ronald Reagan, scientific worldview, silicon-based life, Singularitarianism, social intelligence, stem cell, stochastic process, superintelligent machines, supply-chain management, supply-chain management software, technological singularity, Ted Nelson, telepresence, telepresence robot, telerobotics, the built environment, The Coming Technological Singularity, the scientific method, The Wisdom of Crowds, transaction costs, Turing machine, Turing test, Upton Sinclair, Vernor Vinge, Von Neumann architecture, Whole Earth Review, women in the workforce, zero-sum game

The mind continues to depend on a substrate to exist and to operate, of course, but there are substrate choices. The goal of substrate-independence is to continue personality, individual characteristics, a manner of experiencing, and a personal way of processing those experiences (Koene 2011a, 2011b). Your identity, your memories can then be embodied physically in many ways. They can also be backed up and operate robustly on fault-tolerant hardware with redundancy schemes. Achieving substrate-independence will allow us to optimize the operational framework, the hardware, to challenges posed by novel circumstances and different environments. Think, instead of sending extremophile bacteria to slowly terraform another world into a habitat, we ourselves can be extremophiles. Substrate-independent minds is a well-described objective.

pages: 778 words: 239,744

Gnomon by Nick Harkaway

Albert Einstein, back-to-the-land, banking crisis, Burning Man, choice architecture, clean water, cognitive dissonance, fault tolerance, fear of failure, gravity well, high net worth, impulse control, Isaac Newton, Khartoum Gordon, lifelogging, neurotypical, pattern recognition, place-making, post-industrial society, Potemkin village, Richard Feynman, Scramble for Africa, self-driving car, side project, Silicon Valley, skunkworks, the market place, trade route, urban planning, urban sprawl

And of course there’s choice architecture: the very thing we use at Tidal Flow to smooth your journey through London or to design serendipitous social spaces in the new developments of the capital. Effectively deployed bad practice under the System is a disaster. It would place the most absolute surveillance machine in history in the hands of villainous actors or mob instincts.’ ‘And you stop that from happening?’ ‘Oh no. Not us. The System itself, as designed by its original architects. Firespine is not a back door. It is a fault-tolerant architecture – a protocol of desperation. It adjusts where necessary, pushes people to vote when they are wise and not when they are foolish. It organises instants in time, perfect moments that unlock our better selves, serendipitous encounters to correct negative ones that make us less than we should be. The System knows us all. It knows intimately when we are struggling, when we are sad, and when we are wrong.

pages: 1,164 words: 309,327

Trading and Exchanges: Market Microstructure for Practitioners by Larry Harris

active measures, Andrei Shleifer, asset allocation, automated trading system, barriers to entry, Bernie Madoff, business cycle, buttonwood tree, buy and hold, compound rate of return, computerized trading, corporate governance, correlation coefficient, data acquisition, diversified portfolio, fault tolerance, financial innovation, financial intermediation, fixed income, floating exchange rates, High speed trading, index arbitrage, index fund, information asymmetry, information retrieval, interest rate swap, invention of the telegraph, job automation, law of one price, London Interbank Offered Rate, Long Term Capital Management, margin call, market bubble, market clearing, market design, market fragmentation, market friction, market microstructure, money market fund, Myron Scholes, Nick Leeson, open economy, passive investing, pattern recognition, Ponzi scheme, post-materialism, price discovery process, price discrimination, principal–agent problem, profit motive, race to the bottom, random walk, rent-seeking, risk tolerance, risk-adjusted returns, selection bias, shareholder value, short selling, Small Order Execution System, speech recognition, statistical arbitrage, statistical model, survivorship bias, the market place, transaction costs, two-sided market, winner-take-all economy, yield curve, zero-coupon bond, zero-sum game

To build reliable trading systems, markets must make substantial investments in redundant hardware and software systems. They must eliminate all single points of failure. Since failures are inevitable, given current technologies, markets also must invest in systems that allow them to recover from service interruptions. Markets—as well as brokers and dealers—employ many of the following processes to create reliable trading systems: • They use fault-tolerant computer hardware. • They build redundant computer systems. • They build redundant network connections. * * * ▶ Some Examples of the Risks of Trading Through Unreliable Data Networks • A trader submits a limit order to an electronic market. After the order is accepted, but before it trades, the trader’s network connection fails. The trader does not know whether she has traded. If she knew that she had not traded, she would do the trade in another market.

pages: 945 words: 292,893

Seveneves by Neal Stephenson

clean water, Colonization of Mars, Danny Hillis, digital map, double helix, epigenetics, fault tolerance, Fellow of the Royal Society, Filipino sailors, gravity well, Isaac Newton, Jeff Bezos, kremlinology, Kuiper Belt, low earth orbit, microbiome, orbital mechanics / astrodynamics, phenotype, Potemkin village, pre–internet, random walk, remote working, selection bias, side project, Silicon Valley, Skype, statistical model, Stewart Brand, supervolcano, the scientific method, Tunguska event, zero day, éminence grise

Ammonia worked better, but it was dangerous, and you couldn’t easily get more of it in space. If the Cloud Ark survived, it would survive on a water-based economy. A hundred years from now everything in space would be cooled by circulating water systems. But for now they had to keep the ammonia-based equipment running as well. Further complications, as if any were wanted, came from the fact that the systems had to be fault tolerant. If one of them got bashed by a hurtling piece of moon shrapnel and began to leak, it needed to be isolated from the rest of the system before too much of the precious water, or ammonia, leaked into space. So, the system as a whole possessed vast hierarchies of check valves, crossover switches, and redundancies that had saturated even Ivy’s brain, normally an infinite sink for detail. She’d had to delegate all cooling-related matters to a working group that was about three-quarters Russian and one-quarter American.