fault tolerance

72 results back to index

Martin Kleppmann-Designing Data-Intensive Applications. The Big Ideas Behind Reliable, Scalable and Maintainable Systems-O’Reilly (2017) by Unknown

active measures, Amazon Web Services, bitcoin, blockchain, business intelligence, business process, c2.com, cloud computing, collaborative editing, commoditize, conceptual framework, cryptocurrency, database schema, DevOps, distributed ledger, Donald Knuth, Edward Snowden, ethereum blockchain, fault tolerance, finite state, Flash crash, full text search, general-purpose programming language, informal economy, information retrieval, Internet of things, iterative process, John von Neumann, loose coupling, Marc Andreessen, natural language processing, Network effects, packet switching, peer-to-peer, performance metric, place-making, premature optimization, recommendation engine, Richard Feynman, Richard Feynman, self-driving car, semantic web, Shoshana Zuboff, social graph, social web, software as a service, software is eating the world, sorting algorithm, source of truth, SPARQL, speech recognition, statistical model, web application, WebSocket, wikimedia commons

Protocols for mak‐ ing systems Byzantine fault-tolerant are quite complicated [84], and fault-tolerant embedded systems rely on support from the hardware level [81]. In most server-side data systems, the cost of deploying Byzantine fault-tolerant solutions makes them impractical. Web applications do need to expect arbitrary and malicious behavior of clients that are under end-user control, such as web browsers. This is why input validation, sani‐ tization, and output escaping are so important: to prevent SQL injection and crosssite scripting, for example. However, we typically don’t use Byzantine fault-tolerant protocols here, but simply make the server the authority on deciding what client behavior is and isn’t allowed. In peer-to-peer networks, where there is no such cen‐ tral authority, Byzantine fault tolerance is more relevant.

A fault is usually defined as one com‐ ponent of the system deviating from its spec, whereas a failure is when the system as a whole stops providing the required service to the user. It is impossible to reduce the probability of a fault to zero; therefore it is usually best to design fault-tolerance mechanisms that prevent faults from causing failures. In this book we cover several techniques for building reliable systems from unreliable parts. Counterintuitively, in such fault-tolerant systems, it can make sense to increase the rate of faults by triggering them deliberately—for example, by randomly killing indi‐ vidual processes without warning. Many critical bugs are actually due to poor error handling [3]; by deliberately inducing faults, you ensure that the fault-tolerance machinery is continually exercised and tested, which can increase your confidence that faults will be handled correctly when they occur naturally. The Netflix Chaos Monkey [4] is an example of this approach.

The biggest dif‐ ferences are that in 2PC the coordinator is not elected, and that fault-tolerant consen‐ sus algorithms only require votes from a majority of nodes, whereas 2PC requires a “yes” vote from every participant. Moreover, consensus algorithms define a recovery process by which nodes can get into a consistent state after a new leader is elected, ensuring that the safety properties are always met. These differences are key to the correctness and fault tolerance of a consensus algorithm. 368 | Chapter 9: Consistency and Consensus Limitations of consensus Consensus algorithms are a huge breakthrough for distributed systems: they bring concrete safety properties (agreement, integrity, and validity) to systems where every‐ thing else is uncertain, and they nevertheless remain fault-tolerant (able to make pro‐ gress as long as a majority of nodes are working and reachable).

Principles of Protocol Design by Robin Sharp

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

accounting loophole / creative accounting, business process, discrete time, fault tolerance, finite state, Gödel, Escher, Bach, information retrieval, loose coupling, packet switching, RFC: Request For Comment, stochastic process, x509 certificate

The first is a practical objection: Simple languages generally do not correspond to protocols which can tolerate faults, such as missing or duplicated messages. Protocols which are fault-tolerant often require the use of state machines with enormous numbers of states, or they may define context-dependent languages. A more radical objection is that classical analysis of the protocol language from a formal language point of view traditionally concerns itself with the problems of constructing a suitable recogniser, determining the internal states of the recogniser, and so on. This does not help us to analyse or check many of the properties which we may require the protocol to have, such as the properties of fault-tolerance mentioned above. To be able to investigate this we need analytical tools which can describe the parallel operation of all the parties which use the protocol to regulate their communication. 1.2 Protocols as Processes A radically different way of looking at things has therefore gained prominence within recent years.

For generality1 , we define the value of the function ma jority(v) as being the value selected by a lieutenant receiving the values in v. If no value is received from a particular participant, the algorithm should supply some default, vde f . 5.4.1 Using unsigned messages Solutions to this problem depend quite critically on the assumptions made about the system. Initially, we shall assume the following: Degree of fault-tolerance: Out of the n participants, at most t are unreliable. This defines the degree of fault tolerance required of the system. We cannot expect the protocol to work correctly if this limit is overstepped. Network properties: Every message that is sent is delivered correctly, and the receiver of a message knows who sent it. These assumptions mean that an unreliable participant cannot interfere with the message traffic between the other participants.

Other features: — Coding: Ad hoc binary coding of fixed fields in TPDUs, with TLV encoding of optional fields (‘parameters’) (Table 8.3). Addressing: Hierarchical addressing. T-address formed by concatenating T-selector onto N-address. Fault tolerance: Loss or duplication of data (DT TPDUs) or acknowledgments (AK TPDUs). Whereas the ISO Class 0 protocol provides minimal functionality, and is therefore only suitable for use when the underlying network is comparatively reliable, the Class 4 protocol is designed to be resilient to a large range of potential disasters, including the arrival of spurious PDUs, PDU loss and PDU corruption. To ensure this degree of fault tolerance, the protocol uses a large number of timers, whose identifications and functions are summarised in Table 9.3. The persistence timer is 3 Network restart is assumed to give rise to N-DISCONNECT.ind on all connections. 286 9 Protocols in the OSI Lower Layers only conceptual, as R is equal to T 1 · (N − 1), where N is the maximum number of attempts to retransmit a PDU, as illustrated in Figure 4.9.

pages: 371 words: 78,103

Webbots, Spiders, and Screen Scrapers by Michael Schrenk

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Amazon Web Services, corporate governance, fault tolerance, Firefox, Marc Andreessen, new economy, pre–internet, SpamAssassin, Turing test, web application

It's better to avoid these issues by designing fault-tolerant webbots that anticipate changes in the websites they target. Fault tolerance does not mean that everything will always work perfectly. Sometimes changes in a targeted website confuse even the most fault-tolerant webbot. In these cases, the proper thing for a webbot to do is to abort its task and report an error to its owner. Essentially, you want your webbot to fail in the same manner a person using a browser might fail. For example, if a webbot is buying an airline ticket, it should not proceed with a purchase if a seat is not available on a desired flight. This action sounds silly, but it is exactly what a poorly programmed webbot may do if it is expecting an available seat and has no provision to act otherwise. Types of Webbot Fault Tolerance For a webbot, fault tolerance involves adapting to changes to URLs, HTML content (which affect parsing), forms, cookie use, and network outages and congestion).

* * * [68] See Chapter 28 for more information about trespass to chattels. [69] You can find the owner of an IP address at http://www.arin.net. Chapter 25. WRITING FAULT-TOLERANT WEBBOTS The biggest complaint users have about webbots is their unreliability: Your webbots will suddenly and inexplicably fail if they are not fault tolerant, or able to adapt to the changing conditions of your target websites. This chapter is devoted to helping you write webbots that are tolerant to network outages and unexpected changes in the web pages you target. Webbots that don't adapt to their changing environments are worse than nonfunctional ones because, when presented with the unexpected, they may perform in odd and unpredictable ways. For example, a non-fault-tolerant webbot may not notice that a form has changed and will continue to emulate the nonexistent form.

Types of Webbot Fault Tolerance For a webbot, fault tolerance involves adapting to changes to URLs, HTML content (which affect parsing), forms, cookie use, and network outages and congestion). We'll examine each of these aspects of fault tolerance in the following sections. Adapting to Changes in URLs Possibly the most important type of webbot fault tolerance is URL tolerance, or a webbot's ability to make valid requests for web pages under changing conditions. URL tolerance ensures that your webbot does the following: Download pages that are available on the target site Follow header redirections to updated pages Use referer values to indicate that you followed a link from a page that is still on the website Avoid Making Requests for Pages That Don't Exist Before you determine that your webbot downloaded a valid web page, you should verify that you made a valid request.

Pragmatic.Programming.Erlang.Jul.2007 by Unknown

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Chuck Templeton: OpenTable, Debian, en.wikipedia.org, fault tolerance, finite state, full text search, RFC: Request For Comment, sorting algorithm

If the clause that executes this statement is not within the scope of 170 E RROR H ANDLING P RIMITIVES Joe Asks. . . How Can We Make a Fault-Tolerant System? To make something fault tolerant, we need at least two computers. One computer does the job, and another computer watches the first computer and must be ready to take over at a moment’s notice if the first computer fails. This is exactly how error recovery works in Erlang. One process does the job, and another process watches the first process and takes over if things go wrong. That’s why we need to monitor processes and to know why things fail. The examples in this chapter show you how to do this. In distributed Erlang, the process that does the job and the processes that monitor the process that does the job can be placed on physically different machines. Using this technique, we can start designing fault-tolerant software. This pattern is common.

Using behaviors, we can concentrate on the functional behavior of a component, while allowing the behavior framework to solve the nonfunctional aspects of the problem. The framework might, for example, take care of making the application fault tolerant or scalable, whereas the behavioral callback concentrates on the specific aspects of the problem. The chapter starts with a general discussion on how to build your own behaviors and then moves to describing the gen_server behavior that is part of the Erlang standard libraries. 14 R OAD M AP • Chapter 17, Mnesia: The Erlang Database, on page 313 talks about the Erlang database management system (DBMS) Mnesia. Mnesia is an integrated DBMS with extremely fast, soft, real-time response times. It can be configured to replicate its data over several physically separated nodes to provide fault-tolerant operation. • Chapter 18, Making a System with OTP, on page 335 is the second of the OTP chapters.

Not only were the programs different, but the whole approach to programming was different. The author kept on and on about concurrency and distribution and fault tolerance and about a method of programming called concurrency-oriented programming—whatever that might mean. But some of the examples looked like fun. That evening the programmer looked at the example chat program. It was pretty small and easy to understand, even if the syntax was a bit strange. Surely it couldn’t be that easy. The basic program was simple, and with a few more lines of code, file sharing and encrypted conversations became possible. The programmer started typing.... What’s This All About? It’s about concurrency. It’s about distribution. It’s about fault tolerance. It’s about functional programming. It’s about programming a distributed concurrent system without locks and mutexes but using only pure message passing.

pages: 496 words: 70,263

Erlang Programming by Francesco Cesarini

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

cloud computing, fault tolerance, finite state, loose coupling, revision control, RFC: Request For Comment, sorting algorithm, Turing test, type inference, web application

This should record enough information to enable billing for the use of the phone. 138 | Chapter 5: Process Design Patterns CHAPTER 6 Process Error Handling Whatever the programming language, building distributed, fault-tolerant, and scalable systems with requirements for high availability is not for the faint of heart. Erlang’s reputation for handling the fault-tolerant and high-availability aspects of these systems has its foundations in the simple but powerful constructs built into the language’s concurrency model. These constructs allow processes to monitor each other’s behavior and to recover from software faults. They give Erlang a competitive advantage over other programming languages, as they facilitate development of the complex architecture that provides the required fault tolerance through isolating errors and ensuring nonstop operation. Attempts to develop similar frameworks in other languages have either failed or hit a major complexity barrier due to the lack of the very constructs described in this chapter.

Only then was the language deemed mature enough to use in major projects with hundreds of developers, including Ericsson’s broadband, GPRS, and ATM switching solutions. In conjunction with these projects, the OTP framework was developed and released in 1996. OTP provides a framework to structure Erlang systems, offering robustness and fault tolerance together with a set of tools and libraries. The history of Erlang is important in understanding its philosophy. Although many languages were developed before finding their niche, Erlang was developed to solve the “time-to-market” requirements of distributed, fault-tolerant, massively concurrent, soft real-time systems. The fact that web services, retail and commercial banking, computer telephony, messaging systems, and enterprise integration, to mention but a few, happen to share the same requirements as telecom systems explains why Erlang is gaining headway in these sectors.

A typical example here is a web server: if you are planning a new release of a piece of software, or you are planning to stream video of a football match in real time, distributing the server across a number of machines will make this possible without failure. This performance is given by replication of a service—in this case a web server— which is often found in the architecture of a distributed system. • Replication also provides fault tolerance: if one of the replicated web servers fails or becomes unavailable for some reason, HTTP requests can still be served by the other servers, albeit at a slower rate. This fault tolerance allows the system to be more robust and reliable. • Distribution allows transparent access to remote resources, and building on this, it is possible to federate a collection of different systems to provide an overall user service. Such a collection of facilities is provided by modern e-commerce systems, such as the Amazon.com website. • Finally, distributed system architecture makes a system extensible, with other services becoming available through remote access.

pages: 161 words: 44,488

The Business Blockchain: Promise, Practice, and Application of the Next Internet Technology by William Mougayar

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Airbnb, airport security, Albert Einstein, altcoin, Amazon Web Services, bitcoin, Black Swan, blockchain, business process, centralized clearinghouse, Clayton Christensen, cloud computing, cryptocurrency, disintermediation, distributed ledger, Edward Snowden, en.wikipedia.org, ethereum blockchain, fault tolerance, fiat currency, fixed income, global value chain, Innovator's Dilemma, Internet of things, Kevin Kelly, Kickstarter, market clearing, Network effects, new economy, peer-to-peer, peer-to-peer lending, prediction markets, pull request, QR code, ride hailing / ride sharing, Satoshi Nakamoto, sharing economy, smart contracts, social web, software as a service, too big to fail, Turing complete, web application

Game Theory: Analysis of Conflict, Harvard University Press. 5. Leslie Lamport, Robert Shostak, and Marshall Pease, The Byzantine Generals Problem. http://research.microsoft.com/en-us/um/people/lamport/pubs/byz.pdf. 6. IT Does not Matter, https://hbr.org/2003/05/it-doesnt-matter. 7. PayPal website, https://www.paypal.com/webapps/mpp/about. 8. Personal communication with Vitalik Buterin, February 2016. 9. Byzantine fault tolerance, https://en.wikipedia.org/wiki/Byzantine_fault_tolerance. 10. Proof-of-stake, https://en.wikipedia.org/wiki/Proof-of-stake. 2 HOW BLOCKCHAIN TRUST INFILTRATES “I cannot understand why people are frightened of new ideas. I’m frightened of the old ones.” –JOHN CAGE REACHING CONSENSUS is at the heart of a blockchain’s operations. But the blockchain does it in a decentralized way that breaks the old paradigm of centralized consensus, when one central database used to rule transaction validity.

In part, the continuation of some of the trends in crypto 2.0, and particularly generalized protocols that provide both computational abstraction and privacy. But equally important is the current technological elephant in the room in the blockchain sphere: scalability. Currently, all existing blockchain protocols have the property that every computer in the network must process every transaction—a property that provides extreme degrees of fault tolerance and security, but at the cost of ensuring that the network's processing power is effectively bounded by the processing power of a single node. Crypto 3.0—at least in my mind—consists of approaches that move beyond this limitation, in one of various ways to create systems that break through this limitation and actually achieve the scale needed to support mainstream adoption (technically astute readers may have heard of “lightning networks,” “state channels,” and “sharding”).

Game theory is ‘the study of mathematical models of conflict and cooperation between intelligent rational decision-makers.”4 And this is related to the blockchain because the Bitcoin blockchain, originally conceived by Satoshi Nakamoto, had to solve a known game theory conundrum called the Byzantine Generals Problem.5 Solving that problem consists in mitigating any attempts by a small number of unethical Generals who would otherwise become traitors, and lie about coordinating their attack to guarantee victory. This is accomplished by enforcing a process for verifying the work that was put into crafting these messages, and time-limiting the requirement for seeing untampered messages in order to ensure their validity. Implementing a “Byzantine Fault Tolerance” is important because it starts with the assumption that you cannot trust anyone, and yet it delivers assurance that the transaction has traveled and arrived safely based on trusting the network during its journey, while surviving potential attacks. There are fundamental implications for this new method of reaching safety in the finality of a transaction, because it questions the existence and roles of current trusted intermediaries, who held the traditional authority on validating transactions.

pages: 1,085 words: 219,144

Solr in Action by Trey Grainger, Timothy Potter

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

business intelligence, cloud computing, commoditize, conceptual framework, crowdsourcing, data acquisition, en.wikipedia.org, failed state, fault tolerance, finite state, full text search, glass ceiling, information retrieval, natural language processing, performance metric, premature optimization, recommendation engine, web application

At this point, you’ve seen that Solr has a modern, well-designed architecture that’s scalable and fault-tolerant. Although these are important aspects to consider if you’ve already decided to use Solr, you still might not be convinced that Solr is the right choice for your needs. In the next section, we describe the benefits of Solr from the perspective of different stakeholders, such as the software architect, system administrator, and CEO. 1.3. Why Solr? In this section, we provide key information to help you decide if Solr is the right technology for your organization. Let’s begin by addressing why Solr is attractive to software architects. 1.3.1. Solr for the software architect When evaluating new technology, software architects must consider a number of factors including stability, scalability, and fault tolerance. Solr scores high marks in all three categories.

We won’t advise you either way on whether this is acceptable for your organization. We only point this out because it’s a testament to the depth and breadth of automated testing in Lucene and Solr. If you have a nightly build off trunk in which all the automated tests pass, then you can be fairly confident that the core functionality is solid. We’ve touched on Solr’s approach to scalability and fault tolerance in sections 1.2.6 and 1.2.7. As an architect, you’re probably most curious about the limitations of Solr’s approach to scalability and fault tolerance. First, you should realize that the sharding and replication features in Solr have been improved in Solr 4 to be robust and easier to manage. The new approach to scaling is called SolrCloud. Under the covers, SolrCloud uses Apache ZooKeeper to distribute configurations across a cluster of Solr servers and to keep track of cluster state.

You may want to use replication either when you want to isolate indexing from searching operations to different servers within your cluster or when you need to increase available queries-per-second capacity. Fault tolerance It’s great that we can increase our overall query capacity by adding another server and replicating the index to that server, but what happens when one of our servers eventually crashes? When our application had only one server, the application clearly would have stopped. Now that multiple, redundant servers exist, one server dying will simply reduce our capacity back to the capacity of however many servers remain. If you want to build fault tolerance into your system, it’s a good idea to have additional resources (extra slave servers) in your cluster so that your system can continue functioning with enough capacity even if a single server fails.

pages: 570 words: 115,722

The Tangled Web: A Guide to Securing Modern Web Applications by Michal Zalewski

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

barriers to entry, business process, defense in depth, easy for humans, difficult for computers, fault tolerance, finite state, Firefox, Google Chrome, information retrieval, RFC: Request For Comment, semantic web, Steve Jobs, telemarketer, Turing test, Vannevar Bush, web application, WebRTC, WebSocket

Vendors released their products with embedded programming languages such as JavaScript and Visual Basic, plug-ins to execute platform-independent Java or Flash applets on the user’s machine, and useful but tricky HTTP extensions such as cookies. Only a limited degree of superficial compatibility, sometimes hindered by patents and trademarks,[7] would be maintained. As the Web grew larger and more diverse, a sneaky disease spread across browser engines under the guise of fault tolerance. At first, the reasoning seemed to make perfect sense: If browser A could display a poorly designed, broken page but browser B refused to (for any reason), users would inevitably see browser B’s failure as a bug in that product and flock in droves to the seemingly more capable client, browser A. To make sure that their browsers could display almost any web page correctly, engineers developed increasingly complicated and undocumented heuristics designed to second-guess the intent of sloppy webmasters, often sacrificing security and occasionally even compatibility in the process.

In several scenarios outlined in that RFC, the desire to explicitly mandate the handling of certain corner cases led to patently absurd outcomes. One such example is the advice on parsing dates in certain HTTP headers, at the request of section 3.3 in RFC 1945. The resulting implementation (the prtime.c file in the Firefox codebase[118]) consists of close to 2,000 lines of extremely confusing and unreadable C code just to decipher the specified date, time, and time zone in a sufficiently fault-tolerant way (for uses such as deciding cache content expiration). Semicolon-Delimited Header Values Several HTTP headers, such as Cache-Control or Content-Disposition, use a semicolon-delimited syntax to cram several separate name=value pairs into a single line. The reason for allowing this nested notation is unclear, but it is probably driven by the belief that it will be a more efficient or a more intuitive approach that using several separate headers that would always have to go hand in hand.

In general, be mindful of control and high-bit characters, commas, quotes, backslashes, and semicolons; other characters or strings may be of concern on a case-by-case basis. Escape or substitute these values as appropriate. When building a new HTTP client, server, or proxy: Do not create a new implementation unless you absolutely have to. If you can’t help it, read this chapter thoroughly and aim to mimic an existing mainstream implementation closely. If possible, ignore the RFC-provided advice about fault tolerance and bail out if you encounter any syntax ambiguities. * * * [24] Public key cryptography relies on asymmetrical encryption algorithms to create a pair of keys: a private one, kept secret by the owner and required to decrypt messages, and a public one, broadcast to the world and useful only to encrypt traffic to that recipient, not to decrypt it. Chapter 4. Hypertext Markup Language The Hypertext Markup Language (HTML) is the primary method of authoring online documents.

Scala in Action by Nilanjan Raychaudhuri

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

continuous integration, create, read, update, delete, database schema, domain-specific language, don't repeat yourself, en.wikipedia.org, failed state, fault tolerance, general-purpose programming language, index card, MVC pattern, type inference, web application

When Alan Kay[7] first thought about OOP, his big idea was “message passing.”[8] In fact, working with actors is more object-oriented than you think. 7 Alan Curtis Kay, http://en.wikipedia.org/wiki/Alan_Kay. 8 Alan Kay, “Prototypes vs. classes was: Re: Sun’s HotSpot,” Oct 10, 1998, http://mng.bz/L12u. What happens if something fails? So many things can go wrong in the concurrent/ parallel programming world. What if we get an IOException while reading the file? Let’s learn how to handle faults in an actor-based application. 9.3.4. Fault tolerance made easy with a supervisor Akka encourages nondefensive programming in which failure is a valid state in the lifecycle of an application. As a programmer you know you can’t prevent every error, so it’s better to prepare your application for the errors. You can easily do this through fault-tolerance support provided by Akka through the supervisor hierarchy. Think of this supervisor as an actor that links to supervised actors and restarts them when one dies. The responsibility of a supervisor is to start, stop, and monitor child actors.

Figure 9.6 shows an example of supervisor hierarchy. Figure 9.6. Supervisor hierarchy in Akka You aren’t limited to one supervisor. You can have one supervisor linked to another supervisor. That way you can supervise a supervisor in case of a crash. It’s hard to build a fault-tolerant system with one box, so I recommend having your supervisor hierarchy spread across multiple machines. That way, if a node (machine) is down, you can restart an actor in a different box. Always remember to delegate the work so that if a crash occurs, another supervisor can recover. Now let’s look into the fault-tolerant strategies available in Akka. Supervision Strategies in Akka Akka comes with two restarting strategies: One-for-One and All-for-One. In the One-for-One strategy (see figure 9.7), if one actor dies, it’s recreated. This is a great strategy if actors are independent in the system.

First I’ll talk about the philosophy behind Akka so you understand the goal behind the Akka project and the problems it tries to solve. 12.1. The philosophy behind Akka The philosophy behind Akka is simple: make it easier for developers to build correct, concurrent, scalable, and fault-tolerant applications. To that end, Akka provides a higher level of abstractions to deal with concurrency, scalability, and faults. Figure 12.1 shows the three core modules provided by Akka for concurrency, scalability, and fault tolerance. Figure 12.1. Akka core modules The concurrency module provides options to solve concurrency-related problems. By now I’m sure you’re comfortable with actors (message-oriented concurrency). But actors aren’t a be-all-end-all solution for concurrency. You need to understand alternative concurrency models available in Akka, and in the next section you’ll explore all of them.

pages: 250 words: 73,574

Nine Algorithms That Changed the Future: The Ingenious Ideas That Drive Today's Computers by John MacCormick, Chris Bishop

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Ada Lovelace, AltaVista, Claude Shannon: information theory, fault tolerance, information retrieval, Menlo Park, PageRank, pattern recognition, Richard Feynman, Richard Feynman, Silicon Valley, Simon Singh, sorting algorithm, speech recognition, Stephen Hawking, Steve Jobs, Steve Wozniak, traveling salesman, Turing machine, Turing test, Vannevar Bush

At the time of writing, however, many of the systems that claim to be peer-to-peer in fact use central servers for some of their functionality and thus do not need to rely on distributed hash tables. The technique of “Byzantine fault tolerance” falls in the same category: a surprising and beautiful algorithm that can't yet be classed as great, due to lack of adoption. Byzantine fault tolerance allows certain computer systems to tolerate any type of error whatsoever (as long as there are not too many simultaneous errors). This contrasts with the more usual notion of fault tolerance, in which a system can survive more benign errors, such as the permanent failure of a disk drive or an operating system crash. CAN GREAT ALGORITHMS FADEAWAY? In addition to speculating about what algorithms might rise to greatness in the future, we might wonder whether any of our current “great” algorithms—indispensable tools that we use constantly without even thinking about it—might fade in importance.

This immense level of concurrency, together with rapid query responses via the virtual table trick, make large databases efficient. The to-do list trick also guarantees consistency in the face of failures. When combined with the prepare-then-commit trick for replicated databases, we are left with iron-clad consistency and durability for our data. The heroic triumph of databases over unreliable components, known by computer scientists as “fault-tolerance,” is the work of many researchers over many decades. But among the most important contributors was Jim Gray, a superb computer scientist who literally wrote the book on transaction processing. (The book is Transaction Processing: Concepts and Techniques, first published in 1992.) Sadly, Gray's career ended early: one day in 2007, he sailed his yacht out of San Francisco Bay, under the Golden Gate Bridge, and into the open ocean on a planned day trip to some nearby islands.

See also certification authority authority trick B-tree Babylonia backup bank; account number; balance; for keys; online banking; for signatures; transfer; as trusted third party base, in exponentiation Battelle, John Bell Telephone Company binary Bing biology biometric sensor Bishop, Christopher bit block cipher body, of a web page brain Brin, Sergey British government browser brute force bug Burrows, Mike Bush, Vannevar Businessweek Byzantine fault tolerance C++ programming language CA. See certification authority calculus Caltech Cambridge CanCrash.exe CanCrashWeird.exe Carnegie Mellon University CD cell phone. See phone certificate certification authority Charles Babbage Institute chat-bot checkbook checksum; in practice; simple; staircase. See also cryptographic hash function checksum trick chemistry chess Church, Alonzo Church-Turing thesis citations class classification classifier clock arithmetic clock size; conditions on; factorization of; need for large; primary; as a public number; in RSA; secondary Codd, E.

pages: 305 words: 89,103

Scarcity: The True Cost of Not Having Enough by Sendhil Mullainathan

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

American Society of Civil Engineers: Report Card, Andrei Shleifer, Cass Sunstein, clean water, computer vision, delayed gratification, double entry bookkeeping, Exxon Valdez, fault tolerance, happiness index / gross national happiness, impulse control, indoor plumbing, inventory management, knowledge worker, late fees, linear programming, mental accounting, microcredit, p-value, payday loans, purchasing power parity, randomized controlled trial, Report Card for America’s Infrastructure, Richard Thaler, Saturday Night Live, Walter Mischel, Yogi Berra

And much of it does not sit so well with being a student. Skipping class in a training program while you’re dealing with scarcity is not the same as playing hooky in middle school. Linear classes that must not be missed can work well for the full-time student; they do not make sense for the juggling poor. It is important to emphasize that fault tolerance is not a substitute for personal responsibility. On the contrary: fault tolerance is a way to ensure that when the poor do take it on themselves, they can improve—as so many do. Fault tolerance allows the opportunities people receive to match the effort they put in and the circumstances they face. It does not take away the need for hard work; rather, it allows hard work to yield better returns for those who are up for the challenge, just as improved levers in the cockpit allow the dedicated pilot to excel.

It has also occasionally led to programs with strong incentives, such as conditional cash transfer programs, where the amount of aid one receives depends on performing assorted “good” behaviors. But why not look at the design of the cockpit rather than the workings of the pilot? Why not look at the structure of the programs rather than the failings of the clients? If we accept that pilots can fail and that cockpits need to be wisely structured so as to inhibit those failures, why can we not do the same with the poor? Why not design programs structured to be more fault tolerant? We could ask the same question of anti-poverty programs. Consider the training programs, where absenteeism is common and dropout rates are high. What happens when, loaded and depleted, a client misses a class? What happens when her mind wanders in class? The next class becomes a lot harder. Miss one or two more classes and dropping out becomes the natural outcome, perhaps even the best option, as she really no longer understands much of what is being discussed in the class.

You’re exhausted and weighed down by things more proximal, and you know that even if you go you won’t absorb a thing. Now roll forward a few more weeks. By now you’ve missed another class. And when you go, you understand less than before. Eventually you decide it’s just too much right now; you’ll drop out and sign up another time, when your financial life is more together. The program you tried was not designed to be fault tolerant. It magnified your mistakes, which were predictable, and essentially pushed you out the door. But it need not be that way. Instead of insisting on no mistakes or for behavior to change, we can redesign the cockpit. Curricula can be altered, for example, so that there are modules, staggered to start at different times and to proceed in parallel. You missed a class and fell behind? Move to a parallel session running a week or two “behind” this one.

pages: 589 words: 147,053

The Age of Em: Work, Love and Life When Robots Rule the Earth by Robin Hanson

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

8-hour work day, artificial general intelligence, augmented reality, Berlin Wall, bitcoin, blockchain, brain emulation, business process, Clayton Christensen, cloud computing, correlation does not imply causation, creative destruction, demographic transition, Erik Brynjolfsson, ethereum blockchain, experimental subject, fault tolerance, financial intermediation, Flynn Effect, hindsight bias, information asymmetry, job automation, job satisfaction, John Markoff, Just-in-time delivery, lone genius, Machinery of Freedom by David Friedman, market design, meta analysis, meta-analysis, Nash equilibrium, new economy, prediction markets, rent control, rent-seeking, reversible computing, risk tolerance, Silicon Valley, smart contracts, statistical model, stem cell, Thomas Malthus, trade route, Turing test, Vernor Vinge

If emulation hardware is digital, then it could either be deterministic, so that the value and timing of output states are always exactly predictable, or it could be fault-prone and fault-tolerant in the sense of having and tolerating more frequent and larger logic errors and timing fluctuations. Most digital hardware today is deterministic, but large parallel systems are more often fault-tolerant. The design of fault-tolerant hardware and software is an active area of research today (Bogdan et al. 2007). As human brains are large, parallel, and have an intrinsically fault-tolerant design, brain emulation software is likely to need less special adaptation to run on fault-prone hardware. Such hardware is usually cheaper to design and construct, occupies less volume, and takes less energy to run. Thus em hardware is likely to often be fault-prone and fault-tolerant. Cosmic rays are high-energy particles that come from space and disrupt the operation of electronic devices.

BLS 2012. “Employee Tenure in 2012.” United States Bureau of Labor Statistics USDL-12–1887, September 18. http://www.bls.gov/news.release/archives/tenure_09182012.pdf. Boehm, Christopher. 1999. Hierarchy in the Forest: The Evolution of Egalitarian Behavior. Harvard University Press, December 1. Bogdan, Paul, Tudor Dumitras, and Radu Marculescu. 2007. “Stochastic Communication: A New Paradigm for Fault-Tolerant Networks-on-Chip.” VLSI Design 2007: 95348. Boning, Brent, Casey Ichniowski, and Kathryn Shaw. 2007. “Opportunity Counts: Teams and the Effectiveness of Production Incentives.” Journal of Labor Economics 25(4): 613–650. Bonke, Jens. 2012. “Do Morning-Type People Earn More than Evening-Type People? How Chronotypes Influence Income.” Annals of Economics and Statistics 105/106: 55–72. Boserup, Ester. 1981.

Eric 33 dust 103 E early scans 148, 150 earthquakes 93 eating 298 economic analysis, v economic growth 28 economics 37–9, 382 economy 130, 179, 190, 276, 278, 374 doubling time of 190–4, 201–2, 221 early em 360 growth of 10, 92 size of 194 efficiency 155–65, 278 clan concentration 155–6 competition 156–9 eliteness 161–3 implications 159–61 qualities 163–5 elections 182, 183, 265 eliteness 161–3 ems see emulations emotion words 217 emulations 2, 6–7, 130, 338 assumptions 47–8 brain 2 compared to ordinary humans 11–2 enough 151–4 envisioning a world of 34–7 inequality 244–6 introduction to 1–2 many 122–4 mass 308 models 48 niche 308 one-name 155–6 opaque 61 open source 61 overview of 5–8 precedents 13–15 slow 257 start of 5–11 summary of conclusions 8–11 technologies 46 time-sharing 65, 222 energy 70, 71, 74, 75, 82 see also entropy control of 126 influence on behavior 83 entrenchment 344 entropy 77–80 see also energy eras 13–14, 15 see also farming era; foraging era; industrial era present 18–21 prior 15–18 values 21–3 erasures of bits 81, 82, 83 logical78rate of 80 reversible 79 eunuchs 285, 343 evaluations 367–70 evolution 22, 24, 25, 26, 134, 153 animal 24 em 153, 154 foragers 24, 25, 238 human 134, 153, 227 systems 344 existence 119–26 copying 119–21 many ems 122–4 rights 121–2 surveillance 124–6 existential risk 369 expenses 357 experimental art 203 experts, fake 254–5 exports 87, 95, 224 F faces 102, 297 factions 268–70 factories 96–7, 190, 191, 192, 193 failures 208 fake experts 254–5 fakery 113–14 farmers 1, 5, 8, 13, 16–17 communities 216 culture 326–8 farming era 5, 13, 14, 190, 252 firms 253 inequality 243 marriages 289 stories 331 wars 251 fashions 257, 268, 298, 310, 325, 326 clothes 18 intellectual 301 local 296 music 28 fast ems 257 fears 343 feelings 217 fertility 25, 26 fiction 1, 2, 41, 334 see also science fiction finance 195–7 financial inequality 247 fines 273 firms 231–4, 245 cost-focused 233 family-based 232 firm-clan relations 235–7 managers 234 mass versus niche teams 239–41 novelty-focused 233 private-equity owned 232 quality-focused 233 teams 237–9 first ems 147–50 flexibility 184, 202, 206, 224, 288, 378 flow projects 192 foragers 1, 5, 6, 8, 24–5, 29, 156, 190, 238 communities 13 pair bonds 289 foraging era 14, 16 inequality 243 stories 331 forecasting 33–4 fractal reversing 79, 81 fractional factorial experiment design 115 fragility 127–30 friendship 320, 371 future, vi 1, 26, 28, 31–2, 381 abstract construal of 42 analysis of 382, 383 em 384 eras 27, 29 evaluation of 367 technology 2, 7 futurists 35 G gates, computer 77–8 gender 290–1, 325 imbalance 291–3 geographical divisions 326 ghosts 132–3 global laws 124 God 316 governance 197, 258–62 clan 262–4 global 358 governments 364 gravity 74, 101 grit 164, 379 groups 227–41 clans 227–9 firm-clan relations 235–7 firms 231–4 managing clans 229–31 mass versus niche teams 239–41 signals 299–302 teams 237–9 growth 14, 15, 27, 28, 29, 189–97 estimate 192–4 faster 189–92 financial 195–7 modes 14 myths 194–5 H happiness 42, 165, 204–5, 232, 238, 247, 253, 303, 311, 320, 339, 370–1 hardware 54, 56–60, 63, 65, 278 clan-specific 355 communication 86 computer 86 deterministic 58, 86, 97, 174 digital 58 fault-tolerant 58 parallel 63–5 reversible 82 signal-processing 46, 57, 59 variable speed 82 heat transport 91–2 historians vi, 35 history 31, 32, 41, 248, 301 leisure 204, 207 personal 111 homosexual ems 292 homosexuality 10 hospitals 302 humans 1, 5, 7, 8, 14 introduction of 13 I identical twins 227 identity 49, 303–8, 317 ideologies 326 illness 305 implementation of emulations 55–65 hardware 56–60 mindreading 55–6 parallelism 63–5 security 60–3 impressions 295, 300 incentives 180, 181, 182, 183, 274 inclinations 342 income tax 182 individualism 20 industrial era 18–21 firms 253 stories 332 industrial organization 158 industrial revolution 232, 363 industry 5, 6, 13, 14 inequality 243–7 information 109–17 fake 113–14 records 111–2 simulations 115–17 views 109–11 infrastructure 85–98 air and water 90–2 buildings 92–5 climate controlled 85 cooling 86–9 manufacturing 95–8 innovation 189, 193, 275–7 institutions 179–80 new 181–4 intellectual property 124, 125, 147, 276, 277, 324, 362, 378 intelligence 163, 194, 295, 297, 299, 346–7 intelligence explosion 347–50 interactions 83, 109–10 interest rates 131, 196–7, 224 J job(s) categories 153 evaluations 159, 233 performance 164 tasks 356 see also careers; work judges 133, 173, 174, 261, 262, 267, 270, 272, 277, 286 K Kahn, Herman 33 kilo-ems 224 Kingdom Tower, Jeddah 93 L labor 54, 143–54, 190, 361 enough ems 151–4 first ems 147–50 Malthusian wages 146–7 markets 237 selection 150–1 supply and demand 143–5 languages 16, 128, 172, 217, 278, 345 law 229, 271–3 efficient 273–5 lawsuits 274 leisure 100, 102, 129, 168, 207, 374 activities 329 fast 258 speeds 222 liability 229, 273, 274, 277 liberals 327 lifecycle 199–212 careers 199–202 childhood 210–2 maturity 204–5 peak age 202–4 preparation for tasks 206–8 training 208–10 lifespan 11, 245, 246, 247 limits 27–9 logic gates 78, 79 loyalty 115, 117, 297, 299 lying 205 M machine reproduction estimates 192–3 machine shops 192 maladaptive behaviors 26 maladaptive cultures 25 Malthusian wages 146–7 management 200 of physical systems 109 practices 232–3 manic-depressive disorder 165 marketing 331 mass labor markets 239, 324 mass market teams 239–41 mass production 96 mating 285–93, 320, 342 gender 290–1 gender imbalance 291–3 open-source lovers 287–8 pair bonds 288–90 sexuality 285–7 maturity 204–5 meetings 75–7, 310 memories 48, 112, 136, 149, 207, 221, 304, 307 memory 63–5, 70–1, 79, 145, 219 mental fatigue 170 mental flexibility 203 mental speeds see mind speeds messages 81–2, 104 delays 77 methods 33, 34, 37, 40, 41, 42 Microsoft 91 military 359–60 mindfulness 165 minds 10, 335–50 features 344–5 humans 335–9 intelligence 346–7 intelligence explosion 347–50 merging 358 partial 341–3 psychology 343–6 quality 74 reading 55–6, 265, 271, 310, 314 speeds 65, 194, 199, 221–4 see also speed(s) theft 10, 61, 62, 76, 124, 302 unhumans 339–41 modeling, brain cell 364 modes of civilization 13–30 dreamtime 23–6 era values 21–3 limits 27–9 our era 18–21 precedents 13–15 prior eras 15–18 modular buildings 94 functional units 49 Moore’s law 54, 59, 80 moral choices 303 morality 2, 368 motivation, for studying future emulations 31–3 multitasking 171 music 311, 312, 328 myths 194–5 N nanotech manufacturing 97 nations 39, 87, 159, 163, 184, 195, 216, 243, 244, 245, 253 democratic 264 poor 22 rich 22, 39, 73, 94, 216, 234 war between 259 nature 81, 303 Neanderthals 21 nepotism 252–4 networks, talk 237 neurons 69 niche ems 308 niche labor markets 239, 324 niche market teams 239–41 normative considerations 44 nostalgia 308 nuclear weapons 251 O office politics 236 offices 100, 102, 104 older people 204–5 see also aging; retirement open-source lovers 287–8 outcome measures 260 ownership 120 P pair bonds 286, 288–90, 292–3 parallel computing 63–5, 278, 279, 280, 353 parents 383 partial sims 115 past, the see history patents 277 pay-for-performance 181–2 peak age 202–4 period 64–5, 70, 72, 76, 110 reversing 79–83 perseverance 164 personality, gender differences 290 personal signals 296–9 phase 65, 76, 81, 83, 110, 222 physical bodies 73, 75–6 physical jobs 73 physical violence 103 physical worlds 81 pipes 87, 88 plants 16, 87, 190, 303 police spurs 358 policy analysis 372–6 political power 354 politics 257–70, 322, 333 clan governance 262–4 coalitions 266–8 democracy 264–6 factions 268–70 governance 258–62 population 125 portable brain hardware 251 portfolios 196, 264, 378 positive considerations 44 poverty 246, 247, 249, 250 em 147, 153, 325 human 338 power 175–7 power laws 243 prediction markets 184, 186–8, 252, 255, 274, 317 city auctions 220 estimates 231 use of 276 pre-human primates 15–16 pre-skills 143–4, 152–3, 158, 356 preparation for tasks 206–8 prices 181–4, 187 of manufactured goods 145 for resources 179 printers, 3D 192 prison 273 privacy 172 productivity 12, 163, 171, 209–10, 211, 371 progress 2, 46–7, 49, 52, 53, 54 psychology 343–6 punishments 229, 273 purchasing 97, 182, 183, 277, 304 Q qualities 163–5 quality of life 370–2 quantum computing 357 R random access memory (RAM) 70 rare products 299 reaction time 72–3, 76–8, 83, 217 body size and 73 physical em body 223 real world, merging virtual and 105–7 records 111–2 redistribution 246–50 regulations 28, 37–8, 106, 110, 123, 151, 159, 217, 221, 264, 356, 358, 359 religion 276, 311–2, 326 research 194–5, 376 retirement 110, 127, 129–33, 135, 170, 174, 221–2, 336–9 human 8 reversibility 77–80, 82, 83 rewards 159–60 rights 121–2 rituals 309–11 rulers 259 rules 164, 271–81 S safes 172–3 salt water 91 scales 69–83 bodies 72–4 Lilliputian 74–5 speeds 69–72 scanning 148, 151, 363 scans 148–50 scenarios 34–7, 354–9, 363, 364 schools v, 20, 164, 168, 181, 233, 295–6, 302, 309, 333, 382 science fiction v, 2, 6, 312 scope 39–40 search teams 210 security 60–3, 71, 101, 104, 110, 117, 231, 306, 354, 357 breaches 85, 117 computer 104, 252, 357 costs 76 selection 5, 24, 26, 112, 137, 150–1, 153, 158, 162, 175, 263, 292, 339, 346 self-deception 173, 261, 296 self-governance 230 serial computing 353 sexuality 285–7, 328 shared spaces 103–5 showing off 295–6 sight perception 341 signals 295–308 copy identity 305–8 groups 299–302 identity 303–5 personal 296–9 processing 46 sim administrators 116 simulations 115–17 singing 311 sins 312 size 69, 72, 73, 74, 75, 110 slaves 16, 60, 121, 123–4, 147, 149, 245, 302, 327, 342 sleep 18, 60, 83, 133, 165 sleeping beauty strategy 131 social bonds 239 social gatherings 267 social interactions 238 social power 175–7 social reasoning 342 social relations 323 social science 382 social status 258 society 12, 321–34 software 54, 126, 277–9, 355 software developers 280–1 software engineers 200, 278, 280 souls 106 sound perception 341 spaces 110–14 space travel 225 speculation 39 speed(s) 69, 110, 137, 245, 246, 332 alternative scenario 355, 358 divisions 325, 326 em 8, 10, 353–4 em era 353 ghosts 132, 133 human-speed emulation 47 redistribution based on 248 retirement 130, 131 talking 298 time-shared em 65 top cheap 69, 70, 82, 89, 133, 222, 280, 281 travel 329, 330 variable speed hardware 82 walking 74 spurs 9, 110, 136, 169–71, 271, 292 social interactions 171 uses of 171–4 stability 131, 132 status 257–8, 301 stories 32, 35, 102, 325, 330–3 see also fiction clan 333–4 stress 20, 103, 134, 137, 164, 313 structure, city 217–19 subclans 227, 229 conflicting 356 inequality between 248 subordinates 200 subsistence levels 249 success 377–9 suicide 138–9 supply and demand 143–5 surveillance 124–6, 271 swearing 312–14 synchronization 309, 318–20 T takeovers 196 talk networks 237 taxes 249–50, 337 teams 237–9, 296, 299, 301, 306, 307 application 210 intelligence 346 mass versus niche teams 239–41 training 204 technologies 362–4 temperature 85, 88–91 territories 374 tests 114–17 theory 37, 39, 143 tools, non-computer-based 279 top cheap speed 69, 70, 82, 89, 133, 222, 280, 281 track records 181, 255 training 147, 151, 208–10, 212 transexuality 10 transgender conversions 292 transition, from our world to the em world 359–62 transport 224–6 travel 18, 22, 29, 43, 75, 102, 215, 218–19, 303, 329–30 travel times 102 trends 353–4 trust 208, 236 clans 227, 228, 234, 235 maturity and 204, 205 Tsiolkovsky, Konstantin 33 tweaking 150, 151 U undo action 104–5 unhumans, minds of 339–41 unions 236 United States of America 23 uploads see emulations utilitarianism 370, 372 V vacations 207 values 21–3, 237–8, 322, 383, 384 variety 20, 23, 96, 156, 157, 160, 189, 199, 234, 298, 375 views 109–11, 381, 382, 383 virtual meetings 217 virtual reality 8, 102, 103–4, 112, 217, 288, 291, 362 appearances 99–101 authentication 113 cultures 324 design of 104 leisure environments 102 meetings 76 merging real and 105–7 nature 81 travel, 224voices, pitch of 297 voting 183, 265–6 W wages 9, 12, 124, 143–5, 245, 336, 358 inequality 234, 248 Malthusian wages 146–7 rules 121, 122, 123 subsistence 354 war 16–17, 36, 131, 134, 250–2, 327, 354, 361 water 87, 90–2 Watkins, John 33 wealth 23, 26, 245–6, 321–2, 325, 336–8 weapons 251 Whole Brain Emulation Roadmap (Sandberg and Bostrom) 47 Wiener, Anthony 33 wind pressures 92, 93 work 167–77, 327, 328, 331 conditions 169 culture 321, 322, 323, 324 hours 167–9, 299, 372 methods 202 social power 175–7 speeds 222 spurs 169–71 teams 237–9 workers, time spent “loafing” 170 workaholics 165, 167 World Wide Web 34 Y Year 2000, The (Kahn and Wiener) 33 youth 11, 30, 376 see also children Z zoning 184, 185

pages: 1,758 words: 342,766

Code Complete (Developer Best Practices) by Steve McConnell

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Ada Lovelace, Albert Einstein, Buckminster Fuller, call centre, choice architecture, continuous integration, data acquisition, database schema, don't repeat yourself, Donald Knuth, fault tolerance, Grace Hopper, haute cuisine, if you see hoof prints, think horses—not zebras, index card, inventory management, iterative process, Larry Wall, late fees, loose coupling, Menlo Park, Perl 6, place-making, premature optimization, revision control, Sapir-Whorf hypothesis, slashdot, sorting algorithm, statistical model, Tacoma Narrows Bridge, the scientific method, Thomas Kuhn: the structure of scientific revolutions, Turing machine, web application

The fact that an environment has a particular error-handling approach doesn't mean that it's the best approach for your requirements. Fault Tolerance The architecture should also indicate the kind of fault tolerance expected. Fault tolerance is a collection of techniques that increase a system's reliability by detecting errors, recovering from them if possible, and containing their bad effects if not. Further Reading For a good introduction to fault tolerance, see the July 2001 issue of IEEE Software. In addition to providing a good introduction, the articles cite many key books and key articles on the topic. For example, a system could make the computation of the square root of a number fault tolerant in any of several ways: The system might back up and try again when it detects a fault. If the first answer is wrong, it would back up to a point at which it knew everything was all right and continue from there.

It might have three square-root classes that each use a different method. Each class computes the square root, and then the system compares the results. Depending on the kind of fault tolerance built into the system, it then uses the mean, the median, or the mode of the three results. The system might replace the erroneous value with a phony value that it knows to have a benign effect on the rest of the system. Other fault-tolerance approaches include having the system change to a state of partial operation or a state of degraded functionality when it detects an error. It can shut itself down or automatically restart itself. These examples are necessarily simplistic. Fault tolerance is a fascinating and complex subject—unfortunately, it's one that's outside the scope of this book. Architectural Feasibility The designers might have concerns about a system's ability to meet its performance targets, work within resource limitations, or be adequately supported by the implementation environments.

Are the architecture's security requirements described? Does the architecture set space and speed budgets for each class, subsystem, or functionality area? Does the architecture describe how scalability will be achieved? Does the architecture address interoperability? Is a strategy for internationalization/localization described? Is a coherent error-handling strategy provided? Is the approach to fault tolerance defined (if any is needed)? Has technical feasibility of all parts of the system been established? Is an approach to overengineering specified? Are necessary buy-vs.-build decisions included? Does the architecture describe how reused code will be made to conform to other architectural objectives? Is the architecture designed to accommodate likely changes? General Architectural Quality Does the architecture account for all the requirements?

pages: 480 words: 99,288

Mastering ElasticSearch by Rafal Kuc, Marek Rogozinski

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Amazon Web Services, create, read, update, delete, en.wikipedia.org, fault tolerance, finite state, full text search, information retrieval

Besides the fact that ElasticSearch can automatically discover field type by looking at its value, sometimes (in fact usually always) we will want to configure the mappings ourselves to avoid unpleasant surprises. Type Each document in ElasticSearch has its type defined. This allows us to store various document types in one index and have different mappings for different document types. Node The single instance of the ElasticSearch server is called a node. A single node ElasticSearch deployment can be sufficient for many simple use cases, but when you have to think about fault tolerance or you have lots of data that cannot fit in a single server, you should think about multi-node ElasticSearch cluster. Cluster Cluster is a set of ElasticSearch nodes that work together to handle the load bigger than single instance can handle (both in terms of handling queries and documents). This is also the solution which allows us to have uninterrupted work of application even if several machines (nodes) are not available due to outage or administration tasks, such as upgrade.

Finally, we've learned about segments merging, merge policies, and scheduling. In the next chapter, we'll look closely at what ElasticSearch offers us when it comes to shard control. We'll see how to choose the right amount of shards and replicas for our index, we'll manipulate shard placement and we will see when to create more shards than we actually need. We'll discuss how the shard allocator works. Finally, we'll use all the knowledge we've got so far to create fault tolerant and scalable clusters. Chapter 4. Index Distribution Architecture In the previous chapter, we've learned how to use different scoring formulas and how we can benefit from using them. We've also seen how to use different posting formats to change how the data is indexed. In addition to that, we now know how to handle near real-time searching and real-time get and what searcher reopening means for ElasticSearch.

If we sent many queries we would end up having the same (or almost the same) number of queries run against each of the shard and replicas. Using our knowledge As we are slowly approaching the end of the fourth chapter we need to get something that is closer to what you can encounter during your everyday work. Because of that we have decided to divide the real-life example into two sections. In this section, you'll see how to combine the knowledge we've got so far to build a fault-tolerant and scalable cluster based on some assumptions. Because this chapter is mostly about configuration, we will concentrate on that. The mappings and your data may be different, but with similar amount data and queries hitting your cluster the following sections may be useful for you. Assumptions Before we go into the juicy configuration details let's make some basic assumptions with which using which we will configure our ElasticSearch cluster.

pages: 463 words: 118,936

Darwin Among the Machines by George Dyson

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Ada Lovelace, Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem, Albert Einstein, anti-communist, British Empire, carbon-based life, cellular automata, Claude Shannon: information theory, combinatorial explosion, computer age, Danny Hillis, Donald Davies, fault tolerance, Fellow of the Royal Society, finite state, IFF: identification friend or foe, invention of the telescope, invisible hand, Isaac Newton, Jacquard loom, Jacquard loom, James Watt: steam engine, John Nash: game theory, John von Neumann, Menlo Park, Nash equilibrium, Norbert Wiener, On the Economy of Machinery and Manufactures, packet switching, pattern recognition, phenotype, RAND corporation, Richard Feynman, Richard Feynman, spectrum auction, strong AI, the scientific method, The Wealth of Nations by Adam Smith, Turing machine, Von Neumann architecture, zero-sum game

How could a mechanism composed of some ten billion unreliable components function reliably while computers with ten thousand components regularly failed? Von Neumann believed that entirely different logical foundations would be required to arrive at an understanding of even the simplest nervous system, let alone the human brain. His Probabilistic Logics and the Synthesis of Reliable Organisms from Unreliable Components (1956) explored the possibilities of parallel architecture and fault-tolerant neural nets. This approach would soon be superseded by a development that neither nature nor von Neumann had counted on: the integrated circuit, composed of logically intricate yet structurally monolithic microscopic parts. Serial architecture swept the stage. Probabilistic logics, along with vacuum tubes and acoustic delay-line memory, would scarcely be heard from again. If the development of solid-state electronics had been delayed a decade or two we might have advanced sooner rather than later into neural networks, parallel architectures, asynchronous processing, and other mechanisms by which nature, with sloppy hardware, achieves reliable results.

At one level, this language may appear to us to be money, especially the new, polymorphous E-money that circulates without reserve at the speed of light. E-money is, after all, simply a consensual definition of “electrons with meaning,” allowing other levels of meaning to freely evolve. Composed of discrete yet divisible and liquid units, digital currency resembles the pulse-frequency coding that has proved to be such a rugged and fault-tolerant characteristic of the nervous systems evolved by biology. Frequency-modulated signals that travel through the nerves are associated with chemical messages that are broadcast by diffusion through the fluid that bathes the brain. Money has a twofold nature that encompasses both kinds of behavior: it can be transmitted, like an electrical signal, from one place (or time) to another; or it can be diffused in any number of more chemical, hormonelike ways.

From the point of view of an individual packet, not only is there a huge number of physically distinct paths from A to B through the mesh of lunch boxes, but there are 162 alternative channels leading to the nearest lunch box at any given time. The packet chooses a channel that happens to be quiet at that instant and jumps to the next lamppost at the speed of light. The multiplexing of communications across the available network topology is extended to the multiplexing of network topology across the available frequency spectrum. Communication becomes more efficient, fault tolerant, and secure. The way the system works now (in a growing number of metropolitan areas—hence the name) is that you purchase or rent a small Ricochet modem, about the size of a large candy bar and transmitting at about two-thirds of a watt. Your modem establishes contact with the nearest pole-top lunch box or directly with any other modem of its species within range. Your computer sees the system as a standard modem connection or an Internet node, and the network, otherwise transparent to the users, keeps track of where all the users and all the lunch boxes are.

pages: 719 words: 181,090

Site Reliability Engineering by Betsy Beyer, Chris Jones, Jennifer Petoff, Niall Richard Murphy

Air France Flight 447, anti-pattern, barriers to entry, business intelligence, business process, Checklist Manifesto, cloud computing, combinatorial explosion, continuous integration, correlation does not imply causation, crowdsourcing, database schema, defense in depth, DevOps, en.wikipedia.org, fault tolerance, Flash crash, George Santayana, Google Chrome, Google Earth, job automation, job satisfaction, linear programming, load shedding, loose coupling, meta analysis, meta-analysis, minimum viable product, MVC pattern, performance metric, platform as a service, revision control, risk tolerance, side project, six sigma, the scientific method, Toyota Production System, trickle-down economics, web application, zero day

The product developers have more visibility into the time and effort involved in writing and releasing their code, while the SREs have more visibility into the service’s reliability (and the state of production in general). These tensions often reflect themselves in different opinions about the level of effort that should be put into engineering practices. The following list presents some typical tensions: Software fault tolerance How hardened do we make the software to unexpected events? Too little, and we have a brittle, unusable product. Too much, and we have a product no one wants to use (but that runs very stably). Testing Again, not enough testing and you have embarrassing outages, privacy data leaks, or a number of other press-worthy events. Too much testing, and you might lose your market. Push frequency Every push is risky.

This amortizes the fixed costs of the disk logging and network latency over the larger number of operations, increasing throughput. Deploying Distributed Consensus-Based Systems The most critical decisions system designers must make when deploying a consensus-based system concern the number of replicas to be deployed and the location of those replicas. Number of Replicas In general, consensus-based systems operate using majority quorums, i.e., a group of replicas may tolerate failures (if Byzantine fault tolerance, in which the system is resistant to replicas returning incorrect results, is required, then replicas may tolerate failures [Cas99]). For non-Byzantine failures, the minimum number of replicas that can be deployed is three—if two are deployed, then there is no tolerance for failure of any process. Three replicas may tolerate one failure. Most system downtime is a result of planned maintenance [Ken12]: three replicas allow a system to operate normally when one replica is down for maintenance (assuming that the remaining two replicas can handle system load at an acceptable performance).

Robbins, Web Operations: Keeping the Data on Time: O’Reilly, 2010. [All12] J. Allspaw, “Blameless PostMortems and a Just Culture”, blog post, 2012. [All15] J. Allspaw, “Trade-Offs Under Pressure: Heuristics and Observations of Teams Resolving Internet Service Outages”, MSc thesis, Lund University, 2015. [Ana07] S. Anantharaju, “Automating web application security testing”, blog post, July 2007. [Ana13] R. Ananatharayan et al., “Photon: Fault-tolerant and Scalable Joining of Continuous Data Streams”, in SIGMOD ’13, 2013. [And05] A. Andrieux, K. Czajkowski, A. Dan, et al., “Web Services Agreement Specification (WS-Agreement)”, September 2005. [Bai13] P. Bailis and A. Ghodsi, “Eventual Consistency Today: Limitations, Extensions, and Beyond”, in ACM Queue, vol. 11, no. 3, 2013. [Bai83] L. Bainbridge, “Ironies of Automation”, in Automatica, vol. 19, no. 6, November 1983.

pages: 540 words: 103,101

Building Microservices by Sam Newman

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

airport security, Amazon Web Services, anti-pattern, business process, call centre, continuous integration, create, read, update, delete, defense in depth, don't repeat yourself, Edward Snowden, fault tolerance, index card, information retrieval, Infrastructure as a Service, inventory management, job automation, load shedding, loose coupling, platform as a service, premature optimization, pull request, recommendation engine, social graph, software as a service, source of truth, the built environment, web application, WebSocket, x509 certificate

If the in-house service template supports only Java, then people may be discouraged from picking alternative stacks if they have to do lots more work themselves. Netflix, for example, is especially concerned with aspects like fault tolerance, to ensure that the outage of one part of its system cannot take everything down. To handle this, a large amount of work has been done to ensure that there are client libraries on the JVM to provide teams with the tools they need to keep their services well behaved. Anyone introducing a new technology stack would mean having to reproduce all this effort. The main concern for Netflix is less about the duplicated effort, and more about the fact that it is so easy to get this wrong. The risk of a service getting newly implemented fault tolerance wrong is high if it could impact more of the system. Netflix mitigates this by using sidecar services, which communicate locally with a JVM that is using the appropriate libraries.

This means if part of your system uses DNS already and can support SRV records, you can just drop in Consul and start using it without any changes to your existing system. Consul also builds in other capabilities that you might find useful, such as the ability to perform health checks on nodes. This means that Consul could well overlap the capabilities provided by other dedicated monitoring tools, although you would more likely use Consul as a source of this information and then pull it into a more comprehensive dashboard or alerting system. Consul’s highly fault-tolerant design and focus on handling systems that make heavy use of ephemeral nodes does make me wonder, though, if it may end up replacing systems like Nagios and Sensu for some use cases. Consul uses a RESTful HTTP interface for everything from registering a service, querying the key/value store, or inserting health checks. This makes integration with different technology stacks very straightforward.

Industry 4.0: The Industrial Internet of Things by Alasdair Gilchrist

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

3D printing, additive manufacturing, Amazon Web Services, augmented reality, autonomous vehicles, barriers to entry, business intelligence, business process, chief data officer, cloud computing, connected car, cyber-physical system, deindustrialization, fault tolerance, global value chain, Google Glasses, hiring and firing, industrial robot, inflight wifi, Infrastructure as a Service, Internet of things, inventory management, job automation, low skilled workers, millennium bug, pattern recognition, peer-to-peer, platform as a service, pre–internet, race to the bottom, RFID, Skype, smart cities, smart grid, smart meter, smart transportation, software as a service, stealth mode startup, supply-chain management, trade route, web application, WebRTC, WebSocket, Y2K

Therefore, we see the following delivery mechanisms: At most once delivery—This is commonly called fire and forget and rides on unreliable protocols such as UDP At least once delivery—This is reliable delivery such as TCP/IP where every message is delivered to the recipient Exactly once delivery—This technique is used in batch jobs as means of delivery that ensures late packets, delayed through excessive latency or delay or even jitter do not mess up the results Additionally, there are also many other factors that need to be taken into consideration such as lifespan, which relates to the IISs to discard old data packets, much like the time-to-live factor on IP packets. There is also fault tolerance, which ensures that there is fault survivability and alternative routes or hardware redundancy is available, which will guarantee availability and reliability. Similarly, there is the case of security, which we will discuss in detail in a later chapter. Industry 4.0 Key Functions of the Communication Layer The communication layer functions can deliver the data to the correct address and application.

There is also considerable interest in the production of IoT devices capable of energy harvesting solar, wind, or electromagnetic fields as a power source, as that can be a major technology advance in deploying remote M2M style mesh networking in rural areas. For example, in a smart agriculture scenario. Energy harvesting IoT devices would provide the means through mesh M2M networks for highly fault tolerant, unattended long-term solutions that require only minimal human intervention However, research and technology is not just focused on the technology. They are also keenly studying methods that would make application protocols and data formats far more efficient. For instance, low-power sources require that devices running on minimal power levels or are harvesting energy, again at subsistence levels, must communicate their data in a highly efficient and timely manner and this has serious implications for protocol design.

One drawback to xDSL is that the advertised bandwidth is shared among subscribers and service providers oversell link capacity, due to the nature of spiky TCP/IP and Internet browsing habits. Therefore, contention ratios—the number of other customers you are sharing the bandwidth with—can be as high as 50:1 for residential use and 10:1 for business use. • SDH/Sonnet—This optic ring technology is typically deployed as the service provider’s transport core as it is provides high speed, high capacity, and highly reliable and fault-tolerant transport for data over sometimesvast geographical regions. However, for customers that require high-speed data links over a large geographical region, typically enterprises or large company's fiber optic 163 164 Chapter 11 | IIoT WAN Technologies and Protocols rings are high performance, highly reliable, and high cost. Sonnet and SDH are transport protocols that encapsulate payload data within fixed synchronous frames.

pages: 201 words: 63,192

Graph Databases by Ian Robinson, Jim Webber, Emil Eifrem

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Amazon Web Services, anti-pattern, bioinformatics, commoditize, corporate governance, create, read, update, delete, data acquisition, en.wikipedia.org, fault tolerance, linked data, loose coupling, Network effects, recommendation engine, semantic web, sentiment analysis, social graph, software as a service, SPARQL, web application

Since the queries run slowly, the database can process fewer of them per second, which means the avail‐ ability of the database to do useful work diminishes from the client’s point of view. Whatever the database, understanding the underlying storage and caching infrastruc‐ ture will help you construct idiomatic-- and hence, mechanically sympathetic—queries that maximise performance. Our final observation on availability is that scaling for cluster-wide replication has a positive impact, not just in terms of fault-tolerance, but also responsiveness. Since there are many machines available for a given workload, query latency is low and availability is maintained. But as we’ll now discuss, scale itself is more nuanced than simply the number of servers we deploy. Scale The topic of scale has become more important as data volumes have grown. In fact, the problems of data at scale, which have proven difficult to solve with relational databases, have been a substantial motivation for the NOSQL movement.

Though optimistic concurrency control mechanisms are useful, we also rather like transactions, and there are numerous example of highthroughput performance transaction processing systems in the litera‐ ture. Document Stores | 173 Key-Value Stores Key-value stores are cousins of the document store family, but their lineage comes from Amazon’s Dynamo database. 3 They act like large, distributed hashmap data structures that store and retrieve opaque values by key. As shown in Figure A-3 the key space of the hashmap is spread across numerous buckets on the network. For fault-tolerance reasons each bucket is replicated onto several ma‐ chines. The formula for number of replicas required is given by R = 2F +1 where F is the number of failures we can tolerate. The replication algorithm seeks to ensure that machines aren’t exact copies of each other. This allows the system to load-balance while a machine and its buckets recover; it also helps avoid hotspots, which can cause inad‐ vertent self denial-of-service.

pages: 194 words: 49,310

Clock of the Long Now by Stewart Brand

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Albert Einstein, Brewster Kahle, Buckminster Fuller, Colonization of Mars, complexity theory, Danny Hillis, Eratosthenes, Extropian, fault tolerance, George Santayana, Internet Archive, Jaron Lanier, Kevin Kelly, knowledge economy, life extension, Metcalfe’s law, nuclear winter, pensions crisis, phenotype, Ray Kurzweil, Robert Metcalfe, Stephen Hawking, Stewart Brand, technological singularity, Ted Kaczynski, Thomas Malthus, Vernor Vinge, Whole Earth Catalog

Imagine a mountain range of opportunities, where the higher you get the greater the advantage. Hasty opportunists will never get past the foothills because they only pay attention to the slope of the ground under their feet, climb quickly to the immediate hilltop, and get stuck there. Patient opportunists take the longer view to the distant peaks, and toil through many ups and downs on the long trek to the heights. There are two ways to make systems fault-tolerant: One is to make them small, so that correction is local and quick; the other is to make them slow, so that correction has time to permeate the system. When you proceed too rapidly with something mistakes cascade, whereas when you proceed slowly the mistakes instruct. Gradual, incremental projects engage the full power of learning and discovery, and they are able to back out of problems. Gradually emergent processes get steadily better over time, while quickly imposed processes often get worse over time.

Diamond, Jared Digital information and core standards discontinuity of and immortality and megadata and migration preservation of Digital records, passive and active Discounting of value Drexler, Eric Drucker, Peter Dubos, René Dyson, Esther Dyson, Freeman Earth, view of from outer space Earth Day Easterbrook, Gregg Eaton Collection Eberling, Richard Ecological communities systems and change See also Environment Economic forecasting Ecotrust Egyptian civilization and time Ehrlich, Paul Electronic Frontier Foundation Eliade, Mircea Eno, Brian and ancient Egyptian woman and Clock of the Long Now ideas for participation in Clock/Library and tour of Big Ben Environment degradation of and peace, prosperity, and continuity reframing of problems of and technology See also Ecological Environmentalists and long-view Europe-America dialogue Event horizon Evolution of Cooperation, The “Experts Look Ahead, The” Extinction rate Extra-Terrestrial Intelligence programs and time-release services Extropians Family Tree Maker Fashion Fast and bad things Fault-tolerant systems Feedback and tuning of systems Feldman, Marcus Finite and Infinite Games Finite games Florescence Foresight Institute Freefall Free will Fuller, Buckminster Fundamental tracking Future configuration towards continuous of desire versus fate feeling of and nuclear armageddon one hundred years and present moment tree uses of and value Future of Industrial Man, The “Futurismists” Gabriel, Peter Galileo Galvin, Robert Gambling Games, finite and infinite Gender imbalance in Chinese babies Generations Gershenfeld, Neil Gibbon, Edward GI Bill Gibson, William Gilbert, Joseph Henry Global Business Network (GBN) Global collapse Global computer Global perspective Global warming Goebbels, Joseph Goethe, Johann Wolfgang von Goldberg, Avram “Goldberg rule, the” Goldsmith, Oliver Goodall, Jane Governance Governing the Commons Government and the long view Grand Canyon Great Year Greek tragedy Grove, Andy Hale-Bopp comet Hampden-Turner, Charles Hardware dependent digital experiences, preservation of Hawking, Stephen Hawthorne, Nathaniel Heinlein, Robert Herman, Arthur Hill climbing Hillis, Daniel definition of technology and design of Clock and digital discontinuity and digital preservation and extra-terrestrial intelligence programs ideas for participation in Clock/Library and Long Now Foundation and long-term responsibility and motivation to build linear Clock and the Singularity and sustained endeavors and types of time History and accessible data as a horror and warning how to apply intelligently Hitler, Adolf Holling, C.

pages: 58 words: 12,386

Big Data Glossary by Pete Warden

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

business intelligence, crowdsourcing, fault tolerance, information retrieval, linked data, natural language processing, recommendation engine, web application

More recent data processing systems, such as Hadoop and Cassandra, are designed to run on clusters of comparatively low-specification servers, and so the easiest way to handle more data is to add more of those machines to the cluster. This horizontal scaling approach tends to be cheaper as the number of operations and the size of the data increases, and the very largest data processing pipelines are all built on a horizontal model. There is a cost to this approach, though. Writing distributed data handling code is tricky and involves tradeoffs between speed, scalability, fault tolerance, and traditional database goals like atomicity and consistency. MapReduce MapReduce is an algorithm design pattern that originated in the functional programming world. It consists of three steps. First, you write a mapper function or script that goes through your input data and outputs a series of keys and values to use in calculating the results. The keys are used to cluster together bits of data that will be needed to calculate a single output result.

pages: 933 words: 205,691

Hadoop: The Definitive Guide by Tom White

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Amazon Web Services, bioinformatics, business intelligence, combinatorial explosion, database schema, Debian, domain-specific language, en.wikipedia.org, fault tolerance, full text search, Grace Hopper, information retrieval, Internet Archive, linked data, loose coupling, openstreetmap, recommendation engine, RFID, SETI@home, social graph, web application

The storage subsystem deals with blocks, simplifying storage management (since blocks are a fixed size, it is easy to calculate how many can be stored on a given disk) and eliminating metadata concerns (blocks are just a chunk of data to be stored—file metadata such as permissions information does not need to be stored with the blocks, so another system can handle metadata separately). Furthermore, blocks fit well with replication for providing fault tolerance and availability. To insure against corrupted blocks and disk and machine failure, each block is replicated to a small number of physically separate machines (typically three). If a block becomes unavailable, a copy can be read from another location in a way that is transparent to the client. A block that is no longer available due to corruption or machine failure can be replicated from its alternative locations to other live machines to bring the replication factor back to the normal level.

The Command-Line Interface We’re going to have a look at HDFS by interacting with it from the command line. There are many other interfaces to HDFS, but the command line is one of the simplest and, to many developers, the most familiar. We are going to run HDFS on one machine, so first follow the instructions for setting up Hadoop in pseudo-distributed mode in Appendix A. Later you’ll see how to run on a cluster of machines to give us scalability and fault tolerance. There are two properties that we set in the pseudo-distributed configuration that deserve further explanation. The first is fs.default.name, set to hdfs://localhost/, which is used to set a default filesystem for Hadoop. Filesystems are specified by a URI, and here we have used an hdfs URI to configure Hadoop to use HDFS by default. The HDFS daemons will use this property to determine the host and port for the HDFS namenode.

Reads are OK, but writes are getting slower and slower Drop secondary indexes and triggers (no indexes?). At this point, there are no clear solutions for how to solve your scaling problems. In any case, you’ll need to begin to scale horizontally. You can attempt to build some type of partitioning on your largest tables, or look into some of the commercial solutions that provide multiple master capabilities. Countless applications, businesses, and websites have successfully achieved scalable, fault-tolerant, and distributed data systems built on top of RDBMSs and are likely using many of the previous strategies. But what you end up with is something that is no longer a true RDBMS, sacrificing features and conveniences for compromises and complexities. Any form of slave replication or external caching introduces weak consistency into your now denormalized data. The inefficiency of joins and secondary indexes means almost all queries become primary key lookups.

pages: 757 words: 193,541

The Practice of Cloud System Administration: DevOps and SRE Practices for Web Services, Volume 2 by Thomas A. Limoncelli, Strata R. Chalup, Christina J. Hogan

active measures, Amazon Web Services, anti-pattern, barriers to entry, business process, cloud computing, commoditize, continuous integration, correlation coefficient, database schema, Debian, defense in depth, delayed gratification, DevOps, domain-specific language, en.wikipedia.org, fault tolerance, finite state, Firefox, Google Glasses, information asymmetry, Infrastructure as a Service, intermodal, Internet of things, job automation, job satisfaction, load shedding, loose coupling, Malcom McLean invented shipping containers, Marc Andreessen, place-making, platform as a service, premature optimization, recommendation engine, revision control, risk tolerance, side project, Silicon Valley, software as a service, sorting algorithm, statistical model, Steven Levy, supply-chain management, Toyota Production System, web application, Yogi Berra

Sometimes services were also scaled by deploying servers for the application into several geographic regions, or business units, each of which would then use its local server. For example, when Tom first worked at AT&T, there was a different payroll processing center for each division of the company. High Availability Applications requiring high availability required “fault-tolerant” computers. These computers had multiple CPUs, error-correcting RAM, and other technologies that were extremely expensive at the time. Fault-tolerant systems were niche products. Generally only the military and Wall Street needed such systems. As a result they were usually priced out of the reach of typical companies. Costs During this era the Internet was not business-critical, and outages for internal business-critical systems could be scheduled because the customer base was a limited, known set of people.

Hardware can also fail, with the scope of the failure ranging from the smallest component to the largest network. Failure domains can be any size: a device, a computer, a rack, a datacenter, or even an entire company. The amount of capacity in a system is N + M, where N is the amount of capacity used to provide a service and M is the amount of spare capacity available, which can be used in the event of a failure. A system that is N + 1 fault tolerant can survive one unit of failure and remain operational. The most common way to route around failure is through replication of services. A service may be replicated one or more times per failure domain to provide resilience greater than the domain. Failures can also come from external sources that overload a system, and from human mistakes. There are countermeasures to nearly every failure imaginable.

Originally based on applying Agile methodology to operations, the result is a streamlined set of principles and processes that can create reliable services. Appendix B will make the case that cloud or distributed computing was the inevitable result of the economics of hardware. DevOps is the inevitable result of needing to do efficient operations in such an environment. If hardware and software are sufficiently fault tolerant, the remaining problems are human. The seminal paper “Why Do Internet Services Fail, and What Can Be Done about It?” by Oppenheimer et al. (2003) raised awareness that if web services are to be a success in the future, operational aspects must improve: We find that (1) operator error is the largest single cause of failures in two of the three services, (2) operator errors often take a long time to repair, (3) configuration errors are the largest category of operator errors, (4) failures in custom-written front-end software are significant, and (5) more extensive online testing and more thoroughly exposing and detecting component failures would reduce failure rates in at least one service.

pages: 66 words: 9,247

MongoDB and Python by Niall O’Higgins

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

cloud computing, Debian, fault tolerance, semantic web, web application

MongoDB ObjectIds have the nice property of being almost-certainly-unique upon generation, hence no central coordination is required. This contrasts sharply with the common RDBMS idiom of using auto-increment primary keys. Guaranteeing that an auto-increment key is not already in use usually requires consulting some centralized system. When the intention is to provide a horizontally scalable, de-centralized and fault-tolerant database—as is the case with MongoDB—auto-increment keys represent an ugly bottleneck. By employing ObjectId as your _id, you leave the door open to horizontal scaling via MongoDB’s sharding capabilities. While you can in fact supply your own value for the _id property if you wish—so long as it is globally unique—this is best avoided unless there is a strong reason to do otherwise. Examples of cases where you may be forced to provide your own _id property value include migration from RDBMS systems which utilized the previously-mentioned auto-increment primary key idiom.

pages: 319 words: 72,969

Nginx HTTP Server Second Edition by Clement Nedelcu

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Debian, fault tolerance, Firefox, Google Chrome, Ruby on Rails, web application

Features As of the stable version 1.2.9, Nginx offers an impressive variety of features, which, contrary to what the title of this book indicates, are not all related to serving HTTP content. Here is a list of the main features of the web branch, quoted from the official website www.nginx.org: • Handling of static files, index files, and autoindexing; open file descriptor cache. • Accelerated reverse proxying with caching; simple load balancing and fault tolerance. • Accelerated support with caching of remote FastCGI servers; simple load balancing and fault tolerance. • Modular architecture. Filters include Gzipping, byte ranges, chunked responses, XSLT, SSI, and image resizing filter. Multiple SSI inclusions within a single page can be processed in parallel if they are handled by FastCGI or proxied servers. • SSL and TLS SNI support (TLS with Server Name Indication (SNI), required for using TLS on a server doing virtual hosting).

pages: 923 words: 516,602

The C++ Programming Language by Bjarne Stroustrup

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

combinatorial explosion, conceptual framework, database schema, distributed generation, Donald Knuth, fault tolerance, general-purpose programming language, index card, iterative process, job-hopping, locality of reference, Menlo Park, Parkinson's law, premature optimization, sorting algorithm

Concrete and abstract classes (interfaces) are presented here (Chapter 10, Chapter 12), together with operator overloading (Chapter 11), polymorphism, and the use of class hierarchies (Chapter 12, Chapter 15). Chapter 13 presents templates, that is, C++’s facilities for defining families of types and functions. It demonstrates the basic techniques used to provide containers, such as lists, and to support generic programming. Chapter 14 presents exception handling, discusses techniques for error handling, and presents strategies for fault tolerance. I assume that you either aren’t well acquainted with objectoriented programming and generic programming or could benefit from an explanation of how the main abstraction techniques are supported by C++. Thus, I don’t just present the language features supporting the abstraction techniques; I also explain the techniques themselves. Part IV goes further in this direction. Part III presents the C++ standard library.

Many systems offer mechanisms, such as signals, to deal with asynchrony, but because these tend to be system-dependent, they are not described here. The exception-handling mechanism is a nonlocal control structure based on stack unwinding (§14.4) that can be seen as an alternative return mechanism. There are therefore legitimate uses of exceptions that have nothing to do with errors (§14.5). However, the primary aim of the exception-handling mechanism and the focus of this chapter is error handling and the support of fault tolerance. Standard C++ doesn’t have the notion of a thread or a process. Consequently, exceptional circumstances relating to concurrency are not discussed here. The concurrency facilities available on your system are described in its documentation. Here, I’ll just note that the C++ exception- The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T. Published by Addison Wesley Longman, Inc.

For example: vvooiidd uussee__ffiillee(ccoonnsstt cchhaarr* ffnn) { F FIIL LE E* f = ffooppeenn(ffnn,"w w"); // use f ffcclloossee(ff); } This looks plausible until you realize that if something goes wrong after the call of ffooppeenn() and before the call of ffcclloossee(), an exception may cause uussee__ffiillee() to be exited without ffcclloossee() being called. Exactly the same problem can occur in languages that do not support exception handling. For example, the standard C library function lloonnggjjm mpp() can cause the same problem. Even an ordinary rreettuurrnn-statement could exit uussee__ffiillee without closing ff. A first attempt to make uussee__ffiillee() to be fault-tolerant looks like this: vvooiidd uussee__ffiillee(ccoonnsstt cchhaarr* ffnn) { F FIIL LE E* f = ffooppeenn(ffnn,"rr"); ttrryy { // use f } The C++ Programming Language, Third Edition by Bjarne Stroustrup. Copyright ©1997 by AT&T. Published by Addison Wesley Longman, Inc. ISBN 0-201-88954-4. All rights reserved. Section 14.4 Resource Management 365 ccaattcchh (...) { ffcclloossee(ff); tthhrroow w; } ffcclloossee(ff); } The code using the file is enclosed in a ttrryy block that catches every exception, closes the file, and re-throws the exception.

pages: 1,201 words: 233,519

Coders at Work by Peter Seibel

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Ada Lovelace, bioinformatics, cloud computing, Conway's Game of Life, domain-specific language, don't repeat yourself, Donald Knuth, fault tolerance, Fermat's Last Theorem, Firefox, George Gilder, glass ceiling, Guido van Rossum, HyperCard, information retrieval, Larry Wall, loose coupling, Marc Andreessen, Menlo Park, Metcalfe's law, Perl 6, premature optimization, publish or perish, random walk, revision control, Richard Stallman, rolodex, Ruby on Rails, Saturday Night Live, side project, slashdot, speech recognition, the scientific method, Therac-25, Turing complete, Turing machine, Turing test, type inference, Valgrind, web application

It's a lot better than shared memory programming. I think that's the one thing Erlang has done—it has actually demonstrated that. When we first did Erlang and we went to conferences and said, “You should copy all your data.” And I think they accepted the arguments over fault tolerance—the reason you copy all your data is to make the system fault tolerant. They said, “It'll be terribly inefficient if you do that,” and we said, “Yeah, it will but it'll be fault tolerant.” The thing that is surprising is that it's more efficient in certain circumstances. What we did for the reasons of fault tolerance, turned out to be, in many circumstances, just as efficient or even more efficient than sharing. Then we asked the question, “Why is that?” Because it increased the concurrency. When you're sharing, you've got to lock your data when you access it.

pages: 834 words: 180,700

The Architecture of Open Source Applications by Amy Brown, Greg Wilson

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

8-hour work day, anti-pattern, bioinformatics, c2.com, cloud computing, collaborative editing, combinatorial explosion, computer vision, continuous integration, create, read, update, delete, David Heinemeier Hansson, Debian, domain-specific language, Donald Knuth, en.wikipedia.org, fault tolerance, finite state, Firefox, friendly fire, Guido van Rossum, linked data, load shedding, locality of reference, loose coupling, Mars Rover, MVC pattern, peer-to-peer, Perl 6, premature optimization, recommendation engine, revision control, Ruby on Rails, side project, Skype, slashdot, social web, speech recognition, the scientific method, The Wisdom of Crowds, web application, WebSocket

The coordinator distributes requests to individual CouchDB instances based on the key of the document being requested. Twitter has built the notions of sharding and replication into a coordinating framework called Gizzard16. Gizzard takes standalone data stores of any type—you can build wrappers for SQL or NoSQL storage systems—and arranges them in trees of any depth to partition keys by key range. For fault tolerance, Gizzard can be configured to replicate data to multiple physical machines for the same key range. 13.4.3. Consistent Hash Rings Good hash functions distribute a set of keys in a uniform manner. This makes them a powerful tool for distributing key-value pairs among multiple servers. The academic literature on a technique called consistent hashing is extensive, and the first applications of the technique to data stores was in systems called distributed hash tables (DHTs).

Routing is simple in the hash partitioning scheme: for the most part, the hash function can be executed by clients to find the appropriate server. With more complicated rebalancing schemes, finding the right node for a key becomes more difficult. Range partitioning requires the upfront cost of maintaining routing and configuration nodes, which can see heavy load and become central points of failure in the absence of relatively complex fault tolerance schemes. Done well, however, range-partitioned data can be load-balanced in small chunks which can be reassigned in high-load situations. If a server goes down, its assigned ranges can be distributed to many servers, rather than loading the server's immediate neighbors during downtime. 13.5. Consistency Having spoken about the virtues of replicating data to multiple machines for durability and spreading load, it's time to let you in on a secret: keeping replicas of your data on multiple machines consistent with one-another is hard.

., RFC 3280 SubjectPublicKeyInfo, with the algorithm I.e., as a RFC 3279 Dsa-Sig-Value, created by algorithm 1.2.840.10040.4.3. The Architecture of Open Source Applications Amy Brown and Greg Wilson (eds.) ISBN 978-1-257-63801-7 License / Buy / Contribute Chapter 15. Riak and Erlang/OTP Francesco Cesarini, Andy Gross, and Justin Sheehy Riak is a distributed, fault tolerant, open source database that illustrates how to build large scale systems using Erlang/OTP. Thanks in large part to Erlang's support for massively scalable distributed systems, Riak offers features that are uncommon in databases, such as high-availability and linear scalability of both capacity and throughput. Erlang/OTP provides an ideal platform for developing systems like Riak because it provides inter-node communication, message queues, failure detectors, and client-server abstractions out of the box.

The Art of Scalability: Scalable Web Architecture, Processes, and Organizations for the Modern Enterprise by Martin L. Abbott, Michael T. Fisher

always be closing, anti-pattern, barriers to entry, Bernie Madoff, business climate, business continuity plan, business intelligence, business process, call centre, cloud computing, combinatorial explosion, commoditize, Computer Numeric Control, conceptual framework, database schema, discounted cash flows, en.wikipedia.org, fault tolerance, finite state, friendly fire, hiring and firing, Infrastructure as a Service, inventory management, new economy, packet switching, performance metric, platform as a service, Ponzi scheme, RFC: Request For Comment, risk tolerance, Rubik’s Cube, Search for Extraterrestrial Intelligence, SETI@home, shareholder value, Silicon Valley, six sigma, software as a service, the scientific method, transaction costs, Vilfredo Pareto, web application, Y2K

If we have a technology platform comprised of a number of noncommunicating services, we increase the number of airports or runways for which we are managing traffic; as a result, we can have many more “landings” or changes. If the services communicate asynchronously, we would have a few more concerns, but we are also likely more willing to take risks. On the other hand, if the services all communicate synchronously with each other, there isn’t much more fault tolerance than with a monolithic system (see Chapter 21, Creating Fault Isolative Architectural Structures) and we are back to managing a single runway at a single airport. The expected result of the change is important as we want to be able to verify later that the change was successful. For instance, if a change is being made to a Web server and that change is to allow more threads of execution in the Web server, we should state that as the expected result.

Be careful here, because if you become an early adopter of software or systems, you will also be on the leading edge of finding all the bugs with that software or system. If availability and reliability are important to you and your customers, try to be an early majority or late majority adopter of those systems that are critical to the operations of your service, product, or platform. Asynchronous Design Whenever possible, systems should communicate in an asynchronous fashion. Asynchronous systems tend to be more fault tolerant to extreme load and do not easily fall prey to the multiplicative effects of failure that characterize synchronous systems. We will discuss the reasons for this in greater detail in the next section of this chapter. Stateless Systems Although some systems need state, state has a cost in terms of availability, scalability, and overall cost of your system. When you store state, you do so at a cost of memory or disk space and maybe the cost of databases.

The first factor to use in determining which services should be selected for stress testing is the criticality of each service to the overall system performance. If there is a central service such as a data abstract layer (DAL) or user authorization, this should be included as a candidate for stress testing because the stability of the entire application depends on this service. If you have architected your application into fault tolerant “swim lanes,” which will be discussed in Chapter 21, Creating Fault Isolative Architectural Structures, you still likely have core services that have been replicated across the lanes. The second consideration for determining services to stress test is the likelihood that a service affects performance. This decision will be influenced by knowledgeable engineers but should also be somewhat scientific.

pages: 400 words: 94,847

Reinventing Discovery: The New Era of Networked Science by Michael Nielsen

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Albert Einstein, augmented reality, barriers to entry, bioinformatics, Cass Sunstein, Climategate, Climatic Research Unit, conceptual framework, dark matter, discovery of DNA, Donald Knuth, double helix, Douglas Engelbart, Douglas Engelbart, en.wikipedia.org, Erik Brynjolfsson, fault tolerance, Fellow of the Royal Society, Firefox, Freestyle chess, Galaxy Zoo, Internet Archive, invisible hand, Jane Jacobs, Jaron Lanier, Kevin Kelly, Magellanic Cloud, means of production, medical residency, Nicholas Carr, publish or perish, Richard Feynman, Richard Feynman, Richard Stallman, selection bias, semantic web, Silicon Valley, Silicon Valley startup, Simon Singh, Skype, slashdot, social web, statistical model, Stephen Hawking, Stewart Brand, Ted Nelson, The Death and Life of Great American Cities, The Nature of the Firm, The Wisdom of Crowds, University of East Anglia, Vannevar Bush, Vernor Vinge

Lipman, and Nancy J. Cox et al. A global initiative on sharing avian flu data. Nature, 442:981, August 31, 2006. [21] John Bohannon. Gamers unravel the secret life of protein. Wired, 17(5), April 20, 2009. http://www.wired.com/medtech/genetics/magazine/17-05/ff_protein?currentPage=all. [22] Parsa Bonderson, Sankar Das Sarma, Michael Freedman, and Chetan Nayak. A blueprint for a topologically fault-tolerant quantum computer. eprint arXiv:1003.2856, 2010. [23] Christine L. Borgman. Scholarship in the Digital Age. Cambrdge, MA: MIT Press, 2007. [24] Kirk D. Borne et al. Astroinformatics: A 21st century approach to astronomy. eprint arXiv: 0909.3892, 2009. Position paper for Astro2010 Decadal Survey State, available at http://arxiv.org/abs/0909.3892. [25] Todd A. Boroson and Tod R. Lauer.

Speculations on the future of science. Edge: The Third Culture, 2006. http://www.edge.org/3rd_culture/kelly06/kelly06_index.html. [109] Kevin Kelly. What Technology Wants. New York: Viking, 2010. [110] Richard A. Kerr. Recently discovered habitable world may not exist. Science Now, October 12, 2010. http://news.sciencemag.org/sciencenow/2010/10/recently-discovered-habitable-world.html. [111] A. Yu Kitaev. Fault-tolerant quantum computation by anyons. Annals of Physics, 303(1):2–30, 2003. [112] Helge Kragh. Max Planck: The reluctant revolutionary. Physics World, December 2000. http://physicsworld.com/cws/article/print/373. [113] Greg Kroah-Hartman. The Linux kernel. Online video from Google Tech Talks. http://www.youtube.com/watch?v=L2SED6sewRw. [114] Greg Kroah-Hartman, Jonathan Corbet, and Amanda McPherson.

pages: 554 words: 108,035

Scala in Depth by Tom Kleenex, Joshua Suereth

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

discrete time, domain-specific language, fault tolerance, MVC pattern, sorting algorithm, type inference

These aren’t discussed in the book, but can be found in Akka’s documentation at http://akka.io/docs/ This technique can be powerful when distributed and clustered. The Akka 2.0 framework is adding the ability to create actors inside a cluster and allow them to be dynamically moved around to machines as needed. 9.6. Summary Actors provide a simpler parallelization model than traditional locking and threading. A well-behaved actors system can be fault-tolerant and resistant to total system slowdown. Actors provide an excellent abstraction for designing high-performance servers, where throughput and uptime are of the utmost importance. For these systems, designing failure zones and failure handling behaviors can help keep a system running even in the event of critical failures. Splitting actors into scheduling zones can ensure that input overload to any one portion of the system won’t bring the rest of the system down.

So, while the Scala actors library is an excellent resource for creating actors applications, the Akka library provides the features and performance needed to make a production application. Akka also supports common features out of the box. Actors and actor-related system design is a rich subject. This chapter lightly covered a few of the key aspects to actor-related design. These should be enough to create a fault-tolerant high-performant actors system. Next let’s look into a topic of great interest: Java interoperability with Scala. Chapter 10. Integrating Scala with Java In this chapter The benefits of using interfaces for Scala-Java interaction The dangers of automatic implicit conversions of Java types The complications of Java serialization in Scala How to effectively use annotations in Scala for Java libraries One of the biggest advantages of the Scala language is its ability to seamlessly interact with existing Java libraries and applications.

pages: 713 words: 93,944

Seven Databases in Seven Weeks: A Guide to Modern Databases and the NoSQL Movement by Eric Redmond, Jim Wilson, Jim R. Wilson

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Amazon Web Services, create, read, update, delete, data is the new oil, database schema, Debian, domain-specific language, en.wikipedia.org, fault tolerance, full text search, general-purpose programming language, linked data, MVC pattern, natural language processing, node package manager, random walk, recommendation engine, Ruby on Rails, Skype, social graph, web application

Just like Riak (“Ree-ahck”), you never use only one, but the multiple parts working together make the overall system durable. Each component is cheap and expendable, but when used right, it’s hard to find a simpler or stronger structure upon which to build a foundation. Riak is a distributed key-value database where values can be anything—from plain text, JSON, or XML to images or video clips—all accessible through a simple HTTP interface. Whatever data you have, Riak can store it. Riak is also fault-tolerant. Servers can go up or down at any moment with no single point of failure. Your cluster continues humming along as servers are added, removed, or (ideally not) crash. Riak won’t keep you up nights worrying about your cluster—a failed node is not an emergency, and you can wait to deal with it in the morning. As core developer Justin Sheehy once noted, “[The Riak team] focused so hard on things like write availability…to go back to sleep.”

It is based on BigTable, a high-performance, proprietary database developed by Google and described in the 2006 white paper “Bigtable: A Distributed Storage System for Structured Data.”[26] Initially created for natural-language processing, HBase started life as a contrib package for Apache Hadoop. Since then, it has become a top-level Apache project. On the architecture front, HBase is designed to be fault tolerant. Hardware failures may be uncommon for individual machines, but in a large cluster, node failure is the norm. By using write-ahead logging and distributed configuration, HBase can quickly recover from individual server failures. Additionally, HBase lives in an ecosystem that has its own complementary benefits. HBase is built on Hadoop—a sturdy, scalable computing platform that provides a distributed file system and mapreduce capabilities.

pages: 329 words: 95,309

Digital Bank: Strategies for Launching or Becoming a Digital Bank by Chris Skinner

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

algorithmic trading, Amazon Web Services, Any sufficiently advanced technology is indistinguishable from magic, augmented reality, bank run, Basel III, bitcoin, business intelligence, business process, business process outsourcing, call centre, cashless society, clean water, cloud computing, corporate social responsibility, credit crunch, crowdsourcing, cryptocurrency, demand response, disintermediation, don't be evil, en.wikipedia.org, fault tolerance, fiat currency, financial innovation, Google Glasses, high net worth, informal economy, Infrastructure as a Service, Internet of things, Jeff Bezos, Kevin Kelly, Kickstarter, M-Pesa, margin call, mass affluent, mobile money, Mohammed Bouazizi, new economy, Northern Rock, Occupy movement, Pingit, platform as a service, Ponzi scheme, prediction markets, pre–internet, QR code, quantitative easing, ransomware, reserve currency, RFID, Satoshi Nakamoto, Silicon Valley, smart cities, software as a service, Steve Jobs, strong AI, Stuxnet, trade route, unbanked and underbanked, underbanked, upwardly mobile, We are the 99%, web application, Y2K

The first category is the one that will occur more and more often, as banks have so many legacy systems across their core back office operations. It is far easier to change and add new front office systems – new trading desks, new channels or new customer service operations – than to replace core back office platforms – deposit account processing, post-trade services and payment systems. Why? Because the core processing needs to be highly resilient; 99.9999999999999999999999% and a few more 9’s fault tolerant; and running 24 by 7. In other words these systems are non-stop and would highly expose the bank to failure if they stop working. It is these systems that cause most of the challenges for a bank however. This is because, being a core system, they were often developed in the 1960s and 1970s. Back then, computing technologies were based upon lines of code fed into the machine through packs and packs of punched cards.

Add to this the regulatory regime change, which would force banks to respond more and more rapidly to new requirements, and the old technologies could not keep up. Finally, the technology had to change. This is why banks have been working hard to consolidate and replace their old infrastructures, and why we are seeing more and more glitches and failures. As soon as you upgrade an old, embedded, non-stop fault tolerant machine however, you are open to risk. The 99.9999+% non-stop machine suddenly has to stop. A competent bank derisks the risk of change by testing, testing and testing, whilst an incompetent bank may test but not enough. Luckily, most banks and exchanges are competent enough to test these things properly by planning correctly through roll forward and roll back cycles. The real issue with an upgrade or consolidation though is that it has be done more and more frequently due to the combined forces of regulatory, technology and customer change.

pages: 102 words: 27,769

Rework by Jason Fried, David Heinemeier Hansson

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

call centre, Clayton Christensen, Dean Kamen, Exxon Valdez, fault tolerance, James Dyson, Jeff Bezos, Ralph Nader, risk tolerance, Ruby on Rails, Steve Jobs, Tony Hsieh, Y Combinator

—Saul Kaplan, chief catalyst, Business Innovation Factory “Appealingly intimate, as if you’re having coffee with the authors. Rework is not just smart and succinct but grounded in the concreteness of doing rather than hard-to-apply philosophizing. This book inspired me to trust myself in defying the status quo.” —Penelope Trunk, author of Brazen Careerist: The New Rules for Success “[This book’s] assumption is that an organization is a piece of software. Editable. Malleable. Sharable. Fault-tolerant. Comfortable in Beta. Reworkable. The authors live by the credo ‘keep it simple, stupid’ and Rework possesses the same intelligence—and irreverence—of that simple adage.” —John Maeda, author of The Laws of Simplicity “Rework is like its authors: fast-moving, iconoclastic, and inspiring. It’s not just for startups. Anyone who works can learn from this.” —Jessica Livingston, partner, Y Combinator; author, Founders at Work INTRODUCTION FIRST The new reality TAKEDOWNS Ignore the real world Learning from mistakes is overrated Planning is guessing Why grow?

HBase: The Definitive Guide by Lars George

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Amazon Web Services, bioinformatics, create, read, update, delete, Debian, distributed revision control, domain-specific language, en.wikipedia.org, fault tolerance, Firefox, Google Earth, place-making, revision control, smart grid, web application

You may have a background in relational database theory or you want to start fresh and this “column-oriented thing” is something that seems to fit your bill. You also heard that HBase can scale without much effort, and that alone is reason enough to look at it since you are building the next web-scale system. I was at that point in late 2007 when I was facing the task of storing millions of documents in a system that needed to be fault-tolerant and scalable while still being maintainable by just me. I had decent skills in managing a MySQL database system, and was using the database to store data that would ultimately be served to our website users. This database was running on a single server, with another as a backup. The issue was that it would not be able to hold the amount of data I needed to store for this new project. I would have to either invest in serious RDBMS scalability skills, or find something else instead.

Looking at open source alternatives in the RDBMS space, you will likely have to give up many or all relational features, such as secondary indexes, to gain some level of performance. The question is, wouldn’t it be good to trade relational features permanently for performance? You could denormalize (see the next section) the data model and avoid waits and deadlocks by minimizing necessary locking. How about built-in horizontal scalability without the need to repartition as your data grows? Finally, throw in fault tolerance and data availability, using the same mechanisms that allow scalability, and what you get is a NoSQL solution—more specifically, one that matches what HBase has to offer. Database (De-)Normalization At scale, it is often a requirement that we design schema differently, and a good term to describe this principle is Denormalization, Duplication, and Intelligent Keys (DDI).[20] It is about rethinking how data is stored in Bigtable-like storage systems, and how to make use of it in an appropriate way.

These are abstractions that define higher-level features and APIs, which are then used by Hadoop to store the data. The data is eventually stored on a disk, at which point the OS filesystem is used. HDFS is the most used and tested filesystem in production. Almost all production clusters use it as the underlying storage layer. It is proven stable and reliable, so deviating from it may impose its own risks and subsequent problems. The primary reason HDFS is so popular is its built-in replication, fault tolerance, and scalability. Choosing a different filesystem should provide the same guarantees, as HBase implicitly assumes that data is stored in a reliable manner by the filesystem. It has no added means to replicate data or even maintain copies of its own storage files. This functionality must be provided by the lower-level system. You can select a different filesystem implementation by using a URI[36] pattern, where the scheme (the part before the first “:”, i.e., the colon) part of the URI identifies the driver to be used.

pages: 541 words: 109,698

Mining the Social Web: Finding Needles in the Social Haystack by Matthew A. Russell

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Climategate, cloud computing, crowdsourcing, en.wikipedia.org, fault tolerance, Firefox, full text search, Georg Cantor, Google Earth, information retrieval, Mark Zuckerberg, natural language processing, NP-complete, profit motive, Saturday Night Live, semantic web, Silicon Valley, slashdot, social graph, social web, statistical model, Steve Jobs, supply-chain management, text mining, traveling salesman, Turing test, web application

Sorting by date seems like a good idea and opens the door to certain kinds of time-series analysis, so let’s start there and see what happens. But first, we’ll need to make a small configuration change so that we can write our map/reduce functions to perform this task in Python. CouchDB is especially intriguing in that it’s written in Erlang, a language engineered to support super-high concurrency[16] and fault tolerance. The de facto out-of-the-box language you use to query and transform your data via map/reduce functions is JavaScript. Note that we could certainly opt to write map/reduce functions in JavaScript and realize some benefits from built-in JavaScript functions CouchDB offers—such as _sum, _count, and _stats. But the benefit gained from your development environment’s syntax checking/highlighting may prove more useful and easier on the eyes than staring at JavaScript functions wrapped up as triple-quoted string values that exist inside of Python code.

Some steps have been made in this direction: for instance, we discussed how microformats already make this possible for certain domains in Chapter 2, and in Chapter 9 we looked at how Facebook is aggressively bootstrapping an explicit graph construct into the Web with its Open Graph protocol. But before we get too pie-in-the-sky, let’s back up for just a moment and reflect on how we got to where we are right now. The Internet is just a network of networks,[63] and what’s very fascinating about it from a technical standpoint is how layers of increasingly higher-level protocols build on top of lower-level protocols to ultimately produce a fault-tolerant worldwide computing infrastructure. In our online activity, we rely on dozens of protocols every single day, without even thinking about it. However, there is one ubiquitous protocol that is hard not to think about explicitly from time to time: HTTP, the prefix of just about every URL that you type into your browser, the enabling protocol for the extensive universe of hypertext documents (HTML pages), and the links that glue them all together into what we know as the Web.

pages: 470 words: 109,589

Apache Solr 3 Enterprise Search Server by Unknown

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

bioinformatics, continuous integration, database schema, en.wikipedia.org, fault tolerance, Firefox, full text search, information retrieval, Internet Archive, natural language processing, performance metric, platform as a service, Ruby on Rails, web application

While Solr offers some impressive scaling techniques through replication and sharding of data, it assumes that you know a priori what your scaling needs are. The distributed search of Solr doesn't adapt to real time changes in indexing or query load and doesn't provide any fail-over support. SolrCloud is an ongoing effort to build a fault tolerant, centrally managed support for clusters of Solr instances and is part of the trunk development path (Solr 4.0). SolrCloud introduces the idea that a logical collection of documents (otherwise known as an index) is distributed across a number of slices. Each slice is made up of shards, which are the physical pieces of the collection. In order to support fault tolerance, there may be multiple replicas of a shard distributed across different physical nodes. To keep all this data straight, Solr embeds Apache ZooKeeper as the centralized service for managing all configuration information for the cluster of Solr instances, including mapping which shards are available on which set of nodes of the cluster.

Programming Android by Zigurd Mednieks, Laird Dornin, G. Blake Meike, Masumi Nakamura

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

anti-pattern, business process, conceptual framework, create, read, update, delete, database schema, Debian, domain-specific language, en.wikipedia.org, fault tolerance, Google Earth, interchangeable parts, iterative process, loose coupling, MVC pattern, revision control, RFID, web application

When someone is related to multiple things (such as multiple addresses), relational databases have ways of handling that too, but we won't go into such detail in this chapter. SQLite Android uses the SQLite database engine, a self-contained, transactional database engine that requires no separate server process. Many applications and environments beyond Android make use of it, and a large open source community actively develops SQLite. In contrast to desktop-oriented or enterprise databases, which provide a plethora of features related to fault tolerance and concurrent access to data, SQLite aggressively strips out features that are not absolutely necessary in order to achieve a small footprint. For example, many database systems use static typing, but SQLite does not store database type information. Instead, it pushes the responsibility of keeping type information into high-level languages, such as Java, that map database structures into high-level types.

You can also explicitly start and end a transaction so that it encompasses multiple statements. For a given transaction, SQLite does not modify the database until all statements in the transaction have completed successfully. Given the volatility of the Android mobile environment, we recommend that in addition to meeting the needs for consistency in your app, you also make liberal use of transactions to support fault tolerance in your application. Example Database Manipulation Using sqlite3 Now that you understand the basics of SQL as it pertains to SQLite, let’s have a look at a simple database for storing video metadata using the sqlite3 command-line tool and the Android debug shell, which you can start by using the adb command. Using the command line will allow us to view database changes right away, and will provide some simple examples of how to work with this useful database debugging tool.

pages: 655 words: 141,257

Programming Android: Java Programming for the New Generation of Mobile Devices by Zigurd Mednieks, Laird Dornin, G. Blake Meike, Masumi Nakamura

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

anti-pattern, business process, conceptual framework, create, read, update, delete, database schema, Debian, domain-specific language, en.wikipedia.org, fault tolerance, Google Earth, interchangeable parts, iterative process, loose coupling, MVC pattern, revision control, RFID, web application, yellow journalism

When someone is related to multiple things (such as multiple addresses), relational databases have ways of handling that too, but we won't go into such detail in this chapter. SQLite Android uses the SQLite database engine, a self-contained, transactional database engine that requires no separate server process. Many applications and environments beyond Android make use of it, and a large open source community actively develops SQLite. In contrast to desktop-oriented or enterprise databases, which provide a plethora of features related to fault tolerance and concurrent access to data, SQLite aggressively strips out features that are not absolutely necessary in order to achieve a small footprint. For example, many database systems use static typing, but SQLite does not store database type information. Instead, it pushes the responsibility of keeping type information into high-level languages, such as Java, that map database structures into high-level types.

You can also explicitly start and end a transaction so that it encompasses multiple statements. For a given transaction, SQLite does not modify the database until all statements in the transaction have completed successfully. Given the volatility of the Android mobile environment, we recommend that in addition to meeting the needs for consistency in your app, you also make liberal use of transactions to support fault tolerance in your application. Example Database Manipulation Using sqlite3 Now that you understand the basics of SQL as it pertains to SQLite, let’s have a look at a simple database for storing video metadata using the sqlite3 command-line tool and the Android debug shell, which you can start by using the adb command. Using the command line will allow us to view database changes right away, and will provide some simple examples of how to work with this useful database debugging tool.

pages: 377 words: 21,687

Digital Apollo: Human and Machine in Spaceflight by David A. Mindell

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

1960s counterculture, computer age, deskilling, fault tolerance, interchangeable parts, Mars Rover, more computing power than Apollo, Norbert Wiener, Norman Mailer, Silicon Valley, Stewart Brand, telepresence, telerobotics

‘‘Apollo Experience Report: Guidance and Control Systems: Primary Guidance, Navigation, and Control System Development.’’ NASA TN D-8227. Houston, Tex.: Johnson Space Center, 1976. Holliday, Will L., and Dale P. Hoffman. ‘‘Systems Approach to Flight Controls.’’ Astronautics (May 1962): 36–37, 74–80. Hong, Sungook. ‘‘Man and Machine in the 1960s.’’ Techne 7, no. 3 (2004): 49–77. Hopkins, Albert L. ‘‘A Fault-Tolerant Information Processing Concept for Space Vehicles.’’ Cambridge, Mass.: MIT Instrumentation Laboratory, 1970. Hopkins, Albert L. ‘‘A Fault-Tolerant Information Processing System for Advanced Control, Guidance, and Navigation.’’ Cambridge, Mass.: Charles Stark Draper Laboratories, 1970. Hopkins Jr., Albert L., Ramon Alonso, and Hugh Blair-Smith. ‘‘Logical Description for the Apollo Guidance Computer (AGC4).’’ Cambridge, Mass.: MIT Instrumentation Laboratory, 1963. Horner, Richard.

pages: 210 words: 42,271

Programming HTML5 Applications by Zachary Kessin

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

barriers to entry, continuous integration, fault tolerance, Firefox, Google Chrome, mandelbrot fractal, QWERTY keyboard, web application, WebSocket

Ruby Event Machine web socket handler require 'em-websocket' EventMachine::WebSocket.start(:host => "", :port => 8080) do |ws| ws.onopen { ws.send "Hello Client!"} ws.onmessage { |msg| ws.send "Pong: #{msg}" } ws.onclose { puts "WebSocket closed" } end Erlang Yaws Erlang is a pretty rigorously functional language that was developed several decades ago for telephone switches and has found acceptance in many other areas where massive parallelism and strong robustness are desired. The language is concurrent, fault-tolerant, and very scalable. In recent years it has moved into the web space because all of the traits that make it useful in phone switches are very useful in a web server. The Erlang Yaws web server also supports web sockets right out of the box. The documentation can be found at the Web Sockets in Yaws web page, along with code for a simple echo server. Example 9-5. Erlang Yaws web socket handler out(A) -> case get_upgrade_header(A#arg.headers) of undefined -> {content, "text/plain", "You're not a web sockets client!

pages: 133 words: 42,254

Big Data Analytics: Turning Big Data Into Big Money by Frank J. Ohlhorst

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

algorithmic trading, bioinformatics, business intelligence, business process, call centre, cloud computing, create, read, update, delete, data acquisition, DevOps, fault tolerance, linked data, natural language processing, Network effects, pattern recognition, performance metric, personalized medicine, RFID, sentiment analysis, six sigma, smart meter, statistical model, supply-chain management, Watson beat the top human players on Jeopardy!, web application

Big Data analytics requires that organizations choose the data to analyze, consolidate them, and then apply aggregation methods before the data can be subjected to the ETL process. This has to occur with large volumes of data, which can be structured, unstructured, or from multiple sources, such as social networks, data logs, web sites, mobile devices, and sensors. Hadoop accomplishes that by incorporating pragmatic processes and considerations, such as a fault-tolerant clustered architecture, the ability to move computing power closer to the data, parallel and/or batch processing of large data sets, and an open ecosystem that supports enterprise architecture layers from data storage to analytics processes. Not all enterprises require what Big Data analytics has to offer; those that do must consider Hadoop’s ability to meet the challenge. However, Hadoop cannot accomplish everything on its own.

pages: 271 words: 52,814

Blockchain: Blueprint for a New Economy by Melanie Swan

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

23andMe, Airbnb, altcoin, Amazon Web Services, asset allocation, banking crisis, basic income, bioinformatics, bitcoin, blockchain, capital controls, cellular automata, central bank independence, clean water, cloud computing, collaborative editing, Conway's Game of Life, crowdsourcing, cryptocurrency, disintermediation, Edward Snowden, en.wikipedia.org, ethereum blockchain, fault tolerance, fiat currency, financial innovation, Firefox, friendly AI, Hernando de Soto, intangible asset, Internet Archive, Internet of things, Khan Academy, Kickstarter, lifelogging, litecoin, Lyft, M-Pesa, microbiome, Network effects, new economy, peer-to-peer, peer-to-peer lending, peer-to-peer model, personalized medicine, post scarcity, prediction markets, QR code, ride hailing / ride sharing, Satoshi Nakamoto, Search for Extraterrestrial Intelligence, SETI@home, sharing economy, Skype, smart cities, smart contracts, smart grid, software as a service, technological singularity, Turing complete, unbanked and underbanked, underbanked, web application, WikiLeaks

Consensus without mining is another area being explored, such as in Tendermint’s modified version of DLS (the solution to the Byzantine Generals’ Problem by Dwork, Lynch, and Stockmeyer), with bonded coins belonging to byzantine participants.184 Another idea for consensus without mining or proof of work is through a consensus algorithm such as Hyperledger’s, which is based on the Practical Byzantine Fault Tolerance algorithm. Only focus on the most recent or unspent outputs Many blockchain operations could be based on surface calculations of the most recent or unspent outputs, similar to how credit card transactions operate. “Thin wallets” operate this way, as opposed to querying a full Bitcoind node, and this is how Bitcoin ewallets work on cellular telephones. A related proposal is Cryptonite, which has a “mini-blockchain” abbreviated data scheme.

pages: 237 words: 76,486

Mars Rover Curiosity: An Inside Account From Curiosity's Chief Engineer by Rob Manning, William L. Simon

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Elon Musk, fault tolerance, fear of failure, Kuiper Belt, Mars Rover

Once I had my diploma in hand, JPL changed my status from draftsman to engineer, but my role as an engineer was slow going. My first job was as an apprentice electronics tester, helping run tests on what would become the brains of the Galileo spacecraft. I quickly discovered that building spacecraft included many extremely tedious jobs. After Galileo, I worked on Magellan (to Venus) and Cassini (to Saturn), becoming expert in the design of spacecraft computers, computer memory, computer architectures, and fault-tolerant systems. In 1993, after thirteen years at JPL, my career took a sudden leap forward. Brian Muirhead, the most inspiring and level-headed spacecraft leader I have ever met, had recently been named spacecraft manager for a funky little mission to Mars called Pathfinder. We had a conversation in which he explained that he was a master of mechanical systems but had not had much experience with electronics.

pages: 313 words: 75,583

Ansible for DevOps: Server and Configuration Management for Humans by Jeff Geerling

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Amazon Web Services, Any sufficiently advanced technology is indistinguishable from magic, cloud computing, continuous integration, database schema, Debian, defense in depth, DevOps, fault tolerance, Firefox, full text search, Google Chrome, inventory management, loose coupling, Minecraft, Ruby on Rails, web application

Have one master SAN that’s mounted on each of the servers. Use a distributed file system, like Gluster, Lustre, Fraunhofer, or Ceph. Some options are easier to set up than others, and all have benefits—and drawbacks. Rsync, git, or NFS offer simple initial setup, and low impact on filesystem performance (in many scenarios). But if you need more flexibility and scalability, less network overhead, and greater fault tolerance, you will have to consider something that requires more configuration (e.g. a distributed file system) and/or more hardware (e.g. a SAN). GlusterFS is licensed under the AGPL license, has good documentation, and a fairly active support community (especially in the #gluster IRC channel). But to someone new to distributed file systems, it can be daunting to get set it up the first time. Configuring Gluster - Basic Overview To get Gluster working on a basic two-server setup (so you can have one folder that’s synchronized and replicated across the two servers—allowing one server to go down completely, and the other to still have access to the files), you need to do the following: Install Gluster server and client on each server, and start the server daemon.

The Dream Machine: J.C.R. Licklider and the Revolution That Made Computing Personal by M. Mitchell Waldrop

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Ada Lovelace, air freight, Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem, Albert Einstein, anti-communist, Apple II, battle of ideas, Berlin Wall, Bill Duvall, Bill Gates: Altair 8800, Byte Shop, Claude Shannon: information theory, computer age, conceptual framework, cuban missile crisis, Donald Davies, double helix, Douglas Engelbart, Douglas Engelbart, Dynabook, experimental subject, fault tolerance, Frederick Winslow Taylor, friendly fire, From Mathematics to the Technologies of Life and Death, Haight Ashbury, Howard Rheingold, information retrieval, invisible hand, Isaac Newton, James Watt: steam engine, Jeff Rulifson, John von Neumann, Leonard Kleinrock, Marc Andreessen, Menlo Park, New Journalism, Norbert Wiener, packet switching, pink-collar, popular electronics, RAND corporation, RFC: Request For Comment, Robert Metcalfe, Silicon Valley, Steve Crocker, Steve Jobs, Steve Wozniak, Steven Levy, Stewart Brand, Ted Nelson, Turing machine, Turing test, Vannevar Bush, Von Neumann architecture, Wiener process, zero-sum game

He and his colleagues would have to give up every engineer's first instinct, which was to control things so that problems could not happen, and instead design a system that was guaranteed to fail-but that would keep running anyhow. Nowadays this is known as a fault-tolerant system, and designing one is still considered a cutting-edge challenge. It means giving the system some of the same quality possessed by a superbly trained military unit, or a talented football team, or, for that matter, any living organism-namely, an ability to react to the unexpected. But in the early 1960s, with CTSS, Corbato and his colleagues had to pioneer fault-tolerant design even as they were pioneering time-sharing itself: For example, among their early innovations were "firewalls," or software barriers that kept each user's area of computer memory isolated from its neighbors, so that a flameout in one program wouldn't necessarily consume the others.

., 197-98 electncal engmeenng, 82 Electncal Engmeermg, 113 electnc power networks, 25-26 electroencephalography (EEG), 11-12 electronIc commons Idea, 413-14,420 ElectronIc Discrete Vanable Automatic Computer (EDVAC),47, 100-101 von Neumann's report on, 59-65 Electronic News, 338 ElectronIc Numencal Integrator and Calculator (EN lAC), 43,45-47,87-88, 101, 102, 103,339 drawbacks of, 46-47 patent dispute over, 63 programmmg of, 46-47 electronIc office Idea, 363-64, 407 ElIas, Peter, 220 ELIZA, 229 Elkmd, Jerry, 110, 111, 152, 175-76, 194, 295, 345, 351, 354,368,371,399,438,444, 446,447 Ellenby, John, 382, 408 EllIs, Jim, 427 E-maIl, 231, 324-26, 384, 420, 465 Engelbart, Douglas, 210-17, 241-43,255,261,273,278, 285, 342, 358, 360n, 364, 406,465,470 at Fall JOInt Computer Confer- ence,5,287-94 EnglIsh, BIll, 242, 243, 289-90, 293-94,354,355,361-62, 365n,366,368 EN lAC, see ElectroniC Numeflcal Integrator and Calculator EnIgma machines, 80 entropy, 81 error-checking codes, 271 error-correcting codes, 79-80, 94n Ethernet, 5, 374-75, 382, 385, 386,439-40,452 Ethernet-AI to- RCG-S LOT (EARS), 385 EuclId, 137 Evans, David, 239, 261, 274, 282, 303, 343, 357, 358 Everett, Robert, 102-3, 108 expectation, 10 behavIOral theory, 74, 97 expert systems, 397-98, 406 ExtrapolatIOn, InterpolatlOn, and Smoothmg of StatIOnary Time Series (Wiener), 54 facsimile machines, 347-48 Fahlman, Scott, 438 FairchIld Semiconductor, 339 Fall JOInt Computer Conference, 5,287-94 Fano, Robert, 19, 75, 94-95, 107, 174,193,217-24,227-36, 243,244,249-51,252-53, 257, 281, 307, 310, 317, 453 FantasIa, 338 Farley, Belmont, 144 fault-tolerant systems, 234 Federal Research Internet Coor- dlnatmg Committee (FRICC), 462 feedback, 55-57, 92, 138 Feigenbaum, Edward, 210, 281, 396,397-98,403,405-6 Fiala, Ed, 346 file systems, hierarchical, 230 FIle Transfer Protocol (FTP), 301 firewalls, 234 "First Draft of a Report on the EDV AC" (von Neumann), 59-65,68,86,102 flat-panel displays, 359 Flegal, Bob, 345 FLEX machine, 358, 359, 361 Flexownter, 166, 188 flight simulators, 101-2 floppy disks, CP/M software and, 434 Ford Motor Company, 334, 335, 336, 337, 389 Forrester, Jay, 102-3, 113, 114-15, 117, 173,230-31 Fortran, 165, 168, 169, 171-72, 246 Fortune, 27, 93 Fossum, Bob, 418, 420 Foster, John, 278, 279, 330 Frankston, Bob, 315 Fredkin, Edward, 152-56, 179, 194,208,313-14,323,412 Freeman, Greydon, 457 Freyman, Monsieur, 83 FRICC (Federal Research Inter- net Coordlnatmg Commit- tee), 462 Fnck, Fredenck, 97, 128, 201-2, 203n Fublni, Gene, 202 Fuchs, Ira, 457 FUJI Xerox, 409 Fumblmg the Future (Smith and Alexander), 382n, 446 functions, 10 lIst processing, 169-70 Galanter, Eugene, 139 Galley, Stuart, 319-20 games, computer, 188, 320, 435 game theory, 85-86, 91 Garner, W.

pages: 619 words: 197,256

Apollo by Charles Murray, Catherine Bly Cox

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

cuban missile crisis, fault tolerance, index card, old-boy network, The Bell Curve by Richard Herrnstein and Charles Murray, V2 rocket, War on Poverty, white flight

He thought that the obsession with the S.P.S. as a one-shot, nonredundant system was a lot of hype. “When they say ‘no redundancy,’ that’s a misnomer,” he said later. “There was only one engine bell, of course, and only one combustion chamber, but all the avionics that fed the signals to that engine and all the mechanical components that had to work, like the little valves that had to be pressurized to open the ball valves, and so forth, were at least single-fault tolerant and usually two-fault tolerant. . . . There were a heck of a lot of ways to start that engine.” And of course they had indeed checked it out carefully before the flight, but nothing they didn’t do for any other mission. All this was still correct as of Christmas Eve, 1968. And yet it ultimately didn’t make any difference to the way many of the people in Apollo felt. Caldwell Johnson, speaking as a designer of the spacecraft, explained it.

pages: 669 words: 210,153

Tools of Titans: The Tactics, Routines, and Habits of Billionaires, Icons, and World-Class Performers by Timothy Ferriss

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Airbnb, Alexander Shulgin, artificial general intelligence, asset allocation, Atul Gawande, augmented reality, back-to-the-land, Bernie Madoff, Bertrand Russell: In Praise of Idleness, Black Swan, blue-collar work, Buckminster Fuller, business process, Cal Newport, call centre, Checklist Manifesto, cognitive bias, cognitive dissonance, Colonization of Mars, Columbine, commoditize, correlation does not imply causation, David Brooks, David Graeber, diversification, diversified portfolio, Donald Trump, effective altruism, Elon Musk, fault tolerance, fear of failure, Firefox, follow your passion, future of work, Google X / Alphabet X, Howard Zinn, Hugh Fearnley-Whittingstall, Jeff Bezos, job satisfaction, Johann Wolfgang von Goethe, John Markoff, Kevin Kelly, Kickstarter, Lao Tzu, life extension, lifelogging, Mahatma Gandhi, Marc Andreessen, Mark Zuckerberg, Mason jar, Menlo Park, Mikhail Gorbachev, Nicholas Carr, optical character recognition, PageRank, passive income, pattern recognition, Paul Graham, peer-to-peer, Peter H. Diamandis: Planetary Resources, Peter Singer: altruism, Peter Thiel, phenotype, PIHKAL and TIHKAL, post scarcity, premature optimization, QWERTY keyboard, Ralph Waldo Emerson, Ray Kurzweil, recommendation engine, rent-seeking, Richard Feynman, Richard Feynman, risk tolerance, Ronald Reagan, selection bias, sharing economy, side project, Silicon Valley, skunkworks, Skype, Snapchat, social graph, software as a service, software is eating the world, stem cell, Stephen Hawking, Steve Jobs, Stewart Brand, superintelligent machines, Tesla Model S, The Wisdom of Crowds, Thomas L Friedman, Wall-E, Washington Consensus, Whole Earth Catalog, Y Combinator, zero-sum game

The most successful computer company of the seventies and eighties, next to IBM, was Digital Equipment Corporation. IBM was first in computers. DEC was first in minicomputers. Many other computer companies (and their entrepreneurial owners) became rich and famous by following a simple principle: If you can’t be first in a category, set up a new category you can be first in. Tandem was first in fault-tolerant computers and built a $1.9 billion business. So Stratus stepped down with the first fault-tolerant minicomputer. Are the laws of marketing difficult? No, they are quite simple. Working things out in practice is another matter, however. Cray Research went over the top with the first supercomputer. So Convex put two and two together and launched the first mini supercomputer. Sometimes you can also turn an also-ran into a winner by inventing a new category.

pages: 722 words: 90,903

Practical Vim: Edit Text at the Speed of Thought by Drew Neil

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Bram Moolenaar, don't repeat yourself, en.wikipedia.org, fault tolerance, finite state, place-making, QWERTY keyboard, web application

That means any bulb can go out, and the rest will be unaffected. I’ve borrowed the expressions in series and in parallel from the field of electronics to differentiate between two techniques for executing a macro multiple times. The technique for executing a macro in series is brittle. Like cheap Christmas tree lights, it breaks easily. The technique for executing a macro in parallel is more fault tolerant. Execute the Macro in Series Picture a robotic arm and a conveyor belt containing a series of items for the robot to manipulate (Figure 4, ​Vim's macros make quick work of repetitive tasks​). Recording a macro is like programming the robot to do a single unit of work. As a final step, we instruct the robot to move the conveyor belt and bring the next item within reach. In this manner, we can have a single robot carry out a series of repetitive tasks on similar items

pages: 354 words: 26,550

High-Frequency Trading: A Practical Guide to Algorithmic Strategies and Trading Systems by Irene Aldridge

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

algorithmic trading, asset allocation, asset-backed security, automated trading system, backtesting, Black Swan, Brownian motion, business process, capital asset pricing model, centralized clearinghouse, collapse of Lehman Brothers, collateralized debt obligation, collective bargaining, computerized trading, diversification, equity premium, fault tolerance, financial intermediation, fixed income, high net worth, implied volatility, index arbitrage, information asymmetry, interest rate swap, inventory management, law of one price, Long Term Capital Management, Louis Bachelier, margin call, market friction, market microstructure, martingale, Myron Scholes, New Journalism, p-value, paper trading, performance metric, profit motive, purchasing power parity, quantitative trading / quantitative finance, random walk, Renaissance Technologies, risk tolerance, risk-adjusted returns, risk/return, Sharpe ratio, short selling, Small Order Execution System, statistical arbitrage, statistical model, stochastic process, stochastic volatility, systematic trading, trade route, transaction costs, value at risk, yield curve, zero-sum game

New York–based MarketFactory provides a suite of software tools to help automated traders get an extra edge in the market, help their models scale, increase their fill ratios, reduce slippage, and thereby improve profitability (P&L). Chapter 18 discusses optimization of execution. Run-time risk management applications ensure that the system stays within prespecified behavioral and P&L bounds. Such applications may also be known as system-monitoring and fault-tolerance software. 26 HIGH-FREQUENCY TRADING r Mobile applications suitable for monitoring performance of highfrequency trading systems alert administration of any issues. r Real-time third-party research can stream advanced information and forecasts. Legal, Accounting, and Other Professional Services Like any business in the financial sector, high-frequency trading needs to make sure that “all i’s are dotted and all t’s are crossed” in the legal and accounting departments.

Scratch Monkey by Stross, Charles

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

carbon-based life, defense in depth, fault tolerance, gravity well, Kuiper Belt, packet switching, phenotype, telepresence

I realise that I may never hear them again. I'm probably grinning like a corpse but I don't care -- she must know by now that blind people often smile. It's easier to grin than to frown; the facial muscles contract into a smirk more easily. Even when you're about to die. "It takes a lot of stress to unbalance a network processor the size of a small moon," she replies calmly; "it shows a remarkable degree of fault tolerance. As for physical assault, the automatic defences are still armed ... as they always have been. So If we want to take it for ourselves, we must overwhelm it by frontal assault, sending uploaded minds out into the simulation space until it overloads and drops into NP-stasis. They do that if you feed them faster than they can transfer capacity elsewhere, you know. It's happened before, and it's what the Superbrights are most afraid of.

pages: 362 words: 86,195

Fatal System Error: The Hunt for the New Crime Lords Who Are Bringing Down the Internet by Joseph Menn

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Brian Krebs, dumpster diving, fault tolerance, Firefox, John Markoff, Menlo Park, offshore financial centre, pirate software, Plutocrats, plutocrats, popular electronics, profit motive, RFID, Silicon Valley, zero day

Cerf, who has a generally upbeat tone about most things, gives the impression that he remains pleasantly surprised that the Internet has continued to function and thrive—even though, as he put it, “We never got to do the production engineering,” the version ready for prime time. Even after his years on the front line, Barrett found such statements amazing. “It’s incredibly disturbing,” he said. “The engine of the world economy is based on this really cool experiment that is not designed for security, it’s designed for fault-tolerance,” which is a system’s ability to withstand some failures. “You can reduce your risks, but the naughty truth is that the Net is just not a secure place for business or society.” Cerf listed a dozen things that could be done to make the Internet safer. Among them: encouraging research into “hardware-assisted security mechanisms,” limiting the enormous damage that Web browsers can wreak on operating systems, and hiring more and better trained federal cybercrime agents while pursuing international legal frameworks.

pages: 352 words: 96,532

Where Wizards Stay Up Late: The Origins of the Internet by Katie Hafner, Matthew Lyon

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

air freight, Bill Duvall, computer age, conceptual framework, Donald Davies, Douglas Engelbart, Douglas Engelbart, fault tolerance, Hush-A-Phone, information retrieval, John Markoff, Kevin Kelly, Leonard Kleinrock, Marc Andreessen, Menlo Park, natural language processing, packet switching, RAND corporation, RFC: Request For Comment, Robert Metcalfe, Ronald Reagan, Silicon Valley, speech recognition, Steve Crocker, Steven Levy

But imagine a local post office somewhere that decided to go it alone, making up its own rules for addressing, packaging, stamping, and sorting mail. Imagine if that rogue post office decided to invent its own set of ZIP codes. Imagine any number of post offices taking it upon themselves to invent new rules. Imagine widespread confusion. Mail handling begs for a certain amount of conformity, and because computers are less fault-tolerant than human beings, e-mail begs loudly. The early wrangling on the ARPANET over attempts to impose standard message headers was typical of other debates over computer industry standards that came later. But because the struggle over e-mail standards was one of the first sources of real tension in the community, it stood out. In 1973 an ad hoc committee led by MIT’s Bhushan tried bringing some order to the implementation of new e-mail programs.

pages: 275 words: 84,980

Before Babylon, Beyond Bitcoin: From Money That We Understand to Money That Understands Us (Perspectives) by David Birch

agricultural Revolution, Airbnb, bank run, banks create money, bitcoin, blockchain, Bretton Woods, British Empire, Broken windows theory, Burning Man, capital controls, cashless society, Clayton Christensen, clockwork universe, creative destruction, credit crunch, cross-subsidies, crowdsourcing, cryptocurrency, David Graeber, dematerialisation, Diane Coyle, distributed ledger, double entry bookkeeping, ethereum blockchain, facts on the ground, fault tolerance, fiat currency, financial exclusion, financial innovation, financial intermediation, floating exchange rates, Fractional reserve banking, index card, informal economy, Internet of things, invention of the printing press, invention of the telegraph, invention of the telephone, invisible hand, Irish bank strikes, Isaac Newton, Jane Jacobs, Kenneth Rogoff, knowledge economy, Kuwabatake Sanjuro: assassination market, large denomination, M-Pesa, market clearing, market fundamentalism, Marshall McLuhan, Martin Wolf, mobile money, money: store of value / unit of account / medium of exchange, new economy, Northern Rock, Pingit, prediction markets, price stability, QR code, quantitative easing, railway mania, Ralph Waldo Emerson, Real Time Gross Settlement, reserve currency, Satoshi Nakamoto, seigniorage, Silicon Valley, smart contracts, social graph, special drawing rights, technoutopianism, the payments system, The Wealth of Nations by Adam Smith, too big to fail, transaction costs, tulip mania, wage slave, Washington Consensus, wikimedia commons

At the time of writing, the ‘market cap’ of Ethereum is significantly higher than that of Ethereum Classic. Ripple After Bitcoin and Ethereum, the third biggest cryptocurrency is Ripple, which unlike those first two has its roots in local exchange trading systems (Peck 2013). It is a protocol for value exchange that uses a shared ledger but it does not use a Bitcoin-like blockchain, preferring another kind of what is known as a ‘Byzantine fault-tolerant consensus-forming process’. Ripple signs every transaction that parties submit to the network with a digital signature. Each user selects a list, called a ‘unique node list’, comprising other users that it trusts as what are known as ‘validating nodes’. Each validating node independently verifies every proposed transaction within its network to determine if it is valid. A transaction is valid if the correct signature appears on the transaction, i.e. the signature of the funds’ owner, and if the parties have enough funds to make the transaction.

pages: 1,266 words: 278,632

Backup & Recovery by W. Curtis Preston

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Berlin Wall, business intelligence, business process, database schema, Debian, dumpster diving, failed state, fault tolerance, full text search, job automation, side project, Silicon Valley, web application

These two steps are Sybase’s “ounce of prevention.” In addition to these dbcc tasks, you need to choose a transaction log archive strategy. If you follow these tasks, you will help maintain the database, keeping it running smoothly and ready for proper backups. dbcc: The Database Consistency Checker Even though Sybase’s dataserver products are very robust and much effort has gone into making them fault-tolerant, there is always the chance that a problem will occur. For very large tables, some of these problems might not show until very specific queries are run. This is one of the reasons for the database consistency checker, dbcc. This set of SQL commands can review all the database page allocations, linkages, and data pointers, finding problems and, in many cases, fixing them before they become insurmountable.

You can achieve ACID compliance in a MySQL database if you use the InnoDB or the NDB storage engines. (As of this writing, the MySQL team is developing other ACID-compliant storage engines.) With PostgreSQL, all data is stored in an ACID-compliant fashion. PostgreSQL also offers sophisticated features such as point-in-time recovery, tablespaces, checkpoints, hot backups, and write ahead logging for fault tolerance. These are all very good things from a data-protection and data-integrity standpoint. PostgreSQL Architecture From a power-user standpoint, PostgreSQL is like any other database. The following terms mean essentially the same in PostgreSQL as they do in any other relational database: Database Table Index Row Attribute Extent Partition Transaction Clusters A PostgreSQL cluster is analogous to an instance in other RDBMSs, and each cluster works with one or more databases.

pages: 302 words: 82,233

Beautiful security by Andy Oram, John Viega

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Albert Einstein, Amazon Web Services, business intelligence, business process, call centre, cloud computing, corporate governance, credit crunch, crowdsourcing, defense in depth, Donald Davies, en.wikipedia.org, fault tolerance, Firefox, loose coupling, Marc Andreessen, market design, Monroe Doctrine, new economy, Nicholas Carr, Nick Leeson, Norbert Wiener, optical character recognition, packet switching, peer-to-peer, performance metric, pirate software, Robert Bork, Search for Extraterrestrial Intelligence, security theater, SETI@home, Silicon Valley, Skype, software as a service, statistical model, Steven Levy, The Wisdom of Crowds, Upton Sinclair, web application, web of trust, x509 certificate, zero day, Zimmermann PGP

After installation, the infected bot machine contacts the bot server to download additional components or obtain the latest commands, such as denial-of-service attacks or spam to send out. With this dynamic control and command infrastructure, the botnet owner can mobilize a massive amount of computing resources from one corner of the Internet to another within a matter of minutes. It should be noted that the control server itself might not be static. Botnets have evolved from a static control infrastructure to a peer-to-peer structure for the purposes of fault tolerance and evading detection. When one server is detected and blocked, other servers can step in and take over. It is also common for the control server to run on a compromised machine or by proxy, so that the botnet’s owner is unlikely to be identified. Botnets commonly communicate through the same method as their creators’ public IRC servers. Recently, however, we have seen botnets branch out to P2P, HTTPS, SMTP, and other protocols.

pages: 648 words: 108,814

Solr 1.4 Enterprise Search Server by David Smiley, Eric Pugh

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Amazon Web Services, bioinformatics, cloud computing, continuous integration, database schema, domain-specific language, en.wikipedia.org, fault tolerance, Firefox, information retrieval, Internet Archive, Ruby on Rails, web application, Y Combinator

Obviously, this is a fairly complex setup and requires a fairly sophisticated load balancer to frontend this whole collection, but it does allow Solr to handle extremely large data sets. Where next for Solr scaling? There has been a fair amount of discussion on Solr mailing lists about setting up distributed Solr on a robust foundation that adapts to changing environment. There has been some investigation regarding using Apache Hadoop, a platform for building reliable, distributing computing as a foundation for Solr that would provide a robust fault-tolerant filesystem. Another interesting sub project of Hadoop is ZooKeeper, which aims to be a service for centralizing the management required by distributed applications. There has been some development work on integrating ZooKeeper as the management interface for Solr. Keep an eye on the Hadoop homepage for more information about these efforts at http://hadoop.apache.org/ and Zookeeper at http://hadoop.apache.org/zookeeper/.

pages: 559 words: 130,949

Learn You a Haskell for Great Good!: A Beginner's Guide by Miran Lipovaca

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

digital map, fault tolerance, loose coupling, type inference

With the sum operator, we return a stack that has only one element, which is the sum of the stack so far. ghci> solveRPN "2.7 ln" 0.9932517730102834 ghci> solveRPN "10 10 10 10 sum 4 /" 10.0 ghci> solveRPN "10 10 10 10 10 sum 4 /" 12.5 ghci> solveRPN "10 2 ^" 100.0 I think that making a function that can calculate arbitrary floating-point RPN expressions and has the option to be easily extended in 10 lines is pretty awesome. Note This RPN calculation solution is not really fault tolerant. When given input that doesn’t make sense, it might result in a runtime error. But don’t worry, you’ll learn how to make this function more robust in Chapter 14. Heathrow to London Suppose that we’re on a business trip. Our plane has just landed in England, and we rent a car. We have a meeting really soon, and we need to get from Heathrow Airport to London as fast as we can (but safely!).

pages: 470 words: 144,455

Secrets and Lies: Digital Security in a Networked World by Bruce Schneier

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Ayatollah Khomeini, barriers to entry, business process, butterfly effect, cashless society, Columbine, defense in depth, double entry bookkeeping, fault tolerance, game design, IFF: identification friend or foe, John von Neumann, knapsack problem, moral panic, mutually assured destruction, pez dispenser, pirate software, profit motive, Richard Feynman, Richard Feynman, risk tolerance, Silicon Valley, Simon Singh, slashdot, statistical model, Steve Ballmer, Steven Levy, the payments system, Y2K, Yogi Berra

Availability has been defined by various security standards as “the property that a product’s services are accessible when needed and without undue delay,” or “the property of being accessible and usable upon demand by an authorized entity.” These definitions have always struck me as being somewhat circular. We know intuitively what we mean by availability with respect to computers: We want the computer to work when we expect it to as we expect it to. Lots of software doesn’t work when and as we expect it to, and there are entire areas of computer science research in reliability and fault- tolerant computing and software quality ... none of which has anything to do with security. In the context of security, availability is about ensuring that an attacker can’t prevent legitimate users from having reasonable access to their systems. For example, availability is about ensuring that denial-of-service attacks are not possible. ACCESS CONTROL Confidentiality, availability, and integrity all boil down to access control.

pages: 448 words: 71,301

Programming Scala by Unknown

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

domain-specific language, en.wikipedia.org, fault tolerance, general-purpose programming language, loose coupling, type inference, web application

Miscellaneous Scala libraries Name Description and URL Kestrel A tiny, very fast queue system (http://github.com/robey/kestrel/tree/master). ScalaModules Scala DSL to ease OSGi development (http://code.google.com/p/scalamodules/). Configgy Managing configuration files and logging for “daemons” written in Scala (http://www.lag.net/configgy/). scouchdb Scala interface to CouchDB (http://code.google.com/p/scouchdb/). Akka A project to implement a platform for building fault-tolerant, distributed applications based on REST, Actors, etc. (http://akkasource.org/). scala-query A type-safe database query API for Scala (http://github.com/szeiger/scala-query/tree/master). We’ll discuss using Scala with several well-known Java libraries after we discuss Java interoperability, next. 368 | Chapter 14: Scala Tools, Libraries, and IDE Support Download at WoweBook.Com Java Interoperability Of all the alternative JVM languages, Scala’s interoperability with Java source code is among the most seamless.

pages: 598 words: 134,339

Data and Goliath: The Hidden Battles to Collect Your Data and Control Your World by Bruce Schneier

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

23andMe, Airbnb, airport security, AltaVista, Anne Wojcicki, augmented reality, Benjamin Mako Hill, Black Swan, Brewster Kahle, Brian Krebs, call centre, Cass Sunstein, Chelsea Manning, citizen journalism, cloud computing, congestion charging, disintermediation, drone strike, Edward Snowden, experimental subject, failed state, fault tolerance, Ferguson, Missouri, Filter Bubble, Firefox, friendly fire, Google Chrome, Google Glasses, hindsight bias, informal economy, Internet Archive, Internet of things, Jacob Appelbaum, Jaron Lanier, John Markoff, Julian Assange, Kevin Kelly, license plate recognition, lifelogging, linked data, Lyft, Mark Zuckerberg, moral panic, Nash equilibrium, Nate Silver, national security letter, Network effects, Occupy movement, payday loans, pre–internet, price discrimination, profit motive, race to the bottom, RAND corporation, recommendation engine, RFID, self-driving car, Shoshana Zuboff, Silicon Valley, Skype, smart cities, smart grid, Snapchat, social graph, software as a service, South China Sea, stealth mode startup, Steven Levy, Stuxnet, TaskRabbit, telemarketer, Tim Cook: Apple, transaction costs, Uber and Lyft, urban planning, WikiLeaks, zero day

Advancing technology adds new perturbations into existing systems, creating instabilities. If systemic imperfections are inevitable, we have to accept them—in laws, in government institutions, in corporations, in individuals, in society. We have to design systems that expect them and can work despite them. If something is going to fail or break, we need it to fail in a predictable way. That’s resilience. In systems design, resilience comes from a combination of elements: fault-tolerance, mitigation, redundancy, adaptability, recoverability, and survivability. It’s what we need in the complex and ever-changing threat landscape I’ve described in this book. I am advocating for several flavors of resilience for both our systems of surveillance and our systems that control surveillance: resilience to hardware and software failure, resilience to technological innovation, resilience to political change, and resilience to coercion.

pages: 1,025 words: 150,187

ZeroMQ by Pieter Hintjens

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

anti-pattern, carbon footprint, cloud computing, Debian, distributed revision control, domain-specific language, factory automation, fault tolerance, fear of failure, finite state, Internet of things, iterative process, premature optimization, profit motive, pull request, revision control, RFC: Request For Comment, Richard Stallman, Skype, smart transportation, software patent, Steve Jobs, Valgrind, WebSocket

Postface Tales from Out There I asked some of the contributors to this book to tell us what they were doing with ØMQ. Here are their stories. Rob Gagnon’s Story “We use ØMQ to assist in aggregating thousands of events occurring every minute across our global network of telecommunications servers so that we can accurately report and monitor for situations that require our attention. ØMQ made the development of the system not only easier, but faster to develop and more robust and fault-tolerant than we had originally planned in our original design. “We’re able to easily add and remove clients from the network without the loss of any message. If we need to enhance the server portion of our system, we can stop and restart it as well, without having to worry about stopping all of the clients first. The built-in buffering of ØMQ makes this all possible.” Tom van Leeuwen’s Story “I was looking at creating some kind of service bus connecting all kinds of services together.

pages: 458 words: 137,960

Ready Player One by Ernest Cline

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Albert Einstein, call centre, dematerialisation, fault tolerance, financial independence, game design, late fees, pre–internet, Rubik’s Cube, side project, telemarketer, walking around money

It managed to overcome limitations that had plagued previous simulated realities. In addition to restricting the overall size of their virtual environments, earlier MMOs had been forced to limit their virtual populations, usually to a few thousand users per server. If too many people were logged in at the same time, the simulation would slow to a crawl and avatars would freeze in midstride as the system struggled to keep up. But the OASIS utilized a new kind of fault-tolerant server array that could draw additional processing power from every computer connected to it. At the time of its initial launch, the OASIS could handle up to five million simultaneous users, with no discernible latency and no chance of a system crash. A massive marketing campaign promoted the launch of the OASIS. The pervasive television, billboard, and Internet ads featured a lush green oasis, complete with palm trees and a pool of crystal blue water, surrounded on all sides by a vast barren desert.

pages: 496 words: 154,363

I'm Feeling Lucky: The Confessions of Google Employee Number 59 by Douglas Edwards

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Albert Einstein, AltaVista, Any sufficiently advanced technology is indistinguishable from magic, barriers to entry, book scanning, Build a better mousetrap, Burning Man, business intelligence, call centre, commoditize, crowdsourcing, don't be evil, Elon Musk, fault tolerance, Googley, gravity well, invisible hand, Jeff Bezos, job-hopping, John Markoff, Marc Andreessen, Menlo Park, microcredit, music of the spheres, Network effects, P = NP, PageRank, performance metric, pets.com, Ralph Nader, risk tolerance, second-price auction, side project, Silicon Valley, Silicon Valley startup, slashdot, stem cell, Superbowl ad, Y2K

A global shortage of RAM (memory) made it worse, and Google's system, which had never been all that robust, started wheezing asthmatically. Part of the problem was that Google had built its system to fail. "Build machines so cheap that we don't care if they fail. And if they fail, just ignore them until we get around to fixing them." That was Google's strategy, according to hardware designer Will Whitted, who joined the company in 2001. "That concept of using commodity parts and of being extremely fault tolerant, of writing the software in a way that the hardware didn't have to be very good, was just brilliant." But only if you could get the parts to fix the broken computers and keep adding new machines. Or if you could improve the machines' efficiency so you didn't need so many of them. The first batch of Google servers had been so hastily assembled that the solder points on the motherboards touched the metal of the trays beneath them, so the engineers added corkboard liners as insulation.

pages: 587 words: 117,894

Cybersecurity: What Everyone Needs to Know by P. W. Singer, Allan Friedman

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

4chan, A Declaration of the Independence of Cyberspace, Apple's 1984 Super Bowl advert, barriers to entry, Berlin Wall, bitcoin, blood diamonds, borderless world, Brian Krebs, business continuity plan, Chelsea Manning, cloud computing, crowdsourcing, cuban missile crisis, data acquisition, drone strike, Edward Snowden, energy security, failed state, Fall of the Berlin Wall, fault tolerance, global supply chain, Google Earth, Internet of things, invention of the telegraph, John Markoff, Julian Assange, Khan Academy, M-Pesa, mutually assured destruction, Network effects, packet switching, Peace of Westphalia, pre–internet, profit motive, RAND corporation, ransomware, RFC: Request For Comment, risk tolerance, rolodex, Silicon Valley, Skype, smart grid, Steve Jobs, Stuxnet, uranium enrichment, We are Anonymous. We are Legion, web application, WikiLeaks, zero day, zero-sum game

There are three elements behind the concept. One is the importance of building in “the intentional capacity to work under degraded conditions.” Beyond that, resilient systems must also recover quickly, and, finally, learn lessons to deal better with future threats. For decades, most major corporations have had business continuity plans for fires or natural disasters, while the electronics industry has measured what it thinks of as fault tolerance, and the communications industry has talked about reliability and redundancy in its operations. All of these fit into the idea of resilience, but most assume some natural disaster, accident, failure, or crisis rather than deliberate attack. This is where cybersecurity must go in a very different direction: if you are only thinking in terms of reliability, a network can be made resilient merely by creating redundancies.

pages: 528 words: 146,459

Computer: A History of the Information Machine by Martin Campbell-Kelly, William Aspray, Nathan L. Ensmenger, Jeffrey R. Yost

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Ada Lovelace, air freight, Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem, Apple's 1984 Super Bowl advert, barriers to entry, Bill Gates: Altair 8800, borderless world, Buckminster Fuller, Build a better mousetrap, Byte Shop, card file, cashless society, cloud computing, combinatorial explosion, computer age, deskilling, don't be evil, Donald Davies, Douglas Engelbart, Douglas Engelbart, Dynabook, fault tolerance, Fellow of the Royal Society, financial independence, Frederick Winslow Taylor, game design, garden city movement, Grace Hopper, informal economy, interchangeable parts, invention of the wheel, Jacquard loom, Jacquard loom, Jeff Bezos, jimmy wales, John Markoff, John von Neumann, light touch regulation, linked data, Marc Andreessen, Mark Zuckerberg, Marshall McLuhan, Menlo Park, natural language processing, Network effects, New Journalism, Norbert Wiener, Occupy movement, optical character recognition, packet switching, PageRank, pattern recognition, Pierre-Simon Laplace, pirate software, popular electronics, prediction markets, pre–internet, QWERTY keyboard, RAND corporation, Robert X Cringely, Silicon Valley, Silicon Valley startup, Steve Jobs, Steven Levy, Stewart Brand, Ted Nelson, the market place, Turing machine, Vannevar Bush, Von Neumann architecture, Whole Earth Catalog, William Shockley: the traitorous eight, women in the workforce, young professional

Although computer technology is at the heart of the Internet, its importance is economic and social: the Internet gives computer users the ability to communicate, to gain access to information sources, and to conduct business. I. From the World Brain to the World Wide Web The Internet sprang from a confluence of three desires, two that emerged in the 1960s and one that originated much further back in time. First, there was the rather utilitarian desire for an efficient, fault-tolerant networking technology, suitable for military communications, that would never break down. Second, there was a wish to unite the world’s computer networks into a single system. Just as the telephone would never have become the dominant person-to-person communications medium if users had been restricted to the network of their particular provider, so the world’s isolated computer networks would be far more useful if they were joined together.

pages: 523 words: 143,139

Algorithms to Live By: The Computer Science of Human Decisions by Brian Christian, Tom Griffiths

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

4chan, Ada Lovelace, Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem, Albert Einstein, algorithmic trading, anthropic principle, asset allocation, autonomous vehicles, Bayesian statistics, Berlin Wall, Bill Duvall, bitcoin, Community Supported Agriculture, complexity theory, constrained optimization, cosmological principle, cryptocurrency, Danny Hillis, David Heinemeier Hansson, delayed gratification, dematerialisation, diversification, Donald Knuth, double helix, Elon Musk, fault tolerance, Fellow of the Royal Society, Firefox, first-price auction, Flash crash, Frederick Winslow Taylor, George Akerlof, global supply chain, Google Chrome, Henri Poincaré, information retrieval, Internet Archive, Jeff Bezos, John Nash: game theory, John von Neumann, knapsack problem, Lao Tzu, Leonard Kleinrock, linear programming, martingale, Nash equilibrium, natural language processing, NP-complete, P = NP, packet switching, Pierre-Simon Laplace, prediction markets, race to the bottom, RAND corporation, RFC: Request For Comment, Robert X Cringely, sealed-bid auction, second-price auction, self-driving car, Silicon Valley, Skype, sorting algorithm, spectrum auction, Steve Jobs, stochastic process, Thomas Bayes, Thomas Malthus, traveling salesman, Turing machine, urban planning, Vickrey auction, Vilfredo Pareto, Walter Mischel, Y Combinator, zero-sum game

The winner of that particular honor is an algorithm called Comparison Counting Sort. In this algorithm, each item is compared to all the others, generating a tally of how many items it is bigger than. This number can then be used directly as the item’s rank. Since it compares all pairs, Comparison Counting Sort is a quadratic-time algorithm, like Bubble Sort. Thus it’s not a popular choice in traditional computer science applications, but it’s exceptionally fault-tolerant. This algorithm’s workings should sound familiar. Comparison Counting Sort operates exactly like a Round-Robin tournament. In other words, it strongly resembles a sports team’s regular season—playing every other team in the division and building up a win-loss record by which they are ranked. That Comparison Counting Sort is the single most robust sorting algorithm known, quadratic or better, should offer something very specific to sports fans: if your team doesn’t make the playoffs, don’t whine.

pages: 461 words: 125,845

This Machine Kills Secrets: Julian Assange, the Cypherpunks, and Their Fight to Empower Whistleblowers by Andy Greenberg

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

Apple II, Ayatollah Khomeini, Berlin Wall, Bill Gates: Altair 8800, Burning Man, Chelsea Manning, computerized markets, crowdsourcing, cryptocurrency, domain-specific language, drone strike, en.wikipedia.org, fault tolerance, hive mind, Jacob Appelbaum, Julian Assange, Mahatma Gandhi, Mohammed Bouazizi, nuclear winter, offshore financial centre, pattern recognition, profit motive, Ralph Nader, Richard Stallman, Robert Hanssen: Double agent, Silicon Valley, Silicon Valley ideology, Skype, social graph, statistical model, stem cell, Steve Jobs, Steve Wozniak, Steven Levy, Vernor Vinge, We are Anonymous. We are Legion, We are the 99%, WikiLeaks, X Prize, Zimmermann PGP

And Nick Mathewson, Tor’s grinning, round-faced, ponytailed chief architect and codirector, had kicked off the day by dropping the room into the deep end of the cryptographic swimming pool. The geekery had gotten so thick that even some of Tor’s modern-day cypherpunks and volunteer coders, loath as they might have been to admit it, might just have gotten lost. Within minutes, Mathewson, wearing a sport jacket over a Tor T-shirt over a dwarfish potbelly, was delving into security issues like “epistemic attacks” and “Byzantine fault tolerances.” By the time he sat down, still grinning, a growing fraction of the room seemed baffled or possibly bored. Appelbaum’s presence, on the other hand, is as much guerrilla as geek. He’s Tor’s field researcher, unofficial revolutionary, and man on the ground in countries from Qatar to Brazil. And he knows the appeal of a sexy piece of hardware. After instantly acquiring the room’s attention, Appelbaum explains that the device his small audience is ogling is a satellite modem, one that he’s just rented with the aim of figuring out how to make Tor accessible to those in the Middle East who need to use satellite connections to access the Internet.

pages: 666 words: 181,495

In the Plex: How Google Thinks, Works, and Shapes Our Lives by Steven Levy

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

23andMe, AltaVista, Anne Wojcicki, Apple's 1984 Super Bowl advert, autonomous vehicles, book scanning, Brewster Kahle, Burning Man, business process, clean water, cloud computing, crowdsourcing, Dean Kamen, discounted cash flows, don't be evil, Donald Knuth, Douglas Engelbart, Douglas Engelbart, El Camino Real, fault tolerance, Firefox, Gerard Salton, Gerard Salton, Google bus, Google Chrome, Google Earth, Googley, HyperCard, hypertext link, IBM and the Holocaust, informal economy, information retrieval, Internet Archive, Jeff Bezos, John Markoff, Kevin Kelly, Mark Zuckerberg, Menlo Park, one-China policy, optical character recognition, PageRank, Paul Buchheit, Potemkin village, prediction markets, recommendation engine, risk tolerance, Rubik’s Cube, Sand Hill Road, Saturday Night Live, search inside the book, second-price auction, selection bias, Silicon Valley, skunkworks, Skype, slashdot, social graph, social software, social web, spectrum auction, speech recognition, statistical model, Steve Ballmer, Steve Jobs, Steven Levy, Ted Nelson, telemarketer, trade route, traveling salesman, turn-by-turn navigation, Vannevar Bush, web application, WikiLeaks, Y Combinator

“We’re going to build hundreds and thousands of cheap servers knowing from the get-go that a certain percentage, maybe 10 percent, are going to fail,” says Reese. Google’s first CIO, Douglas Merrill, once noted that the disk drives Google purchased were “poorer quality than you would put into your kid’s computer at home.” But Google designed around the flaws. “We built capabilities into the software, the hardware, and the network—the way we hook them up, the load balancing, and so on—to build in redundancy, to make the system fault-tolerant,” says Reese. The Google File System, written by Jeff Dean and Sanjay Ghemawat, was invaluable in this process: it was designed to manage failure by “sharding” data, distributing it to multiple servers. If Google search called for certain information at one server and didn’t get a reply after a couple of milliseconds, there were two other Google servers that could fulfill the request. “The Google business model was constrained by cost, especially at the very beginning,” says Erik Teetzel, who worked with Google’s data centers.

pages: 552 words: 168,518

MacroWikinomics: Rebooting Business and the World by Don Tapscott, Anthony D. Williams

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

accounting loophole / creative accounting, airport security, Andrew Keen, augmented reality, Ayatollah Khomeini, barriers to entry, bioinformatics, Bretton Woods, business climate, business process, car-free, carbon footprint, citizen journalism, Clayton Christensen, clean water, Climategate, Climatic Research Unit, cloud computing, collaborative editing, collapse of Lehman Brothers, collateralized debt obligation, colonial rule, commoditize, corporate governance, corporate social responsibility, creative destruction, crowdsourcing, death of newspapers, demographic transition, distributed generation, don't be evil, en.wikipedia.org, energy security, energy transition, Exxon Valdez, failed state, fault tolerance, financial innovation, Galaxy Zoo, game design, global village, Google Earth, Hans Rosling, hive mind, Home mortgage interest deduction, interchangeable parts, Internet of things, invention of movable type, Isaac Newton, James Watt: steam engine, Jaron Lanier, jimmy wales, Joseph Schumpeter, Julian Assange, Kevin Kelly, knowledge economy, knowledge worker, Marc Andreessen, Marshall McLuhan, mass immigration, medical bankruptcy, megacity, mortgage tax deduction, Netflix Prize, new economy, Nicholas Carr, oil shock, old-boy network, online collectivism, open borders, open economy, pattern recognition, peer-to-peer lending, personalized medicine, Ray Kurzweil, RFID, ride hailing / ride sharing, Ronald Reagan, Rubik’s Cube, scientific mainstream, shareholder value, Silicon Valley, Skype, smart grid, smart meter, social graph, social web, software patent, Steve Jobs, text mining, the scientific method, The Wisdom of Crowds, transaction costs, transfer pricing, University of East Anglia, urban sprawl, value at risk, WikiLeaks, X Prize, young professional, Zipcar

To make it work, you’ll need to reveal your IP in an appropriate network, socializing it with participants and letting it spawn new knowledge and invention. You’ll need to stay plugged into the community so that you can leverage new contributions as they come in. You’ll also need to dedicate some resources to filtering and aggregating contributions. It can be a lot of work, but these types of collaborations can produce more robust, user-defined, fault-tolerant products in less time and for less expense than the conventional closed approach. 3. LET GO Leaders in business and society who are attempting to transform their organizations have many understandable concerns about moving forward. One of the biggest is a fear of losing control. I can’t open up, it’s too risky. Our lawyers would go berserk. There are too many obstacles. I can’t empower others to make decisions because I’ll get all the blame if they get it wrong.

pages: 798 words: 240,182

The Transhumanist Reader by Max More, Natasha Vita-More

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

23andMe, Any sufficiently advanced technology is indistinguishable from magic, artificial general intelligence, augmented reality, Bill Joy: nanobots, bioinformatics, brain emulation, Buckminster Fuller, cellular automata, clean water, cloud computing, cognitive bias, cognitive dissonance, combinatorial explosion, conceptual framework, Conway's Game of Life, cosmological principle, data acquisition, discovery of DNA, Douglas Engelbart, Drosophila, en.wikipedia.org, endogenous growth, experimental subject, Extropian, fault tolerance, Flynn Effect, Francis Fukuyama: the end of history, Frank Gehry, friendly AI, game design, germ theory of disease, hypertext link, impulse control, index fund, John von Neumann, joint-stock company, Kevin Kelly, Law of Accelerating Returns, life extension, lifelogging, Louis Pasteur, Menlo Park, meta analysis, meta-analysis, moral hazard, Network effects, Norbert Wiener, P = NP, pattern recognition, phenotype, positional goods, prediction markets, presumed consent, Ray Kurzweil, reversible computing, RFID, Richard Feynman, Ronald Reagan, silicon-based life, Singularitarianism, stem cell, stochastic process, superintelligent machines, supply-chain management, supply-chain management software, technological singularity, Ted Nelson, telepresence, telepresence robot, telerobotics, the built environment, The Coming Technological Singularity, the scientific method, The Wisdom of Crowds, transaction costs, Turing machine, Turing test, Upton Sinclair, Vernor Vinge, Von Neumann architecture, Whole Earth Review, women in the workforce, zero-sum game

The mind continues to depend on a substrate to exist and to operate, of course, but there are substrate choices. The goal of substrate-independence is to continue personality, individual characteristics, a manner of experiencing, and a personal way of processing those experiences (Koene 2011a, 2011b). Your identity, your memories can then be embodied physically in many ways. They can also be backed up and operate robustly on fault-tolerant hardware with redundancy schemes. Achieving substrate-independence will allow us to optimize the operational framework, the hardware, to challenges posed by novel circumstances and different environments. Think, instead of sending extremophile bacteria to slowly terraform another world into a habitat, we ourselves can be extremophiles. Substrate-independent minds is a well-described objective.

pages: 945 words: 292,893

Seveneves by Neal Stephenson

Amazon: amazon.comamazon.co.ukamazon.deamazon.fr

clean water, Colonization of Mars, Danny Hillis, digital map, double helix, epigenetics, fault tolerance, Fellow of the Royal Society, Filipino sailors, gravity well, Isaac Newton, Jeff Bezos, kremlinology, Kuiper Belt, microbiome, phenotype, Potemkin village, pre–internet, random walk, remote working, selection bias, side project, Silicon Valley, Skype, statistical model, Stewart Brand, supervolcano, the scientific method, Tunguska event, zero day, éminence grise

Ammonia worked better, but it was dangerous, and you couldn’t easily get more of it in space. If the Cloud Ark survived, it would survive on a water-based economy. A hundred years from now everything in space would be cooled by circulating water systems. But for now they had to keep the ammonia-based equipment running as well. Further complications, as if any were wanted, came from the fact that the systems had to be fault tolerant. If one of them got bashed by a hurtling piece of moon shrapnel and began to leak, it needed to be isolated from the rest of the system before too much of the precious water, or ammonia, leaked into space. So, the system as a whole possessed vast hierarchies of check valves, crossover switches, and redundancies that had saturated even Ivy’s brain, normally an infinite sink for detail. She’d had to delegate all cooling-related matters to a working group that was about three-quarters Russian and one-quarter American.