WebSocket

34 results back to index


Realtime Web Apps: HTML5 WebSocket, Pusher, and the Web’s Next Big Thing by Jason Lengstorf, Phil Leggetter

Amazon Web Services, barriers to entry, don't repeat yourself, en.wikipedia.org, Firefox, Google Chrome, MVC pattern, Ruby on Rails, Skype, software as a service, web application, WebSocket

This is accomplished by opening an HTTP request and then asking the server to “upgrade” the connection to the WebSocket protocol by sending the following headers:17 GET /chat HTTP/1.1 Host: server.example.com Upgrade: websocket Connection: Upgrade Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ== Origin: http://example.com Sec-WebSocket-Protocol: chat, superchat Sec-WebSocket-Version: 13 If the request is successful, the server will return headers that look like these: HTTP/1.1 101 Switching Protocols Upgrade: websocket Connection: Upgrade Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo= Sec-WebSocket-Protocol: chat This exchange is called a handshake, and it’s required to establish a WebSocket connection. Once a successful handshake occurs between the server and the client, a two-way communication channel is established, and both the client and server can send data to each other independently.

The goal of this technology is to provide a mechanism for browser-based applications that need two-way communication with servers that does not rely on opening multiple HTTP connections (e.g. using XMLHttpRequest or <iframe>s and long polling).16 One of the most beneficial implications of widespread WebSocket support is in scalability: because WebSockets use a single TCP connection for communication between the server and client instead of multiple, separate HTTP requests, the overhead is dramatically reduced. The WebSocket Protocol Because full-duplex communication cannot be achieved using HTTP, WebSocket actually defines a whole new protocol, or method of connecting to a server from a client. This is accomplished by opening an HTTP request and then asking the server to “upgrade” the connection to the WebSocket protocol by sending the following headers:17 GET /chat HTTP/1.1 Host: server.example.com Upgrade: websocket Connection: Upgrade Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ== Origin: http://example.com Sec-WebSocket-Protocol: chat, superchat Sec-WebSocket-Version: 13 If the request is successful, the server will return headers that look like these: HTTP/1.1 101 Switching Protocols Upgrade: websocket Connection: Upgrade Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo= Sec-WebSocket-Protocol: chat This exchange is called a handshake, and it’s required to establish a WebSocket connection.

. | +----------------------------------------------------------------------+ 2 rows in set (0.00 sec) Now that you know what’s awesome, you may want to destroy the evidence by dropping the database altogether: DROP DATABASE awesome_test_db; This removes the database altogether so that your MySQL server isn’t cluttered with test data. 29 Chapter 2 ■ the tools HTML5 WebSocket Technology and Pusher We already talked a bit about WebSocket and realtime, but let’s recap: HTML5 WebSocket allows applications to push data to the client rather than requiring the client to constantly ask for new data. let’s have a look at the native Websocket apI to get an idea of how it can be used. Create an htMl file with the following content. this file contains Javascript that connects to a Websocket echo test service. this means that you can test connecting, sending, and receiving messages. <!


pages: 136 words: 20,501

Introduction to Tornado by Michael Dory, Adam Parrish, Brendan Berg

don't repeat yourself, Firefox, social web, web application, WebSocket

The HTML 5 spec not only describes the communication protocol itself, but also the browser APIs that are required to write client-side code that use WebSockets. Since WebSocket support is already supported in some of the latest browsers and since Tornado helpfully provides a module for it, it’s worth seeing how to implement applications that use WebSockets. Tornado’s WebSocket Module Tornado provides a WebSocketHandler class as part of the websocket module. The class provides hooks for WebSocket events and methods to communicate with the connected client. The open method is called when a new WebSocket connection is opened, and the on_message and on_close methods are called when the connection receives a new message or is closed by the client.

Example 5-8. Web Sockets: The new requestInventory function from inventory.js function requestInventory() { var host = 'ws://localhost:8000/cart/status'; var websocket = new WebSocket(host); websocket.onopen = function (evt) { }; websocket.onmessage = function(evt) { $('#count').html($.parseJSON(evt.data)['inventoryCount']); }; websocket.onerror = function (evt) { }; } After creating a new WebSocket connection to the URL ws://localhost:8000/cart/status, we add handler functions for each of the events we want to respond to. The only event we care about in this example is onmessage, which updates the contents of the same count span that the previous requestInventory function modified.

The difference here is that one persistent WebSocket connection is used instead of re-opening HTTP requests with each long polling update. The Future of WebSockets The WebSocket protocol is still in draft form, and may change as it is finalized. However, since the specification has just been submitted to the IETF for final review, it is relatively unlikely to face significant changes. As mentioned in the beginning of this section, the major downside to using the WebSocket protocol right now is that only the very latest browsers support it. Despite those caveats, WebSockets are a promising new way to implement bidirectional communication between a browser and server.


pages: 210 words: 42,271

Programming HTML5 Applications by Zachary Kessin

barriers to entry, continuous integration, fault tolerance, Firefox, functional programming, Google Chrome, mandelbrot fractal, QWERTY keyboard, web application, WebSocket

Go away!"}; "WebSocket" -> WebSocketOwner = spawn(fun() -> websocket_owner() end), {websocket, WebSocketOwner, passive} end. websocket_owner() -> receive {ok, WebSocket} -> %% This is how we read messages (plural!!) from websockets on passive mode case yaws_api:websocket_receive(WebSocket) of {error,closed} -> io:format("The websocket got disconnected right from the start. " "This wasn't supposed to happen!!~n"); {ok, Messages} -> case Messages of [<<"client-connected">>] -> yaws_api:websocket_setopts(WebSocket, [{active, true}]), echo_server(WebSocket); Other -> io:format("websocket_owner got: ~p.

~n"); {ok, Messages} -> case Messages of [<<"client-connected">>] -> yaws_api:websocket_setopts(WebSocket, [{active, true}]), echo_server(WebSocket); Other -> io:format("websocket_owner got: ~p. Terminating~n", [Other]) end end; _ -> ok end. echo_server(WebSocket) -> receive {tcp, WebSocket, DataFrame} -> Data = yaws_api:websocket_unframe_data(DataFrame), io:format("Got data from Websocket: ~p~n", [Data]), yaws_api:websocket_send(WebSocket, Data), echo_server(WebSocket); {tcp_closed, WebSocket} -> io:format("Websocket closed. Terminating echo_server...~n"); Any -> io:format("echo_server received msg:~p~n", [Any]), echo_server(WebSocket) end. get_upgrade_header(#headers{other=L}) -> lists:foldl(fun({http_header,_,K0,_,V}, undefined) -> K = case is_atom(K0) of true -> atom_to_list(K0); false -> K0 end, case string:to_lower(K) of "upgrade" -> V; _ -> undefined end; (_, Acc) -> Acc end, undefined, L).

The EventMachine::WebSocket interface closely matches the interface in JavaScript. As in the client, the EventMachine interface has standard event handlers for onopen, onmessage, and onclose, as well as a ws.send method to send data back to the client. Example 9-4 shows a very trivial “hello world” type of web socket interface in Ruby. Example 9-4. Ruby Event Machine web socket handler require 'em-websocket' EventMachine::WebSocket.start(:host => "0.0.0.0", :port => 8080) do |ws| ws.onopen { ws.send "Hello Client!"} ws.onmessage { |msg| ws.send "Pong: #{msg}" } ws.onclose { puts "WebSocket closed" } end Erlang Yaws Erlang is a pretty rigorously functional language that was developed several decades ago for telephone switches and has found acceptance in many other areas where massive parallelism and strong robustness are desired.


Designing Web APIs: Building APIs That Developers Love by Brenda Jin, Saurabh Sahni, Amir Shevat

active measures, Amazon Web Services, augmented reality, blockchain, business process, continuous integration, create, read, update, delete, Google Hangouts, if you build it, they will come, Lyft, MITM: man-in-the-middle, premature optimization, pull request, Silicon Valley, Snapchat, software as a service, the market place, uber lyft, web application, WebSocket

When there are thousands of events happening in a short time that need to be sent via a single WebHook, it can be noisy. Figure 2-4. Configuring a GitHub WebHook Event-Driven APIs | 21 WebSockets WebSocket is a protocol used to establish a two-way streaming com‐ munication channel over a single Transport Control Protocol (TCP) connection. Although the protocol is generally used between a web client (e.g., a browser) and a server, it’s sometimes used for serverto-server communication, as well. The WebSocket protocol is supported by major browsers and often used by real-time applications. Slack uses WebSockets to send all kinds of events happening in a workspace to Slack’s clients, includ‐ ing new messages, emoji reactions added to items, and channel crea‐ tions.

Slack uses WebSockets to send all kinds of events happening in a workspace to Slack’s clients, includ‐ ing new messages, emoji reactions added to items, and channel crea‐ tions. Slack also provides a WebSocket-based Real Time Messaging API to developers so that they can receive events from Slack in real time and send messages as users. Similarly, Trello uses WebSockets to push changes made by other people down from servers to brows‐ ers listening on the appropriate channels, and Blockchain uses its WebSocket API to send real-time notifications about new transac‐ tions and blocks. WebSockets can enable full-duplex communication (server and cli‐ ent can communicate with each other simultaneously) at a low over‐ head.

For example, some enterprise developers using Slack APIs prefer to use the WebSocket API over WebHooks because they are able to receive events from the Slack API securely without having to open up an HTTP WebHook endpoint to the internet where Slack can post messages. WebSockets are great for fast, live streaming data and long-lived connections. However, be wary if you plan to make these available on mobile devices or in regions where connectivity can be spotty. Clients are supposed to keep the connection alive. If the connection dies, the client needs to reinitiate it. There are also issues related to scalability. Developers using Slack’s WebSocket API must establish a connection for each team that uses their app (Figure 2-5).


pages: 325 words: 85,599

Professional Node.js: Building Javascript Based Scalable Software by Pedro Teixeira

en.wikipedia.org, Firefox, Google Chrome, node package manager, platform as a service, web application, WebSocket

To do this, the browser sends a special HTTP/1.1 request to the server, asking it to turn the connection of this request into a WebSockets connection: GET /chat HTTP/1.1 Host: server.example.com Upgrade: websocket Connection: Upgrade Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ== Origin: http://example.com Sec-WebSocket-Protocol: chat, superchat Sec-WebSocket-Version: 13 Although this starts out as a regular HTTP connection, the client asks to “upgrade” this connection to a WebSocket connection. If the server supports the WebSocket protocol, it answers like this: HTTP/1.1 101 Switching Protocols Upgrade: websocket Connection: Upgrade Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo= Sec-WebSocket-Protocol: chat This marks the end of the handshake, and the connection switches to data transfer mode.

USING SOCKET.IO TO BUILD WEBSOCKET APPLICATIONS Although implementing your own WebSocket server for Node.js is possible, it’s not necessary. Many low-level details need to be taken care of before you can implement an actual application on top of it, which makes using a library a lot more practical. The de facto standard library for building WebSocket Node.js applications is Socket.IO. Not only is it a wrapper library that makes building WebSocket servers very convenient, it also provides transparent fallback mechanisms like long polling for clients that don’t support the WebSocket protocol. Furthermore, it ships with a client-side library that provides a convenient API for developing the browser part of the application.

But because HTTP was never designed for this kind of use, implementations remained hacky – browsers showed different behaviors when faced with long-running request responses, and keeping the connection open usually resulted in inefficient load behavior on the server. Today, the situation is much better. The WebSocket protocol was developed (and recently standardized) to overcome the shortcomings of HTTP in relation to real-time applications, which now enables HTTP clients and browsers to talk to each other in an efficient and full-duplex way, without the need for workarounds. For the first time, real-time web applications can use only web technologies, without the need to for external technologies like Java Applets, Flash, or ActiveX. UNDERSTANDING HOW WEBSOCKETS WORK At its core, a WebSocket connection is just a conventional TCP connection between an HTTP server and an HTTP client.


pages: 628 words: 107,927

Node.js in Action by Mike Cantelon, Marc Harter, Tj Holowaychuk, Nathan Rajlich

Amazon Web Services, Chris Wanstrath, create, read, update, delete, Debian, en.wikipedia.org, Firefox, Google Chrome, MITM: man-in-the-middle, MVC pattern, node package manager, p-value, pull request, Ruby on Rails, web application, WebSocket

This limitation prompted the standardization of the WebSocket protocol, which specifies a way for browsers to maintain a full-duplex connection to the server, allowing both ends to send and receive data simultaneously. WebSocket APIs allow for a whole new breed of web applications utilizing real-time communication between the client and server. The problem with the WebSocket protocol is that it’s not yet finalized, and although some browsers have begun shipping with WebSocket, there are still a lot of older versions out there, especially of Internet Explorer. Socket.IO solves this problem by utilizing WebSocket when it’s available in the browser, and falling back to other browser-specific tricks to simulate the behavior that WebSocket provides, even in older browsers. In this section, you’ll build two sample applications using Socket.IO: A minimal Socket.IO application that pushes the server’s time to connected clients A Socket.IO application that triggers page refreshes when CSS files are edited After you build the example apps, we’ll show you a few more ways you can use Socket.IO by briefly revisiting the upload-progress example from chapter 4.

Opening and closing connections takes time, and the size of the data transfer is larger because HTTP headers are sent on every request. Instead of employing a solution reliant on HTTP, this application will prefer WebSocket (http://en.wikipedia.org/wiki/WebSocket), which was designed as a bidirectional lightweight communications protocol to support real-time communication. Since only HTML5-compliant browsers, for the most part, support WebSocket, the application will leverage the popular Socket.IO library (http://socket.io/), which provides a number of fallbacks, including the use of Flash, should using WebSocket not be possible. Socket.IO handles fallback functionality transparently, requiring no additional code or configuration.

Handling chat-related messaging using Socket.IO Of the three things we said the app had to do, we’ve already covered the first one, serving static files, and now we’ll tackle the second—handling communication between the browser and server. Modern browsers are capable of using WebSocket to handle communication between the browser and the server. (See the Socket.IO browser support page for details on supported browsers: http://socket.io/#browser-support.) Socket.IO provides a layer of abstraction over WebSocket and other transports for both Node and client-side JavaScript. Socket.IO will fall back transparently to other WebSocket alternatives if WebSocket isn’t implemented in a web browser while keeping the same API. In this section, we’ll Briefly introduce you to Socket.IO and define the Socket.IO functionality you’ll need on the server side Add code that sets up a Socket.IO server Add code to handle various chat application events Socket.IO, out of the box, provides virtual channels, so instead of broadcasting every message to every connected user, you can broadcast only to those who have subscribed to a specific channel.


pages: 550 words: 84,515

Vue.js 2 Cookbook by Andrea Passaglia

bitcoin, functional programming, Kickstarter, loose coupling, MVC pattern, node package manager, Silicon Valley, single page application, web application, WebSocket

Using WebSockets in Vue WebSockets are a new technology that enables two-way communication between the user and the server where the app is hosted. Before this technology, only the browser could initiate a request and, thus, a connection. If some update on the page was expected, the browser had to continuously poll the server. With WebSockets, this is no longer necessary; after the connection is established, the server can send updates only when there is a need. Getting ready You don't need any preparation for this recipe, just the basics of Vue. If you don't know what WebSockets are, you don't really need to, just think about them as a channel of continuous two-way communication between a server and browser.

We will not build a server; instead, we'll use an already existing server that just echoes whatever you send to it via WebSockets. So, if we were to send the Hello message, the server would respond with Hello. You will build a chat app that will talk to this server. Write the following HTML code: <div id="app"> <h1>Welcome</h1> <pre>{{chat}}</pre> <input v-model="message" @keyup.enter="send"> </div> The <pre> tag will help us render a chat. As we don't need the <br/> element to break a line, we can just use the n special character that means a new line. For our chat to work, we first have to declare our WebSocket in the JavaScript: const ws = new WebSocket('ws://echo.websocket.org') After that, we declare our Vue instance that will contain a chat string (to contain the chat so far) and a message string (to contain the message we are currently writing): new Vue({ el: '#app', data: { chat: '', message: '' } }) We still need to define the send method, which is called upon pressing Enter in the textbox: new Vue({ el: '#app', data: { chat: '', message: '' }, methods: { send () { this.appendToChat(this.message) ws.send(this.message) this.message = '' }, appendToChat (text) { this.chat += text + 'n' } } } We factored out the appendToChat method because we will use it to append all the messages we'll receive.

Seeing the cats added in real time is clearly the way to go for modern applications. Feathers lets you create them in a snap and with a fraction of the code, thanks to the underlying Socket.io, which in turn uses WebSockets. WebSockets are really not that complex and what Feathers does in this case is just listen for messages in the channel and associate them with actions like adding something to the database. The power of Feathers is visible when you can just swap database and WebSocket provider, or switch to REST, without even touching your Vue code. Creating a reactive app with Horizon Horizon is a platform to build reactive, real-time scalable apps.


pages: 296 words: 41,381

Vue.js by Callum Macrae

Airbnb, single page application, source of truth, web application, WebSocket

Let’s take a simple component written without vuex that displays the number of messages a user has on the page: const NotificationCount = { template: `<p>Messages: {{ messageCount }}</p>`, data: () => ({ messageCount: 'loading' }), mounted() { const ws = new WebSocket('/api/messages'); ws.addEventListener('message', (e) => { const data = JSON.parse(e.data); this.messageCount = data.messages.length; }); } }; It’s pretty simple. It opens a websocket to /api/messages, and then when the server sends data to the client—in this case, when the socket is opened (initial message count) and when the count is updated (on new messages)—the messages sent over the socket are counted and displayed on the page. Note In practice, this code would be much more complicated: there’s no authentication on the websocket in this example, and it is always assumed that the response over the websocket is valid JSON with a messages property that is an array, when realistically it probably wouldn’t be.

Note In practice, this code would be much more complicated: there’s no authentication on the websocket in this example, and it is always assumed that the response over the websocket is valid JSON with a messages property that is an array, when realistically it probably wouldn’t be. For this example, this simplistic code will do the job. We run into problems when we want to use more than one of the NotificationCount components on the same page. As each component opens a websocket, it opens unnecessary duplicate connections, and because of network latency, the components might update at slightly different times. To fix this, we can move the websocket logic into vuex. Let’s dive right in with an example. Our component will become this: const NotificationCount = { template: `<p>Messages: {{ messageCount }}</p>`, computed: { messageCount() { return this.

$store.dispatch('getMessages'); } }; And the following will become our vuex store: let ws; export default new Vuex.Store({ state: { messages: [], }, mutations: { setMessages(state, messages) { state.messages = messages; } }, actions: { getMessages({ commit }) { if (ws) { return; } ws = new WebSocket('/api/messages'); ws.addEventListener('message', (e) => { const data = JSON.parse(e.data); commit('setMessages', data.messages); }); } } }); Now, every notification count component that is mounted will trigger getMessages, but the action checks whether the websocket exists and opens a connection only if there isn’t one already open. Then it listens to the socket, committing changes to the state, which will then be updated in the notification count component as the store is reactive—just like most other things in Vue.


pages: 434 words: 77,974

Mastering Blockchain: Unlocking the Power of Cryptocurrencies and Smart Contracts by Lorne Lantz, Daniel Cawrey

altcoin, Amazon Web Services, barriers to entry, bitcoin, blockchain, business process, call centre, capital controls, cloud computing, corporate governance, creative destruction, cryptocurrency, currency peg, disinformation, disintermediation, distributed ledger, Dogecoin, Ethereum, ethereum blockchain, fault tolerance, fiat currency, Firefox, global reserve currency, Internet of things, Kubernetes, litecoin, Lyft, margin call, MITM: man-in-the-middle, Network effects, offshore financial centre, packet switching, peer-to-peer, Ponzi scheme, prediction markets, QR code, ransomware, regulatory arbitrage, rent-seeking, reserve currency, Ross Ulbricht, Satoshi Nakamoto, Silicon Valley, Skype, smart contracts, software as a service, Steve Wozniak, tulip mania, uber lyft, unbanked and underbanked, underbanked, web application, WebSocket, WikiLeaks

These gaps increase significantly when the trading bot is rate-limited and has to wait a split second until it can make a valid request. A faster way for a trading bot to view the current state of the market is by subscribing to a WebSocket. With this setup, as soon as a change occurs in the market, the exchange’s server pushes a notification to all subscribers of the WebSocket. Then all trading bots subscribing to the WebSocket will receive the same information at the same time, and they do not need to make additional API requests that will fill their rate limit quotas. Tip It can be helpful for traders to find out the exact location where the exchange hosts its API server, and host their trading bot in the same location.

technical cryptocurrency analysis, Technical Cryptocurrency Analysis-Hunting for Bartlooking for Bart pattern, Hunting for Bart Anti-Money Laundering (AML) rules, Banking Risk, Singaporeimplementation in Novi wallet, Novi APIsexchange APIs and trading bots, Exchange APIs and Trading Bots-Market Aggregatorscharacteristics of high-quality API, Exchange APIs and Trading Bots Coinbase Pro and Kraken APIs, Exchange APIs and Trading Bots important API calls, Open Source Trading Tech market aggregators, Market Aggregators open source trading tech, Open Source Trading Tech rate limiting, Rate Limiting REST versus WebSocket, REST Versus WebSocket testing in a sandbox, Testing in a Sandbox application binary interface (ABI), Interacting with a smart contract application-based blockchain transactions, Ether and Gas application-specific integrated circuits (ASICs), Mining Is About IncentivesASIC-resistant Scrypt algorithm, Litecoin deterring use for mining, Altcoins X11 ASIC-resistant proof-of-work, Dash arbitrage, Jurisdiction, Arbitrage, Arbitrage Trading-Float Configuration 3basic, Arbitrage Trading basic mistakes in, Basic Mistakes exchange risk, Exchange Risk involving fiat currency, banking risk, Banking Risk regulatory, Avoiding Scrutiny: Regulatory Arbitrage-Crypto-Based Stablecoins timing and managing float, Timing and Managing Floatfloat configuration 1, Float Configuration 1 float configuration 2, Float Configuration 2 float configuration 3, Float Configuration 3 triangular, Arbitrage Trading arbitrageurs, Arbitrage assets, real-woldB-Money digital currency price based on, B-Money backing digital blockchain cryptocurrencies, Tether enabling representation on Bitcoin, Colored Coins and Tokens problems when represented on blockchain, Tether asymmetric cryptography, Public and Private Keys in Cryptocurrency Systems(see also public/private key cryptography) auditors, third-party, for smart contracts, Fungible and Nonfungible Tokens authentication issues in cryptocurrency losses, Security Fundamentals-Recovery Seed autoliquidation, Derivatives Avalanche consensus mechanism, Avalanche Azure, Blockchain as a Service, Blockchain as a Service B B-Money, B-Money BaaS (Blockchain as a Service), Blockchain as a Service Back, Adam, Hashcash Bahamas, regulatory arbitrage, Bahamas banking risk, Banking Risk banking, blockchain implementations, Banking-JPMorganBanque de France, Banque de France China, China JPMorgan, JPMorgan permissioned ledger uses of blockchain, Banking Royal Mint, The Royal Mint US Federal Reserve, US Federal Reserve Banque de France, Banque de France Bart pattern, Hunting for Bart basic arbitrage, Arbitrage Trading Basic Attention Token (BAT), Web 3.0 Basis, Basis beacon chain, Ethereum Scaling Beam, Mimblewimble, Beam, and Grin bidirectional payment channels, Lightning BIP39 for generating wallet seeds, Recovery Seed bit gold, Bit Gold Bitcoin, The Bitcoin Experiment-Bringing Bitcoin to LifeBitcoin Cash fork, Contentious Hard Forks-The Bitcoin Cash Fork block times, Float Configuration 2 bringing the network to life, Bringing Bitcoin to Life-Adoptionachieving consensus, Achieving Consensus-Generating transactions adoption, Adoption compelling components, Compelling Components early security vulnerability, An Early Vulnerability evolution of, Improving Bitcoin’s Limited Functionality Liquid Network federated sidechain, Sidechains mining difficulty, history of, Block Generation Omni Layer protocol on top of, How Omni Layer works predecessors, Bitcoin Predecessors-Bit Gold proof-of-work consensus, problems with, Ripple and Stellar Satoshi Nakamoto's whitepaper, The Whitepaper scalability issues, solving with Lightning, Lightning SHA-256 hash algorithm, Hashes storing data in chain of blocks, Storing Data in a Chain of Blocks timestamp system to verify transactions, Introducing the Timestamp Server transaction life cycle, Transaction life cycle 2008 financial crisis, The 2008 Financial Crisis why you can't cheat at, Storing Data in a Chain of Blocks bitcoin, Storing Data in a Chain of Blocksevolution of its price, Market Infrastructure futures, Derivatives halving, Whalesimpact on market, Whales Bitcoin Cash (BCH), Contentious Hard Forks-Replay attacks Bitcoin Improvement Proposals (BIPs), Bitcoin Improvement Proposals, Understanding Ethereum Requests for Comment Bitcoin Satoshi’s Vision (SV), The Bitcoin Cash Fork “Bitcoin: A Peer-to-Peer Electronic Cash System”, The Whitepaper Bitfinex, Bitfinex BitGo, Custody BitLicense, FinCEN Guidance and the Beginning of Regulation BitPay, Brokerages Bitstamp, Exchanges blind signature technology, DigiCash block explorers, Block explorers block hashes, Storing Data in a Chain of Blocks, Block Hashes-Custody: Who Holds the Keysvalid, criteria for on Bitcoin, Block discovery block height, Storing Data in a Chain of Blocks block propagators (EOS), Blockchains to Watch block reward, The Coinbase Transaction Block.One, Skirting the Laws blockchain explorers, Analytics Blockchain.com, Analytics, Block explorers, The Evolution of Crypto Laundering blockchainscreating new platforms for the web, Web 3.0 future of, The Future of Blockchain-Summaryblockchains to watch, Blockchains to Watch-Mimblewimble, Beam, and Grin interoperability, Interoperability privacy, Privacy similarities to internet, The More Things Change tokenizing everything, Tokenize Everything illegal uses of, Catch Me If You Can information on the industry, Information oracles interacting with, Important Definitions origins of, Origins of Blockchain Technology-SummaryBitcoin experiment, The Bitcoin Experiment-Storing Data in a Chain of Blocks Bitcoin predecessors, Bitcoin Predecessors-Bit Gold bringing Bitcoin network to life, Bringing Bitcoin to Life-Adoption distributed versus centralized versus decentralized, Distributed Versus Centralized Versus Decentralized-Bitcoin Predecessors electronic systems and trust, Electronic Systems and Trust otherbanking implementations, Banking-JPMorgan Blockchain as a Service (BaaS), Blockchain as a Service databases and ledgers, Databases and Ledgers decentralization versus centralization, Decentralization Versus Centralization enterprise implementations, Enterprise Implementations-DAML Ethereum-based privacy implementations, Ethereum-Based Privacy Implementations key properties of distributed verifiable ledgers, Key Properties of Distributed Verifiable Ledgers Libra, Libra-Summary permissioned ledger uses, Permissioned Ledger Uses-Payments use cases, What Are Blockchains Good For?

evolution of, Electronic Systems and Trust Internet of Things (IoT), permissioned ledger implementations of blockchain, Internet of Things interoperability between different blockchains, Interoperability Interplanetary File System (IPFS), Web 3.0 issuance trust, Electronic Systems and Trust IT systems, permissioned ledger uses, IT Ixcoin, Altcoins J Java, Corda language JPMorgan, JPMorganinterbank payments using permissioned ledger, Payments jurisdiction over cryptocurrency exchanges, Jurisdiction K Keccak-256 hash algorithm, Hashes Know Your Customer (KYC) rules, Banking Risk, DAIon centralized and decentralized exchanges, Know your customer crypto laundering and, The Evolution of Crypto Laundering implementation in Novi wallet, Novi in Singapore, Singapore stablecoins requiring/not requiring, KYC and pseudonymity L LBFT consensus protocol, How the Libra Protocol Works Ledger wallet, Wallets ledgers, Storing Data in a Chain of Blocks, Databases and LedgersCorda, Corda ledger distributed verifiable, key properties of, Key Properties of Distributed Verifiable Ledgers Hyperledger Fabric technology, Hyperledger permissioned ledger uses of blockchain, Permissioned Ledger Uses-Payments Ripple, Ripple legal industry, permissioned ledger uses, Legal legal requirements, cryptocurrency and blockchain technology skirting the laws, Skirting the Laws lending services (DeFi), Lending less than 5% rule, Counterparty Risk Libra, Libra-Summaryborrowing from existing blockchains, Borrowing from Existing Blockchains centralization challenges, Novi how the Libra protocol works, How the Libra Protocol Works-Transactionsblocks, Blocks transactions, Transactions Libra Association, The Libra Association Novi wallet and other third-party wallets, Novi Lightning, Lightning, Lightningfunding transactions, Funding transactions nodes and wallets, Lightning nodes and wallets off-chain transactions, Off-chain transactions solving scalability issues on Blockchain, Lightning Liquid multisignature wallet, Liquid liquidity, Arbitrageor depth in a market, Hunting for Bart Litecoin, Litecoin longest chain rule, The mining process lottery-based consensus, Alternative methods M MaidSafe, Understanding Omni LayerICO for, Use Cases: ICOs Maker project's DAI, DAIsavings rates for DAI, Savings Malta, regulatory arbitrage, Malta man in the middle attacks, Zero-Knowledge Proof margin/leveraged products, Derivatives market capitalization, low, cryptocurrencies with, Whales market depthconsiderations in cryptocurrency trading, Basic Mistakes lacking in cryptocurrency market, Cryptocurrency Market Structure market infrastructure, Market Infrastructure-Summaryanalysis, Analysis-Hunting for Bartfundamental cryptocurrency analysis, Fundamental Cryptocurrency Analysis-Tools for fundamental analysis technical cryptocurrency analysis, Technical Cryptocurrency Analysis-Hunting for Bart arbitrage trading, Arbitrage Trading-Float Configuration 3 cryptocurrency market structure, Cryptocurrency Market Structure-Transaction flowsaribtrage, Arbitrage counterparty risk, Counterparty Risk market data, Market Data-Transaction flows depth charts, Depth Charts derivatives, Derivatives exchange APIs and trading bots, Exchange APIs and Trading Bots-Market Aggregatorsmarket aggregators, Market Aggregators open source trading tech, Open Source Trading Tech rate limiting, Rate Limiting REST versus WebSocket APIs, REST Versus WebSocket testing trading bot in sandbox, Testing in a Sandbox exchanges, The Role of Exchanges-The Role of Exchanges order books, Order Books regulatory challenges, Regulatory Challenges-Basic Mistakes slippage in cryptocurrency trading, Slippage wash trading, Wash Trading ways to buy and sell cryptocurrency, Evolution of the Price of Bitcoin whales, Whales market size, Order Books Mastercoin, Mastercoin and Smart Contracts, Tokenize EverythingEthereum and, Ethereum: Taking Mastercoin to the Next Level raising cryptocurrency funds to launch a project, Use Cases: ICOs Meetup.com, Information mempool, unconfirmed transactions on Bitcoin, Transaction life cycle Merkelized Abstract Syntax Trees (MAST), Privacy Merkle roots, Storing Data in a Chain of Blocks, The Merkle Root-The Merkle Rootin block hashes, Block Hashes Merkle trees, The Merkle Root MetaMask wallet, ConsenSys, Walletsusing in writing smart contracts, Writing a smart contract Middleton, Reggie, Skirting the Laws Mimblewimble, Mimblewimble, Beam, and Grin mining, Mining-Block Generation, Evolution of the Price of BitcoinBitcoin, problems with, Ripple and Stellar block generation, Block Generation GAW Miners, Skirting the Laws impacts on market data, Slippage incentives for, Mining Is About Incentives miners discovering new block at same time, The mining process process on Bitcoin for block discovery, The mining process Scrypt, Altcoins transactions confirmed by miner on Bitcoin, Transaction life cycle mint-based currency model, The Whitepaper minting, Important Definitions MKR token, DAI mobile wallets, Wallet Type Variations Moesif’s binary encoder/decoder, Custody and counterparty risk Monero, Monero, Ring Signatures, The Evolution of Crypto Laundering, Blockchains to Watchhow it works, How Monero Works-How Monero Works money laundering, Banking Risk(see also Anti-Money Laundering (AML) rules) evolution of crypto laundering, The Evolution of Crypto Laundering-The Evolution of Crypto Laundering Money Services Business (MSB) standards, The FATF and the Travel Rule MoneyGram, Ripple Mt.


pages: 435 words: 62,013

HTML5 Cookbook by Christopher Schmitt, Kyle Simpson

Firefox, Internet Archive, security theater, web application, WebSocket

Solution Most browsers now have the native ability to establish a bidirectional socket connection between themselves and the server, using the WebSocket API. This means that both sides (browser and server) can send and receive data. Common use cases for Web Sockets are live online games, stock tickers, chat clients, etc. To test if the browser supports Web Sockets, use the following feature-detect for the WebSocket API: var websockets_support = !!window.WebSocket; Now, let’s build a simple application with chat room–type functionality, where a user may read the current list of messages and add her own message to the room.

While things are beginning to stabilize, Web Sockets are still quite volatile, and you have to make sure that your server is speaking the most up-to-date version of the protocol so that the browser can communicate properly with it. The WebSocket object instance has, similar to XHR, a readyState property that lets you examine the state of the connection. It can have the following constant values: {worker}.CONNECTING (numeric value 0) Connection has not yet been established {worker}.OPEN (numeric value 1) Connection is open and communication is possible {worker}.CLOSING (numeric value 2) Connection is being closed {worker}.CLOSED (numeric value 3) Connection is closed (or was never opened successfully) The events that a WebSocket object instance fires are: open Called when the connection has been opened message Called when a message has been received from the server error Called when an error occurs with the socket (sending or receiving) close Called when the connection is closed For each of these events, you can add an event listener using addEventListener(...), or you can set a corresponding handler directly on the worker object instance, including onopen, onmessage, onerror, and onclose.

DOCTYPE html> <html> <head> <title>Our Chatroom</title> <script src="chatroom.js"></script> </head> <body> <h1>Our Chatroom</h1> <div id="chatlog"></div> <input id="newmsg" /><br /> <input type="button" value="Send Message" id="sendmsg" /> </body> </html> Now, let’s examine the JavaScript in chatroom.js: var chatcomm = new WebSocket("ws://something.com/server/chat"); chatcomm.onmessage = function(msg) { msg = JSON.parse(msg); // decode JSON into object var chatlog = document.getElementById("chatlog"); var docfrag = document.createDocumentFragment(); var msgdiv; for (var i=0; i<msg.messages.length; i++) { msgdiv = document.createElement("div"); msgdiv.appendChild(document.createTextNode(msg.messages[i])); docfrag.appendChild(msgdiv); } chatlog.appendChild(docfrag); }; chatcomm.onclose = function() { alert("The chatroom connection was lost.


pages: 1,038 words: 137,468

JavaScript Cookbook by Shelley Powers

Firefox, Google Chrome, hypertext link, semantic web, web application, WebSocket

Eventually, the concept led to work in the W3C on a 428 | Chapter 18: Communication Figure 18-1. Demonstration of updates from polled Ajax calls new JavaScript API called WebSockets. Currently only implemented in Chrome, Web- Sockets enables bidirectional communication between server and client by using the send method on the WebSocket object for communicating to the server, and then at- taching a function to WebSocket’s onmessage event handler to get messages back from the server, as demonstrated in the following code from the Chromium Blog: if ("WebSocket" in window) { var ws = new WebSocket("ws://example.com/service"); ws.onopen = function() { // Web Socket is connected.

ws.send("message to send"); .... }; ws.onmessage = function (evt) { var received_msg = evt.data; ... }; ws.onclose = function() { // websocket is closed. }; } else { // the browser doesn't support WebSocket. } Another approach is a concept known as long polling. In long polling, we initiate an Ajax request as we do now, but the server doesn’t respond right away. Instead, it holds the connection open and does not respond until it has the requested data, or until a waiting time is exceeded. See Also See Recipe 14.8 for a demonstration of using this same functionality with an ARIA live region to ensure the application is accessible for those using screen readers. The W3C WebSockets API specification is located at http://dev.w3.org/html5/websockets/, and the 18.9 Using a Timer to Automatically Update the Page with Fresh Data | 429 Chrome introduction of support for WebSockets is at http://blog.chromium.org/2009/ 12/web-sockets-now-available-in-google.html. 18.10 Communicating Across Windows with PostMessage Problem Your application needs to communicate with a widget that’s located in an iFrame.

The W3C WebSockets API specification is located at http://dev.w3.org/html5/websockets/, and the 18.9 Using a Timer to Automatically Update the Page with Fresh Data | 429 Chrome introduction of support for WebSockets is at http://blog.chromium.org/2009/ 12/web-sockets-now-available-in-google.html. 18.10 Communicating Across Windows with PostMessage Problem Your application needs to communicate with a widget that’s located in an iFrame. However, you don’t want to have to send the communication through the network. Solution Use the new HTML5 postMessage to enable back-and-forth communication with the iFrame widget, bypassing network communication altogether.


pages: 514 words: 111,012

The Art of Monitoring by James Turnbull

Amazon Web Services, anti-pattern, cloud computing, continuous integration, correlation does not imply causation, Debian, DevOps, domain-specific language, failed state, functional programming, Kickstarter, Kubernetes, microservices, performance metric, pull request, Ruby on Rails, software as a service, source of truth, web application, WebSocket

$ sudo riemann /etc/riemann/riemann.config loading bin INFO [2014-12-21 18:13:21,841] main - riemann.bin - PID 18754 INFO [2014-12-21 18:13:22,056] clojure-agent-send-off-pool-2 - riemann.transport.websockets - Websockets server 127.0.0.1 5556 online INFO [2014-12-21 18:13:22,091] clojure-agent-send-off-pool-4 - riemann.transport.tcp - TCP server 127.0.0.1 5555 online INFO [2014-12-21 18:13:22,099] clojure-agent-send-off-pool-3 - riemann.transport.udp - UDP server 127.0.0.1 5555 16384 online INFO [2014-12-21 18:13:22,102] main - riemann.core - Hyperspace core online We see that Riemann has been started and a couple of servers have also been started: a WebSockets server on port 5556, and TCP and UDP servers on port 5555. By default Riemann binds to localhost.

In summary we're calling the logging/init function and passing it a map, in this case containing only one option: the name of the file in which to write our logs. The third stanza controls Riemann's interfaces. Riemann generally listens on TCP, UDP, and a WebSockets interface. By default, the TCP, UDP, and WebSockets interfaces are bound to the 127.0.0.1 or localhost. TCP is on port 5555. UDP is on port 5555. WebSockets is on port 5556. We see that the definition of our interface configuration is inside a stanza starting with let. We're going to see let quite a bit in our configuration. The let expression creates lexically scoped immutable aliases for values.

We use these bindings in the subsequent expressions. In our interface example we're saying: "Let the symbol host be 127.0.0.1 and then call the tcp-server, udp-server, and ws-server functions with that symbol as the value of the :host option." This sets the host interface of the TCP, UDP, and WebSockets servers to 127.0.0.1. A let binding is lexically scoped, i.e., limited in scope to the expression itself. Outside of this expression the host symbol would be undefined. The host symbol is also immutable inside the expression in which it is defined. You cannot change the value of host inside this expression.


pages: 214 words: 14,382

Monadic Design Patterns for the Web by L.G. Meredith

barriers to entry, domain-specific language, don't repeat yourself, finite state, functional programming, Georg Cantor, ghettoisation, John von Neumann, Kickstarter, semantic web, social graph, type inference, web application, WebSocket

In Cover · Overview · Contents · Discuss · Suggest · Glossary · Index 68 Section 3.5 Chapter 3 · An I/O Monad for HTTP Streams 69 terms of our MonadicDispatcher API, this is equivalent to not having the acceptConnections method. In this case, there is a simple transformation of the basic “inversion of control” setup provided in APIs like servlet containers or the Jetty implementation of websockets; we introduce a buffer: case class QueuingWebSocket( requ tQueue : Reques Queue ,,,opeeenCB: SSSoooccckkkeeetttConnnnectionPair => Unit clossseCB: () => U ittt ) extends WWWeeebbb with Socket.OnTextMessage { override def onOpen( wsConnection: WebSocket.Connection ) : Unit = { ppprrriiinnntttlllnnn((( """innn onOpen with """+ wsConnection ))) ope CB( SCP( requestQueue, wsConnection ) o Open complete ) } override def onClose( clossseCode:Int, mes age: String ) : Unit = { println((("in onClose with " + closeCode + " and " + message ) closeCB ) Download from Wow!

To underscore the title choice for this book, I want to stress that this API is a design pattern. You could use it for a wide variety of event streams – from messaging applications, such as those built over an AMQP provider (with the messages, themselves, playing in the role of events) to TCP/IP packets, websockets, and so on. I chose HTTP for many reasons, not the least of which is that it’s ubiquitous in modern applications and as such receives a distinguished level of attention in the practical programmer’s mind and dayto-day practice. I also chose this protocol for its limitations. One of the most widely recognized and adopted uses of the HTTP protocol is the RESTful approach to accessing resources in a web-based application; this approach stresses stateless access to resources.

In a richer discipline, such as one that might line up with nesting of requests and responses, we can imagine that in response to a client’s request, the server needs to turn around and issue a subsequent request back to the client. This is common both in everyday human dialogue: “Will you serve me a coffee? Yes, what kind would you like?” as well as in applications (which in the web context is sufficient motivation for developments like COMET and websockets), because it is a useful idiom! So, unlike the Lisp s expression, which only needs to represent opening and closing of parentheses, when we move to the world of protocols and interpret these as requests and responses, we have the extra dimension of client and server, necessitating an enrichment of our notation scheme.


pages: 82 words: 17,229

Redis Cookbook by Tiago Macedo, Fred Oliveira

Debian, full text search, loose coupling, Ruby on Rails, Silicon Valley, WebSocket

Installing the necessary software Let’s start off by installing the necessary node libraries using npm: npm install socket.io npm install redis Implementing the server side code On the server side, we’ll be running Redis and creating a Javascript file that we’ll run with Node.js. This piece of code will take care of setting up a connection to Redis and listening on a given port for connecting clients (either using websockets or flash—this choice will be handled transparently by Socket.IO). Let’s go through our necessary JavaScript code. Create a chat.js file containing the following code: var http = require('http'), io = require('socket.io'), redis = require('redis'), rc = redis.createClient(); These lines require the libraries we installed and create the variables we’ll use to access Redis and Socket.IO.

Create a chat.js file containing the following code: var http = require('http'), io = require('socket.io'), redis = require('redis'), rc = redis.createClient(); These lines require the libraries we installed and create the variables we’ll use to access Redis and Socket.IO. We’ll access Redis with the “redis” variable, and “io” will let us access all the sockets that are connected to our server (web clients, who visit our chat page). The next thing we must do in our code is to set up an HTTP server system on top of which Socket.io will do its websocket magic. Here are the lines to do that: server = http.createServer(function(req, res){ // we may want to redirect a client that hits this page // to the chat URL instead res.writeHead(200, {'Content-Type': 'text/html'}); res.end('<h1>Hello world</h1>'); }); // Set up our server to listen on 8000 and serve socket.io server.listen(8000); var socketio = io.listen(server); If you have some experience with Node.js or Socket.IO, this code is pretty straightforward.

Here are the lines to do that: server = http.createServer(function(req, res){ // we may want to redirect a client that hits this page // to the chat URL instead res.writeHead(200, {'Content-Type': 'text/html'}); res.end('<h1>Hello world</h1>'); }); // Set up our server to listen on 8000 and serve socket.io server.listen(8000); var socketio = io.listen(server); If you have some experience with Node.js or Socket.IO, this code is pretty straightforward. What we’re basically doing is setting up an HTTP server, specifying how it will reply to requests, making it listen on a port (in this case, we’re going to listen on port 8000), and attaching Socket.IO to it so that it can automatically serve the Socket.IO JavaScript files and set up the websocket functionality. Now we set up the small bits of Redis code to support our functionality. The Redis client we set up with Node.js must subscribe to a specific chat channel, and deal with messages on that channel when they arrive. So that’s what we do next: // if the Redis server emits a connect event, it means we're ready to work, // which in turn means we should subscribe to our channels.


Exploring ES6 - Upgrade to the next version of JavaScript by Axel Rauschmayer

anti-pattern, domain-specific language, en.wikipedia.org, Firefox, functional programming, Google Chrome, MVC pattern, web application, WebSocket

The 2D Context of canvas¹³ lets you retrieve the bitmap data as an instance of Uint8ClampedArray: let let let let canvas = document.getElementById('my_canvas'); context = canvas.getContext('2d'); imageData = context.getImageData(0, 0, canvas.width, canvas.height); uint8ClampedArray = imageData.data; 20.6.5 WebSockets WebSockets¹⁴ let you send and receive binary data via ArrayBuffers: let socket = new WebSocket('ws://127.0.0.1:8081'); socket.binaryType = 'arraybuffer'; // Wait until socket is open socket.addEventListener('open', function (event) { // Send binary data let typedArray = new Uint8Array(4); socket.send(typedArray.buffer); }); // Receive binary data socket.addEventListener('message', function (event) { let arrayBuffer = event.data; ··· }); 20.6.6 Other APIs • WebGL¹⁵ uses the Typed Array API for: accessing buffer data, specifying pixels for texture mapping, reading pixel data, and more. • The Web Audio API¹⁶ lets you decode audio data¹⁷ submitted via an ArrayBuffer. ¹³http://www.w3.org/TR/2dcontext/ ¹⁴http://www.w3.org/TR/websockets/ ¹⁵https://www.khronos.org/registry/webgl/specs/latest/2.0/ ¹⁶http://www.w3.org/TR/webaudio/ ¹⁷http://www.w3.org/TR/webaudio/#dfn-decodeAudioData 325 Typed Arrays • Media Source Extensions¹⁸: The HTML media elements are currently <audio> and <video>.

Typed Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 20.2.1 Element types . . . . . . . . . . . . . . . . . . . . . 20.2.2 Handling overflow and underflow . . . . . . . . . . 20.2.3 Endianness . . . . . . . . . . . . . . . . . . . . . . . 20.2.4 Negative indices . . . . . . . . . . . . . . . . . . . . 20.3 ArrayBuffers . . . . . . . . . . . . . . . . . . . . . . . . . . 20.3.1 ArrayBuffer constructor . . . . . . . . . . . . . . . 20.3.2 Static ArrayBuffer methods . . . . . . . . . . . . . 20.3.3 ArrayBuffer.prototype properties . . . . . . . . 20.4 Typed Arrays . . . . . . . . . . . . . . . . . . . . . . . . . 20.4.1 Typed Arrays versus normal Arrays . . . . . . . . . 20.4.2 Typed Arrays are iterable . . . . . . . . . . . . . . . 20.4.3 Converting Typed Arrays to and from normal Arrays 20.4.4 The Species pattern for Typed Arrays . . . . . . . . . 20.4.5 The inheritance hierarchy of Typed Arrays . . . . . . 20.4.6 Static TypedArray methods . . . . . . . . . . . . . . 20.4.7 TypedArray.prototype properties . . . . . . . . . 20.4.8 «ElementType»Array constructor . . . . . . . . . . 20.4.9 Static «ElementType»Array properties . . . . . . . 20.4.10«ElementType»Array.prototype properties . . . . 20.4.11Concatenating Typed Arrays . . . . . . . . . . . . . 20.5 DataViews . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.5.1 DataView constructor . . . . . . . . . . . . . . . . . 20.5.2 DataView.prototype properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307 307 308 309 310 311 312 312 313 313 313 313 313 314 314 315 315 315 317 320 320 321 321 322 322 322 CONTENTS 20.6 Browser APIs that support Typed Arrays . 20.6.1 File API . . . . . . . . . . . . . . . 20.6.2 XMLHttpRequest . . . . . . . . . 20.6.3 Fetch API . . . . . . . . . . . . . . 20.6.4 Canvas . . . . . . . . . . . . . . . 20.6.5 WebSockets . . . . . . . . . . . . . 20.6.6 Other APIs . . . . . . . . . . . . . 20.7 Extended example: JPEG SOF0 decoder . 20.7.1 The JPEG file format . . . . . . . . 20.7.2 The JavaScript code . . . . . . . . 20.8 Availability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322 322 323 323 323 324 324 325 325 325 327 21.

Two kinds of views are used to access the data: • Typed Arrays (Uint8Array, Int16Array, Float32Array, etc.) interpret the ArrayBuffer as an indexed sequence of elements of a single type. • Instances of DataView let you access data as elements of several types (Uint8, Int16, Float32, etc.), at any byte offset inside an ArrayBuffer. The following browser APIs support Typed Arrays (details are mentioned later): • • • • • • File API XMLHttpRequest Fetch API Canvas WebSockets And more 307 Typed Arrays 308 20.2 Introduction Much data one encounters on the web is text: JSON files, HTML files, CSS files, JavaScript code, etc. For handling such data, JavaScript’s built-in string data type works well. However, until a few years ago, JavaScript was not well equipped to handle binary data.


pages: 570 words: 115,722

The Tangled Web: A Guide to Securing Modern Web Applications by Michal Zalewski

barriers to entry, business process, defense in depth, easy for humans, difficult for computers, fault tolerance, finite state, Firefox, Google Chrome, information retrieval, RFC: Request For Comment, semantic web, Steve Jobs, telemarketer, Tragedy of the Commons, Turing test, Vannevar Bush, web application, WebRTC, WebSocket

In particular, it is not clear if the added complexity of preflight requests is worth the peripheral benefit of being able to issue cross-domain requests with unorthodox methods or random headers. The last of the weak complaints hinges on the fact that CORS is susceptible to header injection. Unlike some other recently proposed browser features, such as WebSockets (Chapter 17), CORS does not require the server to echo back an unpredictable challenge string to complete the handshake. Particularly in conjunction with preflight caching, this may worsen the impact of certain header-splitting vulnerabilities in the server-side code. XDomainRequest Microsoft’s objection to CORS appears to stem from the aforementioned concerns over the use of ambient authority, but it also bears subtle overtones of their dissatisfaction with interactions with W3C.

At the same time, it mini- mizes the overhead associated with delivering concurrent requests or with the parsing of text-based requests and response data. The protocol is currently supported only in Chrome, and other than select Google services, it is not commonly encountered on the Web. It may be coming to Firefox soon, too, however. HTTP-less networking WebSocket[259] is a still-evolving API designed for negotiating largely unconstrained, bidirectional TCP streams for when the transactional nature of TCP gets in the way (e.g., in the case of a low-latency chat application). The protocol is bootstrapped using a keyed challenge-response handshake, which looks sort of like HTTP and which is (quite remarkably) impossible to spoof by merely exploiting a header-splitting flaw in the destination site.

[257] “Manipulating the Browser History,” Mozilla Developer Network, https://developer.mozilla.org/en/DOM/Manipulating_the_browser_history/. [258] A. Langley and M. Belsche, “SPDY: An Experimental Protocol for a Faster Web,” The Chromium Projects, http://www.chromium.org/spdy/spdy-whitepaper/. [259] I. Fette and A. Melnikov, “The WebSocket Protocol,” IETF Request for Comments draft (2011), http://tools.ietf.org/html/draft-ietf-hybi-thewebsocketprotocol-10/. [260] J. Rosenberg, M. Kaufman, M. Hiie, and F. Audet, “An Architectural Framework for Browser Based Real-Time Communications,” IETF Request for Comments draft (2011), http://tools.ietf.org/html/draft-rosenberg-rtcweb-framework-00/


pages: 1,881 words: 178,824

HTML5 Canvas by Steve Fulton, Jeff Fulton

barriers to entry, Firefox, game design, Google Chrome, web application, WebSocket

At the same time, ElectroServer can be used with technologies other than Canvas (such as Flash, iOS, and so on), so Canvas will be able to communicate with other socket servers via JavaScript and WebSockets. We chose to base this example on ElectroServer because it allowed us to create a full application for you to test and work through. Other libraries and tools are bound to appear very soon that can work with Canvas—for example, the SmartFox server, which now supports WebSockets and JavaScript without add-ons. Creating a Simple Object Framework for the Canvas As you have seen throughout this book, you can easily create a lot of code when working with the HTML5 Canvas.

ElectroServer from Electrotank was one of the first reliable socket-server applications built to communicate with Flash clients. Over the past couple years, ElectroServer has been updated with APIs for iOS, C#, C++, and now JavaScript. This first iteration of the ElectroServer JavaScript API does not use WebSockets but instead implements JavaScript polling. However, with the availability of ElectroServer’s simplified JavaScript API, you can still start to write multiplayer applications using HTML5 Canvas. Note While this portion of the chapter is specific to ElectroServer, many of the multiplayer/multiuser concepts are applicable to other technologies as well.

The ElectroServer admin tool Because ElectroServer is a socket server, it listens on a specified port for communication from the JavaScript client using one of the supported protocols. ElectroServer supports multiple protocols, but we need to make sure we are using the BinaryHTTP protocol for the JavaScript API. The default port for BinaryHTTP in ElectroServer is 8989. Note When the ElectroServer JavaScript API is updated to support WebSockets, the port and protocol will likely be different. There is a nifty admin tool for ElectroServer that allows you to view and modify all the supported protocols and ports, as well as many other cool features of the socket server. In the /admin directory of the install folder, you should find both an installer for an Adobe AIR admin tool (named something like es5-airadmin-5.0.0.air), and a /webadmin directory with an HTML file named webadmin.html.


pages: 834 words: 180,700

The Architecture of Open Source Applications by Amy Brown, Greg Wilson

8-hour work day, anti-pattern, bioinformatics, c2.com, cloud computing, collaborative editing, combinatorial explosion, computer vision, continuous integration, create, read, update, delete, David Heinemeier Hansson, Debian, domain-specific language, Donald Knuth, en.wikipedia.org, fault tolerance, finite state, Firefox, friendly fire, functional programming, Guido van Rossum, linked data, load shedding, locality of reference, loose coupling, Mars Rover, MITM: man-in-the-middle, MVC pattern, peer-to-peer, Perl 6, premature optimization, recommendation engine, revision control, Ruby on Rails, side project, Skype, slashdot, social web, speech recognition, the scientific method, The Wisdom of Crowds, web application, WebSocket

Cross-browser Transport To make this work across browsers and operating systems, we use the Web::Hippie4 framework, a high-level abstraction of JSON-over-WebSocket with convenient jQuery bindings, with MXHR (Multipart XML HTTP Request5) as the fallback transport mechanism if WebSocket is not available. For browsers with Adobe Flash plugin installed but without native WebSocket support, we use the web_socket.js6 project's Flash emulation of WebSocket, which is often faster and more reliable than MXHR. The operation flow is shown in Figure 19.17. Figure 19.17: Cross-Browser Flow The client-side SocialCalc.Callbacks.broadcast function is defined as: var hpipe = new Hippie.Pipe(); SocialCalc.Callbacks.broadcast = function(type, data) { hpipe.send({ type: type, data: data }); }; $(hpipe).bind("message.execute", function (e, d) { var sheet = SocialCalc.CurrentSpreadsheetControlObject.context.sheetobj; sheet.ScheduleSheetCommands( d.data.cmdstr, d.data.saveundo, true // isRemote = true ); break; }); Although this works quite well, there are still two remaining issues to resolve. 19.7.2.

The server needs to be configured as a proxy so that it can intercept any requests that are made to it without causing the calling Javascript to fall foul of the "Single Host Origin" policy, which states that only resources from the same server that the script was served from can be requested via Javascript. This is in place as a security measure, but from the point of view of a browser automation framework developer, it's pretty frustrating and requires a hack such as this. The reason for making an XmlHttpRequest call to the server is two-fold. Firstly, and most importantly, until WebSockets, a part of HTML5, become available in the majority of browsers there is no way to start up a server process reliably within a browser. That means that the server had to live elsewhere. Secondly, an XMLHttpRequest calls the response callback asynchronously, which means that while we're waiting for the next command the normal execution of the browser is unaffected.


pages: 196 words: 58,122

AngularJS by Brad Green, Shyam Seshadri

combinatorial explosion, continuous integration, Firefox, Google Chrome, Kickstarter, MVC pattern, node package manager, single page application, web application, WebSocket

For example, if in our shopping website a controller needs to get a list of items for sale from the server, we’d want some object—let’s call it Items—to take care of getting the items from the server. The Items object, in turn, needs some way to communicate with the database on the server over XHR or WebSockets. Doing this without modules looks something like this: function ItemsViewController($scope) { // make request to server … // parse response into Item objects … // set Items array on $scope so the view can display it ... } While this would certainly work, it has a number of potential problems.

throw operator, Expressions time zones, Common Gotchas tokens, XSRF transclude property, API Overview, Transclusion transformations, Transformations on Requests and Responses transitive changes, Performance Considerations in watch() U UIs (User Interfaces) creating dynamic, Data Binding separating responsibilities in, Separating UI Responsibilities with Controllers unauthorized transfers, XSRF unit tests, The Tests–Unit Tests for $http service, Unit Testing for app logic, A Few Words on Unobtrusive JavaScript for ngResource, Unit Test the ngResource in Karma, Integrating AngularJS with RequireJS Jasmine style, Unit Tests Jasmine-style, Project Organization with monkey patches, Organizing Dependencies with Modules uppercase filter, Formatting Data with Filters user input, validation of, Validating User Input, The Templates username requirement, enforcing, The Templates V validation tools, Karma, Directives and HTML Validation (see also form validation controls) variables, in data binding, An Example: Shopping Cart vendor folder, Project Organization View Controller, Controllers views adding with Yeoman, Adding New Routes, Views, and Controllers basics of, Model View Controller, Relationship Between Model, Controller, and Template changing with routes and $location, Changing Views with Routes and $location–controllers.js creation of, Model View Controller exposing model data to, Publishing Model Data with Scopes working example of, The Templates W watchAction, Observing Model Changes with $watch watchFn, Observing Model Changes with $watch, Performance Considerations in watch() web development platforms, IDEs web servers starting with ExpressJS, Without Yeoman starting with Yeoman, With Yeoman WebSockets, Organizing Dependencies with Modules WebStorm development platform, IDEs while loop, Expressions window.location vs. $location, $location Windows OS, and Yeoman, Project Organization workflow optimization, Yeoman: Optimizing Your Workflow X XHR, Organizing Dependencies with Modules xHTML naming format, Directives and HTML Validation XML naming format, Directives and HTML Validation XSRF, Talking to Servers XSRF (Cross-Site Request Forgery) attacks, XSRF Y Yeoman, Yeoman: Optimizing Your Workflow overview of, Project Organization starting web servers in, With Yeoman About the Authors Brad Green works at Google as an engineering manager.


pages: 779 words: 116,439

Test-Driven Development With Python by Harry J. W. Percival

continuous integration, database schema, Debian, DevOps, don't repeat yourself, Firefox, loose coupling, MVC pattern, platform as a service, pull request, web application, WebSocket

Pick a framework—perhaps Backbone.js or Angular.js—and spike in an implementa‐ tion. Each framework has its own preferences for how to write unit tests, so learn the one that goes along with it, and see how you like it. Async and Websockets Supposing two users are working on the same list at the same time. Wouldn’t it be nice to see real-time updates, so if the other person adds an item to the list, you see it im‐ mediately? A persistent connection between client and server using websockets is the way to get this to work. Write Some Security Tests www.it-ebooks.info | 435 Check out one of the Python async web servers—Tornado, gevent, Twisted—and see if you can use it to implement dynamic notifications.

Inside-Out, 323 model layer, 331–333 pitfalls, 335 presentation layer, 325 template hierarchy, 327–329 views layer, 326–331, 333 P PaaS (Platform-as-a-Service), 136 Page pattern, 390–393, 396 patch decorator, 278, 301 patching, 287 payment systems, testing for, 252 performance testing, 435 Persona, 242, 252, 308–310, 435 PhantomJS, 381–384, 434 Platform-as-a-Service (PaaS), 136 POST requests, 203 processing, 54, 183–187 redirect after, 68 saving to database, 65–67 sending, 51–54, 92 Postgres, 433 private key authentication, 137 programming by wishful thinking, 328, 335 (see also Outside-In TDD) property Decorator, 334 provisioning, 136–140 with Ansible, 423–426 automation in, 166 functional tests (FT) in, 139 overview, 152 vs. deployment, 140 pure unit tests (see isolated tests) py.test, 436 Python adding to Jenkins, 369 PythonAnywhere, 136, 409 R race conditions, 374, 389 Red, Green, Refactor, 58, 89, 170 redirects, 68, 188 refactoring, 40–45 at application level, 183–186 Red, Green, Refactor, 58, 89, 170 removing hard-coded URLs, 187 and test isolation, 341, 362 tips, 190 unit tests, 175 Refactoring Cat, 44, 112 relative import, 161, 173 render to string, 56 REST (Representational Site Transfer), 82 S screenshots, 411 scripts, automated, 132 secret key, 160 Security Engineering (Anderson), 53 security tests, 435 sed (stream editor), 165 Index www.it-ebooks.info | 447 Selenium, 4 and JavaScript, 235 best practices, 385 in continuous integration, 378–381 in continuous integration, 372 race conditions, 389 race conditions in, 378–381 upgrading, 86 for user interaction testing, 37–40 wait patterns, 18, 253, 387, 389 waits in, 379–381, 385 server configuration, 155 server options, 137 servers, 136–140 (see also staging server) session key, 304 sessions, 282 Shining Panda, 369 sinon.js, 265, 268, 272 skips, 170 spiking, 242–255, 275 browser-ID protocol, 244 de-spiking, 251 frontend and JavaScript code, 243 logging, 250 server-side authentication, 245–248 with JavaScript, 242 SQLite, 433 staging server creating sessions, 311 debugging in, 306–310 managing database on, 311–306 test automation with CI, 384 staging sites, 132, 133, 135 static files, 116, 122, 132, 149 static folder, site-wide, 256 static live server case, 124 string representation, 215 string substitutions, 103 style (see layout and style) superlists, 8 superusers, 73 system boundaries, 403 system tests, 398 T table styling, 126 template inheritance, 120–121 template inheritance hierarchy, 327 448 | template tag, 53 templates, 40, 55 rendering items in, 69–71 separate, 90 test fixtures, 304, 320 test isolation, 112, 337–363 cleanup after, 359–362 collaborators, 343–345 complexity in, 363 forms layer, 347–350 full isolation, 342 interactions between layers, 355 isolated vs. integrated tests, 362 mocks/mocking for, 338–341 models layer, 351–353 ORM code, 347–351, 364 refactoring in, 341, 362 views layer, 337, 338–347, 353 test methods, 17 test organisation, 190 test skips, 170 test types, 364, 397 test-driven development (TDD) advanced considerations in, 397–404 and developer stupidity, 213 double-loop, 47, 323 further reading on, 404 Inside-Out, 323 iterating towards new design, 86 Java testing in, 234 justifications for, 35–37 new design implementation with, 83–86 Outside-In, 323–335 (see also Outside-In TDD) process flowchart, 83 process recap, 47–50 trivial tests, 36–37 Working state to working state, 86, 110, 112 testing best practices, 397 Testing Goat, 3, 110, 112, cdvii tests, as documentation, 296 thin views, 210 time.sleep, 52 tracebacks, 26, 56 triangulation, 58 U Ubuntu, 137 Index www.it-ebooks.info unit tests architectural solutions for, 402 context manager, 177 desired features of, 401 in Django, 23 for simple home page, 21–33 vs. functional tests, 303 vs. functional tests (FT), 22 vs. integrated tests, 61 pros and cons of, 398–401 refactoring, 175 unit-test/code cycle, 31–33 unittest, 134 Unix sockets, 150 Upstart, 151 URLs capturing parameters in, 103 distinct, 102 in Django, 24–30, 88, 94, 96, 102, 106, 108 pointing forms to, 96 urls.py, 27–30 user authentication (see authentication) user creation, 291 user input, saving, 51–75 user interaction testing, 37–40 user stories, 19, 170 validation, 169 (see also functional tests/testing (FT)) model-layer, 175–187 (see also model-layer validation) VCS (version control system), 8–11 view functions, in Django, 24, 89, 94, 105–108 views layer, 337, 338–347, 353 model validation errors in, 178–182 views, what to test in, 223 virtual displays, 372 Virtualbox, 426 virtualenvs, 132, 142–144 W waits, 18, 253, 379–381, 385, 387, 389 warnings, 17 watch function, 265 websockets, 435 widgets, 194, 196 X Xvfb, 369, 373, 410 Y YAGNI, 82 V Vagrant, 426 Index www.it-ebooks.info | 449 About the Author After an idyllic childhood spent playing with BASIC on French 8-bit computers like the Thomson T-07 whose keys go “boop” when you press them, Harry spent a few years being deeply unhappy with economics and management consultancy.


pages: 141 words: 9,896

Pragmatic Guide to JavaScript by Christophe Porteneuve

barriers to entry, commoditize, domain-specific language, en.wikipedia.org, Firefox, web application, WebSocket

eBook <www.wowebook.com>this copy is (P1.0 printing, November 2010) U SING DYNAMIC M ULTIPLE F ILE U PLOADS 26 Using Dynamic Multiple File Uploads The file upload feature currently built into HTML (as in, pre-HTML5) basically blows. It’s single-file, it has no upload progress feedback, it cannot filter on size or file type constraints, and so on. And it uses Base64 encoding, which means every file sent is blown up by 33 percent. Unless we use stuff like WebSockets or SWFUpload, we are stuck with most of these limitations. However, we can improve the user experience a bit by letting users pick multiple files in a nice way. When I say “nice” here, I basically mean “without as many visible file controls as there are files.” I like how 37signals presents lists of files-to-be-uploaded in their products: a flat, icon-decorated list of filenames with the option to remove them from the upload “queue.”


pages: 266 words: 38,397

Mastering Ember.js by Mitchel Kelonye

Firefox, MVC pattern, Ruby on Rails, single page application, web application, WebRTC, WebSocket

For example, the first sample defines its adapter as follows: App.ApplicationAdapter = DS.FixtureAdapter; All adapters need to implement the following methods: find findAll findQuery createRecord updateRecord deleteRecord These adapters enable applications to stay in sync with various data stores such as: Local caches A browser's local storage or indexdb Remote databases through REST Remote databases through RPC Remote databases through WebSockets These adapters are, therefore, swappable in case applications need to use different data providers. Ember-data comes with two built-in adapters: the fixtures-adapter and the rest-adapter. The fixtures adapter uses an in-browser cache to store the application's records. This adapter is especially useful when the backend service of the project is either inaccessible for testing or is still being developed.


pages: 193 words: 46,550

Twisted Network Programming Essentials by Jessica McKellar, Abe Fettig

continuous integration, WebSocket

161 Index A adbapi switching from blocking API to, 77–79 using with SQLite, 78 addBoth method, 36, 56 addCallback method, 26, 31–35, 36 addCallbacks method, 27–28, 33–35, 36 addErrback method, 27–27, 31–35, 36 administrative Python shell, SSH providing, 153–155 Agent API, 53, 55–60 agent.request, 56 AlreadyCalledError, 35 ampoule, 101 API Agent, 53, 55–60 blocking, 77–79, 93 Deferred, 36, 50, 50 (see also Deferreds) platform-independent, 96 producer/consumer, 58 threading, 101 API documentation, using Twisted, 8–8 applications, deploying Twisted, 63–69 Applications, in Twisted application infrastruc‐ ture, 64 Ascher, David, Learning Python, xvi asynchronous code about using Deferreds in, 25 addCallback method vs. addErrback meth‐ od, 33–35 keyfacts about Deferreds, 35 managing callbacks not registered, 25 structure of Deferreds, 26–28 structuring, 25 using callback chains inside of reactor, 28–29 using callback chains outside of reactor, 26– 28 asynchronous headline retriever, 28 asynchronous responses, web server, 49–51 authentication in Twisted applications, 89–90 using public keys for, 151–153, 158 authentication, using Cred about, 81 chat-specific, 121–124 components of, 81–82 examples of, 82–86 process in, 84 AuthOptionMixin class, 89–91 AutobahnPython, Web-Sockets implementa‐ tion, 22 avatar ID, definition of, 82 avatar, definition of, 81 B blocking API, 77–79, 93 blockingApiCall, 95 We’d like to hear your suggestions for improving our indexes. Send email to index@oreilly.com. 163 blockingCallFromThread method, 96 blogs, for Twisted, 9 browsers GET request, 42 serializing requests to same resource, 51 buildProtocol method, 16, 85, 130 C C compiler, installing, 5 Calderone, JP, “Twisted Conch in 60 Seconds” series, 158 callback chains in Deferreds, 26–28 using inside of reactor, 28–29 using outside of reactor, 26–28 callbacks attaching to non-blocking database queries, 78, 79 attaching to writeSuccessResponse, 97 Deferreds using outside of reactor, 26–28 failing to register, 25 practice using, 30–31 registering multiple, 27–28 callFromThread method, 96 callInThread method, 93 callLater method, 29, 108, 112 callMultipleInThread method, 96 channelOpen method, 158 ChatFactory, 21, 106 ChatProtocol states, 21 chatserver, testing, 106–108 client, 142 (see also web client) communication in Twisted, 19 IRC, 119–121 POP3, 142 simultaneous connections to server, 19 SMTP, 127–129, 143 SSH, 156–158 TCP echo, 11–16 ClientCommandTransport class, 158 ClientConnection class, 158 clients IMAP, 137–139 closed method, 158 ColorizedLogObserver, 74 commands standard library module, 96–100 conchFactory, manhole_ssh, 155 ConchUser class, 149 164 | Index connection.SSHConnection class, 156 connectionLost method, 15, 56 connectionMade method, 15, 98, 99 connectionSecure method, 158 connectTCP method, 14 Cred authentication system about, 81 chat-specific, 121–124 components of, 81–82 examples of, 82–86 process in, 84 SSH server, 145–146 credentialInterfaces class variable, 87 credentialInterfaces, authenticating, 85 credentials checkers database-backed, 87–88 DBCredentialsChecker, 87–88, 110–112 definition of, 82 FilePasswordDB, 86 IMAP, 133 in UNIX systems, 91 POP3, 139 returning Deferred to Portal, 85 SSH server, 145–146, 153 credentials, definition of, 81 curses library, 150 D DailyLogFile class, 73 data, streaming large amounts of, 58 databases, non-blocking queries, 77–79 dataReceived method, 15, 56 dataReceived methods, IProtocol interface, 20 DBCredentialsChecker, 87–88, 110–112 decoupling, transports and protocols, 16 deferLater method, 94 Deferreds about Deferred API, 36 agent.request returning, 56 asynchronous responses on web server us‐ ing, 50 credentials checker to Portal, 85 in non-blocking database queries, 78, 79 keyfacts about, 35 POP3 client returning, 142 practice using, 30–35 shutting down reactor before firing, 95 testing, 109–112 using callback chains inside of reactor, 28–29 using callback chains outside of reactor, 26– 28 using in asynchronous code, 25 deferToThread method, 93 DirtyReactorAggregateError, 110 Dive Into Python (Pilgrim), xvi downloading Python, xvi TortoiseSVN, 6 Twisted, 3 web resources, 54–55 downloadPage helper, 54–55 dynamic content, serving, 45 dynamic URL dispatch, 46–48 E echo application, turning echo server into, 64 echo bot IRC, 119–121 talking in #twisted-bots with, 122 Echo protocol, testing, 104–105 echo TCP servers and clients, 11–16 EchoFactory class, 16, 64, 72, 85, 90, 104 emails IMAP client for, 137–139 POP3 servers for, 139–143 sending using SMTP, 127–128 serving messages using IMAP, 133–137 storing using SMTP servers, 130–132 emit method, 74 errbacks attaching to non-blocking database queries, 78 Deferreds using outside of reactor, 26–27 practice using, 30–33 errReceived method, 99 event-driven programming, 12–14 F fakeRunqueryMatchingPassword, 111–112 FileLogObserver, 73–74 FilePasswordDB credential checker, 86 Free Software (Open Source movements), x, xiii G GET requests handling, 43–48 making HTTP, 40–42 getHost method, ITransport interface, 14 getManholeFactory function, 155–155 getPage helper, 53–54 getPassword method, 158 getPeer method, ITransport interface, 14 getPrivateKey method, 158 getPrivateKeyString functions, 150, 150 getProcessOutput method, 96, 97, 98 getProcessValue method, 96 getPu blicKey method, 158 getPublicKeyString functions, 150, 150 H headline retriever, asynchronous, 28 HistoricRecvLine, 145, 149 HTTP client, 55 (see also web client) Agent API, 55–60 HTTP GET request, 40–42 HTTP HEAD request, 57–57 HTTP servers, 39 (see also web servers) about, 39 parsing requests, 42 responding to requests, 39–42 tutorials related to, 51 HTTPEchoFactory, 40, 66 I IAccount, imap4, 133 IAvatar, 150 IBodyProducer interface, 58 IChatService interface, InMemoryWordsRealm implementing, 121 ICredentialsChecker interface, 87 IMailbox, imap4, 133 IMAP (Internet Message Access Protocol) about, 125, 132 clients, 137–139 servers, 133–137 imap4 IAccount, 133 IMailbox, 133 IMessage, 133 IMessage, imap4, 133 IMessageDelivery interface, 130 in-application logging, 71–73 Index | 165 inConnectionLost method, 99, 99 infrastructure, Twisted application, 63–67 InMemoryWordsRealm, implementing IChat‐ Service interface, 121 installing Twisted, 3–6 insults library, 150 integration-friendly platform, xv IPlugin class, 67 IProcessProtocol, 98 IProtocol interface methods, 15 IProtocolAvatar interface, 85 IRC channels, for Twisted, 9 IRC clients, 119–121 IRC servers, 121–124 IRCFactory, 122 IRCUser protocol, 121 irc_* handler, implementing, 122 IResource interface, 44 irssi, connecting to twisted IRC server using, 122 IService interface, implementing, 64 IServiceMaker class, 67 ISession, 149 ISSH PrivateKey, 153 ITransport interface methods, 14 IUsernameHashedPassword, 88–88 K key-based authentication, supporting both user‐ name/password and, 153 Klein micro-web framework, 51 L Learning Python (Lutz and Ascher), xvi Lefkowitz, Matthew “the Glyph”, ix–xi lineReceived method, 40, 106, 145, 149 lineReceived methods, 20 LineReceiver, 97, 145 Linux installing PyCrypto for, 4 installing pyOpenSSL for, 4 installing Twisted on, 3–4 Linux distributions, OpenSSH SSH implemen‐ tation on, 146 listenTCP method, 14, 40 listSize method, 142 log.addObserver, 74 logging systems, 71–75 166 | Index LogObserver, 74 LoopingCall, 94 loseConnection method, ITransport interface, 14 Lutz, Mark, Learning Python, xvi M Mac OS X, 146 (see also OS X) OpenSSH SSH implementation on, 146 mail (see emails) Maildir IMAP server, 133–137 storage format, 130–132 using POP3, 139 mailing lists, for Twisted, 8–7 makeConnection method, 15 manhole_ssh, 155–155 manhole_ssh.ConchFactory class, 156 myCallback function, 26 myErrback function, 27 MyHTTP protocol, 43 MySQL, non-blocking interface for, 77 N namespace argument, 155–155 non-blocking code, using Deferreds in, 25 NOT_DONE_YET method, 50 nslookup command, 126 O Open Source movements (Free Software), x, xiii openShell method, 145–146, 150–150 OpenSSH SSH implementation, 146 optParameters instance variable, 67 OS X installing PyCrypto for, 5 installing pyOpenSSL for, 5 installing Twisted on, 5 OpenSSH SSH implementation on, 146 outConnectionLost method, 99 outReceived method, 99 P parsing HTTP requests, 42–43 passage of time, testing, 112–114 PasswordAuth class, 158 pauseProducing method, 58 persistent protocol state, stored in protocol fac‐ tory, 19 Pilgrim, Mark, Dive Into Python, xvi Planet Twisted blogs, 9 platform-independent API, 96 Plugins, in Twisted application infrastructure, 66–67, 69 POP3 (Post Office Protocol version 3) about, 125 servers, 139–143 Portal definition of, 82 IMAP, 133 in Cred authentication process, 85 POP3, 139 SSH server, 145–146 POST HTTP data, with Agent, 58 POST requests, handling, 48–49 Postgres, non-blocking interface for, 77 printing to stderr if headline is too long, 28 web resource, 53–54 printResource method, 56 private keys generating for SSH server, 150 RSA, 150 processEnded method, 99 processExited method, 99 ProcessProtocol, 98, 99 producer/consumer API, streaming large amounts of data using, 58 protocol code, mixing application-specific logic with, 22 protocol factories about, 16 IMAP server, 133 in Cred authentication process, 85–85 in HTTP GET request, 40, 43 persistent protocol state stored in, 19 POP3, 139 SMTP server, 130 protocol state machines, 19–21 protocols about, 15–16 creating subclass ResourcePrinter, 56 custom process, 98–100 decoupling, 16 HistoricRecvLine vs. regular, 149 IMAP server, 133 in Twisted Mail, 125 IRCUser, 121 POP3, 139 retrieving reason for terminated connection, 19 service implementations, 64 SMTP, 126–127 SSH server, 145–146, 149 testing, 104–108 Twisted Words, 119–124 proto_helpers, 104–105 public keys generating for SSH server, 150 using for authentication, 151, 158 PublicKeyCre dentialsChecker, 153 putChild method, 44 PyCrypto, installing for Linux, 4, 4 for OS X, 5, 5 Python about, xiii checking version of, 7 resources for learning and downloading, xvi Python shell, SSH providing administrative, 153–155 python-crypto,packages, for Windows, 4 python-openssl packages, for Windows, 4 python-twisted packages, 3 Q queries, non-blocking database, 77–79 quote, TCP servers and clients, 16–19 R reactor in serving static content, 44–44 shutting down before events complete, 95 testing and, 108–114 using callback chains inside of, 28–29 reactor event loop, 14 Realm IMAP, 133 POP3, 139 SSH server, 145–146, 150 realm, definition of, 82 receivedHeader method, 130 RecvLine class, 145 Index | 167 redirects, dynamic URL dispatch, 48 release tarball, installing Twisted from, 6 remote server using SSH, running commands on, 156–158 render_GET method, 46, 48, 50 render_POST method, 46 request blocks, rendering on web servers, 49–51 requestAvatar method, 86, 150, 153 requestAvatarId method, 87, 88, 153 requestAvatarID method, 110 Resource hierarchies, extending by registering child resources, 45 Resource subclass, defining dynamic resource by, 45 ResourcePrinter subclass, 56 resources, for answering questions about Twist‐ ed, 8–7 Response body, handling through agent.request, 56 Response metadata, retrieving, 57–57 resumeProducing method, 58 retrieve method, 142 rotateLength, 72, 73 RSA private keys, for SSH server, 150–150 RSA.generate, as blocking function, 150 RunCommand, 97 RunCommandFactory, 97 S Safari Books Online, xvii Scripts directory, adding to PATH in Windows, 4–5 sendData method, IProtocol interface, 20 sendLine methods, 20 sendRequest, 158 server, 51 (see also web server) client simultaneous connections to, 19 communication in Twisted, 19 examples at Twisted Web examples directo‐ ry, 51 IMAP, 133–137 IRC, 121–124 POP3, 139–143 SMTP, 128–132 SSH creating, 145–150 supporting both username/password and key-based authentication on, 153 168 | Index twisted.conch communicationg with, 156–158 TCP echo, 11–16 service plugin, components of, 67 Services, in twisted application infrastructure, 64 serviceStarted method, 158 serving dynamic content, 45 static content, 43–45 setResponseCode, 43 slowFunction, 109 SMTP (Simple Mail Transfer Protocol) about, 125 protocol, 126–127 sending emails using, 127–128 servers, 128–132 tutorial for building client, 143 source, installing Twisted from, 6 spawnProcess method, 98, 99 SQLite non-blocking interface for, 77 using adbapi with, 78 SSH (Secure SHell) about, 145 clients, 156–158 getting error on local machine, 149 providing administrative Python shell, 153– 155 running commands on remote server, 156– 158 server creating, 145–150 supporting both username/password and key-based authentication on, 153 using public keys for authentication, 151–153 ssh-keygen, using in Windows, 146 SSHDemoAvatar class, 149 SSHDemoProtocol class, 149 Stack Overflow programming Q & A site, for Twisted, 9 startLogging, 74 startProducing method, 58 startService method, 64 static content, serving, 43–45 static URL dispatch, 44 stderr, printing if headline is too long to, 28 stdout, logging to, 71–72 StdoutMessageDelivery, 130 StdoutSMTPFactory, 130 stopProducing method, 58 stopService method, 64 storing mail, 130 streaming, large amounts of data, 58 StringProducer, constructing, 58–60 StringTransport class, 104–105 subprocesses, running, 96–100 subproject documentation, using Twisted, 8 svn (subversion) repository, Twisted, 6 T TAC (Twisted Application Configuration) files, in Twisted application infrastructure, 64–65, 69 task module method, 94 TCP servers and clients echo, 11–16 quote, 16–19 TCP, HTTP using as transport-layer protocol, 40 telnet connections, terminating, 21 telnet utility, 40 TerminalRealm, manhole_ssh, 155 testing about, 103 Deferreds, 109–112 passage of time, 112–114 protocols, 104–108 reactor and, 108 writing and running unit tests with trial, 103–104 test_slowFunction, 109 threaded calls, making, 93–96, 101 threading API, 101 TortoiseSVN, downloading, 6 transport.SSHClientTransport class, 156 transports about, 14 decoupling, 16 twistd examples of, 68–68 in Twisted application infrastructure, 65–66 logging, 73 Twisted about, ix–xi, xiii–xv downloading and installing, 3–6 resources for answering questions about, 8– 7 svn repository, 6 testing installation of, 7–7 using API documentation, 8 Twisted Application Configuration (TAC) files, in Twisted application infrastructure, 64–65 Twisted applications authentication in, 89–91 deploying, 63–69 Twisted Conch examples, 158 Twisted Conch HOWTO, walking through im‐ plementing SSH client, 158 “Twisted Conch in 60 Seconds” series (Calder‐ one), 158 Twisted Core examples directory, 22 networking libraries, 8 Twisted Core HOWTO documents on Deferreds, 36 plugin discussion at, 69–69 TAC discussion at, 69–69 threads discussion at, 101 “Twisted From Scratch” tutorial, 22 Twisted Cred about, 81 authentication process in, 84 chat-specific authentication using, 121–124 components of, 81–82 examples of, 82–86 using on SSH server to support authentica‐ tion, 151–146 #twisted IRC channel, 9 Twisted Mail about, 125 examples directory, 143 Twisted Mail HOWTOtutorial, for building SMTP client, 143 Twisted Web Client HOWTO, discussing Agent API at, 60 Twisted Web HOWTO, tutorials related to HTTP servers, 51 Twisted Words, 119–124 #twisted-bots, talking with echo bot in, 122 twisted-python, mailing list, 8–9 twisted.application.service.Application, creating instance, 64–65 twisted.conch about, 145 Index | 169 communicationg with server using SSH, 156–158 writing SSH server and, 145 twisted.conch.avatar.ConchUser class, 149 twisted.conch.common.NS function, 158 twisted.conch.interfaces.IAvatar, 150 twisted.conch.interfaces.ISession, 149 twisted.conch.manhole_ssh module, 153 twisted.conch.recvline, 145, 149 twisted.conch.ssh.keys module, 150 twisted.enterprise.adbapi, as non-blocking in‐ terface, 77 twisted.internet.protocol.ProcessProtocol, 98 twisted.internet.task Clock class, 112 LoopingCall, 94 twisted.trial.unittest, 103–104 twisted.web implementations for common resources contained on, 44 mailing list, 9 parsing http requests from, 42–43 server, handling GET requests, 43–48 twisted.web.client downloadPage, 54–55 getPage, 53–54 initializing Agent, 55–56 U Ubuntu PPA, packages for Twisted, 4 unit tests, writing and running with trial, 103– 104 unittest framework, 103 unittest.tearDown test method, 108 UNIX systems curses library in, 150 using credentials checker in, 91 URL dispatch dynamic, 46–48 static, 44 userauth.SSHUserAuthClient class, 156, 158 170 | Index username/password, supporting both key-based authentication and, 153 V validateFrom method, 130 validateTo method, 130 verifyHostKey method, 158 verifySignature, 153 W wantReply, keyword argument, 158 web browsers GET request, 42 serializing requests to same resource, 51 web clients, Agent API, 55–60 web resources, downloading, 54–55 web servers about, 39 asynchronous responses on, 49–51 handling GET requests, 43–48 handling POST requests, 48–49 parsing requests, 42–43 responding to requests, 39–42 Windows adding the Scripts directory to PATH in, 4–5 installing PyCrypto for, 4 installing pyOpenSSL for, 4 installing Twisted on, 4–5 using ssh-keygen, 146 Wokkel library, 122 write method, ITransport interface, 14 writeSequence method, ITransport interface, 14 writeSuccessResponse, attaching callback to, 97 Z zope.interface import implements, 58 installing, 6 About the Authors Jessica McKellar is a software engineer from Cambridge, Massachusetts.


pages: 190 words: 52,865

Full Stack Web Development With Backbone.js by Patrick Mulder

Airbnb, create, read, update, delete, Debian, functional programming, Kickstarter, MVC pattern, node package manager, Ruby on Rails, side project, single page application, web application, WebSocket

Parsing raw data also applies to situations where you are working with non-RESTful APIs, such as data from sockets. For those cases, you can overwrite parts (or the complete) synching behavior. The documentation will be a good start if you need to overwrite the default Backbone sync behavior (e.g., when you want to connect ap‐ plication state to websockets). The annotated source code of Back‐ bone.js has a nice list of use cases when overwriting Backbone.Sync, which might be important. To start working with an API, we first explore the mapping of Backbone.Sync to “read” movies. The data is provided by canned with the setup described in “Mocking an API” on page 85.


Elixir in Action by Saša Jurić

demand response, en.wikipedia.org, fault tolerance, finite state, functional programming, general-purpose programming language, place-making, Ruby on Rails, WebSocket

For example, GenStage (https://github.com/elixir-lang/gen_stage) can be used for back-pressure and load control. The Phoenix.Channel module, which is part of the Phoenix web framework (http://phoenixframework.org/), is used to facilitate bidirectional communication between a client and a web server over protocols such as WebSocket or HTTP. There isn’t enough space in this book to treat every possible OTP-compliant abstraction, so you’ll need to do some research of your own. But it’s worth pointing out that most such abstractions follow the ideas of GenServer. Except for the Task module, all of the OTP abstractions mentioned in this section are internally implemented on top of GenServer.

For example, let’s say that when handling a web request you start a longer-running task that communicates with the payment gateway. You could start the task and immediately respond to the user that the request has been accepted. Once the task is done, the server would issue a notification about the outcome, perhaps via WebSocket or an email. Or suppose a task needs to produce a side effect, such as a database update, without notifying the starter process. In either scenario, the starter process doesn’t need to be notified about the task’s outcome. Furthermore, in some cases you won’t want to link the task process to the starter process.


pages: 560 words: 135,629

Eloquent JavaScript: A Modern Introduction to Programming by Marijn Haverbeke

always be closing, domain-specific language, Donald Knuth, en.wikipedia.org, Firefox, functional programming, hypertext link, job satisfaction, MITM: man-in-the-middle, premature optimization, slashdot, web application, WebSocket

We can arrange for the client to open the connection and keep it around so that the server can use it to send information when it needs to do so. But an HTTP request allows only a simple flow of information: the client sends a request, the server comes back with a single response, and that is it. There is a technology called WebSockets, supported by modern browsers, that makes it possible to open connections for arbitrary data exchange. But using them properly is somewhat tricky. In this chapter, we use a simpler technique—long polling—where clients continuously ask the server for new information using regular HTTP requests, and the server stalls its answer when it has nothing new to report.

., 128 player, 265–267, 275, 278, 281, 284, 296, 303, 305 Player class, 270, 281 plus character, 13, 148, 165 pointer, 230 pointer events, 253–256, 337 pointerPosition function, 338 polling, 247 pollTalks function, 385 polymorphism, 105–106 pop method, 62, 71 Popper, Karl, 234 port, 220, 311, 360 pose, 296 position, of elements on screen, 236 position (CSS), 240, 244, 257, 266, 275 POST method, 313, 314, 321, 374 postMessage method, 259 power example, 42, 48, 50 precedence, 13, 17, 239 predicate function, 88, 92, 95 Prefer header, 374, 380, 385 premature optimization, 50 preventDefault method, 251, 256–258, 282, 321, 339, 423 previousSibling property, 230 primitiveMultiply (exercise), 141, 413 privacy, 225 private (reserved word), 26 private properties, 98, 141–142 process object, 354–355, 364–365 processor, 181, 400 profiling, 50, 399 program, 2, 23, 28 program size, 83, 84, 164, 272 programming, 1 difficulty of, 2 history of, 3 joy of, 1, 2 Programming Is Terrible, 166 programming language, 1–2 creating, 203, 213 DOM, 229 history of, 3 machine language and, 391 Node.js and, 354 power of, 5 programming style, 3, 24, 32, 35, 272 progress bar, 256 project chapter, 117, 203, 265, 333, 371 promise, 200, 416 Promise class, 186, 187, 189, 195, 197, 198, 200, 315, 326, 359, 361, 363, 386, 416 Promise.all function, 190, 199, 200, 416 Promise.reject function, 187 Promise.resolve function, 186, 190 promises package, 359 promptDirection function, 139 promptInteger function, 134 propagation, of events, 249, 250, 257, 258 proper lines (exercise), 350, 424 property access, 27, 61, 129, 348, 403 assignment, 63 definition, 63, 66, 109 deletion, 63, 98 inheritance, 99, 101, 103 model of, 63 naming, 105–107 testing for, 64 protected (reserved word), 26 protocol, 220, 221, 311–312 prototype, 99–104, 111, 211, 215, 417, 426 diagram, 103 prototype property, 101 pseudorandom numbers, 75 public (reserved word), 26 public properties, 98 public space (exercise), 369, 425 publishing (packages), 358 punch card, 3 pure function, 55, 79, 88, 175, 330, 422 push method, 62, 69, 71, 411, 426 pushing data, 372 PUT method, 312–313, 363, 367, 373, 378, 425 Pythagorean theorem, 411, 423 Python, 391 Q quadratic curve, 292 quadraticCurveTo method, 292, 420 query string, 314, 374, 380 querySelector method, 240, 417 querySelectorAll method, 239, 324 question mark, 18, 148, 157, 314 queue, 198 quotation mark, 14, 165 quoting in JSON, 77 of object properties, 63 quoting style (exercise), 165, 413 R rabbit example, 98, 100–102 radians, 242, 293, 298 radio buttons, 318, 323 radius, 350, 423 radix, 11 raising (exceptions), 135 random numbers, 75, 271 random-item package, 414 randomPick function, 122 randomRobot function, 122 range, 88, 147, 148 range function, 5, 78, 409 Range header, 316 ray tracer, 306 readability, 4, 5, 35, 50, 54, 135, 167, 208, 276, 307 readable stream, 361, 362, 364, 378 readAsDataURL method, 345 readAsText method, 326 readdir function, 359, 366, 425 readdirSync function, 425 read-eval-print loop, 354 readFile function, 172, 358, 425 readFileSync function, 359, 424 reading code, 6, 117 readStorage function, 184 readStream function, 378, 379 real-time events, 247 reasoning, 17 recipe analogy, 84 record, 62 rect (SVG tag), 288 rectangle, 266, 278, 289, 307, 342 rectangle function, 342, 423 recursion, 47, 50, 56, 80, 189, 195, 205, 206, 208, 231, 243, 300, 394, 408, 410, 413, 416, 418 reduce method, 89, 91, 94, 95, 340, 411 redundancy, 397 ReferenceError type, 215 RegExp class, 146, 157, 424 regexp golf (exercise), 164 regular expressions, 145–165, 206, 368, 375, 376, 417, 424 alternatives, 152 backtracking, 153 boundary, 151 creation, 146, 157 escaping, 146, 158, 414 flags, 149, 155, 157, 414 global, 155, 158, 159 grouping, 149, 155 internationalization, 162 matching, 152, 158 methods, 146, 150, 158 repetition, 148 rejecting (a promise), 187, 189, 198 relative path, 172, 224, 355, 363, 425 relative positioning, 240, 241 relative URL, 315 remainder (modulo) operator, 14, 33, 297, 407, 408, 418, 420 remote access, 363 remote procedure call, 316 removeChild method, 232 removeEventListener method, 248, 419 removeItem method, 326 rename function, 359 rendering, 289 renderTalk function, 384 renderTalkForm function, 385 renderUserField function, 383 repeat method, 73, 257 repeating key, 251 repetition, 52, 148, 154, 157, 260 replace method, 155, 165, 413 replaceChild method, 233, 418 replaceSelection function, 322 reportError function, 383 repulsion, 393, 395 request, 185, 189, 220, 312, 313, 321, 360, 361, 367, 372 request function, 189, 361, 362 request type, 185 requestAnimationFrame function, 241, 258, 260, 283, 308, 418 requestType function, 190 require function, 171, 172, 178, 355, 356, 365, 375 reserved words, 26 resolution, 172, 355 resolve function, 364 resolving (a promise), 186, 187, 189, 198 resource, 220, 221, 312, 313, 317, 363, 377 response, 185, 189, 312, 313, 316, 360, 364, 366 Response class, 315 responsiveness, 247, 353, rest parameters, 74 restore method, 299, 300 result property, 326 retry, 189 return keyword, 42, 47, 101, 196, 408, 411 return value, 27, 42, 134, 185, 410 reuse, 54, 112, 167–169, 356 reverse method, 79 reversing (exercise), 79, 409 rgb (CSS), 274 right-aligning, 243 rmdir function, 366, 368 roadGraph object, 118 roads array, 117 roads module (exercise), 177, 415 robot, 117, 119, 121, 123, 125, 177 robot efficiency (exercise), 125, 412 robustness, 373 root, 229 rotate method, 298, 300 rotation, 307, 420 rounding, 76, 134, 278, 279, 302, 424 router, 372, 375 Router class, 375, 376 routeRequest function, 194 routeRobot function, 123 routing, 192 rows, in tables, 243 Ruby, 391 rules (CSS), 238, 239 run function, 211 runAnimation function, 283, 285 runGame function, 284, 285 runLayout function, 396 runLevel function, 283, 285 running code, 7 runRobot function, 121, 412 run-time error, 132–134, 140, 417 Rust (programming language), 391 S Safari, 225 sandbox, 7, 59, 224, 227, 316 save method, 299, 300 SaveButton class, 344 scalar replacement of aggregates, 400, 402 scale constant, 337–339 scale method, 297, 299 scaling, 273, 296, 297, 303, 421 scalpel (exercise), 200, 416 scheduling, 197, 354 scientific notation, 13, 165 scope, 43, 44, 48, 168, 170–173, 208, 210, 214, 215, 417 script (HTML tag), 223, 224, 258 SCRIPTS data set, 87, 89, 92, 93, 95 scroll event, 256, 260 scrolling, 251, 256–257, 275–276, 282, 301 search method, 158 search problem, 124, 152, 154, 232, 368, 405 search tool (exercise), 368, 424 section, 161 Secure HTTP, 221, 317, 361 security, 224, 225, 316, 317, 325, 327, 364, 375 select (HTML tag), 319, 324, 327, 334, 340, 425 selected attribute, 324 selection, 322 selectionEnd property, 322 selectionStart property, 322 selector, 239 self-closing tag, 222 semantic versioning, 357 semicolon, 23, 24, 33, 237 send method, 185, 188 sendGossip function, 191 sep binding, 364–365 sequence, 148 serialization, 77 server, 220, 221, 311–313, 315, 316, 353, 360, 362, 363, 372, 375 session, 328 sessionStorage object, 328 set, 146, 147, 229 Set (data structure), 113, 126 Set class, 113, 126, 413 set method, 105 setAttribute method, 235, 337 setInterval function, 260, 296 setItem method, 326 setter, 110 setTimeout function, 184, 197, 259, 260, 380, 386 shape, 287, 290, 291, 293, 295, 307 shapes (exercise), 307, 420 shared property, 100, 103 SHIFT key, 252, 423 shift method, 71 shiftKey property, 252 short-circuit evaluation, 20, 51, 209, 411 SICP, 202 side effect, 24, 27, 34, 42, 54, 65, 79, 88, 159, 175, 199, 230, 232, 233, 236, 290, 299, 314, 334, 335 sign, 12, 165, 414 sign bit, 12 signal, 11 simplicity, 213 simulation, 119, 121, 265, 270, 330, 393, 418 sine, 75, 241, 271, 281 single-quote character, 14, 165, 224 singleton, 126 skill, 333 SkillShareApp class, 386 skill-sharing project, 371–373, 375, 381 skipSpace function, 206, 214 slash character, 13, 35–36, 146, 156, 315, 364, 425 slice method, 72, 73, 88, 233, 409, 416 slope, 424 sloppy programming, 261 smooth animation, 241 SMTP, 220 social factors, 349 socket, 372–373 some method, 92, 95, 191, 376, 426 sorting, 229 source property, 158 special form, 203, 208 special return value, 134, 135 specialForms object, 208 specificity, 239 speed, 1, 2, 308, 421 SpiderMonkey, 400 spiral, 307, 420 split method, 118, 268 spread, 74, 336 spread operator, 274 spring, 393, 395 sprite, 296, 303–304 spy, 256 square, 28 square brackets, 60, 61, 74, 76, 107, 147, 324, 328, 409 square example, 41–42, 45, 46 square root, 68, 75, 411 src attribute, 222, 224 stack, see call stack stack overflow, 47, 50, 56, 408 stack trace, 136 staged compilation, 392 standard, 5, 26, 35, 88, 136, 162, 349, 354, 355 standard environment, 26 standard output, 354, 362–363 standards, 219, 225 star, 307, 420 Star Trek insignia, 292 startPixelEditor function, 347 startState constant, 347 startsWith method, 364 stat function, 359, 365, 366, 425 state of application, 275, 334, 342, 346, 347, 388 in binding, 24, 31, 32, 34, 400 of canvas, 289, 299 in iterator, 197 in objects, 119, 268, 301 transitions, 198, 336, 337 statement, 23, 24, 28, 31, 32, 42, 63 static (reserved word), 26 static file, 373, 376 static method, 110, 113, 268, 413 static typing, 403 Stats type, 366 statSync function, 425 status code, 312, 354–355 status property, 315, 383 stdout property, 362–363 stopPropagation method, 250 storage function, 187 stream, 220, 361–363, 364, 367, 378 strict mode, 130 string, 14, 60, 62, 65, 92 indexing, 56, 72, 74, 92, 149 length, 37, 92 methods, 73, 149 notation, 14 properties, 72 representation, 15 searching, 73 String function, 28, 105 stroke method, 290–292 strokeRect method, 289, 421 strokeStyle property, 290 strokeText method, 295 stroking, 289, 290, 295, 306 strong (HTML tag), 235, 237 structure, 168, 222, 227, 334 Structure and Interpretation of Computer Programs, 202 structure sharing, 79 style, 237 style (HTML tag), 238, 239 style attribute, 237–239, 273 style sheet, see CSS subclass, 111 submit, 318, 320, 321 submit event, 321, 384, 425 substitution, 54 subtraction, 13, 113 sum function, 5, 78 summing (exercise), 78, 409 summing example, 4, 83, 89, 211 superclass, 111 survey, 294 Sussman, Gerald, 202 SVG, 287–289, 305, 306 swapping bindings, 424 swipe, 342 switch keyword, 34 symbiotic relationship, 183 symbol, 106 Symbol function, 106 Symbol.iterator symbol, 107 SymmetricMatrix class, 111 synchronization, 387, 426 synchronous programming, 182, 195, 359, 368 syncState method, 335, 338, 340, 341, 349, 426 syntax of Egg, 203, 204 error, 26, 129, 130 expression, 23 function, 42, 45 identifier, 26 number, 12, 165 object, 63 operator, 13 statement, 24, 26, 28–34, 135 string, 14 syntax tree, 204–205, 207, 228–229 SyntaxError type, 206 T tab character, 14, 32 TAB key, 320 tabbed interface (exercise), 262, 419 tabindex attribute, 252, 320, 349 table (HTML tag), 243, 266, 274, 422 table example, 417 tableFor function, 68 tables, 67, 68, 274 tags, 221–222, 227, 239, see also names of specific tags talk, 371, 372, 377–379 talkResponse method, 380 talksAbout function, 231 talkURL function, 383 Tamil, 87 tampering, 317 tangent, 75 target property, 250 task management example, 71 TCP, 220, 221, 311, 373 td (HTML tag), 243, 274 Tef, 166 temperature example, 110 template, 171, 388, 426 template literals, 15 tentacle (analogy), 25, 63, 65 terminal, 354 termite, 183 ternary operator, 18, 20, 209 test method, 146 test runners, 132 test suites, 132 testing, 125, 132 text, 14, 221, 222, 227, 229, 295, 305–307, 322, 324, 358, 422 text field, 257, 318, 319, 322 text method, 315 text node, 229, 231, 233, 419 text wrapping, 305 text-align (CSS), 243 textAlign property, 295, 420 textarea (HTML tag), 260, 318, 322, 327, 330, 425 textBaseline property, 295, 420 textContent property, 418, 422 TEXT_NODE code, 229, 419 textScripts function, 94, 411 th (HTML tag), 243 then method, 186–188, 191, 416 theory, 133 this binding, 62, 98–99, 101, 130 thread, 182, 183, 198, 259 throw keyword, 135, 136, 139, 141, 413 tile, 303 time, 147, 148, 150, 184, 241, 261, 277, 278, 280, 283, 303, 346 time zone, 150 timeline, 182, 197, 223, 241, 247, 258 timeout, 188, 259, 373, 374, 380 Timeout class, 189 times method, 269 timing, 396 title, 382 title (HTML tag), 222, 223 toDataURL method, 344 toLowerCase method, 62, 243 tool, 145, 164, 175, 334, 339, 340, 342–344, 347, 350, 357 tool property, 335 ToolSelect class, 340 top (CSS), 240–242, 244 top-level scope, see global scope toString method, 99, 100, 103–105, 346, 362 touch, 255, 334 touchend event, 255 touches method, 278 touches property, 255, 339 touchmove event, 255, 339, 350 touchstart event, 255, 337, 339 toUpperCase method, 62, 132, 243, 362 tr (HTML tag), 243, 274 trackKeys function, 282, 285 transform (CSS), 287 transformation, 297–299, 308, 420 translate method, 298, 299 Transmission Control Protocol, 220, 221, 311, 373 transparency, 289, 296, 346 transpilation, 213 trapezoid, 307, 420 traversal, 152 tree, 100, 204, 229 treeGraph function, 394 trial and error, 133, 282, 293 triangle (exercise), 37, 407 trigonometry, 75, 241 trim method, 73, 268 true, 16 trust, 224 try keyword, 136, 137, 190, 413, 422 type, 12, 16, 112 type attribute, 318, 321 type checking, 131, 174 type coercion, 18, 19, 28 type observation, 392, 401, 403 type property, 204, 249 type variable, 131 typeof operator, 16, 80, 410 TypeScript, 131–132 typing, 260 typo, 129 U Ullman, Ellen, xx unary operator, 16, 23 uncaught exception, 138, 188 undefined, 18, 19, 25, 42, 47, 61, 63, 77, 129, 130, 134 underline, 237 underscore character, 26, 35, 98, 151, 157 undo history, 346, 347 UndoButton class, 347 Unicode, 15, 17, 87, 92, 147, 162, 163 unicycling, 371 Uniform Resource Locator, see URL uniformity, 204 uniqueness, 239 unit (CSS), 242, 257 Unix, 366–368 Unix time, 150 unlink function, 359, 366 unshift method, 71 unwinding the stack, 135 upcasing server example, 362 updated method, 378, 381, 425 updateState function, 336 upgrading, 169 upload, 325 URL, 221, 224, 288, 313, 315, 317, 360, 373, 383 URL encoding, 314 url package, 364, 380 urlToPath function, 364 usability, 251 use strict, 130 user experience, 247, 320, 372, 383 user interface, 138, 334 users’ group, 371 UTF-8, 358, 359 UTF-16, 15, 92 V V8, 398 validation, 134, 140, 203, 277, 321, 378, 379 value, 12, 186 value attribute, 318, 322, 324 var keyword, 25, 43, 76 variables, see also binding Vec class, 113, 268, 269, 280, 394, 396, 421 vector, 394, 400 vector (exercise), 113, 411 vector graphics, 295 verbosity, 46, 182 version, 169, 222, 312, 357, 398 viewport, 275–277, 301, 302, 305 VillageState class, 119 virtual keyboard, 252 virtual world, 117, 119, 121 virus, 224 vocabulary, 41, 84 void operator, 26 volatile data storage, 12 W waitForChanges method, 380 waiting, 184 walking, 303 warning, 357 wave, 271, 280, 281 web, see World Wide Web web application, 5, 326, 333 web browser, see browser web page, 174 web worker, 259 WebDAV, 369 webgl (canvas context), 289 website, 224, 225, 313, 353, 369, 371 WebSockets, 373 weekDay module, 169–170 weekend project, 369 weresquirrel example, 60, 62, 64, 66, 69, 71 while loop, 4, 30, 32, 53, 160 whitespace in HTML, 231, 340, 419 indentation, 32 matching, 147, 162 syntax, 204, 206, 214, 417 trimming, 73, 268 in URLs, 373–374 Why’s (Poignant) Guide to Ruby, 22 width property, 350, 423 window, 250, 255, 258 window object, 248 with statement, 131 word boundary, 151 word character, 147, 151, 162 work list, 124, 343 workbench (exercise), 330, 422 world, of a game, 265 World Wide Web, 5, 77, 219, 221, 224, 225, 311 writable stream, 360–363, 364 write method, 360, 361 writeFile function, 359, 361, 425 writeHead method, 360 writing code, 6, 117 writing system, 87 WWW, see World Wide Web X XML, 230, 288 XML namespace, 288 xmlns attribute, 288 Y yield (reserved word), 26 yield keyword, 197 your own loop (example), 95 Yuan-Ma, 10, 352 Z Zawinski, Jamie, 144 zero-based counting, 56, 61, 150 zeroPad function, 54 zigzag shape, 420 zooming, 305 Eloquent JavaScript, 3rd Edition is set in New Baskerville, Futura, Dogma, and TheSansMono Condensed.


pages: 1,331 words: 183,137

Programming Rust: Fast, Safe Systems Development by Jim Blandy, Jason Orendorff

bioinformatics, bitcoin, Donald Knuth, Elon Musk, Firefox, functional programming, mandelbrot fractal, MVC pattern, natural language processing, side project, sorting algorithm, speech recognition, Turing test, type inference, WebSocket

(io::stderr(), "usage: http-get URL").unwrap(); return; } if let Err(err) = http_get_main(&args[1]) { writeln!(io::stderr(), "error: {}", err).unwrap(); } } The iron framework for HTTP servers offers high-level touches such as the BeforeMiddleware and AfterMiddleware traits, which help you compose an app from pluggable parts. The websocket crate implements the WebSocket protocol. And so on. Rust is a young language with a busy open source ecosystem. Support for networking is rapidly expanding. Chapter 19. Concurrency In the long run it is not advisable to write large concurrent programs in machine-oriented languages that permit unrestricted use of store locations and their addresses.


The Manager’s Path by Camille Fournier

failed state, fear of failure, hiring and firing, hive mind, interchangeable parts, job automation, Larry Wall, microservices, pull request, risk tolerance, Schrödinger's Cat, side project, Steve Jobs, WebSocket

Getting a sense of where the product roadmap is going helps you guide the technical roadmap. Many technical projects are supported on the strength of their ability to enable new features more easily—for example, rewriting the checkout system to plug in payment types like Apple Pay, or moving to a new JavaScript framework model that supports streaming data changes via WebSockets, in order to build a more interactive experience. Start asking the product team questions about what the future might look like, and spend some time keeping up with technological developments that might change the way you think about the software you’re writing or the way you’re operating it. REVIEW THE OUTCOME OF YOUR DECISIONS AND PROJECTS Talk about whether the hypotheses you used to motivate projects actually turned out to be true.


Learn Algorithmic Trading by Sebastien Donadio

active measures, algorithmic trading, automated trading system, backtesting, Bayesian statistics, buy and hold, buy low sell high, cryptocurrency, DevOps, en.wikipedia.org, fixed income, Flash crash, Guido van Rossum, latency arbitrage, locking in a profit, market fundamentalism, market microstructure, martingale, natural language processing, p-value, paper trading, performance metric, prediction markets, quantitative trading / quantitative finance, random walk, risk tolerance, risk-adjusted returns, Sharpe ratio, short selling, sorting algorithm, statistical arbitrage, statistical model, stochastic process, survivorship bias, transaction costs, type inference, WebSocket, zero-sum game

These protocols use a fixed offset to specify the tag values. For instance, instead of using 39=2, the OUCH protocol will use a value of 2 at an offset of 20. The New York Stock Exchange (NYSE) uses UTP Direct, which is similar to the NASDAQ protocols. The cryptocurrency world uses HTTP requests while using the RESTful API or Websocket way of communicating. All of these protocols provide us with different ways to represent financial exchange information. They all have the same goal: price update and order handling. Summary In this chapter, we learned that trading system communication is key to trading. The trading system is in charge of collecting the required prices to make an informed decision.


Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems by Martin Kleppmann

active measures, Amazon Web Services, bitcoin, blockchain, business intelligence, business process, c2.com, cloud computing, collaborative editing, commoditize, conceptual framework, cryptocurrency, database schema, DevOps, distributed ledger, Donald Knuth, Edward Snowden, Ethereum, ethereum blockchain, fault tolerance, finite state, Flash crash, full text search, functional programming, general-purpose programming language, informal economy, information retrieval, Internet of things, iterative process, John von Neumann, Kubernetes, loose coupling, Marc Andreessen, microservices, natural language processing, Network effects, packet switching, peer-to-peer, performance metric, place-making, premature optimization, recommendation engine, Richard Feynman, self-driving car, semantic web, Shoshana Zuboff, social graph, social web, software as a service, software is eating the world, sorting algorithm, source of truth, SPARQL, speech recognition, statistical model, surveillance capitalism, Tragedy of the Commons, undersea cable, web application, WebSocket, wikimedia commons

Thus, the state on the device is a stale cache that is not updated unless you explicitly poll for changes. (HTTP-based feed subscription protocols like RSS are really just a basic form of poll‐ ing.) More recent protocols have moved beyond the basic request/response pattern of HTTP: server-sent events (the EventSource API) and WebSockets provide communi‐ cation channels by which a web browser can keep an open TCP connection to a server, and the server can actively push messages to the browser as long as it remains connected. This provides an opportunity for the server to actively inform the enduser client about any changes to the state it has stored locally, reducing the staleness of the client-side state.

The opposite of bounded. 558 | Glossary Index A aborts (transactions), 222, 224 in two-phase commit, 356 performance of optimistic concurrency con‐ trol, 266 retrying aborted transactions, 231 abstraction, 21, 27, 222, 266, 321 access path (in network model), 37, 60 accidental complexity, removing, 21 accountability, 535 ACID properties (transactions), 90, 223 atomicity, 223, 228 consistency, 224, 529 durability, 226 isolation, 225, 228 acknowledgements (messaging), 445 active/active replication (see multi-leader repli‐ cation) active/passive replication (see leader-based rep‐ lication) ActiveMQ (messaging), 137, 444 distributed transaction support, 361 ActiveRecord (object-relational mapper), 30, 232 actor model, 138 (see also message-passing) comparison to Pregel model, 425 comparison to stream processing, 468 Advanced Message Queuing Protocol (see AMQP) aerospace systems, 6, 10, 305, 372 aggregation data cubes and materialized views, 101 in batch processes, 406 in stream processes, 466 aggregation pipeline query language, 48 Agile, 22 minimizing irreversibility, 414, 497 moving faster with confidence, 532 Unix philosophy, 394 agreement, 365 (see also consensus) Airflow (workflow scheduler), 402 Ajax, 131 Akka (actor framework), 139 algorithms algorithm correctness, 308 B-trees, 79-83 for distributed systems, 306 hash indexes, 72-75 mergesort, 76, 402, 405 red-black trees, 78 SSTables and LSM-trees, 76-79 all-to-all replication topologies, 175 AllegroGraph (database), 50 ALTER TABLE statement (SQL), 40, 111 Amazon Dynamo (database), 177 Amazon Web Services (AWS), 8 Kinesis Streams (messaging), 448 network reliability, 279 postmortems, 9 RedShift (database), 93 S3 (object storage), 398 checking data integrity, 530 amplification of bias, 534 of failures, 364, 495 Index | 559 of tail latency, 16, 207 write amplification, 84 AMQP (Advanced Message Queuing Protocol), 444 (see also messaging systems) comparison to log-based messaging, 448, 451 message ordering, 446 analytics, 90 comparison to transaction processing, 91 data warehousing (see data warehousing) parallel query execution in MPP databases, 415 predictive (see predictive analytics) relation to batch processing, 411 schemas for, 93-95 snapshot isolation for queries, 238 stream analytics, 466 using MapReduce, analysis of user activity events (example), 404 anti-caching (in-memory databases), 89 anti-entropy, 178 Apache ActiveMQ (see ActiveMQ) Apache Avro (see Avro) Apache Beam (see Beam) Apache BookKeeper (see BookKeeper) Apache Cassandra (see Cassandra) Apache CouchDB (see CouchDB) Apache Curator (see Curator) Apache Drill (see Drill) Apache Flink (see Flink) Apache Giraph (see Giraph) Apache Hadoop (see Hadoop) Apache HAWQ (see HAWQ) Apache HBase (see HBase) Apache Helix (see Helix) Apache Hive (see Hive) Apache Impala (see Impala) Apache Jena (see Jena) Apache Kafka (see Kafka) Apache Lucene (see Lucene) Apache MADlib (see MADlib) Apache Mahout (see Mahout) Apache Oozie (see Oozie) Apache Parquet (see Parquet) Apache Qpid (see Qpid) Apache Samza (see Samza) Apache Solr (see Solr) Apache Spark (see Spark) 560 | Index Apache Storm (see Storm) Apache Tajo (see Tajo) Apache Tez (see Tez) Apache Thrift (see Thrift) Apache ZooKeeper (see ZooKeeper) Apama (stream analytics), 466 append-only B-trees, 82, 242 append-only files (see logs) Application Programming Interfaces (APIs), 5, 27 for batch processing, 403 for change streams, 456 for distributed transactions, 361 for graph processing, 425 for services, 131-136 (see also services) evolvability, 136 RESTful, 133 SOAP, 133 application state (see state) approximate search (see similarity search) archival storage, data from databases, 131 arcs (see edges) arithmetic mean, 14 ASCII text, 119, 395 ASN.1 (schema language), 127 asynchronous networks, 278, 553 comparison to synchronous networks, 284 formal model, 307 asynchronous replication, 154, 553 conflict detection, 172 data loss on failover, 157 reads from asynchronous follower, 162 Asynchronous Transfer Mode (ATM), 285 atomic broadcast (see total order broadcast) atomic clocks (caesium clocks), 294, 295 (see also clocks) atomicity (concurrency), 553 atomic increment-and-get, 351 compare-and-set, 245, 327 (see also compare-and-set operations) replicated operations, 246 write operations, 243 atomicity (transactions), 223, 228, 553 atomic commit, 353 avoiding, 523, 528 blocking and nonblocking, 359 in stream processing, 360, 477 maintaining derived data, 453 for multi-object transactions, 229 for single-object writes, 230 auditability, 528-533 designing for, 531 self-auditing systems, 530 through immutability, 460 tools for auditable data systems, 532 availability, 8 (see also fault tolerance) in CAP theorem, 337 in service level agreements (SLAs), 15 Avro (data format), 122-127 code generation, 127 dynamically generated schemas, 126 object container files, 125, 131, 414 reader determining writer’s schema, 125 schema evolution, 123 use in Hadoop, 414 awk (Unix tool), 391 AWS (see Amazon Web Services) Azure (see Microsoft) B B-trees (indexes), 79-83 append-only/copy-on-write variants, 82, 242 branching factor, 81 comparison to LSM-trees, 83-85 crash recovery, 82 growing by splitting a page, 81 optimizations, 82 similarity to dynamic partitioning, 212 backpressure, 441, 553 in TCP, 282 backups database snapshot for replication, 156 integrity of, 530 snapshot isolation for, 238 use for ETL processes, 405 backward compatibility, 112 BASE, contrast to ACID, 223 bash shell (Unix), 70, 395, 503 batch processing, 28, 389-431, 553 combining with stream processing lambda architecture, 497 unifying technologies, 498 comparison to MPP databases, 414-418 comparison to stream processing, 464 comparison to Unix, 413-414 dataflow engines, 421-423 fault tolerance, 406, 414, 422, 442 for data integration, 494-498 graphs and iterative processing, 424-426 high-level APIs and languages, 403, 426-429 log-based messaging and, 451 maintaining derived state, 495 MapReduce and distributed filesystems, 397-413 (see also MapReduce) measuring performance, 13, 390 outputs, 411-413 key-value stores, 412 search indexes, 411 using Unix tools (example), 391-394 Bayou (database), 522 Beam (dataflow library), 498 bias, 534 big ball of mud, 20 Bigtable data model, 41, 99 binary data encodings, 115-128 Avro, 122-127 MessagePack, 116-117 Thrift and Protocol Buffers, 117-121 binary encoding based on schemas, 127 by network drivers, 128 binary strings, lack of support in JSON and XML, 114 BinaryProtocol encoding (Thrift), 118 Bitcask (storage engine), 72 crash recovery, 74 Bitcoin (cryptocurrency), 532 Byzantine fault tolerance, 305 concurrency bugs in exchanges, 233 bitmap indexes, 97 blockchains, 532 Byzantine fault tolerance, 305 blocking atomic commit, 359 Bloom (programming language), 504 Bloom filter (algorithm), 79, 466 BookKeeper (replicated log), 372 Bottled Water (change data capture), 455 bounded datasets, 430, 439, 553 (see also batch processing) bounded delays, 553 in networks, 285 process pauses, 298 broadcast hash joins, 409 Index | 561 brokerless messaging, 442 Brubeck (metrics aggregator), 442 BTM (transaction coordinator), 356 bulk synchronous parallel (BSP) model, 425 bursty network traffic patterns, 285 business data processing, 28, 90, 390 byte sequence, encoding data in, 112 Byzantine faults, 304-306, 307, 553 Byzantine fault-tolerant systems, 305, 532 Byzantine Generals Problem, 304 consensus algorithms and, 366 C caches, 89, 553 and materialized views, 101 as derived data, 386, 499-504 database as cache of transaction log, 460 in CPUs, 99, 338, 428 invalidation and maintenance, 452, 467 linearizability, 324 CAP theorem, 336-338, 554 Cascading (batch processing), 419, 427 hash joins, 409 workflows, 403 cascading failures, 9, 214, 281 Cascalog (batch processing), 60 Cassandra (database) column-family data model, 41, 99 compaction strategy, 79 compound primary key, 204 gossip protocol, 216 hash partitioning, 203-205 last-write-wins conflict resolution, 186, 292 leaderless replication, 177 linearizability, lack of, 335 log-structured storage, 78 multi-datacenter support, 184 partitioning scheme, 213 secondary indexes, 207 sloppy quorums, 184 cat (Unix tool), 391 causal context, 191 (see also causal dependencies) causal dependencies, 186-191 capturing, 191, 342, 494, 514 by total ordering, 493 causal ordering, 339 in transactions, 262 sending message to friends (example), 494 562 | Index causality, 554 causal ordering, 339-343 linearizability and, 342 total order consistent with, 344, 345 consistency with, 344-347 consistent snapshots, 340 happens-before relationship, 186 in serializable transactions, 262-265 mismatch with clocks, 292 ordering events to capture, 493 violations of, 165, 176, 292, 340 with synchronized clocks, 294 CEP (see complex event processing) certificate transparency, 532 chain replication, 155 linearizable reads, 351 change data capture, 160, 454 API support for change streams, 456 comparison to event sourcing, 457 implementing, 454 initial snapshot, 455 log compaction, 456 changelogs, 460 change data capture, 454 for operator state, 479 generating with triggers, 455 in stream joins, 474 log compaction, 456 maintaining derived state, 452 Chaos Monkey, 7, 280 checkpointing in batch processors, 422, 426 in high-performance computing, 275 in stream processors, 477, 523 chronicle data model, 458 circuit-switched networks, 284 circular buffers, 450 circular replication topologies, 175 clickstream data, analysis of, 404 clients calling services, 131 pushing state changes to, 512 request routing, 214 stateful and offline-capable, 170, 511 clocks, 287-299 atomic (caesium) clocks, 294, 295 confidence interval, 293-295 for global snapshots, 294 logical (see logical clocks) skew, 291-294, 334 slewing, 289 synchronization and accuracy, 289-291 synchronization using GPS, 287, 290, 294, 295 time-of-day versus monotonic clocks, 288 timestamping events, 471 cloud computing, 146, 275 need for service discovery, 372 network glitches, 279 shared resources, 284 single-machine reliability, 8 Cloudera Impala (see Impala) clustered indexes, 86 CODASYL model, 36 (see also network model) code generation with Avro, 127 with Thrift and Protocol Buffers, 118 with WSDL, 133 collaborative editing multi-leader replication and, 170 column families (Bigtable), 41, 99 column-oriented storage, 95-101 column compression, 97 distinction between column families and, 99 in batch processors, 428 Parquet, 96, 131, 414 sort order in, 99-100 vectorized processing, 99, 428 writing to, 101 comma-separated values (see CSV) command query responsibility segregation (CQRS), 462 commands (event sourcing), 459 commits (transactions), 222 atomic commit, 354-355 (see also atomicity; transactions) read committed isolation, 234 three-phase commit (3PC), 359 two-phase commit (2PC), 355-359 commutative operations, 246 compaction of changelogs, 456 (see also log compaction) for stream operator state, 479 of log-structured storage, 73 issues with, 84 size-tiered and leveled approaches, 79 CompactProtocol encoding (Thrift), 119 compare-and-set operations, 245, 327 implementing locks, 370 implementing uniqueness constraints, 331 implementing with total order broadcast, 350 relation to consensus, 335, 350, 352, 374 relation to transactions, 230 compatibility, 112, 128 calling services, 136 properties of encoding formats, 139 using databases, 129-131 using message-passing, 138 compensating transactions, 355, 461, 526 complex event processing (CEP), 465 complexity distilling in theoretical models, 310 hiding using abstraction, 27 of software systems, managing, 20 composing data systems (see unbundling data‐ bases) compute-intensive applications, 3, 275 concatenated indexes, 87 in Cassandra, 204 Concord (stream processor), 466 concurrency actor programming model, 138, 468 (see also message-passing) bugs from weak transaction isolation, 233 conflict resolution, 171, 174 detecting concurrent writes, 184-191 dual writes, problems with, 453 happens-before relationship, 186 in replicated systems, 161-191, 324-338 lost updates, 243 multi-version concurrency control (MVCC), 239 optimistic concurrency control, 261 ordering of operations, 326, 341 reducing, through event logs, 351, 462, 507 time and relativity, 187 transaction isolation, 225 write skew (transaction isolation), 246-251 conflict-free replicated datatypes (CRDTs), 174 conflicts conflict detection, 172 causal dependencies, 186, 342 in consensus algorithms, 368 in leaderless replication, 184 Index | 563 in log-based systems, 351, 521 in nonlinearizable systems, 343 in serializable snapshot isolation (SSI), 264 in two-phase commit, 357, 364 conflict resolution automatic conflict resolution, 174 by aborting transactions, 261 by apologizing, 527 convergence, 172-174 in leaderless systems, 190 last write wins (LWW), 186, 292 using atomic operations, 246 using custom logic, 173 determining what is a conflict, 174, 522 in multi-leader replication, 171-175 avoiding conflicts, 172 lost updates, 242-246 materializing, 251 relation to operation ordering, 339 write skew (transaction isolation), 246-251 congestion (networks) avoidance, 282 limiting accuracy of clocks, 293 queueing delays, 282 consensus, 321, 364-375, 554 algorithms, 366-368 preventing split brain, 367 safety and liveness properties, 365 using linearizable operations, 351 cost of, 369 distributed transactions, 352-375 in practice, 360-364 two-phase commit, 354-359 XA transactions, 361-364 impossibility of, 353 membership and coordination services, 370-373 relation to compare-and-set, 335, 350, 352, 374 relation to replication, 155, 349 relation to uniqueness constraints, 521 consistency, 224, 524 across different databases, 157, 452, 462, 492 causal, 339-348, 493 consistent prefix reads, 165-167 consistent snapshots, 156, 237-242, 294, 455, 500 (see also snapshots) 564 | Index crash recovery, 82 enforcing constraints (see constraints) eventual, 162, 322 (see also eventual consistency) in ACID transactions, 224, 529 in CAP theorem, 337 linearizability, 324-338 meanings of, 224 monotonic reads, 164-165 of secondary indexes, 231, 241, 354, 491, 500 ordering guarantees, 339-352 read-after-write, 162-164 sequential, 351 strong (see linearizability) timeliness and integrity, 524 using quorums, 181, 334 consistent hashing, 204 consistent prefix reads, 165 constraints (databases), 225, 248 asynchronously checked, 526 coordination avoidance, 527 ensuring idempotence, 519 in log-based systems, 521-524 across multiple partitions, 522 in two-phase commit, 355, 357 relation to consensus, 374, 521 relation to event ordering, 347 requiring linearizability, 330 Consul (service discovery), 372 consumers (message streams), 137, 440 backpressure, 441 consumer offsets in logs, 449 failures, 445, 449 fan-out, 11, 445, 448 load balancing, 444, 448 not keeping up with producers, 441, 450, 502 context switches, 14, 297 convergence (conflict resolution), 172-174, 322 coordination avoidance, 527 cross-datacenter, 168, 493 cross-partition ordering, 256, 294, 348, 523 services, 330, 370-373 coordinator (in 2PC), 356 failure, 358 in XA transactions, 361-364 recovery, 363 copy-on-write (B-trees), 82, 242 CORBA (Common Object Request Broker Architecture), 134 correctness, 6 auditability, 528-533 Byzantine fault tolerance, 305, 532 dealing with partial failures, 274 in log-based systems, 521-524 of algorithm within system model, 308 of compensating transactions, 355 of consensus, 368 of derived data, 497, 531 of immutable data, 461 of personal data, 535, 540 of time, 176, 289-295 of transactions, 225, 515, 529 timeliness and integrity, 524-528 corruption of data detecting, 519, 530-533 due to pathological memory access, 529 due to radiation, 305 due to split brain, 158, 302 due to weak transaction isolation, 233 formalization in consensus, 366 integrity as absence of, 524 network packets, 306 on disks, 227 preventing using write-ahead logs, 82 recovering from, 414, 460 Couchbase (database) durability, 89 hash partitioning, 203-204, 211 rebalancing, 213 request routing, 216 CouchDB (database) B-tree storage, 242 change feed, 456 document data model, 31 join support, 34 MapReduce support, 46, 400 replication, 170, 173 covering indexes, 86 CPUs cache coherence and memory barriers, 338 caching and pipelining, 99, 428 increasing parallelism, 43 CRDTs (see conflict-free replicated datatypes) CREATE INDEX statement (SQL), 85, 500 credit rating agencies, 535 Crunch (batch processing), 419, 427 hash joins, 409 sharded joins, 408 workflows, 403 cryptography defense against attackers, 306 end-to-end encryption and authentication, 519, 543 proving integrity of data, 532 CSS (Cascading Style Sheets), 44 CSV (comma-separated values), 70, 114, 396 Curator (ZooKeeper recipes), 330, 371 curl (Unix tool), 135, 397 cursor stability, 243 Cypher (query language), 52 comparison to SPARQL, 59 D data corruption (see corruption of data) data cubes, 102 data formats (see encoding) data integration, 490-498, 543 batch and stream processing, 494-498 lambda architecture, 497 maintaining derived state, 495 reprocessing data, 496 unifying, 498 by unbundling databases, 499-515 comparison to federated databases, 501 combining tools by deriving data, 490-494 derived data versus distributed transac‐ tions, 492 limits of total ordering, 493 ordering events to capture causality, 493 reasoning about dataflows, 491 need for, 385 data lakes, 415 data locality (see locality) data models, 27-64 graph-like models, 49-63 Datalog language, 60-63 property graphs, 50 RDF and triple-stores, 55-59 query languages, 42-48 relational model versus document model, 28-42 data protection regulations, 542 data systems, 3 about, 4 Index | 565 concerns when designing, 5 future of, 489-544 correctness, constraints, and integrity, 515-533 data integration, 490-498 unbundling databases, 499-515 heterogeneous, keeping in sync, 452 maintainability, 18-22 possible faults in, 221 reliability, 6-10 hardware faults, 7 human errors, 9 importance of, 10 software errors, 8 scalability, 10-18 unreliable clocks, 287-299 data warehousing, 91-95, 554 comparison to data lakes, 415 ETL (extract-transform-load), 92, 416, 452 keeping data systems in sync, 452 schema design, 93 slowly changing dimension (SCD), 476 data-intensive applications, 3 database triggers (see triggers) database-internal distributed transactions, 360, 364, 477 databases archival storage, 131 comparison of message brokers to, 443 dataflow through, 129 end-to-end argument for, 519-520 checking integrity, 531 inside-out, 504 (see also unbundling databases) output from batch workflows, 412 relation to event streams, 451-464 (see also changelogs) API support for change streams, 456, 506 change data capture, 454-457 event sourcing, 457-459 keeping systems in sync, 452-453 philosophy of immutable events, 459-464 unbundling, 499-515 composing data storage technologies, 499-504 designing applications around dataflow, 504-509 566 | Index observing derived state, 509-515 datacenters geographically distributed, 145, 164, 278, 493 multi-tenancy and shared resources, 284 network architecture, 276 network faults, 279 replication across multiple, 169 leaderless replication, 184 multi-leader replication, 168, 335 dataflow, 128-139, 504-509 correctness of dataflow systems, 525 differential, 504 message-passing, 136-139 reasoning about, 491 through databases, 129 through services, 131-136 dataflow engines, 421-423 comparison to stream processing, 464 directed acyclic graphs (DAG), 424 partitioning, approach to, 429 support for declarative queries, 427 Datalog (query language), 60-63 datatypes binary strings in XML and JSON, 114 conflict-free, 174 in Avro encodings, 122 in Thrift and Protocol Buffers, 121 numbers in XML and JSON, 114 Datomic (database) B-tree storage, 242 data model, 50, 57 Datalog query language, 60 excision (deleting data), 463 languages for transactions, 255 serial execution of transactions, 253 deadlocks detection, in two-phase commit (2PC), 364 in two-phase locking (2PL), 258 Debezium (change data capture), 455 declarative languages, 42, 554 Bloom, 504 CSS and XSL, 44 Cypher, 52 Datalog, 60 for batch processing, 427 recursive SQL queries, 53 relational algebra and SQL, 42 SPARQL, 59 delays bounded network delays, 285 bounded process pauses, 298 unbounded network delays, 282 unbounded process pauses, 296 deleting data, 463 denormalization (data representation), 34, 554 costs, 39 in derived data systems, 386 materialized views, 101 updating derived data, 228, 231, 490 versus normalization, 462 derived data, 386, 439, 554 from change data capture, 454 in event sourcing, 458-458 maintaining derived state through logs, 452-457, 459-463 observing, by subscribing to streams, 512 outputs of batch and stream processing, 495 through application code, 505 versus distributed transactions, 492 deterministic operations, 255, 274, 554 accidental nondeterminism, 423 and fault tolerance, 423, 426 and idempotence, 478, 492 computing derived data, 495, 526, 531 in state machine replication, 349, 452, 458 joins, 476 DevOps, 394 differential dataflow, 504 dimension tables, 94 dimensional modeling (see star schemas) directed acyclic graphs (DAGs), 424 dirty reads (transaction isolation), 234 dirty writes (transaction isolation), 235 discrimination, 534 disks (see hard disks) distributed actor frameworks, 138 distributed filesystems, 398-399 decoupling from query engines, 417 indiscriminately dumping data into, 415 use by MapReduce, 402 distributed systems, 273-312, 554 Byzantine faults, 304-306 cloud versus supercomputing, 275 detecting network faults, 280 faults and partial failures, 274-277 formalization of consensus, 365 impossibility results, 338, 353 issues with failover, 157 limitations of distributed transactions, 363 multi-datacenter, 169, 335 network problems, 277-286 quorums, relying on, 301 reasons for using, 145, 151 synchronized clocks, relying on, 291-295 system models, 306-310 use of clocks and time, 287 distributed transactions (see transactions) Django (web framework), 232 DNS (Domain Name System), 216, 372 Docker (container manager), 506 document data model, 30-42 comparison to relational model, 38-42 document references, 38, 403 document-oriented databases, 31 many-to-many relationships and joins, 36 multi-object transactions, need for, 231 versus relational model convergence of models, 41 data locality, 41 document-partitioned indexes, 206, 217, 411 domain-driven design (DDD), 457 DRBD (Distributed Replicated Block Device), 153 drift (clocks), 289 Drill (query engine), 93 Druid (database), 461 Dryad (dataflow engine), 421 dual writes, problems with, 452, 507 duplicates, suppression of, 517 (see also idempotence) using a unique ID, 518, 522 durability (transactions), 226, 554 duration (time), 287 measurement with monotonic clocks, 288 dynamic partitioning, 212 dynamically typed languages analogy to schema-on-read, 40 code generation and, 127 Dynamo-style databases (see leaderless replica‐ tion) E edges (in graphs), 49, 403 property graph model, 50 edit distance (full-text search), 88 effectively-once semantics, 476, 516 Index | 567 (see also exactly-once semantics) preservation of integrity, 525 elastic systems, 17 Elasticsearch (search server) document-partitioned indexes, 207 partition rebalancing, 211 percolator (stream search), 467 usage example, 4 use of Lucene, 79 ElephantDB (database), 413 Elm (programming language), 504, 512 encodings (data formats), 111-128 Avro, 122-127 binary variants of JSON and XML, 115 compatibility, 112 calling services, 136 using databases, 129-131 using message-passing, 138 defined, 113 JSON, XML, and CSV, 114 language-specific formats, 113 merits of schemas, 127 representations of data, 112 Thrift and Protocol Buffers, 117-121 end-to-end argument, 277, 519-520 checking integrity, 531 publish/subscribe streams, 512 enrichment (stream), 473 Enterprise JavaBeans (EJB), 134 entities (see vertices) epoch (consensus algorithms), 368 epoch (Unix timestamps), 288 equi-joins, 403 erasure coding (error correction), 398 Erlang OTP (actor framework), 139 error handling for network faults, 280 in transactions, 231 error-correcting codes, 277, 398 Esper (CEP engine), 466 etcd (coordination service), 370-373 linearizable operations, 333 locks and leader election, 330 quorum reads, 351 service discovery, 372 use of Raft algorithm, 349, 353 Ethereum (blockchain), 532 Ethernet (networks), 276, 278, 285 packet checksums, 306, 519 568 | Index Etherpad (collaborative editor), 170 ethics, 533-543 code of ethics and professional practice, 533 legislation and self-regulation, 542 predictive analytics, 533-536 amplifying bias, 534 feedback loops, 536 privacy and tracking, 536-543 consent and freedom of choice, 538 data as assets and power, 540 meaning of privacy, 539 surveillance, 537 respect, dignity, and agency, 543, 544 unintended consequences, 533, 536 ETL (extract-transform-load), 92, 405, 452, 554 use of Hadoop for, 416 event sourcing, 457-459 commands and events, 459 comparison to change data capture, 457 comparison to lambda architecture, 497 deriving current state from event log, 458 immutability and auditability, 459, 531 large, reliable data systems, 519, 526 Event Store (database), 458 event streams (see streams) events, 440 deciding on total order of, 493 deriving views from event log, 461 difference to commands, 459 event time versus processing time, 469, 477, 498 immutable, advantages of, 460, 531 ordering to capture causality, 493 reads as, 513 stragglers, 470, 498 timestamp of, in stream processing, 471 EventSource (browser API), 512 eventual consistency, 152, 162, 308, 322 (see also conflicts) and perpetual inconsistency, 525 evolvability, 21, 111 calling services, 136 graph-structured data, 52 of databases, 40, 129-131, 461, 497 of message-passing, 138 reprocessing data, 496, 498 schema evolution in Avro, 123 schema evolution in Thrift and Protocol Buffers, 120 schema-on-read, 39, 111, 128 exactly-once semantics, 360, 476, 516 parity with batch processors, 498 preservation of integrity, 525 exclusive mode (locks), 258 eXtended Architecture transactions (see XA transactions) extract-transform-load (see ETL) F Facebook Presto (query engine), 93 React, Flux, and Redux (user interface libra‐ ries), 512 social graphs, 49 Wormhole (change data capture), 455 fact tables, 93 failover, 157, 554 (see also leader-based replication) in leaderless replication, absence of, 178 leader election, 301, 348, 352 potential problems, 157 failures amplification by distributed transactions, 364, 495 failure detection, 280 automatic rebalancing causing cascading failures, 214 perfect failure detectors, 359 timeouts and unbounded delays, 282, 284 using ZooKeeper, 371 faults versus, 7 partial failures in distributed systems, 275-277, 310 fan-out (messaging systems), 11, 445 fault tolerance, 6-10, 555 abstractions for, 321 formalization in consensus, 365-369 use of replication, 367 human fault tolerance, 414 in batch processing, 406, 414, 422, 425 in log-based systems, 520, 524-526 in stream processing, 476-479 atomic commit, 477 idempotence, 478 maintaining derived state, 495 microbatching and checkpointing, 477 rebuilding state after a failure, 478 of distributed transactions, 362-364 transaction atomicity, 223, 354-361 faults, 6 Byzantine faults, 304-306 failures versus, 7 handled by transactions, 221 handling in supercomputers and cloud computing, 275 hardware, 7 in batch processing versus distributed data‐ bases, 417 in distributed systems, 274-277 introducing deliberately, 7, 280 network faults, 279-281 asymmetric faults, 300 detecting, 280 tolerance of, in multi-leader replication, 169 software errors, 8 tolerating (see fault tolerance) federated databases, 501 fence (CPU instruction), 338 fencing (preventing split brain), 158, 302-304 generating fencing tokens, 349, 370 properties of fencing tokens, 308 stream processors writing to databases, 478, 517 Fibre Channel (networks), 398 field tags (Thrift and Protocol Buffers), 119-121 file descriptors (Unix), 395 financial data, 460 Firebase (database), 456 Flink (processing framework), 421-423 dataflow APIs, 427 fault tolerance, 422, 477, 479 Gelly API (graph processing), 425 integration of batch and stream processing, 495, 498 machine learning, 428 query optimizer, 427 stream processing, 466 flow control, 282, 441, 555 FLP result (on consensus), 353 FlumeJava (dataflow library), 403, 427 followers, 152, 555 (see also leader-based replication) foreign keys, 38, 403 forward compatibility, 112 forward decay (algorithm), 16 Index | 569 Fossil (version control system), 463 shunning (deleting data), 463 FoundationDB (database) serializable transactions, 261, 265, 364 fractal trees, 83 full table scans, 403 full-text search, 555 and fuzzy indexes, 88 building search indexes, 411 Lucene storage engine, 79 functional reactive programming (FRP), 504 functional requirements, 22 futures (asynchronous operations), 135 fuzzy search (see similarity search) G garbage collection immutability and, 463 process pauses for, 14, 296-299, 301 (see also process pauses) genome analysis, 63, 429 geographically distributed datacenters, 145, 164, 278, 493 geospatial indexes, 87 Giraph (graph processing), 425 Git (version control system), 174, 342, 463 GitHub, postmortems, 157, 158, 309 global indexes (see term-partitioned indexes) GlusterFS (distributed filesystem), 398 GNU Coreutils (Linux), 394 GoldenGate (change data capture), 161, 170, 455 (see also Oracle) Google Bigtable (database) data model (see Bigtable data model) partitioning scheme, 199, 202 storage layout, 78 Chubby (lock service), 370 Cloud Dataflow (stream processor), 466, 477, 498 (see also Beam) Cloud Pub/Sub (messaging), 444, 448 Docs (collaborative editor), 170 Dremel (query engine), 93, 96 FlumeJava (dataflow library), 403, 427 GFS (distributed file system), 398 gRPC (RPC framework), 135 MapReduce (batch processing), 390 570 | Index (see also MapReduce) building search indexes, 411 task preemption, 418 Pregel (graph processing), 425 Spanner (see Spanner) TrueTime (clock API), 294 gossip protocol, 216 government use of data, 541 GPS (Global Positioning System) use for clock synchronization, 287, 290, 294, 295 GraphChi (graph processing), 426 graphs, 555 as data models, 49-63 example of graph-structured data, 49 property graphs, 50 RDF and triple-stores, 55-59 versus the network model, 60 processing and analysis, 424-426 fault tolerance, 425 Pregel processing model, 425 query languages Cypher, 52 Datalog, 60-63 recursive SQL queries, 53 SPARQL, 59-59 Gremlin (graph query language), 50 grep (Unix tool), 392 GROUP BY clause (SQL), 406 grouping records in MapReduce, 406 handling skew, 407 H Hadoop (data infrastructure) comparison to distributed databases, 390 comparison to MPP databases, 414-418 comparison to Unix, 413-414, 499 diverse processing models in ecosystem, 417 HDFS distributed filesystem (see HDFS) higher-level tools, 403 join algorithms, 403-410 (see also MapReduce) MapReduce (see MapReduce) YARN (see YARN) happens-before relationship, 340 capturing, 187 concurrency and, 186 hard disks access patterns, 84 detecting corruption, 519, 530 faults in, 7, 227 sequential write throughput, 75, 450 hardware faults, 7 hash indexes, 72-75 broadcast hash joins, 409 partitioned hash joins, 409 hash partitioning, 203-205, 217 consistent hashing, 204 problems with hash mod N, 210 range queries, 204 suitable hash functions, 203 with fixed number of partitions, 210 HAWQ (database), 428 HBase (database) bug due to lack of fencing, 302 bulk loading, 413 column-family data model, 41, 99 dynamic partitioning, 212 key-range partitioning, 202 log-structured storage, 78 request routing, 216 size-tiered compaction, 79 use of HDFS, 417 use of ZooKeeper, 370 HDFS (Hadoop Distributed File System), 398-399 (see also distributed filesystems) checking data integrity, 530 decoupling from query engines, 417 indiscriminately dumping data into, 415 metadata about datasets, 410 NameNode, 398 use by Flink, 479 use by HBase, 212 use by MapReduce, 402 HdrHistogram (numerical library), 16 head (Unix tool), 392 head vertex (property graphs), 51 head-of-line blocking, 15 heap files (databases), 86 Helix (cluster manager), 216 heterogeneous distributed transactions, 360, 364 heuristic decisions (in 2PC), 363 Hibernate (object-relational mapper), 30 hierarchical model, 36 high availability (see fault tolerance) high-frequency trading, 290, 299 high-performance computing (HPC), 275 hinted handoff, 183 histograms, 16 Hive (query engine), 419, 427 for data warehouses, 93 HCatalog and metastore, 410 map-side joins, 409 query optimizer, 427 skewed joins, 408 workflows, 403 Hollerith machines, 390 hopping windows (stream processing), 472 (see also windows) horizontal scaling (see scaling out) HornetQ (messaging), 137, 444 distributed transaction support, 361 hot spots, 201 due to celebrities, 205 for time-series data, 203 in batch processing, 407 relieving, 205 hot standbys (see leader-based replication) HTTP, use in APIs (see services) human errors, 9, 279, 414 HyperDex (database), 88 HyperLogLog (algorithm), 466 I I/O operations, waiting for, 297 IBM DB2 (database) distributed transaction support, 361 recursive query support, 54 serializable isolation, 242, 257 XML and JSON support, 30, 42 electromechanical card-sorting machines, 390 IMS (database), 36 imperative query APIs, 46 InfoSphere Streams (CEP engine), 466 MQ (messaging), 444 distributed transaction support, 361 System R (database), 222 WebSphere (messaging), 137 idempotence, 134, 478, 555 by giving operations unique IDs, 518, 522 idempotent operations, 517 immutability advantages of, 460, 531 Index | 571 deriving state from event log, 459-464 for crash recovery, 75 in B-trees, 82, 242 in event sourcing, 457 inputs to Unix commands, 397 limitations of, 463 Impala (query engine) for data warehouses, 93 hash joins, 409 native code generation, 428 use of HDFS, 417 impedance mismatch, 29 imperative languages, 42 setting element styles (example), 45 in doubt (transaction status), 358 holding locks, 362 orphaned transactions, 363 in-memory databases, 88 durability, 227 serial transaction execution, 253 incidents cascading failures, 9 crashes due to leap seconds, 290 data corruption and financial losses due to concurrency bugs, 233 data corruption on hard disks, 227 data loss due to last-write-wins, 173, 292 data on disks unreadable, 309 deleted items reappearing, 174 disclosure of sensitive data due to primary key reuse, 157 errors in transaction serializability, 529 gigabit network interface with 1 Kb/s throughput, 311 network faults, 279 network interface dropping only inbound packets, 279 network partitions and whole-datacenter failures, 275 poor handling of network faults, 280 sending message to ex-partner, 494 sharks biting undersea cables, 279 split brain due to 1-minute packet delay, 158, 279 vibrations in server rack, 14 violation of uniqueness constraint, 529 indexes, 71, 555 and snapshot isolation, 241 as derived data, 386, 499-504 572 | Index B-trees, 79-83 building in batch processes, 411 clustered, 86 comparison of B-trees and LSM-trees, 83-85 concatenated, 87 covering (with included columns), 86 creating, 500 full-text search, 88 geospatial, 87 hash, 72-75 index-range locking, 260 multi-column, 87 partitioning and secondary indexes, 206-209, 217 secondary, 85 (see also secondary indexes) problems with dual writes, 452, 491 SSTables and LSM-trees, 76-79 updating when data changes, 452, 467 Industrial Revolution, 541 InfiniBand (networks), 285 InfiniteGraph (database), 50 InnoDB (storage engine) clustered index on primary key, 86 not preventing lost updates, 245 preventing write skew, 248, 257 serializable isolation, 257 snapshot isolation support, 239 inside-out databases, 504 (see also unbundling databases) integrating different data systems (see data integration) integrity, 524 coordination-avoiding data systems, 528 correctness of dataflow systems, 525 in consensus formalization, 365 integrity checks, 530 (see also auditing) end-to-end, 519, 531 use of snapshot isolation, 238 maintaining despite software bugs, 529 Interface Definition Language (IDL), 117, 122 intermediate state, materialization of, 420-423 internet services, systems for implementing, 275 invariants, 225 (see also constraints) inversion of control, 396 IP (Internet Protocol) unreliability of, 277 ISDN (Integrated Services Digital Network), 284 isolation (in transactions), 225, 228, 555 correctness and, 515 for single-object writes, 230 serializability, 251-266 actual serial execution, 252-256 serializable snapshot isolation (SSI), 261-266 two-phase locking (2PL), 257-261 violating, 228 weak isolation levels, 233-251 preventing lost updates, 242-246 read committed, 234-237 snapshot isolation, 237-242 iterative processing, 424-426 J Java Database Connectivity (JDBC) distributed transaction support, 361 network drivers, 128 Java Enterprise Edition (EE), 134, 356, 361 Java Message Service (JMS), 444 (see also messaging systems) comparison to log-based messaging, 448, 451 distributed transaction support, 361 message ordering, 446 Java Transaction API (JTA), 355, 361 Java Virtual Machine (JVM) bytecode generation, 428 garbage collection pauses, 296 process reuse in batch processors, 422 JavaScript in MapReduce querying, 46 setting element styles (example), 45 use in advanced queries, 48 Jena (RDF framework), 57 Jepsen (fault tolerance testing), 515 jitter (network delay), 284 joins, 555 by index lookup, 403 expressing as relational operators, 427 in relational and document databases, 34 MapReduce map-side joins, 408-410 broadcast hash joins, 409 merge joins, 410 partitioned hash joins, 409 MapReduce reduce-side joins, 403-408 handling skew, 407 sort-merge joins, 405 parallel execution of, 415 secondary indexes and, 85 stream joins, 472-476 stream-stream join, 473 stream-table join, 473 table-table join, 474 time-dependence of, 475 support in document databases, 42 JOTM (transaction coordinator), 356 JSON Avro schema representation, 122 binary variants, 115 for application data, issues with, 114 in relational databases, 30, 42 representing a résumé (example), 31 Juttle (query language), 504 K k-nearest neighbors, 429 Kafka (messaging), 137, 448 Kafka Connect (database integration), 457, 461 Kafka Streams (stream processor), 466, 467 fault tolerance, 479 leader-based replication, 153 log compaction, 456, 467 message offsets, 447, 478 request routing, 216 transaction support, 477 usage example, 4 Ketama (partitioning library), 213 key-value stores, 70 as batch process output, 412 hash indexes, 72-75 in-memory, 89 partitioning, 201-205 by hash of key, 203, 217 by key range, 202, 217 dynamic partitioning, 212 skew and hot spots, 205 Kryo (Java), 113 Kubernetes (cluster manager), 418, 506 L lambda architecture, 497 Lamport timestamps, 345 Index | 573 Large Hadron Collider (LHC), 64 last write wins (LWW), 173, 334 discarding concurrent writes, 186 problems with, 292 prone to lost updates, 246 late binding, 396 latency instability under two-phase locking, 259 network latency and resource utilization, 286 response time versus, 14 tail latency, 15, 207 leader-based replication, 152-161 (see also replication) failover, 157, 301 handling node outages, 156 implementation of replication logs change data capture, 454-457 (see also changelogs) statement-based, 158 trigger-based replication, 161 write-ahead log (WAL) shipping, 159 linearizability of operations, 333 locking and leader election, 330 log sequence number, 156, 449 read-scaling architecture, 161 relation to consensus, 367 setting up new followers, 155 synchronous versus asynchronous, 153-155 leaderless replication, 177-191 (see also replication) detecting concurrent writes, 184-191 capturing happens-before relationship, 187 happens-before relationship and concur‐ rency, 186 last write wins, 186 merging concurrently written values, 190 version vectors, 191 multi-datacenter, 184 quorums, 179-182 consistency limitations, 181-183, 334 sloppy quorums and hinted handoff, 183 read repair and anti-entropy, 178 leap seconds, 8, 290 in time-of-day clocks, 288 leases, 295 implementation with ZooKeeper, 370 574 | Index need for fencing, 302 ledgers, 460 distributed ledger technologies, 532 legacy systems, maintenance of, 18 less (Unix tool), 397 LevelDB (storage engine), 78 leveled compaction, 79 Levenshtein automata, 88 limping (partial failure), 311 linearizability, 324-338, 555 cost of, 335-338 CAP theorem, 336 memory on multi-core CPUs, 338 definition, 325-329 implementing with total order broadcast, 350 in ZooKeeper, 370 of derived data systems, 492, 524 avoiding coordination, 527 of different replication methods, 332-335 using quorums, 334 relying on, 330-332 constraints and uniqueness, 330 cross-channel timing dependencies, 331 locking and leader election, 330 stronger than causal consistency, 342 using to implement total order broadcast, 351 versus serializability, 329 LinkedIn Azkaban (workflow scheduler), 402 Databus (change data capture), 161, 455 Espresso (database), 31, 126, 130, 153, 216 Helix (cluster manager) (see Helix) profile (example), 30 reference to company entity (example), 34 Rest.li (RPC framework), 135 Voldemort (database) (see Voldemort) Linux, leap second bug, 8, 290 liveness properties, 308 LMDB (storage engine), 82, 242 load approaches to coping with, 17 describing, 11 load testing, 16 load balancing (messaging), 444 local indexes (see document-partitioned indexes) locality (data access), 32, 41, 555 in batch processing, 400, 405, 421 in stateful clients, 170, 511 in stream processing, 474, 478, 508, 522 location transparency, 134 in the actor model, 138 locks, 556 deadlock, 258 distributed locking, 301-304, 330 fencing tokens, 303 implementation with ZooKeeper, 370 relation to consensus, 374 for transaction isolation in snapshot isolation, 239 in two-phase locking (2PL), 257-261 making operations atomic, 243 performance, 258 preventing dirty writes, 236 preventing phantoms with index-range locks, 260, 265 read locks (shared mode), 236, 258 shared mode and exclusive mode, 258 in two-phase commit (2PC) deadlock detection, 364 in-doubt transactions holding locks, 362 materializing conflicts with, 251 preventing lost updates by explicit locking, 244 log sequence number, 156, 449 logic programming languages, 504 logical clocks, 293, 343, 494 for read-after-write consistency, 164 logical logs, 160 logs (data structure), 71, 556 advantages of immutability, 460 compaction, 73, 79, 456, 460 for stream operator state, 479 creating using total order broadcast, 349 implementing uniqueness constraints, 522 log-based messaging, 446-451 comparison to traditional messaging, 448, 451 consumer offsets, 449 disk space usage, 450 replaying old messages, 451, 496, 498 slow consumers, 450 using logs for message storage, 447 log-structured storage, 71-79 log-structured merge tree (see LSMtrees) replication, 152, 158-161 change data capture, 454-457 (see also changelogs) coordination with snapshot, 156 logical (row-based) replication, 160 statement-based replication, 158 trigger-based replication, 161 write-ahead log (WAL) shipping, 159 scalability limits, 493 loose coupling, 396, 419, 502 lost updates (see updates) LSM-trees (indexes), 78-79 comparison to B-trees, 83-85 Lucene (storage engine), 79 building indexes in batch processes, 411 similarity search, 88 Luigi (workflow scheduler), 402 LWW (see last write wins) M machine learning ethical considerations, 534 (see also ethics) iterative processing, 424 models derived from training data, 505 statistical and numerical algorithms, 428 MADlib (machine learning toolkit), 428 magic scaling sauce, 18 Mahout (machine learning toolkit), 428 maintainability, 18-22, 489 defined, 23 design principles for software systems, 19 evolvability (see evolvability) operability, 19 simplicity and managing complexity, 20 many-to-many relationships in document model versus relational model, 39 modeling as graphs, 49 many-to-one and many-to-many relationships, 33-36 many-to-one relationships, 34 MapReduce (batch processing), 390, 399-400 accessing external services within job, 404, 412 comparison to distributed databases designing for frequent faults, 417 diversity of processing models, 416 diversity of storage, 415 Index | 575 comparison to stream processing, 464 comparison to Unix, 413-414 disadvantages and limitations of, 419 fault tolerance, 406, 414, 422 higher-level tools, 403, 426 implementation in Hadoop, 400-403 the shuffle, 402 implementation in MongoDB, 46-48 machine learning, 428 map-side processing, 408-410 broadcast hash joins, 409 merge joins, 410 partitioned hash joins, 409 mapper and reducer functions, 399 materialization of intermediate state, 419-423 output of batch workflows, 411-413 building search indexes, 411 key-value stores, 412 reduce-side processing, 403-408 analysis of user activity events (exam‐ ple), 404 grouping records by same key, 406 handling skew, 407 sort-merge joins, 405 workflows, 402 marshalling (see encoding) massively parallel processing (MPP), 216 comparison to composing storage technolo‐ gies, 502 comparison to Hadoop, 414-418, 428 master-master replication (see multi-leader replication) master-slave replication (see leader-based repli‐ cation) materialization, 556 aggregate values, 101 conflicts, 251 intermediate state (batch processing), 420-423 materialized views, 101 as derived data, 386, 499-504 maintaining, using stream processing, 467, 475 Maven (Java build tool), 428 Maxwell (change data capture), 455 mean, 14 media monitoring, 467 median, 14 576 | Index meeting room booking (example), 249, 259, 521 membership services, 372 Memcached (caching server), 4, 89 memory in-memory databases, 88 durability, 227 serial transaction execution, 253 in-memory representation of data, 112 random bit-flips in, 529 use by indexes, 72, 77 memory barrier (CPU instruction), 338 MemSQL (database) in-memory storage, 89 read committed isolation, 236 memtable (in LSM-trees), 78 Mercurial (version control system), 463 merge joins, MapReduce map-side, 410 mergeable persistent data structures, 174 merging sorted files, 76, 402, 405 Merkle trees, 532 Mesos (cluster manager), 418, 506 message brokers (see messaging systems) message-passing, 136-139 advantages over direct RPC, 137 distributed actor frameworks, 138 evolvability, 138 MessagePack (encoding format), 116 messages exactly-once semantics, 360, 476 loss of, 442 using total order broadcast, 348 messaging systems, 440-451 (see also streams) backpressure, buffering, or dropping mes‐ sages, 441 brokerless messaging, 442 event logs, 446-451 comparison to traditional messaging, 448, 451 consumer offsets, 449 replaying old messages, 451, 496, 498 slow consumers, 450 message brokers, 443-446 acknowledgements and redelivery, 445 comparison to event logs, 448, 451 multiple consumers of same topic, 444 reliability, 442 uniqueness in log-based messaging, 522 Meteor (web framework), 456 microbatching, 477, 495 microservices, 132 (see also services) causal dependencies across services, 493 loose coupling, 502 relation to batch/stream processors, 389, 508 Microsoft Azure Service Bus (messaging), 444 Azure Storage, 155, 398 Azure Stream Analytics, 466 DCOM (Distributed Component Object Model), 134 MSDTC (transaction coordinator), 356 Orleans (see Orleans) SQL Server (see SQL Server) migrating (rewriting) data, 40, 130, 461, 497 modulus operator (%), 210 MongoDB (database) aggregation pipeline, 48 atomic operations, 243 BSON, 41 document data model, 31 hash partitioning (sharding), 203-204 key-range partitioning, 202 lack of join support, 34, 42 leader-based replication, 153 MapReduce support, 46, 400 oplog parsing, 455, 456 partition splitting, 212 request routing, 216 secondary indexes, 207 Mongoriver (change data capture), 455 monitoring, 10, 19 monotonic clocks, 288 monotonic reads, 164 MPP (see massively parallel processing) MSMQ (messaging), 361 multi-column indexes, 87 multi-leader replication, 168-177 (see also replication) handling write conflicts, 171 conflict avoidance, 172 converging toward a consistent state, 172 custom conflict resolution logic, 173 determining what is a conflict, 174 linearizability, lack of, 333 replication topologies, 175-177 use cases, 168 clients with offline operation, 170 collaborative editing, 170 multi-datacenter replication, 168, 335 multi-object transactions, 228 need for, 231 Multi-Paxos (total order broadcast), 367 multi-table index cluster tables (Oracle), 41 multi-tenancy, 284 multi-version concurrency control (MVCC), 239, 266 detecting stale MVCC reads, 263 indexes and snapshot isolation, 241 mutual exclusion, 261 (see also locks) MySQL (database) binlog coordinates, 156 binlog parsing for change data capture, 455 circular replication topology, 175 consistent snapshots, 156 distributed transaction support, 361 InnoDB storage engine (see InnoDB) JSON support, 30, 42 leader-based replication, 153 performance of XA transactions, 360 row-based replication, 160 schema changes in, 40 snapshot isolation support, 242 (see also InnoDB) statement-based replication, 159 Tungsten Replicator (multi-leader replica‐ tion), 170 conflict detection, 177 N nanomsg (messaging library), 442 Narayana (transaction coordinator), 356 NATS (messaging), 137 near-real-time (nearline) processing, 390 (see also stream processing) Neo4j (database) Cypher query language, 52 graph data model, 50 Nephele (dataflow engine), 421 netcat (Unix tool), 397 Netflix Chaos Monkey, 7, 280 Network Attached Storage (NAS), 146, 398 network model, 36 Index | 577 graph databases versus, 60 imperative query APIs, 46 Network Time Protocol (see NTP) networks congestion and queueing, 282 datacenter network topologies, 276 faults (see faults) linearizability and network delays, 338 network partitions, 279, 337 timeouts and unbounded delays, 281 next-key locking, 260 nodes (in graphs) (see vertices) nodes (processes), 556 handling outages in leader-based replica‐ tion, 156 system models for failure, 307 noisy neighbors, 284 nonblocking atomic commit, 359 nondeterministic operations accidental nondeterminism, 423 partial failures in distributed systems, 275 nonfunctional requirements, 22 nonrepeatable reads, 238 (see also read skew) normalization (data representation), 33, 556 executing joins, 39, 42, 403 foreign key references, 231 in systems of record, 386 versus denormalization, 462 NoSQL, 29, 499 transactions and, 223 Notation3 (N3), 56 npm (package manager), 428 NTP (Network Time Protocol), 287 accuracy, 289, 293 adjustments to monotonic clocks, 289 multiple server addresses, 306 numbers, in XML and JSON encodings, 114 O object-relational mapping (ORM) frameworks, 30 error handling and aborted transactions, 232 unsafe read-modify-write cycle code, 244 object-relational mismatch, 29 observer pattern, 506 offline systems, 390 (see also batch processing) 578 | Index stateful, offline-capable clients, 170, 511 offline-first applications, 511 offsets consumer offsets in partitioned logs, 449 messages in partitioned logs, 447 OLAP (online analytic processing), 91, 556 data cubes, 102 OLTP (online transaction processing), 90, 556 analytics queries versus, 411 workload characteristics, 253 one-to-many relationships, 30 JSON representation, 32 online systems, 389 (see also services) Oozie (workflow scheduler), 402 OpenAPI (service definition format), 133 OpenStack Nova (cloud infrastructure) use of ZooKeeper, 370 Swift (object storage), 398 operability, 19 operating systems versus databases, 499 operation identifiers, 518, 522 operational transformation, 174 operators, 421 flow of data between, 424 in stream processing, 464 optimistic concurrency control, 261 Oracle (database) distributed transaction support, 361 GoldenGate (change data capture), 161, 170, 455 lack of serializability, 226 leader-based replication, 153 multi-table index cluster tables, 41 not preventing write skew, 248 partitioned indexes, 209 PL/SQL language, 255 preventing lost updates, 245 read committed isolation, 236 Real Application Clusters (RAC), 330 recursive query support, 54 snapshot isolation support, 239, 242 TimesTen (in-memory database), 89 WAL-based replication, 160 XML support, 30 ordering, 339-352 by sequence numbers, 343-348 causal ordering, 339-343 partial order, 341 limits of total ordering, 493 total order broadcast, 348-352 Orleans (actor framework), 139 outliers (response time), 14 Oz (programming language), 504 P package managers, 428, 505 packet switching, 285 packets corruption of, 306 sending via UDP, 442 PageRank (algorithm), 49, 424 paging (see virtual memory) ParAccel (database), 93 parallel databases (see massively parallel pro‐ cessing) parallel execution of graph analysis algorithms, 426 queries in MPP databases, 216 Parquet (data format), 96, 131 (see also column-oriented storage) use in Hadoop, 414 partial failures, 275, 310 limping, 311 partial order, 341 partitioning, 199-218, 556 and replication, 200 in batch processing, 429 multi-partition operations, 514 enforcing constraints, 522 secondary index maintenance, 495 of key-value data, 201-205 by key range, 202 skew and hot spots, 205 rebalancing partitions, 209-214 automatic or manual rebalancing, 213 problems with hash mod N, 210 using dynamic partitioning, 212 using fixed number of partitions, 210 using N partitions per node, 212 replication and, 147 request routing, 214-216 secondary indexes, 206-209 document-based partitioning, 206 term-based partitioning, 208 serial execution of transactions and, 255 Paxos (consensus algorithm), 366 ballot number, 368 Multi-Paxos (total order broadcast), 367 percentiles, 14, 556 calculating efficiently, 16 importance of high percentiles, 16 use in service level agreements (SLAs), 15 Percona XtraBackup (MySQL tool), 156 performance describing, 13 of distributed transactions, 360 of in-memory databases, 89 of linearizability, 338 of multi-leader replication, 169 perpetual inconsistency, 525 pessimistic concurrency control, 261 phantoms (transaction isolation), 250 materializing conflicts, 251 preventing, in serializability, 259 physical clocks (see clocks) pickle (Python), 113 Pig (dataflow language), 419, 427 replicated joins, 409 skewed joins, 407 workflows, 403 Pinball (workflow scheduler), 402 pipelined execution, 423 in Unix, 394 point in time, 287 polyglot persistence, 29 polystores, 501 PostgreSQL (database) BDR (multi-leader replication), 170 causal ordering of writes, 177 Bottled Water (change data capture), 455 Bucardo (trigger-based replication), 161, 173 distributed transaction support, 361 foreign data wrappers, 501 full text search support, 490 leader-based replication, 153 log sequence number, 156 MVCC implementation, 239, 241 PL/pgSQL language, 255 PostGIS geospatial indexes, 87 preventing lost updates, 245 preventing write skew, 248, 261 read committed isolation, 236 recursive query support, 54 representing graphs, 51 Index | 579 serializable snapshot isolation (SSI), 261 snapshot isolation support, 239, 242 WAL-based replication, 160 XML and JSON support, 30, 42 pre-splitting, 212 Precision Time Protocol (PTP), 290 predicate locks, 259 predictive analytics, 533-536 amplifying bias, 534 ethics of (see ethics) feedback loops, 536 preemption of datacenter resources, 418 of threads, 298 Pregel processing model, 425 primary keys, 85, 556 compound primary key (Cassandra), 204 primary-secondary replication (see leaderbased replication) privacy, 536-543 consent and freedom of choice, 538 data as assets and power, 540 deleting data, 463 ethical considerations (see ethics) legislation and self-regulation, 542 meaning of, 539 surveillance, 537 tracking behavioral data, 536 probabilistic algorithms, 16, 466 process pauses, 295-299 processing time (of events), 469 producers (message streams), 440 programming languages dataflow languages, 504 for stored procedures, 255 functional reactive programming (FRP), 504 logic programming, 504 Prolog (language), 61 (see also Datalog) promises (asynchronous operations), 135 property graphs, 50 Cypher query language, 52 Protocol Buffers (data format), 117-121 field tags and schema evolution, 120 provenance of data, 531 publish/subscribe model, 441 publishers (message streams), 440 punch card tabulating machines, 390 580 | Index pure functions, 48 putting computation near data, 400 Q Qpid (messaging), 444 quality of service (QoS), 285 Quantcast File System (distributed filesystem), 398 query languages, 42-48 aggregation pipeline, 48 CSS and XSL, 44 Cypher, 52 Datalog, 60 Juttle, 504 MapReduce querying, 46-48 recursive SQL queries, 53 relational algebra and SQL, 42 SPARQL, 59 query optimizers, 37, 427 queueing delays (networks), 282 head-of-line blocking, 15 latency and response time, 14 queues (messaging), 137 quorums, 179-182, 556 for leaderless replication, 179 in consensus algorithms, 368 limitations of consistency, 181-183, 334 making decisions in distributed systems, 301 monitoring staleness, 182 multi-datacenter replication, 184 relying on durability, 309 sloppy quorums and hinted handoff, 183 R R-trees (indexes), 87 RabbitMQ (messaging), 137, 444 leader-based replication, 153 race conditions, 225 (see also concurrency) avoiding with linearizability, 331 caused by dual writes, 452 dirty writes, 235 in counter increments, 235 lost updates, 242-246 preventing with event logs, 462, 507 preventing with serializable isolation, 252 write skew, 246-251 Raft (consensus algorithm), 366 sensitivity to network problems, 369 term number, 368 use in etcd, 353 RAID (Redundant Array of Independent Disks), 7, 398 railways, schema migration on, 496 RAMCloud (in-memory storage), 89 ranking algorithms, 424 RDF (Resource Description Framework), 57 querying with SPARQL, 59 RDMA (Remote Direct Memory Access), 276 read committed isolation level, 234-237 implementing, 236 multi-version concurrency control (MVCC), 239 no dirty reads, 234 no dirty writes, 235 read path (derived data), 509 read repair (leaderless replication), 178 for linearizability, 335 read replicas (see leader-based replication) read skew (transaction isolation), 238, 266 as violation of causality, 340 read-after-write consistency, 163, 524 cross-device, 164 read-modify-write cycle, 243 read-scaling architecture, 161 reads as events, 513 real-time collaborative editing, 170 near-real-time processing, 390 (see also stream processing) publish/subscribe dataflow, 513 response time guarantees, 298 time-of-day clocks, 288 rebalancing partitions, 209-214, 556 (see also partitioning) automatic or manual rebalancing, 213 dynamic partitioning, 212 fixed number of partitions, 210 fixed number of partitions per node, 212 problems with hash mod N, 210 recency guarantee, 324 recommendation engines batch process outputs, 412 batch workflows, 403, 420 iterative processing, 424 statistical and numerical algorithms, 428 records, 399 events in stream processing, 440 recursive common table expressions (SQL), 54 redelivery (messaging), 445 Redis (database) atomic operations, 243 durability, 89 Lua scripting, 255 single-threaded execution, 253 usage example, 4 redundancy hardware components, 7 of derived data, 386 (see also derived data) Reed–Solomon codes (error correction), 398 refactoring, 22 (see also evolvability) regions (partitioning), 199 register (data structure), 325 relational data model, 28-42 comparison to document model, 38-42 graph queries in SQL, 53 in-memory databases with, 89 many-to-one and many-to-many relation‐ ships, 33 multi-object transactions, need for, 231 NoSQL as alternative to, 29 object-relational mismatch, 29 relational algebra and SQL, 42 versus document model convergence of models, 41 data locality, 41 relational databases eventual consistency, 162 history, 28 leader-based replication, 153 logical logs, 160 philosophy compared to Unix, 499, 501 schema changes, 40, 111, 130 statement-based replication, 158 use of B-tree indexes, 80 relationships (see edges) reliability, 6-10, 489 building a reliable system from unreliable components, 276 defined, 6, 22 hardware faults, 7 human errors, 9 importance of, 10 of messaging systems, 442 Index | 581 software errors, 8 Remote Method Invocation (Java RMI), 134 remote procedure calls (RPCs), 134-136 (see also services) based on futures, 135 data encoding and evolution, 136 issues with, 134 using Avro, 126, 135 using Thrift, 135 versus message brokers, 137 repeatable reads (transaction isolation), 242 replicas, 152 replication, 151-193, 556 and durability, 227 chain replication, 155 conflict resolution and, 246 consistency properties, 161-167 consistent prefix reads, 165 monotonic reads, 164 reading your own writes, 162 in distributed filesystems, 398 leaderless, 177-191 detecting concurrent writes, 184-191 limitations of quorum consistency, 181-183, 334 sloppy quorums and hinted handoff, 183 monitoring staleness, 182 multi-leader, 168-177 across multiple datacenters, 168, 335 handling write conflicts, 171-175 replication topologies, 175-177 partitioning and, 147, 200 reasons for using, 145, 151 single-leader, 152-161 failover, 157 implementation of replication logs, 158-161 relation to consensus, 367 setting up new followers, 155 synchronous versus asynchronous, 153-155 state machine replication, 349, 452 using erasure coding, 398 with heterogeneous data systems, 453 replication logs (see logs) reprocessing data, 496, 498 (see also evolvability) from log-based messaging, 451 request routing, 214-216 582 | Index approaches to, 214 parallel query execution, 216 resilient systems, 6 (see also fault tolerance) response time as performance metric for services, 13, 389 guarantees on, 298 latency versus, 14 mean and percentiles, 14 user experience, 15 responsibility and accountability, 535 REST (Representational State Transfer), 133 (see also services) RethinkDB (database) document data model, 31 dynamic partitioning, 212 join support, 34, 42 key-range partitioning, 202 leader-based replication, 153 subscribing to changes, 456 Riak (database) Bitcask storage engine, 72 CRDTs, 174, 191 dotted version vectors, 191 gossip protocol, 216 hash partitioning, 203-204, 211 last-write-wins conflict resolution, 186 leaderless replication, 177 LevelDB storage engine, 78 linearizability, lack of, 335 multi-datacenter support, 184 preventing lost updates across replicas, 246 rebalancing, 213 search feature, 209 secondary indexes, 207 siblings (concurrently written values), 190 sloppy quorums, 184 ring buffers, 450 Ripple (cryptocurrency), 532 rockets, 10, 36, 305 RocksDB (storage engine), 78 leveled compaction, 79 rollbacks (transactions), 222 rolling upgrades, 8, 112 routing (see request routing) row-oriented storage, 96 row-based replication, 160 rowhammer (memory corruption), 529 RPCs (see remote procedure calls) Rubygems (package manager), 428 rules (Datalog), 61 S safety and liveness properties, 308 in consensus algorithms, 366 in transactions, 222 sagas (see compensating transactions) Samza (stream processor), 466, 467 fault tolerance, 479 streaming SQL support, 466 sandboxes, 9 SAP HANA (database), 93 scalability, 10-18, 489 approaches for coping with load, 17 defined, 22 describing load, 11 describing performance, 13 partitioning and, 199 replication and, 161 scaling up versus scaling out, 146 scaling out, 17, 146 (see also shared-nothing architecture) scaling up, 17, 146 scatter/gather approach, querying partitioned databases, 207 SCD (slowly changing dimension), 476 schema-on-read, 39 comparison to evolvable schema, 128 in distributed filesystems, 415 schema-on-write, 39 schemaless databases (see schema-on-read) schemas, 557 Avro, 122-127 reader determining writer’s schema, 125 schema evolution, 123 dynamically generated, 126 evolution of, 496 affecting application code, 111 compatibility checking, 126 in databases, 129-131 in message-passing, 138 in service calls, 136 flexibility in document model, 39 for analytics, 93-95 for JSON and XML, 115 merits of, 127 schema migration on railways, 496 Thrift and Protocol Buffers, 117-121 schema evolution, 120 traditional approach to design, fallacy in, 462 searches building search indexes in batch processes, 411 k-nearest neighbors, 429 on streams, 467 partitioned secondary indexes, 206 secondaries (see leader-based replication) secondary indexes, 85, 557 partitioning, 206-209, 217 document-partitioned, 206 index maintenance, 495 term-partitioned, 208 problems with dual writes, 452, 491 updating, transaction isolation and, 231 secondary sorts, 405 sed (Unix tool), 392 self-describing files, 127 self-joins, 480 self-validating systems, 530 semantic web, 57 semi-synchronous replication, 154 sequence number ordering, 343-348 generators, 294, 344 insufficiency for enforcing constraints, 347 Lamport timestamps, 345 use of timestamps, 291, 295, 345 sequential consistency, 351 serializability, 225, 233, 251-266, 557 linearizability versus, 329 pessimistic versus optimistic concurrency control, 261 serial execution, 252-256 partitioning, 255 using stored procedures, 253, 349 serializable snapshot isolation (SSI), 261-266 detecting stale MVCC reads, 263 detecting writes that affect prior reads, 264 distributed execution, 265, 364 performance of SSI, 265 preventing write skew, 262-265 two-phase locking (2PL), 257-261 index-range locks, 260 performance, 258 Serializable (Java), 113 Index | 583 serialization, 113 (see also encoding) service discovery, 135, 214, 372 using DNS, 216, 372 service level agreements (SLAs), 15 service-oriented architecture (SOA), 132 (see also services) services, 131-136 microservices, 132 causal dependencies across services, 493 loose coupling, 502 relation to batch/stream processors, 389, 508 remote procedure calls (RPCs), 134-136 issues with, 134 similarity to databases, 132 web services, 132, 135 session windows (stream processing), 472 (see also windows) sessionization, 407 sharding (see partitioning) shared mode (locks), 258 shared-disk architecture, 146, 398 shared-memory architecture, 146 shared-nothing architecture, 17, 146-147, 557 (see also replication) distributed filesystems, 398 (see also distributed filesystems) partitioning, 199 use of network, 277 sharks biting undersea cables, 279 counting (example), 46-48 finding (example), 42 website about (example), 44 shredding (in relational model), 38 siblings (concurrent values), 190, 246 (see also conflicts) similarity search edit distance, 88 genome data, 63 k-nearest neighbors, 429 single-leader replication (see leader-based rep‐ lication) single-threaded execution, 243, 252 in batch processing, 406, 421, 426 in stream processing, 448, 463, 522 size-tiered compaction, 79 skew, 557 584 | Index clock skew, 291-294, 334 in transaction isolation read skew, 238, 266 write skew, 246-251, 262-265 (see also write skew) meanings of, 238 unbalanced workload, 201 compensating for, 205 due to celebrities, 205 for time-series data, 203 in batch processing, 407 slaves (see leader-based replication) sliding windows (stream processing), 472 (see also windows) sloppy quorums, 183 (see also quorums) lack of linearizability, 334 slowly changing dimension (data warehouses), 476 smearing (leap seconds adjustments), 290 snapshots (databases) causal consistency, 340 computing derived data, 500 in change data capture, 455 serializable snapshot isolation (SSI), 261-266, 329 setting up a new replica, 156 snapshot isolation and repeatable read, 237-242 implementing with MVCC, 239 indexes and MVCC, 241 visibility rules, 240 synchronized clocks for global snapshots, 294 snowflake schemas, 95 SOAP, 133 (see also services) evolvability, 136 software bugs, 8 maintaining integrity, 529 solid state drives (SSDs) access patterns, 84 detecting corruption, 519, 530 faults in, 227 sequential write throughput, 75 Solr (search server) building indexes in batch processes, 411 document-partitioned indexes, 207 request routing, 216 usage example, 4 use of Lucene, 79 sort (Unix tool), 392, 394, 395 sort-merge joins (MapReduce), 405 Sorted String Tables (see SSTables) sorting sort order in column storage, 99 source of truth (see systems of record) Spanner (database) data locality, 41 snapshot isolation using clocks, 295 TrueTime API, 294 Spark (processing framework), 421-423 bytecode generation, 428 dataflow APIs, 427 fault tolerance, 422 for data warehouses, 93 GraphX API (graph processing), 425 machine learning, 428 query optimizer, 427 Spark Streaming, 466 microbatching, 477 stream processing on top of batch process‐ ing, 495 SPARQL (query language), 59 spatial algorithms, 429 split brain, 158, 557 in consensus algorithms, 352, 367 preventing, 322, 333 using fencing tokens to avoid, 302-304 spreadsheets, dataflow programming capabili‐ ties, 504 SQL (Structured Query Language), 21, 28, 43 advantages and limitations of, 416 distributed query execution, 48 graph queries in, 53 isolation levels standard, issues with, 242 query execution on Hadoop, 416 résumé (example), 30 SQL injection vulnerability, 305 SQL on Hadoop, 93 statement-based replication, 158 stored procedures, 255 SQL Server (database) data warehousing support, 93 distributed transaction support, 361 leader-based replication, 153 preventing lost updates, 245 preventing write skew, 248, 257 read committed isolation, 236 recursive query support, 54 serializable isolation, 257 snapshot isolation support, 239 T-SQL language, 255 XML support, 30 SQLstream (stream analytics), 466 SSDs (see solid state drives) SSTables (storage format), 76-79 advantages over hash indexes, 76 concatenated index, 204 constructing and maintaining, 78 making LSM-Tree from, 78 staleness (old data), 162 cross-channel timing dependencies, 331 in leaderless databases, 178 in multi-version concurrency control, 263 monitoring for, 182 of client state, 512 versus linearizability, 324 versus timeliness, 524 standbys (see leader-based replication) star replication topologies, 175 star schemas, 93-95 similarity to event sourcing, 458 Star Wars analogy (event time versus process‐ ing time), 469 state derived from log of immutable events, 459 deriving current state from the event log, 458 interplay between state changes and appli‐ cation code, 507 maintaining derived state, 495 maintenance by stream processor in streamstream joins, 473 observing derived state, 509-515 rebuilding after stream processor failure, 478 separation of application code and, 505 state machine replication, 349, 452 statement-based replication, 158 statically typed languages analogy to schema-on-write, 40 code generation and, 127 statistical and numerical algorithms, 428 StatsD (metrics aggregator), 442 stdin, stdout, 395, 396 Stellar (cryptocurrency), 532 Index | 585 stock market feeds, 442 STONITH (Shoot The Other Node In The Head), 158 stop-the-world (see garbage collection) storage composing data storage technologies, 499-504 diversity of, in MapReduce, 415 Storage Area Network (SAN), 146, 398 storage engines, 69-104 column-oriented, 95-101 column compression, 97-99 defined, 96 distinction between column families and, 99 Parquet, 96, 131 sort order in, 99-100 writing to, 101 comparing requirements for transaction processing and analytics, 90-96 in-memory storage, 88 durability, 227 row-oriented, 70-90 B-trees, 79-83 comparing B-trees and LSM-trees, 83-85 defined, 96 log-structured, 72-79 stored procedures, 161, 253-255, 557 and total order broadcast, 349 pros and cons of, 255 similarity to stream processors, 505 Storm (stream processor), 466 distributed RPC, 468, 514 Trident state handling, 478 straggler events, 470, 498 stream processing, 464-481, 557 accessing external services within job, 474, 477, 478, 517 combining with batch processing lambda architecture, 497 unifying technologies, 498 comparison to batch processing, 464 complex event processing (CEP), 465 fault tolerance, 476-479 atomic commit, 477 idempotence, 478 microbatching and checkpointing, 477 rebuilding state after a failure, 478 for data integration, 494-498 586 | Index maintaining derived state, 495 maintenance of materialized views, 467 messaging systems (see messaging systems) reasoning about time, 468-472 event time versus processing time, 469, 477, 498 knowing when window is ready, 470 types of windows, 472 relation to databases (see streams) relation to services, 508 search on streams, 467 single-threaded execution, 448, 463 stream analytics, 466 stream joins, 472-476 stream-stream join, 473 stream-table join, 473 table-table join, 474 time-dependence of, 475 streams, 440-451 end-to-end, pushing events to clients, 512 messaging systems (see messaging systems) processing (see stream processing) relation to databases, 451-464 (see also changelogs) API support for change streams, 456 change data capture, 454-457 derivative of state by time, 460 event sourcing, 457-459 keeping systems in sync, 452-453 philosophy of immutable events, 459-464 topics, 440 strict serializability, 329 strong consistency (see linearizability) strong one-copy serializability, 329 subjects, predicates, and objects (in triplestores), 55 subscribers (message streams), 440 (see also consumers) supercomputers, 275 surveillance, 537 (see also privacy) Swagger (service definition format), 133 swapping to disk (see virtual memory) synchronous networks, 285, 557 comparison to asynchronous networks, 284 formal model, 307 synchronous replication, 154, 557 chain replication, 155 conflict detection, 172 system models, 300, 306-310 assumptions in, 528 correctness of algorithms, 308 mapping to the real world, 309 safety and liveness, 308 systems of record, 386, 557 change data capture, 454, 491 treating event log as, 460 systems thinking, 536 T t-digest (algorithm), 16 table-table joins, 474 Tableau (data visualization software), 416 tail (Unix tool), 447 tail vertex (property graphs), 51 Tajo (query engine), 93 Tandem NonStop SQL (database), 200 TCP (Transmission Control Protocol), 277 comparison to circuit switching, 285 comparison to UDP, 283 connection failures, 280 flow control, 282, 441 packet checksums, 306, 519, 529 reliability and duplicate suppression, 517 retransmission timeouts, 284 use for transaction sessions, 229 telemetry (see monitoring) Teradata (database), 93, 200 term-partitioned indexes, 208, 217 termination (consensus), 365 Terrapin (database), 413 Tez (dataflow engine), 421-423 fault tolerance, 422 support by higher-level tools, 427 thrashing (out of memory), 297 threads (concurrency) actor model, 138, 468 (see also message-passing) atomic operations, 223 background threads, 73, 85 execution pauses, 286, 296-298 memory barriers, 338 preemption, 298 single (see single-threaded execution) three-phase commit, 359 Thrift (data format), 117-121 BinaryProtocol, 118 CompactProtocol, 119 field tags and schema evolution, 120 throughput, 13, 390 TIBCO, 137 Enterprise Message Service, 444 StreamBase (stream analytics), 466 time concurrency and, 187 cross-channel timing dependencies, 331 in distributed systems, 287-299 (see also clocks) clock synchronization and accuracy, 289 relying on synchronized clocks, 291-295 process pauses, 295-299 reasoning about, in stream processors, 468-472 event time versus processing time, 469, 477, 498 knowing when window is ready, 470 timestamp of events, 471 types of windows, 472 system models for distributed systems, 307 time-dependence in stream joins, 475 time-of-day clocks, 288 timeliness, 524 coordination-avoiding data systems, 528 correctness of dataflow systems, 525 timeouts, 279, 557 dynamic configuration of, 284 for failover, 158 length of, 281 timestamps, 343 assigning to events in stream processing, 471 for read-after-write consistency, 163 for transaction ordering, 295 insufficiency for enforcing constraints, 347 key range partitioning by, 203 Lamport, 345 logical, 494 ordering events, 291, 345 Titan (database), 50 tombstones, 74, 191, 456 topics (messaging), 137, 440 total order, 341, 557 limits of, 493 sequence numbers or timestamps, 344 total order broadcast, 348-352, 493, 522 consensus algorithms and, 366-368 Index | 587 implementation in ZooKeeper and etcd, 370 implementing with linearizable storage, 351 using, 349 using to implement linearizable storage, 350 tracking behavioral data, 536 (see also privacy) transaction coordinator (see coordinator) transaction manager (see coordinator) transaction processing, 28, 90-95 comparison to analytics, 91 comparison to data warehousing, 93 transactions, 221-267, 558 ACID properties of, 223 atomicity, 223 consistency, 224 durability, 226 isolation, 225 compensating (see compensating transac‐ tions) concept of, 222 distributed transactions, 352-364 avoiding, 492, 502, 521-528 failure amplification, 364, 495 in doubt/uncertain status, 358, 362 two-phase commit, 354-359 use of, 360-361 XA transactions, 361-364 OLTP versus analytics queries, 411 purpose of, 222 serializability, 251-266 actual serial execution, 252-256 pessimistic versus optimistic concur‐ rency control, 261 serializable snapshot isolation (SSI), 261-266 two-phase locking (2PL), 257-261 single-object and multi-object, 228-232 handling errors and aborts, 231 need for multi-object transactions, 231 single-object writes, 230 snapshot isolation (see snapshots) weak isolation levels, 233-251 preventing lost updates, 242-246 read committed, 234-238 transitive closure (graph algorithm), 424 trie (data structure), 88 triggers (databases), 161, 441 implementing change data capture, 455 implementing replication, 161 588 | Index triple-stores, 55-59 SPARQL query language, 59 tumbling windows (stream processing), 472 (see also windows) in microbatching, 477 tuple spaces (programming model), 507 Turtle (RDF data format), 56 Twitter constructing home timelines (example), 11, 462, 474, 511 DistributedLog (event log), 448 Finagle (RPC framework), 135 Snowflake (sequence number generator), 294 Summingbird (processing library), 497 two-phase commit (2PC), 353, 355-359, 558 confusion with two-phase locking, 356 coordinator failure, 358 coordinator recovery, 363 how it works, 357 issues in practice, 363 performance cost, 360 transactions holding locks, 362 two-phase locking (2PL), 257-261, 329, 558 confusion with two-phase commit, 356 index-range locks, 260 performance of, 258 type checking, dynamic versus static, 40 U UDP (User Datagram Protocol) comparison to TCP, 283 multicast, 442 unbounded datasets, 439, 558 (see also streams) unbounded delays, 558 in networks, 282 process pauses, 296 unbundling databases, 499-515 composing data storage technologies, 499-504 federation versus unbundling, 501 need for high-level language, 503 designing applications around dataflow, 504-509 observing derived state, 509-515 materialized views and caching, 510 multi-partition data processing, 514 pushing state changes to clients, 512 uncertain (transaction status) (see in doubt) uniform consensus, 365 (see also consensus) uniform interfaces, 395 union type (in Avro), 125 uniq (Unix tool), 392 uniqueness constraints asynchronously checked, 526 requiring consensus, 521 requiring linearizability, 330 uniqueness in log-based messaging, 522 Unix philosophy, 394-397 command-line batch processing, 391-394 Unix pipes versus dataflow engines, 423 comparison to Hadoop, 413-414 comparison to relational databases, 499, 501 comparison to stream processing, 464 composability and uniform interfaces, 395 loose coupling, 396 pipes, 394 relation to Hadoop, 499 UPDATE statement (SQL), 40 updates preventing lost updates, 242-246 atomic write operations, 243 automatically detecting lost updates, 245 compare-and-set operations, 245 conflict resolution and replication, 246 using explicit locking, 244 preventing write skew, 246-251 V validity (consensus), 365 vBuckets (partitioning), 199 vector clocks, 191 (see also version vectors) vectorized processing, 99, 428 verification, 528-533 avoiding blind trust, 530 culture of, 530 designing for auditability, 531 end-to-end integrity checks, 531 tools for auditable data systems, 532 version control systems, reliance on immutable data, 463 version vectors, 177, 191 capturing causal dependencies, 343 versus vector clocks, 191 Vertica (database), 93 handling writes, 101 replicas using different sort orders, 100 vertical scaling (see scaling up) vertices (in graphs), 49 property graph model, 50 Viewstamped Replication (consensus algo‐ rithm), 366 view number, 368 virtual machines, 146 (see also cloud computing) context switches, 297 network performance, 282 noisy neighbors, 284 reliability in cloud services, 8 virtualized clocks in, 290 virtual memory process pauses due to page faults, 14, 297 versus memory management by databases, 89 VisiCalc (spreadsheets), 504 vnodes (partitioning), 199 Voice over IP (VoIP), 283 Voldemort (database) building read-only stores in batch processes, 413 hash partitioning, 203-204, 211 leaderless replication, 177 multi-datacenter support, 184 rebalancing, 213 reliance on read repair, 179 sloppy quorums, 184 VoltDB (database) cross-partition serializability, 256 deterministic stored procedures, 255 in-memory storage, 89 output streams, 456 secondary indexes, 207 serial execution of transactions, 253 statement-based replication, 159, 479 transactions in stream processing, 477 W WAL (write-ahead log), 82 web services (see services) Web Services Description Language (WSDL), 133 webhooks, 443 webMethods (messaging), 137 WebSocket (protocol), 512 Index | 589 windows (stream processing), 466, 468-472 infinite windows for changelogs, 467, 474 knowing when all events have arrived, 470 stream joins within a window, 473 types of windows, 472 winners (conflict resolution), 173 WITH RECURSIVE syntax (SQL), 54 workflows (MapReduce), 402 outputs, 411-414 key-value stores, 412 search indexes, 411 with map-side joins, 410 working set, 393 write amplification, 84 write path (derived data), 509 write skew (transaction isolation), 246-251 characterizing, 246-251, 262 examples of, 247, 249 materializing conflicts, 251 occurrence in practice, 529 phantoms, 250 preventing in snapshot isolation, 262-265 in two-phase locking, 259-261 options for, 248 write-ahead log (WAL), 82, 159 writes (database) atomic write operations, 243 detecting writes affecting prior reads, 264 preventing dirty writes with read commit‐ ted, 235 WS-* framework, 133 (see also services) WS-AtomicTransaction (2PC), 355 590 | Index X XA transactions, 355, 361-364 heuristic decisions, 363 limitations of, 363 xargs (Unix tool), 392, 396 XML binary variants, 115 encoding RDF data, 57 for application data, issues with, 114 in relational databases, 30, 41 XSL/XPath, 45 Y Yahoo!


pages: 1,237 words: 227,370

Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems by Martin Kleppmann

active measures, Amazon Web Services, bitcoin, blockchain, business intelligence, business process, c2.com, cloud computing, collaborative editing, commoditize, conceptual framework, cryptocurrency, database schema, DevOps, distributed ledger, Donald Knuth, Edward Snowden, Ethereum, ethereum blockchain, fault tolerance, finite state, Flash crash, full text search, functional programming, general-purpose programming language, informal economy, information retrieval, Infrastructure as a Service, Internet of things, iterative process, John von Neumann, Kubernetes, loose coupling, Marc Andreessen, microservices, natural language processing, Network effects, packet switching, peer-to-peer, performance metric, place-making, premature optimization, recommendation engine, Richard Feynman, self-driving car, semantic web, Shoshana Zuboff, social graph, social web, software as a service, software is eating the world, sorting algorithm, source of truth, SPARQL, speech recognition, statistical model, surveillance capitalism, Tragedy of the Commons, undersea cable, web application, WebSocket, wikimedia commons

Thus, the state on the device is a stale cache that is not updated unless you explicitly poll for changes. (HTTP-based feed subscription protocols like RSS are really just a basic form of polling.) More recent protocols have moved beyond the basic request/response pattern of HTTP: server-sent events (the EventSource API) and WebSockets provide communication channels by which a web browser can keep an open TCP connection to a server, and the server can actively push messages to the browser as long as it remains connected. This provides an opportunity for the server to actively inform the end-user client about any changes to the state it has stored locally, reducing the staleness of the client-side state.

designing applications around dataflow, Designing Applications Around Dataflow-Stream processors and services observing derived state, Observing Derived State-Multi-partition data processingmaterialized views and caching, Materialized views and caching multi-partition data processing, Multi-partition data processing pushing state changes to clients, Pushing state changes to clients uncertain (transaction status) (see in doubt) uniform consensus, Fault-Tolerant Consensus(see also consensus) uniform interfaces, A uniform interface union type (in Avro), Schema evolution rules uniq (Unix tool), Simple Log Analysis uniqueness constraintsasynchronously checked, Loosely interpreted constraints requiring consensus, Uniqueness constraints require consensus requiring linearizability, Constraints and uniqueness guarantees uniqueness in log-based messaging, Uniqueness in log-based messaging Unix philosophy, The Unix Philosophy-Transparency and experimentationcommand-line batch processing, Batch Processing with Unix Tools-Sorting versus in-memory aggregationUnix pipes versus dataflow engines, Discussion of materialization comparison to Hadoop, Philosophy of batch process outputs-Philosophy of batch process outputs comparison to relational databases, Unbundling Databases, The meta-database of everything comparison to stream processing, Processing Streams composability and uniform interfaces, The Unix Philosophy loose coupling, Separation of logic and wiring pipes, The Unix Philosophy relation to Hadoop, Unbundling Databases UPDATE statement (SQL), Schema flexibility in the document model updatespreventing lost updates, Preventing Lost Updates-Conflict resolution and replicationatomic write operations, Atomic write operations automatically detecting lost updates, Automatically detecting lost updates compare-and-set operations, Compare-and-set conflict resolution and replication, Conflict resolution and replication using explicit locking, Explicit locking preventing write skew, Write Skew and Phantoms-Materializing conflicts V validity (consensus), Fault-Tolerant Consensus vBuckets (partitioning), Partitioning vector clocks, Version vectors(see also version vectors) vectorized processing, Memory bandwidth and vectorized processing, The move toward declarative query languages verification, Trust, but Verify-Tools for auditable data systemsavoiding blind trust, Don’t just blindly trust what they promise culture of, A culture of verification designing for auditability, Designing for auditability end-to-end integrity checks, The end-to-end argument again tools for auditable data systems, Tools for auditable data systems version control systems, reliance on immutable data, Limitations of immutability version vectors, Multi-Leader Replication Topologies, Version vectorscapturing causal dependencies, Capturing causal dependencies versus vector clocks, Version vectors Vertica (database), The divergence between OLTP databases and data warehouseshandling writes, Writing to Column-Oriented Storage replicas using different sort orders, Several different sort orders vertical scaling (see scaling up) vertices (in graphs), Graph-Like Data Modelsproperty graph model, Property Graphs Viewstamped Replication (consensus algorithm), Consensus algorithms and total order broadcastview number, Epoch numbering and quorums virtual machines, Distributed Data(see also cloud computing) context switches, Process Pauses network performance, Network congestion and queueing noisy neighbors, Network congestion and queueing reliability in cloud services, Hardware Faults virtualized clocks in, Clock Synchronization and Accuracy virtual memoryprocess pauses due to page faults, Describing Performance, Process Pauses versus memory management by databases, Keeping everything in memory VisiCalc (spreadsheets), Designing Applications Around Dataflow vnodes (partitioning), Partitioning Voice over IP (VoIP), Network congestion and queueing Voldemort (database)building read-only stores in batch processes, Key-value stores as batch process output hash partitioning, Partitioning by Hash of Key-Partitioning by Hash of Key, Fixed number of partitions leaderless replication, Leaderless Replication multi-datacenter support, Multi-datacenter operation rebalancing, Operations: Automatic or Manual Rebalancing reliance on read repair, Read repair and anti-entropy sloppy quorums, Sloppy Quorums and Hinted Handoff VoltDB (database)cross-partition serializability, Partitioning deterministic stored procedures, Pros and cons of stored procedures in-memory storage, Keeping everything in memory output streams, API support for change streams secondary indexes, Partitioning Secondary Indexes by Document serial execution of transactions, Actual Serial Execution statement-based replication, Statement-based replication, Rebuilding state after a failure transactions in stream processing, Atomic commit revisited W WAL (write-ahead log), Making B-trees reliable web services (see services) Web Services Description Language (WSDL), Web services webhooks, Direct messaging from producers to consumers webMethods (messaging), Message brokers WebSocket (protocol), Pushing state changes to clients windows (stream processing), Stream analytics, Reasoning About Time-Types of windowsinfinite windows for changelogs, Maintaining materialized views, Stream-table join (stream enrichment) knowing when all events have arrived, Knowing when you’re ready stream joins within a window, Stream-stream join (window join) types of windows, Types of windows winners (conflict resolution), Converging toward a consistent state WITH RECURSIVE syntax (SQL), Graph Queries in SQL workflows (MapReduce), MapReduce workflowsoutputs, The Output of Batch Workflows-Philosophy of batch process outputskey-value stores, Key-value stores as batch process output search indexes, Building search indexes with map-side joins, MapReduce workflows with map-side joins working set, Sorting versus in-memory aggregation write amplification, Advantages of LSM-trees write path (derived data), Observing Derived State write skew (transaction isolation), Write Skew and Phantoms-Materializing conflictscharacterizing, Write Skew and Phantoms-Phantoms causing write skew, Decisions based on an outdated premise examples of, Write Skew and Phantoms, More examples of write skew materializing conflicts, Materializing conflicts occurrence in practice, Maintaining integrity in the face of software bugs phantoms, Phantoms causing write skew preventingin snapshot isolation, Decisions based on an outdated premise-Detecting writes that affect prior reads in two-phase locking, Predicate locks-Index-range locks options for, Characterizing write skew write-ahead log (WAL), Making B-trees reliable, Write-ahead log (WAL) shipping writes (database)atomic write operations, Atomic write operations detecting writes affecting prior reads, Detecting writes that affect prior reads preventing dirty writes with read committed, No dirty writes WS-* framework, Web services(see also services) WS-AtomicTransaction (2PC), Introduction to two-phase commit X XA transactions, Introduction to two-phase commit, XA transactions-Limitations of distributed transactionsheuristic decisions, Recovering from coordinator failure limitations of, Limitations of distributed transactions xargs (Unix tool), Simple Log Analysis, A uniform interface XMLbinary variants, Binary encoding encoding RDF data, The RDF data model for application data, issues with, JSON, XML, and Binary Variants in relational databases, The Object-Relational Mismatch, Convergence of document and relational databases XSL/XPath, Declarative Queries on the Web Y Yahoo!


pages: 419 words: 102,488

Chaos Engineering: System Resiliency in Practice by Casey Rosenthal, Nora Jones

Amazon Web Services, Asilomar, autonomous vehicles, barriers to entry, blockchain, business continuity plan, business intelligence, business process, cloud computing, complexity theory, continuous integration, cyber-physical system, database schema, DevOps, fault tolerance, hindsight bias, Kubernetes, linear programming, loose coupling, microservices, MITM: man-in-the-middle, node package manager, pull request, ransomware, risk tolerance, Silicon Valley, six sigma, Skype, software as a service, statistical model, the scientific method, WebSocket

Try, Try Again (for Safety) In early 2019 we planned a series of ten exercises to demonstrate Slack’s tolerance of zonal failures and network partitions in AWS. One of these exercises concerned Channel Server, a system responsible for broadcasting newly sent messages and metadata to all connected Slack client WebSockets. The goal was simply to partition 25% of the Channel Servers from the network to observe that the failures were detected and the instances were replaced by spares. The first attempt to create this network partition failed to fully account for the overlay network that provides transparent transit encryption.


pages: 540 words: 103,101

Building Microservices by Sam Newman

airport security, Amazon Web Services, anti-pattern, business process, call centre, continuous integration, create, read, update, delete, defense in depth, don't repeat yourself, Edward Snowden, fault tolerance, index card, information retrieval, Infrastructure as a Service, inventory management, job automation, Kubernetes, load shedding, loose coupling, microservices, MITM: man-in-the-middle, platform as a service, premature optimization, pull request, recommendation engine, social graph, software as a service, source of truth, sunk-cost fallacy, the built environment, web application, WebSocket

The overhead of HTTP for each request may also be a concern for low-latency requirements. HTTP, while it can be suited well to large volumes of traffic, isn’t great for low-latency communications when compared to alternative protocols that are built on top of Transmission Control Protocol (TCP) or other networking technology. Despite the name, WebSockets, for example, has very little to do with the Web. After the initial HTTP handshake, it’s just a TCP connection between client and server, but it can be a much more efficient way for you to stream data for a browser. If this is something you’re interested in, note that you aren’t really using much of HTTP, let alone anything to do with REST.


pages: 960 words: 125,049

Mastering Ethereum: Building Smart Contracts and DApps by Andreas M. Antonopoulos, Gavin Wood Ph. D.

Amazon Web Services, bitcoin, blockchain, continuous integration, cryptocurrency, Debian, Dogecoin, domain-specific language, don't repeat yourself, Edward Snowden, en.wikipedia.org, Ethereum, ethereum blockchain, fault tolerance, fiat currency, Firefox, functional programming, Google Chrome, intangible asset, Internet of things, litecoin, move fast and break things, move fast and break things, node package manager, peer-to-peer, Ponzi scheme, prediction markets, pull request, QR code, Ruby on Rails, Satoshi Nakamoto, sealed-bid auction, sharing economy, side project, smart contracts, transaction costs, Turing complete, Turing machine, Vickrey auction, web application, WebSocket

The directory has the following structure and contents: frontend/ |-- build | |-- build.js | |-- check-versions.js | |-- logo.png | |-- utils.js | |-- vue-loader.conf.js | |-- webpack.base.conf.js | |-- webpack.dev.conf.js | `-- webpack.prod.conf.js |-- config | |-- dev.env.js | |-- index.js | `-- prod.env.js |-- index.html |-- package.json |-- package-lock.json |-- README.md |-- src | |-- App.vue | |-- components | | |-- Auction.vue | | `-- Home.vue | |-- config.js | |-- contracts | | |-- AuctionRepository.json | | `-- DeedRepository.json | |-- main.js | |-- models | | |-- AuctionRepository.js | | |-- ChatRoom.js | | `-- DeedRepository.js | `-- router | `-- index.js Once you have deployed the contracts, edit the frontend configuration in frontend/src/config.js and enter the addresses of the DeedRepository and AuctionRepository contracts, as deployed. The frontend application also needs access to an Ethereum node offering a JSON-RPC and WebSockets interface. Once you’ve configured the frontend, launch it with a web server on your local machine: $ npm install $ npm run dev The Auction DApp frontend will launch and will be accessible via any web browser at http://localhost:8080. If all goes well you should see the screen shown in Figure 12-3, which illustrates the Auction DApp running in a web browser.


pages: 1,025 words: 150,187

ZeroMQ by Pieter Hintjens

AGPL, anti-pattern, carbon footprint, cloud computing, Debian, distributed revision control, domain-specific language, eat what you kill, factory automation, fault tolerance, fear of failure, finite state, Internet of things, iterative process, premature optimization, profit motive, pull request, revision control, RFC: Request For Comment, Richard Stallman, Skype, smart transportation, software patent, Steve Jobs, Valgrind, WebSocket

For application developers, HTTP is perhaps the one solution to have been simple enough to work, but it arguably makes the problem worse by encouraging developers and architects to think in terms of big servers and thin, stupid clients. So today people are still connecting applications using raw UDP and TCP, proprietary protocols, HTTP, and WebSockets. It remains painful, slow, hard to scale, and essentially centralized. Distributed peer-to-peer architectures are mostly for play, not work. How many applications use Skype or BitTorrent to exchange data? Which brings us back to the science of programming. To fix the world, we needed to do two things.


Seeking SRE: Conversations About Running Production Systems at Scale by David N. Blank-Edelman

Affordable Care Act / Obamacare, algorithmic trading, Amazon Web Services, backpropagation, bounce rate, business continuity plan, business process, cloud computing, cognitive bias, cognitive dissonance, commoditize, continuous integration, crowdsourcing, dark matter, database schema, Debian, defense in depth, DevOps, domain-specific language, en.wikipedia.org, fault tolerance, fear of failure, friendly fire, game design, Grace Hopper, information retrieval, Infrastructure as a Service, Internet of things, invisible hand, iterative process, Kubernetes, loose coupling, Lyft, Marc Andreessen, microaggression, microservices, minimum viable product, MVC pattern, performance metric, platform as a service, pull request, RAND corporation, remote working, Richard Feynman, risk tolerance, Ruby on Rails, search engine result page, self-driving car, sentiment analysis, Silicon Valley, single page application, Snapchat, software as a service, software is eating the world, source of truth, the scientific method, Toyota Production System, web application, WebSocket, zero day

Some providers offer basic ESI support based on Specification 1.0, while others offer extensions and custom solutions. 4 SPA frameworks dynamically rewrite the current page rather than reloading the page from the server; that is, SPAs use client-side rendering instead of server-side rendering. They do so by loading HTML, JavaScript, and stylesheets in a single page load and caching those objects locally in the browser. Content is then loaded dynamically via WebSockets, polling over AJAX, server-sent events, or event-triggered AJAX calls. For a cached SPA, traditional page load timing using Navigation Timing API is incredibly low; however, the page is likely unusable without the dynamic content. Triggering a Resource Timing API in modern browsers well after expected page load times can help with the ability to diagnose and triage user experience issues. 5 Use of monitoring-based automation is growing among large companies.