performance metric

155 results back to index


pages: 204 words: 53,261

The Tyranny of Metrics by Jerry Z. Muller

Affordable Care Act / Obamacare, Atul Gawande, Cass Sunstein, Checklist Manifesto, Chelsea Manning, collapse of Lehman Brothers, corporate governance, Credit Default Swap, crowdsourcing, delayed gratification, deskilling, Edward Snowden, Erik Brynjolfsson, Frederick Winslow Taylor, George Akerlof, Hyman Minsky, intangible asset, Jean Tirole, job satisfaction, joint-stock company, joint-stock limited liability company, Moneyball by Michael Lewis explains big data, performance metric, price mechanism, RAND corporation, school choice, Second Machine Age, selection bias, Steven Levy, total factor productivity, transaction costs, WikiLeaks

Yet when we dig more deeply, we find that the metrics matter because of the way they are embedded into a larger institutional culture. Is the success of the Cleveland Clinic a function of the fact that the Clinic publishes its outcomes? Or is the Clinic eager to publicize its outcomes precisely because they are so impressive? In fact, the Cleveland Clinic was one of the world’s great medical institutions before the rise of performance metrics, and it maintains that standing in the age of performance metrics. But to conclude that there is a causal relationship between the clinic’s quality and the publication of its performance metrics is to fall prey to the fallacy of post hoc ergo propter hoc. The success may have far more to do with local conditions—the ways in which the organizational culture of the Cleveland Clinic makes use of metrics—than with quality measurement per se.10 Metrics at Geisinger are effective because of the way in which they are embedded in a larger system.

., 155, 156 omission of data, 23–24 organizational complexity and leadership, 44–47 over-measurement, 41–42 overregulation, 41–42 Patten, Simon, 32 pay for performance, 19; in business and finance, 137–45; extrinsic and intrinsic rewards and, 53–57; in medicine, 114–16; in New Public Management, 52; origins of, 29–31; in schools, 95–96; situations for successful use of, 179–80; Taylorism and, 31–32 Pentagon, the, 35–36 performance measurement, 8, 63–64, 74, 177, 180; college, 73–75; medicine, 2–5, 107, 123, 176; and transparency as enemy of performance, 159–65 Peters, Tom, 17 pharmaceutical industry, 140–42 Phelps, Edmund, 172 philanthropy and foreign aid, 153–56 philosophical critiques of metrics, 59–64 Pisano, Gary, 150–51 Polanyi, Michael, 59 policing, 125–29, 175 politics and government, 160–62; Bush’s use of performance metrics in, 11, 64, 89, 90; diplomacy and intelligence in, 162–65; higher education and (see higher education); Obama’s use of performance metrics and, 33, 81–82, 85, 94; Obsessive Measurement Disorder in, 155–56; public policy related to accountability and, 12, 41, 73; schools and (see schools); Thatcher’s use of performance metrics in, 56–57, 62–63, 73 Porter, Michael E., 107–8 practical tacit knowledge, 59–60 pretense of knowledge, 60 Princeton Review, 76 principal-agent theory, 49–51 productivity: increased numbers of college graduates and, 68; measuring academic, 78–80; metric fixation and costs of, 173 Pronovost, Peter, 109–10, 111–12, 176 ProPublica, 115, 116 public policy, 12, 41, 73 Public School Administration, 33 Race to the Top, 94–95, 100 Rand Corporation, 116, 131, 135 rankings, college, 75–78, 81 Rappaport, Alfred, 148 rationalism, 59–60 Ravitch, Diane, 89 remedial college courses, 70–71 Repenning, Nelson, 150 resistance to change, 46 rewarding of luck, 171 rewards, extrinsic and intrinsic, 53–57, 119–20, 137–38, 144 Rigas, John, 144 risk adjustment, 122 risk-taking, discouragement of, 62, 117–18, 171 rule cascades, 171 Sarbanes-Oxley Act of 2002, 144–45 SAT and ACT tests, 70 schools, 11, 24, 89, 175–76; achievement gap in, 20, 91, 96–99; costs of attempted gap-closing in, 99–101; paying for performance in, 95–96; problems and purported solution of NCLB for, 89–91; Race to the Top and, 94–95, 100; unintended consequences of NCLB for, 92–94.

Studies that demonstrate its lack of effectiveness are either ignored, or met with the assertion that what is needed is more data and better measurement. Metric fixation, which aspires to imitate science, too often resembles faith. All of that is not intended to claim that measurement is useless or intrinsically pernicious. One of the purposes of this book is to specify when performance metrics are genuinely useful—how to use metrics without the characteristic dysfunctions of metric fixation. The next chapter, “Recurring Flaws,” provides a taxonomy of the most frequent types of flaws in the use of performance metrics. Defining and labeling them will make it easier to refer back to them later. Then, in part II, we examine the origins of metric fixation and account for its spread and tenacity in spite of its frequent failures, in addition to exploring some of the deeper philosophical sources of its shortcomings.


Mastering Machine Learning With Scikit-Learn by Gavin Hackeling

computer vision, constrained optimization, correlation coefficient, Debian, distributed generation, iterative process, natural language processing, Occam's razor, optical character recognition, performance metric, recommendation engine

www.it-ebooks.info www.it-ebooks.info Table of Contents Preface Chapter 1: The Fundamentals of Machine Learning Learning from experience Machine learning tasks Training data and test data Performance measures, bias, and variance An introduction to scikit-learn Installing scikit-learn Installing scikit-learn on Windows Installing scikit-learn on Linux Installing scikit-learn on OS X Verifying the installation Installing pandas and matplotlib Summary Chapter 2: Linear Regression Simple linear regression Evaluating the fitness of a model with a cost function Solving ordinary least squares for simple linear regression Evaluating the model Multiple linear regression Polynomial regression Regularization Applying linear regression Exploring the data Fitting and evaluating the model Fitting models with gradient descent Summary www.it-ebooks.info 1 7 8 10 11 13 16 16 17 17 18 18 18 19 21 21 25 27 29 31 35 40 41 41 44 46 50 Table of Contents Chapter 3: Feature Extraction and Preprocessing 51 Chapter 4: From Linear Regression to Logistic Regression 71 Chapter 5: Nonlinear Classification and Regression with Decision Trees 97 Extracting features from categorical variables Extracting features from text The bag-of-words representation Stop-word filtering Stemming and lemmatization Extending bag-of-words with TF-IDF weights Space-efficient feature vectorizing with the hashing trick Extracting features from images Extracting features from pixel intensities Extracting points of interest as features SIFT and SURF Data standardization Summary Binary classification with logistic regression Spam filtering Binary classification performance metrics Accuracy Precision and recall Calculating the F1 measure ROC AUC Tuning models with grid search Multi-class classification Multi-class classification performance metrics Multi-label classification and problem transformation Multi-label classification performance metrics Summary Decision trees Training decision trees Selecting the questions Information gain Gini impurity Decision trees with scikit-learn Tree ensembles The advantages and disadvantages of decision trees Summary [ ii ] www.it-ebooks.info 51 52 52 55 56 59 62 63 63 65 67 69 70 72 73 76 77 79 80 81 84 86 90 91 94 95 97 99 100 103 108 109 112 113 114 Table of Contents Chapter 6: Clustering with K-Means 115 Chapter 7: Dimensionality Reduction with PCA 137 Chapter 8: The Perceptron 155 Chapter 9: From the Perceptron to Support Vector Machines 171 Chapter 10: From the Perceptron to Artificial Neural Networks 187 Clustering with the K-Means algorithm Local optima The elbow method Evaluating clusters Image quantization Clustering to learn features Summary An overview of PCA Performing Principal Component Analysis Variance, Covariance, and Covariance Matrices Eigenvectors and eigenvalues Dimensionality reduction with Principal Component Analysis Using PCA to visualize high-dimensional data Face recognition with PCA Summary Activation functions The perceptron learning algorithm Binary classification with the perceptron Document classification with the perceptron Limitations of the perceptron Summary Kernels and the kernel trick Maximum margin classification and support vectors Classifying characters in scikit-learn Classifying handwritten digits Classifying characters in natural images Summary Nonlinear decision boundaries Feedforward and feedback artificial neural networks Multilayer perceptrons Minimizing the cost function Forward propagation Backpropagation [ iii ] www.it-ebooks.info 117 123 124 128 130 132 135 137 142 142 143 146 149 150 153 157 158 159 166 167 169 172 176 179 179 182 185 188 189 189 191 192 198 Table of Contents Approximating XOR with Multilayer perceptrons Classifying handwritten digits Summary Index [ iv ] www.it-ebooks.info 212 213 214 217 Preface Recent years have seen the rise of machine learning, the study of software that learns from experience.

I can call up to book Prediction: ham. Message: Hi, can i please get a <#> dollar loan from you. I.ll pay you back by mid february. Pls. Prediction: ham. Message: Where do you need to go to get it? How well does our classifier perform? The performance metrics we used for linear regression are inappropriate for this task. We are only interested in whether the predicted class was correct, not how far it was from the decision boundary. In the next section, we will discuss some performance metrics that can be used to evaluate binary classifiers. Binary classification performance metrics A variety of metrics exist to evaluate the performance of binary classifiers against trusted labels. The most common metrics are accuracy, precision, recall, F1 measure, and ROC AUC score. All of these measures depend on the concepts of true positives, true negatives, false positives, and false negatives.

The final prediction is the union of the predictions from all of the binary classifiers. The transformed training data is shown in the previous figure. This problem transformation ensures that the single-label problems will have the same number of training examples as the multilabel problem, but ignores relationships between the labels. Multi-label classification performance metrics Multi-label classification problems must be assessed using different performance measures than single-label classification problems. Two of the most common performance metrics are Hamming loss and Jaccard similarity. Hamming loss is the average fraction of incorrect labels. Note that Hamming loss is a loss function, and that the perfect score is zero. Jaccard similarity, or the Jaccard index, is the size of the intersection of the predicted labels and the true labels divided by the size of the union of the predicted and true labels.


The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling by Ralph Kimball, Margy Ross

active measures, Albert Einstein, business intelligence, business process, call centre, cloud computing, data acquisition, discrete time, inventory management, iterative process, job automation, knowledge worker, performance metric, platform as a service, side project, zero-sum game

Multiple study groups can be defined and derivative study groups can be created with intersections, unions, and set differences. Chapter 8 Customer Relationship Management Aggregated Facts as Dimension Attributes Business users are often interested in constraining the customer dimension based on aggregated performance metrics, such as filtering on all customers who spent over a certain dollar amount during last year or perhaps over the customer's lifetime. Selected aggregated facts can be placed in a dimension as targets for constraining and as row labels for reporting. The metrics are often presented as banded ranges in the dimension table. Dimension attributes representing aggregated performance metrics add burden to the ETL processing, but ease the analytic burden in the BI layer. Chapter 8 Customer Relationship Management Dynamic Value Bands A dynamic value banding report is organized as a series of report row headers that define a progressive set of varying-sized ranges of a target numeric fact.

These en masse changes are prime candidates because business users often want the ability to analyze performance metrics using either the pre- or post-hierarchy reorganization for a period of time. With type 3 changes, the prior column is labeled to distinctly represent the prechanged grouping, such as 2012 department or pre-merger department. These column names provide clarity, but there may be unwanted ripples in the BI layer. Finally, if the type 3 attribute represents a hierarchical rollup level within the dimension, then as discussed with type 1, the type 3 update and additional column would likely cause OLAP cubes to be reprocessed. Multiple Type 3 Attributes If a dimension attribute changes with a predictable rhythm, sometimes the business wants to summarize performance metrics based on any of the historic attribute values. Imagine the product line is recategorized at the start of every year and the business wants to look at multiple years of historic facts based on the department assignment for the current year or any prior year.

Bus Matrix for HR Processes Although an employee dimension with precise type 2 slowly changing dimension tracking coupled with a monthly periodic snapshot of core HR performance metrics is a good start, they just scratch the surface when it comes to tracking HR data. Figure 9.4 illustrates other processes that HR professionals and functional managers are likely keen to analyze. We've embellished this preliminary bus matrix with the type of fact table that might be used for each process; however, your source data realities and business requirements may warrant a different or complementary treatment. Figure 9.4 Bus matrix rows for HR processes. Some of these business processes capture performance metrics, but many result in factless fact tables, such as benefit eligibility or participation. Packaged Analytic Solutions and Data Models Many organizations purchase a vendor solution to address their operational HR application needs.


Text Analytics With Python: A Practical Real-World Approach to Gaining Actionable Insights From Your Data by Dipanjan Sarkar

bioinformatics, business intelligence, computer vision, continuous integration, en.wikipedia.org, general-purpose programming language, Guido van Rossum, information retrieval, Internet of things, invention of the printing press, iterative process, natural language processing, out of africa, performance metric, premature optimization, recommendation engine, self-driving car, semantic web, sentiment analysis, speech recognition, statistical model, text mining, Turing test, web application

# predict sentiment for test movie reviews dataset sentiwordnet_predictions = [analyze_sentiment_sentiwordnet_lexicon(review) for review in test_reviews] from utils import display_evaluation_metrics, display_confusion_matrix, display_classification_report # get model performance statistics In [295]: print 'Performance metrics:' ...: display_evaluation_metrics(true_labels=test_sentiments, ...: predicted_labels=sentiwordnet_predictions, ...: positive_class='positive') ...: print '\nConfusion Matrix:' ...: display_confusion_matrix(true_labels=test_sentiments, ...: predicted_labels=sentiwordnet_predictions, ...: classes=['positive', 'negative']) ...: print '\nClassification report:' ...: display_classification_report(true_labels=test_sentiments, ...: predicted_labels=sentiwordnet_predictions, ...: classes=['positive', 'negative']) Performance metrics: Accuracy: 0.59 Precision: 0.56 Recall: 0.92 F1 Score: 0.7 Confusion Matrix: Predicted: positive negative Actual: positive 6941 569 negative 5510 1980 Classification report: precision recall f1-score support positive 0.56 0.92 0.70 7510 negative 0.78 0.26 0.39 7490 avg / total 0.67 0.59 0.55 15000 Our model has a sentiment prediction accuracy of around 60% and an F1-score of 70% approximately.

The following snippet shows the model sentiment prediction performance on the entire test movie reviews dataset: # predict sentiment for test movie reviews dataset vader_predictions = [analyze_sentiment_vader_lexicon(review, threshold=0.1) for review in test_reviews] # get model performance statistics In [302]: print 'Performance metrics:' ...: display_evaluation_metrics(true_labels=test_sentiments, ...: predicted_labels=vader_predictions, ...: positive_class='positive') ...: print '\nConfusion Matrix:' ...: display_confusion_matrix(true_labels=test_sentiments, ...: predicted_labels=vader_predictions, ...: classes=['positive', 'negative']) ...: print '\nClassification report:' ...: display_classification_report(true_labels=test_sentiments, ...: predicted_labels=vader_predictions, ...: classes=['positive', 'negative']) Performance metrics: Accuracy: 0.7 Precision: 0.65 Recall: 0.86 F1 Score: 0.74 Confusion Matrix: Predicted: positive negative Actual: positive 6434 1076 negative 3410 4080 Classification report: precision recall f1-score support positive 0.65 0.86 0.74 7510 negative 0.79 0.54 0.65 7490 avg / total 0.72 0.70 0.69 15000 The preceding metrics depict that our model has a sentiment prediction accuracy of around 70 percent and an F1-score close to 75 percent, which is definitely better than our previous model.

The following snippet achieves the same: # predict sentiment for test movie reviews dataset pattern_predictions = [analyze_sentiment_pattern_lexicon(review, threshold=0.1) for review in test_reviews] # get model performance statistics In [307]: print 'Performance metrics:' ...: display_evaluation_metrics(true_labels=test_sentiments, ...: predicted_labels=pattern_predictions, ...: positive_class='positive') ...: print '\nConfusion Matrix:' ...: display_confusion_matrix(true_labels=test_sentiments, ...: predicted_labels=pattern_predictions, ...: classes=['positive', 'negative']) ...: print '\nClassification report:' ...: display_classification_report(true_labels=test_sentiments, ...: predicted_labels=pattern_predictions, ...: classes=['positive', 'negative']) Performance metrics: Accuracy: 0.77 Precision: 0.76 Recall: 0.79 F1 Score: 0.77 Confusion Matrix: Predicted: positive negative Actual: positive 5958 1552 negative 1924 5566 Classification report: precision recall f1-score support positive 0.76 0.79 0.77 7510 negative 0.78 0.74 0.76 7490 avg / total 0.77 0.77 0.77 15000 This model gives a better and more balanced performance toward predicting the sentiment of both positive and negative classes.


pages: 263 words: 75,455

Quantitative Value: A Practitioner's Guide to Automating Intelligent Investment and Eliminating Behavioral Errors by Wesley R. Gray, Tobias E. Carlisle

activist fund / activist shareholder / activist investor, Albert Einstein, Andrei Shleifer, asset allocation, Atul Gawande, backtesting, beat the dealer, Black Swan, business cycle, butter production in bangladesh, buy and hold, capital asset pricing model, Checklist Manifesto, cognitive bias, compound rate of return, corporate governance, correlation coefficient, credit crunch, Daniel Kahneman / Amos Tversky, discounted cash flows, Edward Thorp, Eugene Fama: efficient market hypothesis, forensic accounting, hindsight bias, intangible asset, Louis Bachelier, p-value, passive investing, performance metric, quantitative hedge fund, random walk, Richard Thaler, risk-adjusted returns, Robert Shiller, Robert Shiller, shareholder value, Sharpe ratio, short selling, statistical model, survivorship bias, systematic trading, The Myth of the Rational Market, time value of money, transaction costs

When we examine the price ratios on a factor-adjusted basis using CAPM alpha, we again find that EBIT enterprise multiple is a top-performing metric, showing statistically and economically significant alpha of 5.23 percent for the top decile stocks. Here, the alternative EBITDA enterprise yield, earnings yield, and gross profits yield also perform well. BM and the free cash flow yield show smaller alphas than the other metrics. The EBIT enterprise multiple shines on a risk-adjusted basis using the Sharpe and Sortino ratios. The EBIT enterprise multiple shows a Sharpe ratio, which calculates risk-to-reward by examining excess return against volatility, of 0.58. When we examine the metric's risk/reward ratio using the Sortino ratio, which ignores upside volatility, and measures only excess return against downside volatility, we again find the augmented enterprise multiple to be the best-performed metric, with a Sortino ratio of 0.89.

Figure 1.1 sets out a brief graphical overview of the performance of the cheapest stocks according to common fundamental price ratios, such as the price-to-earnings (P/E) ratio, the price-to-book (P/B) ratio, and the EBITDA enterprise multiple (total enterprise value divided by earnings before interest, taxes, depreciation, and amortization, or TEV/EBITDA). FIGURE 1.1 Cumulative Returns to Common Price Ratios As Figure 1.1 illustrates, value investing according to simple fundamental price ratios has cumulatively beaten the S&P 500 over almost 50 years. Table 1.1 shows some additional performance metrics for the price ratios. The numbers illustrate that value strategies have been very successful (Chapter 7 has a detailed discussion of our method of our investment simulation procedures). TABLE 1.1 Long-Term Performance of Common Price Ratios (1964 to 2011) The counterargument to the empirical outperformance of value stocks is that these stocks are inherently more risky. In this instance, risk is defined as the additional volatility of the value stocks.

To this end, we focus our quantitative metrics on long-term averages for a set of simple measures. We have chosen eight years as our “long term” for two reasons: First, eight years likely captures a boom-and-bust cycle for the typical stock, and, second, there are sufficient stocks with eight years of historical data that we can identify a sufficiently large universe of stocks.9 We analyze three long-term, high-return operating performance metrics and rank these variables against the entire universe of stocks: long-term free cash flow on assets, long-term geometric return on assets, and long-term geometric return on capital, discussed next. The first measure is long-term free cash flow on assets (CFOA), defined as the sum of eight years of free cash flow divided by total assets. The measure can be expressed more formally as follows: CFOA = Sum (Eight Years Free Cash Flow) / Total Assets We define free cash flow as net income + depreciation and amortization − changes in working capital − capital expenditures.


pages: 321

Finding Alphas: A Quantitative Approach to Building Trading Strategies by Igor Tulchinsky

algorithmic trading, asset allocation, automated trading system, backtesting, barriers to entry, business cycle, buy and hold, capital asset pricing model, constrained optimization, corporate governance, correlation coefficient, credit crunch, Credit Default Swap, discounted cash flows, discrete time, diversification, diversified portfolio, Eugene Fama: efficient market hypothesis, financial intermediation, Flash crash, implied volatility, index arbitrage, index fund, intangible asset, iterative process, Long Term Capital Management, loss aversion, market design, market microstructure, merger arbitrage, natural language processing, passive investing, pattern recognition, performance metric, popular capitalism, prediction markets, price discovery process, profit motive, quantitative trading / quantitative finance, random walk, Renaissance Technologies, risk tolerance, risk-adjusted returns, risk/return, selection bias, sentiment analysis, shareholder value, Sharpe ratio, short selling, Silicon Valley, speech recognition, statistical arbitrage, statistical model, stochastic process, survivorship bias, systematic trading, text mining, transaction costs, Vanguard fund, yield curve

The graph in Figure 31.2 is an example of a good alpha. Sample only 3,000k 2,500k 1,500k PNL Sharpe ratio 2,000k 1,000k 500k 0 –500k 2013–07 2014–01 2014–07 2015–01 2015–07 2016–01 2016–07 2017–01 2017–07 2018–01 Figure 31.2 1 PnL graph for sample WebSim alpha1 Alpha = rank (sales/assets). Table 31.1 Performance metrics for sample WebSim alpha in Figure 31.2 260 Finding Alphas In addition, numerous metrics are displayed, giving the user an opportunity to evaluate the aggregate performance of the alpha, as shown in Table 31.1. These performance metrics reflect the distribution of capital across the stocks and the alpha’s performance, including the annual and aggregate PnL, Sharpe ratio, turnover, and other parameters. The first thing to consider is whether the alpha is profitable. Were the PnL and returns adequate? The Sharpe ratio is a measure of the risk-adjusted returns (returns/ volatility).

Turnover is a measure of the volume of trading required to reach the alpha’s desired positions over the simulation period. Each trade in or out of a position carries transaction costs (fees and spread costs). If the turnover number is high – for example, over 40% – the transaction costs may eradicate some or all of the PnL that the alpha generated during simulation. The other performance metrics and their uses in evaluating alpha performance are discussed in more detail in the WebSim user guides and in videos in the educational section of the website. In addition to the aggregate performance metrics, WebSim data visualization charts and graphs help to confirm that an alpha has an acceptable distribution of positions and returns across equities grouped by capitalization, industry, or sector. If established thresholds are met, alphas can be processed in out-­ of-­ sample testing using more-current data to confirm the validity of the idea.

We can use the same mean-reversion idea mentioned above and express it in terms of a mathematical expression as follows: Alpha1 close today close 5 _ days _ ago / close 5 _ days _ ago To find out if this idea works, we need a simulator to do backtesting. We can use WebSim for this purpose. Using WebSim, we get the sample results for this alpha, as shown in Figure 5.1. Table 5.1 shows several performance metrics used to evaluate an alpha. We focus on the most important metrics. The backtesting is done from 2010 through 2015, so each row of the output lists the annual performance of that year. The total simulation book size is always fixed at $20 million; the PnL is the annual PnL. Cumulative profit $15MM $10MM $5MM 1 /1 2 04 /1 2 07 /1 2 10 /1 2 01 /1 3 04 /1 3 07 /1 3 10 /1 3 01 /1 4 04 /1 4 07 /1 4 10 /1 4 01 /1 5 01 1 /1 10 1 /1 07 /1 1 Figure 5.1 04 0 /1 01 0 /1 10 /1 07 01 /1 0 $0 Sample simulation result of Alpha1 by WebSim How to Develop an Alpha: A Case Study35 Annual return is defined as: Ann_return ann_pnl / booksize / 2 The annual return measures the profitability of the alpha.


pages: 597 words: 119,204

Website Optimization by Andrew B. King

AltaVista, bounce rate, don't be evil, en.wikipedia.org, Firefox, In Cold Blood by Truman Capote, information retrieval, iterative process, Kickstarter, medical malpractice, Network effects, performance metric, search engine result page, second-price auction, second-price sealed-bid, semantic web, Silicon Valley, slashdot, social graph, Steve Jobs, web application

You can then retrieve revenue information about your conversions by running a report in the AdWords interface that opts to include value information for conversion columns. Tracking and Metrics You should track the success of all PPC elements through website analytics and conversion tracking. Google offers a free analytics program called Google Analytics. With it you can track multiple campaigns and get separate data for organic and paid listings. Whatever tracking program you use, you have to be careful to keep track of performance metrics correctly. The first step in optimizing a PPC campaign is to use appropriate metrics. Profitable campaigns with equally valued conversions might be optimized to: Reduce the CPC given the same (or greater) click volume and conversion rates. Increase the CTR given the same (or a greater) number of impressions and the same (or better) conversion rates. Increase conversion rates given the same (or a greater) number of clicks.

ComScore, http://www.comscore.com/request/cookie_deletion_white_paper.pdf (accessed February 5, 2008). According to the study, "Approximately 31 percent of U.S. computer users clear their first-party cookies in a month " Under these conditions, a server-centric measurement would overestimate unique visitors by 150%. [166] PathLoss is a metric developed by Paul Holstein of CableOrganizer.com. Web Performance Metrics At first glance, measuring the speed of a web page seems straightforward. Start a timer. Load up the page. Click Stop when the web page is "ready." Write down the time. For users, however, "ready" varies across different browsers on different connection speeds (dial-up, DSL, cable, LAN) at different locations (Washington, DC, versus Mountain View, California, versus Bangalore, India) at different times of the day (peak versus off-peak times) and from different browse paths (fresh from search results or accessed from a home page).

Tip If you have a machine dedicated to performance analysis, use about:blank as your home page. IBM Page Detailer IBM Page Detailer is a Windows tool that sits quietly in the background as you browse. It captures snapshots of how objects are loading on the page behind the scenes. Download it from http://www.alphaworks.ibm.com/tech/pagedetailer/download. IBM Page Detailer captures three basic performance metrics: load time, bytes, and items. These correlate to the Document Complete, kilobytes received, and number of requests metrics we are tracking. We recommend capturing three to five page loads and averaging the metrics to ensure that no anomalies impacted performance in the data, such as a larger ad. It is important, however, to note the occurrence and work to mitigate such anomalies. Table 10-2 shows our averaged results.


pages: 372 words: 67,140

Jenkins Continuous Integration Cookbook by Alan Berg

anti-pattern, continuous integration, Debian, don't repeat yourself, en.wikipedia.org, Firefox, job automation, performance metric, revision control, web application

Consider storing User-Agents and other browser headers in a textfile, and then picking the values up for HTTP requests through the CSV Data Set Config element. This is useful if resources returned to your web browser, such as JavaScript or images, depend on the User-Agents. JMeter can then loop through the User-Agents, asserting that the resources exist. See also Reporting JMeter performance metrics Functional testing using JMeter assertions Reporting JMeter performance metrics In this recipe, you will be shown how to configure Jenkins to run a JMeter test plan, and then collect and report the results. The passing of variables from an Ant script to JMeter will also be explained. Getting ready It is assumed that you have run through the last recipe, Creating JMeter test plans. You will also need to install the Jenkins performance plugin (https://wiki.jenkins-ci.org/display/JENKINS/Performance+Plugin).

See also Looking for "smelly" code through code coverage Activating more PMD rulesets Interpreting JavaNCSS Chapter 6. Testing Remotely In this chapter, we will cover the following recipes: Deploying a WAR file from Jenkins to Tomcat Creating multiple Jenkins nodes Testing with Fitnesse Activating Fitnesse HtmlUnit Fixtures Running Selenium IDE tests Triggering failsafe integration tests with Selenium Webdriver Creating JMeter test plans Reporting JMeter performance metrics Functional testing using JMeter assertions Enabling Sakai web services Writing test plans with SoapUI Reporting SoapUI test results Introduction By the end of this chapter, you will have ran performance and functional tests against web applications and web services. Two typical setup recipes are included. The first is the deployment of a war file through Jenkins to an application server.

This allows JMeter to fail Jenkins builds based on a range of JMeter tests. This approach is especially important when starting from an HTML mockup of a web application, whose underlying code is changing rapidly. The test plan logs in and out of your local instance of Jenkins, checking size, duration, and text found in the login response. Getting ready We assume that you have already performed the Creating JMeter test plans and Reporting JMeter performance metrics recipes. The recipe requires the creation of a user tester1 in Jenkins. Feel free to change the username and password. Remember to delete the test user once it is no longer needed. How to do it... Create a user in Jenkins named tester1 with password testtest. Run JMeter. In the Test Plan element, change Name to LoginLogoutPlan, and add the following details for User Defined Variables:Name: USER; Value:tester1 Name: PASS; Value:testtest Right-click on Test Plan, then select Add | Config Element | HTTP cookie Manager.


pages: 502 words: 107,510

Natural Language Annotation for Machine Learning by James Pustejovsky, Amber Stubbs

Amazon Mechanical Turk, bioinformatics, cloud computing, computer vision, crowdsourcing, easy for humans, difficult for computers, finite state, game design, information retrieval, iterative process, natural language processing, pattern recognition, performance metric, sentiment analysis, social web, speech recognition, statistical model, text mining

Again, using some of the features that are identified in Natural Language Processing with Python, we have:[2] F1: last_letter = “a” F2: last_letter = “k” F3: last_letter = “f” F4: last_letter = “r” F5: last_letter = “y” F6: last_2_letters = “yn” Choose a learning algorithm to infer the target function from the experience you provide it with. We will start with the decision tree method. Evaluate the results according to the performance metric you have chosen. We will use accuracy over the resultant classifications as a performance metric. But, now, where do we start? That is, which feature do we use to start building our tree? When using a decision tree to partition your data, this is one of the most difficult questions to answer. Fortunately, there is a very nice way to assess the impact of choosing one feature over another. It is called information gain and is based on the notion of entropy from information theory.

Choose how to represent the target function. We will assume that target function is represented as the MAP of the Bayesian classifier over the features. Choose a learning algorithm to infer the target function from the experience you provide it with. This is tied to the way we chose to represent the function, namely: Evaluate the results according to the performance metric you have chosen. We will use accuracy over the resultant classifications as a performance metric. Sentiment classification Now let’s look at some classification tasks where different feature sets resulting from richer annotation have proved to be helpful for improving results. We begin with sentiment or opinion classification of texts. This is really two classification tasks: first, distinguishing fact from opinion in language; and second, if a text is an opinion, determining the sentiment conveyed by the opinion holder, and what object it is directed toward.

We will learn when to use each of these classes, as well as which algorithms are most appropriate for each feature type. In particular, we will answer the following question: when does annotation actually help in a learning algorithm? Defining Our Learning Task To develop an algorithm, we need to have a precise representation of what we are trying to learn. We’ll start with Tom Mitchell’s [1] definition of a learning task: Learning involves improving on a task, T, with respect to a performance metric, P, based on experience, E. Given this statement of the problem (inspired by Simon’s concise phrasing shown earlier), Mitchell then discusses the five steps involved in the design of a learning system. Consider what the role of a specification and the associated annotated data will be for each of the following steps for designing a learning system: Choose the “training experience.” For our purposes, this is the corpus that you just built.


pages: 233 words: 67,596

Competing on Analytics: The New Science of Winning by Thomas H. Davenport, Jeanne G. Harris

always be closing, big data - Walmart - Pop Tarts, business intelligence, business process, call centre, commoditize, data acquisition, digital map, en.wikipedia.org, global supply chain, high net worth, if you build it, they will come, intangible asset, inventory management, iterative process, Jeff Bezos, job satisfaction, knapsack problem, late fees, linear programming, Moneyball by Michael Lewis explains big data, Netflix Prize, new economy, performance metric, personalized medicine, quantitative hedge fund, quantitative trading / quantitative finance, recommendation engine, RFID, search inside the book, shareholder value, six sigma, statistical model, supply-chain management, text mining, the scientific method, traveling salesman, yield management

Unintegrated systems 2 Localized analytics Autonomous activity builds experience and confidence using analytics; creates new analytically based insights Disconnected, very narrow focus Pockets of isolated analysts (may be in finance, SCM, or marketing/CRM) Functional and tactical Desire for more objective data, successes from point use of analytics start to get attention Recent transaction data unintegrated, missing important information. Isolated BI/analytic efforts 3 Analytical aspirations Coordinated; establish enterprise performance metrics, build analytically based insights Mostly separate analytic processes. Building enterpriselevel plan Analysts in multiple areas of business but with limited interaction Executive—early stages of awareness of competitive possibilities Executive support for fact-based culture—may meet considerable resistance Proliferation of BI tools. Data marts/data warehouse established/expands 4 Analytical companies Change program to develop integrated analytical processes and applications and build analytical capabilities Some embedded analytics processes Skills exist, but often not aligned to right level/right role Broad C-suite support Change management to build a fact-based culture High-quality data.

Bolstered by a series of smaller successes, management should set its sights on using analytics in the company’s distinctive capability and addressing strategic business problems. For the first time, program benefits should be defined in terms of improved business performance and care should be taken to measure progress against broad business objectives. A critical element of stage 3 is defining a set of achievable performance metrics and putting the processes in place to monitor progress. To focus scarce resources appropriately, the organization may create a centralized “business intelligence competency center” to foster and support analytical activities. In stage 3, companies will launch their first major project to use analytics in their distinctive capability. The application of more sophisticated analytics may require specialized analytical expertise and adding new analytical technology.

After analyzing these issues, the team concluded that an enterprise-wide focus on analytics would not only eliminate the majority of these problems but also uncover cross-selling opportunities. The team realized that a major obstacle to building an enterprise-level analytical capability would be resistance from department heads. Their performance measures were based on the assets of their departments, not on enterprise-wide metrics. The bank’s senior management team responded by introducing new performance metrics that would assess overall enterprise performance (including measures related to asset size and profitability) and cross-departmental cooperation. These changes cleared the path for an enterprise-wide initiative to improve BankCo’s analytical orientation, beginning with the creation of an integrated and consistent customer database (to the extent permitted by law) as well as coordinated retail, trust, and brokerage marketing campaigns.


pages: 294 words: 77,356

Automating Inequality by Virginia Eubanks

autonomous vehicles, basic income, business process, call centre, cognitive dissonance, collective bargaining, correlation does not imply causation, deindustrialization, disruptive innovation, Donald Trump, Elon Musk, ending welfare as we know it, experimental subject, housing crisis, IBM and the Holocaust, income inequality, job automation, mandatory minimum, Mark Zuckerberg, mass incarceration, minimum wage unemployment, mortgage tax deduction, new economy, New Urbanism, payday loans, performance metric, Ronald Reagan, self-driving car, statistical model, strikebreaker, underbanked, universal basic income, urban renewal, War on Poverty, working poor, Works Progress Administration, young professional, zero-sum game

Each month the number of verification documents that vanished—were not attached properly to digital case files in a process called “indexing”—rose exponentially. According to court documents, in December 2007 just over 11,000 documents were unindexed. By February 2009, nearly 283,000 documents had disappeared, an increase of 2,473 percent. The rise in technical errors far outpaced increased system use. The consequences are staggering if you consider that any single missing document could cause an applicant to be denied benefits. Performance metrics designed to speed eligibility determinations created perverse incentives for call center workers to close cases prematurely. Timeliness could be improved by denying applications and then advising applicants to reapply, which required that they wait an additional 30 or 60 days for a new determination. Some administrative snafus were simple mistakes, integration problems, and technical glitches.

By depriving judges of the ultimate authority to impose just sentences, mandatory sentencing laws and guidelines put sentencing on auto-pilot.”15 Automated decision-making can change government for the better, and tracking program data may, in fact, help identify patterns of biased decision-making. But justice sometimes requires an ability to bend the rules. By removing human discretion from frontline social servants and moving it instead to engineers and private contractors, the Indiana experiment supercharged discrimination. The “social specs” for the automation were based on time-worn, race- and class-motivated assumptions about welfare recipients that were encoded into performance metrics and programmed into business processes: they are lazy and must be “prodded” into contributing to their own support, they are sneaky and prone to fraudulent claims, and their burdensome use of public resources must be repeatedly discouraged. Each of these assumptions relies on, and is bolstered by, race- and class-based stereotypes. Poor Black women like Omega Young paid the price. * * * New high-tech tools allow for more precise measuring and tracking, better sharing of information, and increased visibility of targeted populations.

The digital poorhouse raises barriers for poor and working-class people attempting to access shared resources. In Indiana, the combination of eligibility automation and privatization achieved striking reductions in the welfare rolls. Cumbersome administrative processes and unreasonable expectations kept people from accessing the benefits they were entitled to and deserved. Brittle rules and poorly designed performance metrics meant that when mistakes were made, they were always interpreted as the fault of the applicant, not the state or the contractor. The assumption that automated decision-making tools were infallible meant that computerized decisions trumped procedures intended to provide applicants with procedural fairness. The result was a million benefit denials. But unequivocal diversion can only ever have limited success.


pages: 280 words: 82,355

Extreme Teams: Why Pixar, Netflix, AirBnB, and Other Cutting-Edge Companies Succeed Where Most Fail by Robert Bruce Shaw, James Foster, Brilliance Audio

Airbnb, augmented reality, call centre, cloud computing, deliberate practice, Elon Musk, future of work, inventory management, Jeff Bezos, job satisfaction, Jony Ive, loose coupling, meta analysis, meta-analysis, nuclear winter, Paul Graham, peer-to-peer, peer-to-peer model, performance metric, Peter Thiel, sharing economy, Silicon Valley, social intelligence, Steve Jobs, Tony Hsieh

The vetting of new members is treated seriously because teams are rewarded in Whole Foods based on team performance in areas such as overall sales and profit per labor hour. A team bonus is paid monthly, which can result in thousands of extra dollars each year for the members of a successful group.4 Whole Foods then goes one step further. It posts each team’s monthly results for everyone to see. A produce team, for example, will see how it stacks up on key performance metrics compared to the meat or seafood teams within its own store. Team leaders can also compare their team’s performance against other teams across a region. New team members who do not pull their weight pose two risks. First, poor performers can reduce the bonus pay of all team members if the team’s results suffer. That gets everyone’s attention. Second, weak members can damage a team’s reputation, as each team’s results are posted within each store.

For instance, research shows that some people will work less diligently when part of a team, allowing others in their group to compensate for their lack of effort. Social scientists call this the “freeloader” or “social loafing” problem.16 In these situations, a few team members contribute less than others and yet benefit from being part of a team where others make up for their shortcomings. Whole Foods deals with this problem by having clear performance metrics and team-level rewards. These practices, along with other informal methods such as peer feedback, increase the likelihood that everyone will contribute to the success of his or her team. New hires at Whole Foods quickly learn that they are not simply employees of the company or accountable only to their managers—they are, above all else, working for each other with financial and reputational consequences if they don’t perform.

TAKEAWAYS Cutting-edge firms actively communicate the broader context to their members (market opportunities and threats, financial realities . . . ). They then clarify their vital few strategic priorities—the three or four goals that must be achieved to move the firm or team forward. These priorities are defined in a manner that ensures that everyone knows what success looks like, including performance metrics and accountabilities. Cutting-edge firms, however, also understand that too much focus can be self-defeating—thus, they foster ongoing experimentation in an attempt to identify innovative customer and revenue opportunities. CHAPTER 5 PUSH HARDER, PUSH SOFTER Every Great Culture Embraces a Great Contradiction Most firms operate with either a hard or soft edge.1 Those with a hard edge emphasize the need for clear performance targets, disciplined practices, and absolute accountability for results.


pages: 514 words: 111,012

The Art of Monitoring by James Turnbull

Amazon Web Services, anti-pattern, cloud computing, continuous integration, correlation does not imply causation, Debian, DevOps, domain-specific language, failed state, Kickstarter, Kubernetes, microservices, performance metric, pull request, Ruby on Rails, software as a service, source of truth, web application, WebSocket

We're going to configure each plugin in a separate file and store them in the /etc/collectd.d/ directory we've specified as the value of the Include option. This separation allows us to individually manage each plugin and lends itself to management with a configuration management tool like Puppet, Chef, or Ansible. Now let's configure each plugin. The cpu plugin The first plugin we're going to configure is the cpu plugin. The cpu plugin collects CPU performance metrics on our hosts. By default, the cpu plugin emits CPU metrics in Jiffies: the number of ticks since the host booted. We're going to also send something a bit more useful: percentages. First, we're going to create a file to hold our plugin configuration. We'll put it into the /etc/collectd.d directory. As a result of the Include option in the collectd.conf configuration file, collectd will load this file automatically

We'll see more of StatsD in Chapter 9 of the book. Diamond — An open-source metrics collector originally written by Brightcove but now maintained by a wider community. Fullerite — An open-source metrics collector written by the Yelp Engineering team. It's written in Go and designed for large scale metrics collection. PCP and Vector — Used by Netflix this combination provides high resolution on host performance metrics suitable for diagnostics. sumd — A lightweight Python collector that allows you to run processes, for example Nagios plugins, locally and send the results to Riemann. Note There's also some overlap with these tools and the collection and graphing tools we looked at in Chapters 3 and 4. Summary In our Riemann configuration we saw how we can make use of this data to monitor our hosts and their components, and how we can notify on specific events or thresholds.

The application architecture can require understanding the interconnection between multiple containers, instances, and hosts. Added to this, the lifespan of a container might be in seconds or minutes. This makes the traditional monitoring techniques used for a single host or instance problematic. From a monitoring perspective there are three major issues with this new host model: Convergence and dynamism. Performance Metric volume Let's first talk about convergence and dynamism. The speed and limited lifespan means a lot of churn in your monitoring configuration: hosts appearing and disappearing quickly. Sometimes a host will even appear and disappear before your monitoring environment is aware of it. In many monitoring environments your configuration is applied after the installation of the host or service, either manually or via a configuration management tool like Puppet or Chef.


Digital Transformation at Scale: Why the Strategy Is Delivery by Andrew Greenway,Ben Terrett,Mike Bracken,Tom Loosemore

Airbnb, bitcoin, blockchain, butterfly effect, call centre, chief data officer, choice architecture, cognitive dissonance, cryptocurrency, Diane Coyle, en.wikipedia.org, G4S, Internet of things, Kevin Kelly, Kickstarter, loose coupling, M-Pesa, minimum viable product, nudge unit, performance metric, ransomware, Silicon Valley, social web, the market place, The Wisdom of Crowds

There are as many perspectives on the ‘right’ things to measure as there are ‘right’ ways to measure them. Some businesses measure hundreds of different variables in their quest for profitability. Most governments tend to be similarly thorough, with the added complication of managing multiple desired outcomes at the same time, where the operational measures often fail to match up with lofty political goals. In the UK, to keep things simple, we selected four performance metrics: digital take-up, completion rate, cost per transaction and user satisfaction. We could have picked more. Four was a manageable number, and effectively covered the bases for the GDS’s primary strategic aims: getting more people to use online government services, building services that worked first time, saving money and meeting user needs. As soon as you set performance indicators and determine a baseline for how things look before you’ve tried to improve the picture, you will be strongly encouraged to set a target number: a goal that you will strive to hit by a certain point in time.

In government, measuring user satisfaction picks up false signals: about how happy people are about paying tax, even about how happy they are with the government’s political performance in general. These are not things that any digital service team can do anything about. In the end, the most reliable way to measure user satisfaction was in the research lab, watching real people use the service. This was difficult to scale, but always worth the effort. The GDS’s choice of four performance metrics acted as useful pointers for stories to celebrate or worries to address. They weren’t designed to provide the people managing the services day to day with all the detailed insight needed to make incremental improvements to services; more detailed web analytics packages delivered that. What they offered was an indication of relative progress, and a measure of momentum. Money While putting an accurate figure on user satisfaction can prove almost impossible, one metric can not be ignored entirely.

The digital institution could keep a focus on meeting user needs at the same time as saving government money. If these priorities had been reversed – saving money before meeting needs – it is unlikely the users would get much of a look in. Summary Write a list of all the services your organisation provides and use it to gauge where digital change can have the biggest impact for users. Choose performance metrics that give clues as to how well you are meeting user needs; these may differ from organisational objectives. Use metrics to judge velocity of change, rather than setting hard targets. Make an economic case for applying digital transformation to your organisation. Move away from spreadsheet data requests to automated real-time data collection as fast as you can. * * * 50 https://designnotes.blog.GOV.UK/2015/06/22/good-services-are-verbs-2/ 51 https://www.mckinsey.com/industries/public-sector/our-insights/deliverology-from-idea-to-implementation 52 https://www.GOV.UK/government/publications/digital-efficiency-report.


pages: 351 words: 123,876

Beautiful Testing: Leading Professionals Reveal How They Improve Software (Theory in Practice) by Adam Goucher, Tim Riley

Albert Einstein, barriers to entry, Black Swan, call centre, continuous integration, Debian, Donald Knuth, en.wikipedia.org, Firefox, Grace Hopper, index card, Isaac Newton, natural language processing, p-value, performance metric, revision control, six sigma, software as a service, software patent, the scientific method, Therac-25, Valgrind, web application

The performance test cases, however, were renamed “Performance Testing Checkpoints” and included the following (abbreviated here): 42 CHAPTER FOUR • Collect baseline system performance metrics and verify that each functional task included in the system usage model achieves performance requirements under a user load of 1 for each performance testing build in which the functional task has been implemented. — [Functional tasks listed, one per line] • Collect system performance metrics and verify that each functional task included in the system usage model achieves performance requirements under a user load of 10 for each performance testing build in which the functional task has been implemented. — [Functional tasks listed, one per line] • Collect system performance metrics and verify that the system usage model achieves performance requirements under the following loads to the degree that the usage model has been implemented in each performance testing build. — [Increasing loads from 100 users to 3,000 users, listed one per line] • Collect system performance metrics and verify that the system usage model achieves performance requirements for the duration of a 9-hour, 1,000-user stress test on performance testing builds that the lead developer, performance tester, and project manager deem appropriate.

Clearly frustrated, but calm, Harold told me that he’d been asked to establish the performance requirements that were going to appear in our contract to the client. Now understanding the intent, I suggested that Harold schedule a conference room for a few hours for us to discuss his task further. He agreed. As it turned out, it took more than one meeting for Harold to explain to me the client’s expectations, the story behind his task, and for me to explain to Harold why we didn’t want to be contractually obligated to performance metrics that were inherently ambiguous, what those ambiguities were, and what we could realistically measure that would be valuable. Finally, Harold and I took what were now several sheets of paper with the following bullets to Sandra, our project manager, to review: “System Performance Testing Requirements: • Performance testing will be conducted under a variety of loads and usage models, to be determined when system features and workflows are established

. — [Functional tasks listed, one per line] • Collect system performance metrics and verify that the system usage model achieves performance requirements under the following loads to the degree that the usage model has been implemented in each performance testing build. — [Increasing loads from 100 users to 3,000 users, listed one per line] • Collect system performance metrics and verify that the system usage model achieves performance requirements for the duration of a 9-hour, 1,000-user stress test on performance testing builds that the lead developer, performance tester, and project manager deem appropriate. The beauty here was that what we created was clear, easy to build a strategy around, and mapped directly to information that the client eventually requested in the final report. An added bonus was that from that point forward in the project, whenever someone challenged our approach to performance testing, one or more of the folks who were involved in the creation of the checkpoints always came to my defense—frequently before I even found out about the challenge!


pages: 217 words: 63,287

The Participation Revolution: How to Ride the Waves of Change in a Terrifyingly Turbulent World by Neil Gibb

Airbnb, Albert Einstein, blockchain, Buckminster Fuller, call centre, carbon footprint, Clayton Christensen, collapse of Lehman Brothers, corporate social responsibility, creative destruction, crowdsourcing, disruptive innovation, Donald Trump, gig economy, iterative process, job automation, Joseph Schumpeter, Khan Academy, Kibera, Kodak vs Instagram, Mark Zuckerberg, Menlo Park, Minecraft, Network effects, new economy, performance metric, ride hailing / ride sharing, shareholder value, side project, Silicon Valley, Silicon Valley startup, Skype, Snapchat, Steve Jobs, the scientific method, Thomas Kuhn: the structure of scientific revolutions, trade route, urban renewal

A higher calling 12. It ain’t what you do, it’s the why that you do it 13. The pursuit of happiness 14. Together 15. Home III. How it works Framework 1. Create a cause A new kind of leadership Bank to the future The non-linear business model 2. Mobilise a movement Weapons of mass participation The Art of transformation Analytics and performance metrics 3. Build a community Together That thing we most seek Social economics IV. Into action A call to action Manifesto An open-source tool kit “Tomorrow belongs to those who can hear it coming” David Bowie I. Introduction When things fall apart “You can’t stop the waves, but you can learn to surf” Jon Kabat-Zinn Galileo Galilei was a clever lad.

We become attached to things, attached to what we do. Letting it go can be hard and painful, which is why we often hang on. At times, the process of transformation can feel a lot like grief. It is difficult and disruptive. But context changes everything. When we have a big why – something that strikes us as worthwhile, meaningful and important to us – our experience is transformed. 3. Analytics and performance metrics “Out of this crisis, there could be a rebirth of economics. I’m not someone who would say that all that’s been done in the past is terrible. It’s just that the models we had were rather narrow and fragile. The problem came when the world was tipped upside down and those models were ill-equipped to making sense of behaviours” Andrew Haldane, chief economist, Bank of England, 2017 When Andy Haldane addressed the Institute of Government in London in early 2017, he described his profession’s inability to foresee the collapse of Lehman Brothers or the ensuing global financial crisis as its “Michael Fish moment” – referring to an infamous incident in 1987 when a BBC weather forecaster confidently predicted that a hurricane was going to miss the UK, only for it to hit the country with full-force the next day, causing devastation and mayhem; the worst storm in a century.

This measure ceases to be useful when content and products become devalued, and, in many cases, free. You can’t assess the value of open-source software, intellectual crowd-sourcing, peer-based wellbeing, or businesses like Facebook on the value of what they produce. The key metric of corporations – productivity – is a measure that has been flatlining for the best part of a decade. The reason for this is that it isn’t measuring what is driving the new economy. Productivity as a performance metric emerged out of the factory system, where value was calculated by assessing the price of the end product in relation to the cost of the materials, labour, and processes that made it. It was a linear calculation. Enterprises in the emerging paradigm don’t work like that. They are not linear, they are networked. They are complex webs, matrices, and dynamic meshes of interconnections that contribute to the output value in a non-cause-and-effect way.


pages: 571 words: 105,054

Advances in Financial Machine Learning by Marcos Lopez de Prado

algorithmic trading, Amazon Web Services, asset allocation, backtesting, bioinformatics, Brownian motion, business process, Claude Shannon: information theory, cloud computing, complexity theory, correlation coefficient, correlation does not imply causation, diversification, diversified portfolio, en.wikipedia.org, fixed income, Flash crash, G4S, implied volatility, information asymmetry, latency arbitrage, margin call, market fragmentation, market microstructure, martingale, NP-complete, P = NP, p-value, paper trading, pattern recognition, performance metric, profit maximization, quantitative trading / quantitative finance, RAND corporation, random walk, risk-adjusted returns, risk/return, selection bias, Sharpe ratio, short selling, Silicon Valley, smart cities, smart meter, statistical arbitrage, statistical model, stochastic process, survivorship bias, transaction costs, traveling salesman

There are many alternative CV schemes, of which one of the most popular is k-fold CV. Figure 7.1 illustrates the k train/test splits carried out by a k-fold CV, where k = 5. In this scheme: The dataset is partitioned into k subsets. For i = 1,…,k The ML algorithm is trained on all subsets excluding i. The fitted ML algorithm is tested on i. Figure 7.1 Train/test splits in a 5-fold CV scheme The outcome from k-fold CV is a kx1 array of cross-validated performance metrics. For example, in a binary classifier, the model is deemed to have learned something if the cross-validated accuracy is over 1/2, since that is the accuracy we would achieve by tossing a fair coin. In finance, CV is typically used in two settings: model development (like hyper-parameter tuning) and backtesting. Backtesting is a complex subject that we will discuss thoroughly in Chapters 10–16.

As you may recall from Chapter 4, observation weights were determined as a function of the observation's absolute return. The implication is that sample weighted cross-entropy loss estimates the classifier's performance in terms of variables involved in a PnL (mark-to-market profit and losses) calculation: It uses the correct label for the side, probability for the position size, and sample weight for the observation's return/outcome. That is the right ML performance metric for hyper-parameter tuning of financial applications, not accuracy. When we use log loss as a scoring statistic, we often prefer to change its sign, hence referring to “neg log loss.” The reason for this change is cosmetic, driven by intuition: A high neg log loss value is preferred to a low neg log loss value, just as with accuracy. Keep in mind this sklearn bug when you use neg_log_loss: https://github.com/scikit-learn/scikit-learn/issues/9144.

Do not research under the influence of a backtest. Most backtests published in journals are flawed, as the result of selection bias on multiple tests (Bailey, Borwein, López de Prado, and Zhu [2014]; Harvey et al. [2016]). A full book could be written listing all the different errors people make while backtesting. I may be the academic author with the largest number of journal articles on backtesting1 and investment performance metrics, and still I do not feel I would have the stamina to compile all the different errors I have seen over the past 20 years. This chapter is not a crash course on backtesting, but a short list of some of the common errors that even seasoned professionals make. 11.2 Mission Impossible: The Flawless Backtest In its narrowest definition, a backtest is a historical simulation of how a strategy would have performed should it have been run over a past period of time.


pages: 354 words: 26,550

High-Frequency Trading: A Practical Guide to Algorithmic Strategies and Trading Systems by Irene Aldridge

algorithmic trading, asset allocation, asset-backed security, automated trading system, backtesting, Black Swan, Brownian motion, business cycle, business process, buy and hold, capital asset pricing model, centralized clearinghouse, collapse of Lehman Brothers, collateralized debt obligation, collective bargaining, computerized trading, diversification, equity premium, fault tolerance, financial intermediation, fixed income, high net worth, implied volatility, index arbitrage, information asymmetry, interest rate swap, inventory management, law of one price, Long Term Capital Management, Louis Bachelier, margin call, market friction, market microstructure, martingale, Myron Scholes, New Journalism, p-value, paper trading, performance metric, profit motive, purchasing power parity, quantitative trading / quantitative finance, random walk, Renaissance Technologies, risk tolerance, risk-adjusted returns, risk/return, Sharpe ratio, short selling, Small Order Execution System, statistical arbitrage, statistical model, stochastic process, stochastic volatility, systematic trading, trade route, transaction costs, value at risk, yield curve, zero-sum game

Kurtosis indicates whether the tails of the distribution are normal; high kurtosis signifies “fat tails,” a higher than normal probability of extreme positive or negative events. COMPARATIVE RATIOS While average return, standard deviation, and maximum drawdown present a picture of the performance of a particular trading strategy, the measures do not lend to an easy point comparison among two or more strategies. Several comparative performance metrics have been developed in an attempt to summarize mean, variance, and tail risk in a single number that can be used to compare different trading strategies. Table 5.1 summarizes the most popular point measures. The first generation of point performance measures were developed in the 1960s and include the Sharpe ratio, Jensen’s alpha, and the Treynor ratio. The Sharpe ratio is probably the most widely used measure in comparative performance evaluation; it incorporates three desirable metrics—average return, standard deviation, and the cost of capital.

VaR companion measure, the conditional VaR (CVaR), also known as expected loss (EL), measures the average value of return within the cut-off tail. Of course, the original VaR assumes normal distributions of returns, whereas the returns are known to be fat-tailed. To address this issue, a modified VaR (MVaR) measure was proposed by Gregoriou and Gueyie (2003) and takes into account deviations from normality. Gregoriou and Gueyie (2003) also suggest using MVaR in place of standard deviation in Sharpe ratio calculations. How do these performance metrics stack up against each other? It turns out that all metrics deliver comparable rankings of trading strategies. Evaluating Performance of High-Frequency Strategies 57 Eling and Schuhmacher (2007) compare hedge fund ranking performance of the 13 measures listed and conclude that the Sharpe ratio is an adequate measure for hedge fund performance. PERFORMANCE ATTRIBUTION Performance attribution analysis, often referred to as “benchmarking,” goes back to the arbitrage pricing theory of Ross (1977) and has been applied to trading strategy performance by Sharpe (1992) and Fung and Hsieh (1997), among others.

Methods for forecast comparisons include: r Mean squared error (MSE) r Mean absolute deviation (MAD) 221 Back-Testing Trading Models r Mean absolute percentage error (MAPE) r Distributional performance r Cumulative accuracy profiling If the value of a financial security is forecasted to be xF,t at some future time t and the realized value of the same security at time t is xR,t , the forecast error for the given forecast, εF,t , is computed as follows: ε F,t = xF,t − x R,t (15.2) The mean squared error (MSE) is then computed as the average of squared forecast errors over T estimation periods, analogously to volatility computation: MSE = T 1 2 ε T τ =1 F,τ (15.3) The mean absolute deviation (MAD) and the mean absolute percentage error (MAPE) also summarize properties of forecast errors: MAD = MAPE = T 1 |ε F,τ | T τ =1 T 1 ε F,τ x T R,τ (15.4) (15.5) τ =1 Naturally, the lower each of the three metrics (MSE, MAD, and MAPE), the better the forecasting performance of the trading system. The distributional evaluation of forecast performance also examines forecast errors ε F,t normalized by the realized value, x R,t . Unlike MSE, MAD, and MAPE metrics, however, the distributional performance metric seeks to establish whether the forecast errors are random. If the errors are indeed random, there exists no consistent bias in either of price direction ε F,t movement, and the distribution of normalized errors xR,t should fall on the uniform [0, 1] distribution. If the errors are nonrandom, the forecast can be improved. One test that can be used to determine whether the errors are random is a comparison of errors with the uniform distribution using the Kolmogorov-Smirnov statistic.


pages: 719 words: 181,090

Site Reliability Engineering: How Google Runs Production Systems by Betsy Beyer, Chris Jones, Jennifer Petoff, Niall Richard Murphy

Air France Flight 447, anti-pattern, barriers to entry, business intelligence, business process, Checklist Manifesto, cloud computing, combinatorial explosion, continuous integration, correlation does not imply causation, crowdsourcing, database schema, defense in depth, DevOps, en.wikipedia.org, fault tolerance, Flash crash, George Santayana, Google Chrome, Google Earth, information asymmetry, job automation, job satisfaction, Kubernetes, linear programming, load shedding, loose coupling, meta analysis, meta-analysis, microservices, minimum viable product, MVC pattern, performance metric, platform as a service, revision control, risk tolerance, side project, six sigma, the scientific method, Toyota Production System, trickle-down economics, web application, zero day

A given set of production dependencies can be shared, possibly with different stipulations around intent. Performance metrics Demand for one service trickles down to result in demand for one or more other services. Understanding the chain of dependencies helps formulate the general scope of the bin packing problem, but we still need more information about expected resource usage. How many compute resources does service Foo need to serve N user queries? For every N queries of service Foo, how many Mbps of data do we expect for service Bar? Performance metrics are the glue between dependencies. They convert from one or more higher-level resource type(s) to one or more lower-level resource type(s). Deriving appropriate performance metrics for a service can involve load testing and resource usage monitoring. Prioritization Inevitably, resource constraints result in trade-offs and hard decisions: of the many requirements that all services have, which requirements should be sacrificed in the face of insufficient capacity?

Doing so resolves the maximum amount of dependent uncertainty for the minimum number of iterations. Of course, when an area of uncertainty resolves into a fault, you need to select additional branch points. Testing Scalable Tools As pieces of software, SRE tools also need testing.10 SRE-developed tools might perform tasks such as the following: Retrieving and propagating database performance metrics Predicting usage metrics to plan for capacity risks Refactoring data within a service replica that isn’t user accessible Changing files on a server SRE tools share two characteristics: Their side effects remain within the tested mainstream API They’re isolated from user-facing production by an existing validation and release barrier Barrier Defenses Against Risky Software Software that bypasses the usual heavily tested API (even if it does so for a good cause) could wreak havoc on a live service.

Ideally, all levels of intent should be supported together, with services benefiting the more they shift to specifying intent versus implementation. In Google’s experience, services tend to achieve the best wins as they cross to step 3: good degrees of flexibility are available, and the ramifications of this request are in higher-level and understandable terms. Particularly sophisticated services may aim for step 4. Precursors to Intent What information do we need in order to capture a service’s intent? Enter dependencies, performance metrics, and prioritization. Dependencies Services at Google depend on many other infrastructure and user-facing services, and these dependencies heavily influence where a service can be placed. For example, imagine user-facing service Foo, which depends upon Bar, an infrastructure storage service. Foo expresses a requirement that Bar must be located within 30 milliseconds of network latency of Foo.


pages: 161 words: 39,526

Applied Artificial Intelligence: A Handbook for Business Leaders by Mariya Yao, Adelyn Zhou, Marlene Jia

Airbnb, Amazon Web Services, artificial general intelligence, autonomous vehicles, business intelligence, business process, call centre, chief data officer, computer vision, conceptual framework, en.wikipedia.org, future of work, industrial robot, Internet of things, iterative process, Jeff Bezos, job automation, Marc Andreessen, natural language processing, new economy, pattern recognition, performance metric, price discrimination, randomized controlled trial, recommendation engine, self-driving car, sentiment analysis, Silicon Valley, skunkworks, software is eating the world, source of truth, speech recognition, statistical model, strong AI, technological singularity

They can also provide support to public company leadership who are especially subject to the whims of quarterly earnings reports. Keeping your board educated and updated is essential if you aspire to larger projects. Build An Enterprise-Wide Case For AI Your case for investing in AI and in automation will depend on your champions and stakeholders since they possess different business priorities, performance metrics, technical aptitude, propensity for risk, and political relationships. Presenting a clear ROI on AI initiatives is the best way to persuade executive stakeholders, but this can be challenging when enterprise AI adoption is early and still being proven in many sectors. Many corporations are still completing their big data investments and have yet to broach analytics. We emphasized earlier the importance of being honest about whether or not your organization is ready for AI.

The opportunity may not be large enough for the multiplier effect to be worth it. If you’re operating off of a miniscule customer base, then even a 200 percent increase in a key metric may not lead to meaningful boosts to revenue. If the problem is worth pursuing, then how much can better conversion rates and longer lifetime values improve sales volume? If you’re partnering with a vendor, ask them for performance metrics. What results have other clients seen? What is the upper and lower limit of improvements? When did they begin to see results? Decreasing Costs Measuring the ability to reduce costs is another popular way to assess returns on AI investments. AI promises greater operational efficiencies, predominantly in middle and back office functions, such as in legal, finance and accounting, operations, and human resources.


pages: 523 words: 112,185

Doing Data Science: Straight Talk From the Frontline by Cathy O'Neil, Rachel Schutt

Amazon Mechanical Turk, augmented reality, Augustin-Louis Cauchy, barriers to entry, Bayesian statistics, bioinformatics, computer vision, correlation does not imply causation, crowdsourcing, distributed generation, Edward Snowden, Emanuel Derman, fault tolerance, Filter Bubble, finite state, Firefox, game design, Google Glasses, index card, information retrieval, iterative process, John Harrison: Longitude, Khan Academy, Kickstarter, Mars Rover, Nate Silver, natural language processing, Netflix Prize, p-value, pattern recognition, performance metric, personalized medicine, pull request, recommendation engine, rent-seeking, selection bias, Silicon Valley, speech recognition, statistical model, stochastic process, text mining, the scientific method, The Wisdom of Crowds, Watson beat the top human players on Jeopardy!, X Prize

(bin), summarise, mean_p = mean(p), mean_y = mean(y)) fin <- data.frame(bin = summ$bin, mean_p = summ$mean_p, mean_y = summ$mean_y, t) # Get wMAE num = 0 den = 0 for (i in c(1:nrow(fin))) { num <- num + fin$Freq[i] * abs(fin$mean_p[i] - fin$mean_y[i]) den <- den + fin$Freq[i] } wmae <- num / den if (doplot == 1) { plot(summ$bin, summ$mean_p, type = "p", main = paste(title," MAE =", wmae), col = "blue", ylab = "P(C | AD, X)", xlab = "P(C | AD, X)") points(summ$bin, summ$mean_y, type = "p", col = "red") rug(p) } return(wmae) } library(ROCR) get_auc <- function(ind, y) { pred <- prediction(ind, y) perf <- performance(pred, 'auc', fpr.stop = 1) auc <- as.numeric(substr(slot(perf, "y.values"), 1, 8), double) return(auc) } # Get X-Validated performance metrics for a given feature set getxval <- function(vars, data, folds, mae_bins) { # assign each observation to a fold data["fold"] <- floor(runif(nrow(data)) * folds) + 1 auc <- c() wmae <- c() fold <- c() # make a formula object f = as.formula(paste("Y", "~", paste(vars, collapse = "+"))) for (i in c(1:folds)) { train <- data[(data$fold != i), ] test <- data[(data$fold == i), ] mod_x <- glm(f, data=train, family = binomial(logit)) p <- predict(mod_x, newdata = test, type = "response") # Get wMAE wmae <- c(wmae, getmae(p, test$Y, mae_bins, "dummy", 0)) fold <- c(fold, i) auc <- c(auc, get_auc(p, test$Y)) } return(data.frame(fold, wmae, auc)) } ############################################################### ########## MAIN: MODELS AND PLOTS ########## ############################################################### # Now build a model on all variables and look at coefficients and model fit vlist <- c("AT_BUY_BOOLEAN", "AT_FREQ_BUY", "AT_FREQ_LAST24_BUY", "AT_FREQ_LAST24_SV", "AT_FREQ_SV", "EXPECTED_TIME_BUY", "EXPECTED_TIME_SV", "LAST_BUY", "LAST_SV", "num_checkins") f = as.formula(paste("Y_BUY", "~" , paste(vlist, collapse = "+"))) fit <- glm(f, data = train, family = binomial(logit)) summary(fit) # Get performance metrics on each variable vlist <- c("AT_BUY_BOOLEAN", "AT_FREQ_BUY", "AT_FREQ_LAST24_BUY", "AT_FREQ_LAST24_SV", "AT_FREQ_SV", "EXPECTED_TIME_BUY", "EXPECTED_TIME_SV", "LAST_BUY", "LAST_SV", "num_checkins") # Create empty vectors to store the performance/evaluation metrics auc_mu <- c() auc_sig <- c() mae_mu <- c() mae_sig <- c() for (i in c(1:length(vlist))) { a <- getxval(c(vlist[i]), set, 10, 100) auc_mu <- c(auc_mu, mean(a$auc)) auc_sig <- c(auc_sig, sd(a$auc)) mae_mu <- c(mae_mu, mean(a$wmae)) mae_sig <- c(mae_sig, sd(a$wmae)) } univar <- data.frame(vlist, auc_mu, auc_sig, mae_mu, mae_sig) # Get MAE plot on single variable - use holdout group for evaluation set <- read.table(file, header = TRUE, sep = "\t", row.names="client_id") names(set) split<-.65 set["rand"] <- runif(nrow(set)) train <- set[(set$rand <= split), ] test <- set[(set$rand > split), ] set$Y <- set$Y_BUY fit <- glm(Y_BUY ~ num_checkins, data = train, family = binomial(logit)) y <- test$Y_BUY p <- predict(fit, newdata = test, type = "response") getmae(p,y,50,"num_checkins",1) # Greedy Forward Selection rvars <- c("LAST_SV", "AT_FREQ_SV", "AT_FREQ_BUY", "AT_BUY_BOOLEAN", "LAST_BUY", "AT_FREQ_LAST24_SV", "EXPECTED_TIME_SV", "num_checkins", "EXPECTED_TIME_BUY", "AT_FREQ_LAST24_BUY") # Create empty vectors auc_mu <- c() auc_sig <- c() mae_mu <- c() mae_sig <- c() for (i in c(1:length(rvars))) { vars <- rvars[1:i] vars a <- getxval(vars, set, 10, 100) auc_mu <- c(auc_mu, mean(a$auc)) auc_sig <- c(auc_sig, sd(a$auc)) mae_mu <- c(mae_mu, mean(a$wmae)) mae_sig <- c(mae_sig, sd(a$wmae)) } kvar<-data.frame(auc_mu, auc_sig, mae_mu, mae_sig) # Plot 3 AUC Curves y <- test$Y_BUY fit <- glm(Y_BUY~LAST_SV, data=train, family = binomial(logit)) p1 <- predict(fit, newdata=test, type="response") fit <- glm(Y_BUY~LAST_BUY, data=train, family = binomial(logit)) p2 <- predict(fit, newdata=test, type="response") fit <- glm(Y_BUY~num_checkins, data=train, family = binomial(logit)) p3 <- predict(fit, newdata=test,type="response") pred <- prediction(p1,y) perf1 <- performance(pred,'tpr','fpr') pred <- prediction(p2,y) perf2 <- performance(pred,'tpr','fpr') pred <- prediction(p3,y) perf3 <- performance(pred,'tpr','fpr') plot(perf1, color="blue", main="LAST_SV (blue), LAST_BUY (red), num_checkins (green)") plot(perf2, col="red", add=TRUE) plot(perf3, col="green", add=TRUE) Chapter 6.

= i), ] test <- data[(data$fold == i), ] mod_x <- glm(f, data=train, family = binomial(logit)) p <- predict(mod_x, newdata = test, type = "response") # Get wMAE wmae <- c(wmae, getmae(p, test$Y, mae_bins, "dummy", 0)) fold <- c(fold, i) auc <- c(auc, get_auc(p, test$Y)) } return(data.frame(fold, wmae, auc)) } ############################################################### ########## MAIN: MODELS AND PLOTS ########## ############################################################### # Now build a model on all variables and look at coefficients and model fit vlist <- c("AT_BUY_BOOLEAN", "AT_FREQ_BUY", "AT_FREQ_LAST24_BUY", "AT_FREQ_LAST24_SV", "AT_FREQ_SV", "EXPECTED_TIME_BUY", "EXPECTED_TIME_SV", "LAST_BUY", "LAST_SV", "num_checkins") f = as.formula(paste("Y_BUY", "~" , paste(vlist, collapse = "+"))) fit <- glm(f, data = train, family = binomial(logit)) summary(fit) # Get performance metrics on each variable vlist <- c("AT_BUY_BOOLEAN", "AT_FREQ_BUY", "AT_FREQ_LAST24_BUY", "AT_FREQ_LAST24_SV", "AT_FREQ_SV", "EXPECTED_TIME_BUY", "EXPECTED_TIME_SV", "LAST_BUY", "LAST_SV", "num_checkins") # Create empty vectors to store the performance/evaluation metrics auc_mu <- c() auc_sig <- c() mae_mu <- c() mae_sig <- c() for (i in c(1:length(vlist))) { a <- getxval(c(vlist[i]), set, 10, 100) auc_mu <- c(auc_mu, mean(a$auc)) auc_sig <- c(auc_sig, sd(a$auc)) mae_mu <- c(mae_mu, mean(a$wmae)) mae_sig <- c(mae_sig, sd(a$wmae)) } univar <- data.frame(vlist, auc_mu, auc_sig, mae_mu, mae_sig) # Get MAE plot on single variable - use holdout group for evaluation set <- read.table(file, header = TRUE, sep = "\t", row.names="client_id") names(set) split<-.65 set["rand"] <- runif(nrow(set)) train <- set[(set$rand <= split), ] test <- set[(set$rand > split), ] set$Y <- set$Y_BUY fit <- glm(Y_BUY ~ num_checkins, data = train, family = binomial(logit)) y <- test$Y_BUY p <- predict(fit, newdata = test, type = "response") getmae(p,y,50,"num_checkins",1) # Greedy Forward Selection rvars <- c("LAST_SV", "AT_FREQ_SV", "AT_FREQ_BUY", "AT_BUY_BOOLEAN", "LAST_BUY", "AT_FREQ_LAST24_SV", "EXPECTED_TIME_SV", "num_checkins", "EXPECTED_TIME_BUY", "AT_FREQ_LAST24_BUY") # Create empty vectors auc_mu <- c() auc_sig <- c() mae_mu <- c() mae_sig <- c() for (i in c(1:length(rvars))) { vars <- rvars[1:i] vars a <- getxval(vars, set, 10, 100) auc_mu <- c(auc_mu, mean(a$auc)) auc_sig <- c(auc_sig, sd(a$auc)) mae_mu <- c(mae_mu, mean(a$wmae)) mae_sig <- c(mae_sig, sd(a$wmae)) } kvar<-data.frame(auc_mu, auc_sig, mae_mu, mae_sig) # Plot 3 AUC Curves y <- test$Y_BUY fit <- glm(Y_BUY~LAST_SV, data=train, family = binomial(logit)) p1 <- predict(fit, newdata=test, type="response") fit <- glm(Y_BUY~LAST_BUY, data=train, family = binomial(logit)) p2 <- predict(fit, newdata=test, type="response") fit <- glm(Y_BUY~num_checkins, data=train, family = binomial(logit)) p3 <- predict(fit, newdata=test,type="response") pred <- prediction(p1,y) perf1 <- performance(pred,'tpr','fpr') pred <- prediction(p2,y) perf2 <- performance(pred,'tpr','fpr') pred <- prediction(p3,y) perf3 <- performance(pred,'tpr','fpr') plot(perf1, color="blue", main="LAST_SV (blue), LAST_BUY (red), num_checkins (green)") plot(perf2, col="red", add=TRUE) plot(perf3, col="green", add=TRUE) Chapter 6.

Defining the error metric How do we measure whether our learning problem is being modeled well? Let’s remind ourselves of the various possibilities using the truth table in Table 9-1. Table 9-1. Actual versus predicted table, also called the Confusion Matrix Actual = True Actual = False Predicted = True (true positive) (false positive) Predicted = False (false positive) (false negative) The most straightforward performance metric is Accuracy, which is defined using the preceding notation as the ratio: Another way of thinking about accuracy is that it’s the probability that your model gets the right answer. Given that there are very few positive examples of fraud—at least compared with the overall number of transactions—accuracy is not a good metric of success, because the “everything looks good” model, or equivalently the “nothing looks fraudulent” model, is dumb but has good accuracy.


pages: 304 words: 80,965

What They Do With Your Money: How the Financial System Fails Us, and How to Fix It by Stephen Davis, Jon Lukomnik, David Pitt-Watson

activist fund / activist shareholder / activist investor, Admiral Zheng, banking crisis, Basel III, Bernie Madoff, Black Swan, buy and hold, centralized clearinghouse, clean water, computerized trading, corporate governance, correlation does not imply causation, credit crunch, Credit Default Swap, crowdsourcing, David Brooks, Dissolution of the Soviet Union, diversification, diversified portfolio, en.wikipedia.org, financial innovation, financial intermediation, fixed income, Flash crash, income inequality, index fund, information asymmetry, invisible hand, Kenneth Arrow, Kickstarter, light touch regulation, London Whale, Long Term Capital Management, moral hazard, Myron Scholes, Northern Rock, passive investing, performance metric, Ponzi scheme, post-work, principal–agent problem, rent-seeking, Ronald Coase, shareholder value, Silicon Valley, South Sea Bubble, sovereign wealth fund, statistical model, Steve Jobs, the market place, The Wealth of Nations by Adam Smith, transaction costs, Upton Sinclair, value at risk, WikiLeaks

Elson commented, “Even the best corporate boards will fail to address executive compensation concerns unless they tackle the structural bias created by external peer group benchmarking metrics. … Boards should measure performance and determine compensation by focusing on internal metrics. For example, if customer satisfaction is deemed important to the company, then results of customer surveys should play into the compensation equation. Other internal performance metrics can include revenue growth, cash flow, and other measures of return.”58 In other words, boards should focus, as owners do, on what makes the business flourish. USE THE RIGHT METRICS As discussed earlier, 90 percent of large American companies measure the performance of their executive teams over a three-year period or less. About a quarter don’t have any long-term performance–based awards at all.59 Fewer than 25 percent incorporate the cost of capital into their executive compensation formulas, and only 13 percent consider innovation—such as new products, markets, or services, research and development, or intellectual property development—in determining compensation.60 You couldn’t design an incentive scheme better suited to keeping a CEO focused strictly on the short term if you tried.

., 254n2 BrightScope, 122 Brokers, fiduciary duty and, 256n23 Brooks, David, 167 Buffett, Warren, 45, 63, 64, 80, 150, 221 Business judgment rule, 78–79 Business school curriculum, 190–92 Buy and Hold Is Dead (Again) (Solow), 65 Buy and Hold Is Dead (Kee), 65 Buycott, 118 Cadbury, Adrian, 227 Call option, 93 CalPERS, 91, 110, 111–12, 208, 221, 241n37 CalSTRS, 208 Canada, pension funds in, 59, 111, 209 Capital Aberto (magazine), 117 Capital gains, taxation of, 92 Capital Institute, 59, 87 Capital losses, 92 Capitalism: agency, 33, 74–80 defined, 243n2 Eastern European countries’ transition to, 167 financial system and, 9 injecting ownership back into, 83–93 private ownership and, 62 reforming, 11–12 Carbon Disclosure Project, 89 Career paths, new economic thinking and, 189–90 CDC. See Collective pension plans CDFIs. See Community Development Financial Institutions (CDFIs) CDSs. See Credit default swaps (CDSs) CEM Benchmarking, 54 Central banks, 20, 213 Centre for Policy Studies, 105 CEOs: performance metrics, 68, 86–87 short-term mindset among, 67–68. See also Executive compensation Ceres, 120 CFA Institute, 121 Chabris, Christopher, 174 Charles Schwab, 29, 31 Cheating, regulations and, 144–45 Chinese Academy of Social Sciences, 167 Citadel, 29 Citicorp, 76 Citizen investors/savers, 19 charter for, 227–31 communication between funds and, 110–11 dependence on others to manage money, 5–6, 19, 20 goals of, 48, 49 government regulation to safeguard, 107–9 lack of accountability to, 5–7, 96, 99–106 technology and, 90–92 trading platforms that protect, 88–89 City Bank of Glasgow, 257n34 Civil society organizations (CSOs), 153 corporate accountability and, 119–23 scrutiny of funds by, 224 “Civil Stewardship Leagues,” 122 Clark, Gordon L., 101, 106 Classical economics, 159–61 Clegg, Nick, 9 Clinton, Bill, 68–69 Clinton, Hillary Rodham, 119 Coase, Ronald, 169–70, 243n2, 261n31 Cohen, Lauren, 102 Coles Myer, 82 Collective Defined Contribution (CDC), 266n28 Collective pension plans, 263n1, 266n28 duration of liabilities, 264n3 in Netherlands, 197, 199, 209, 264n6.

See also Retirement savings Pension Trustee Code of Conduct, 121 Pension trustees, 105–6, 108–9, 137–38, 140, 205, 207, 224–25, 229 People’s Pension, 202–11 cost of, 217 enrollment into, 208–9 feedback mechanisms, 207 fees, 204 governance and, 202–3, 205–6 investment interests of beneficiaries, 206–7 models for, 266n28 reform of financial institutions and, 226 transparency and, 203–4, 207–8 Performance: asset managers and, 48–50 defined, 149 encouraging through collective action, 57–58 executive compensation and, 68, 148–49 fees, 239n16 governance and, 100–104 institutional investors and incentives for, 112–13 investment management, 35–38 Performance metrics for executives, 68, 86–87 Perry Capital, 81 PFZW. See Stichting Pensioenfonds Zorg en Welzijn (PFZW) PGGM, 77, 111 Philippon, Thomas, 26–28, 220 Philosophy, Politics and Economics (PPE), 190 Pitman, Brian, 213 Pitt, William, 158 Pitt-Watson, David, 263n1, 264n4, 264–65n11, 266n28 Plender, John, 259n5 Political economy, 142, 152 Political institutions, 183–84 Portfolio management: ownership and, 246n36 pension fund, 208–9 PPE (Philosophy, Politics and Economics), 190 Premium, 22 Price of goods, 160 Principles for Responsible Investment.


Mastering Private Equity by Zeisberger, Claudia,Prahl, Michael,White, Bowen, Michael Prahl, Bowen White

asset allocation, backtesting, barriers to entry, Basel III, business process, buy low sell high, capital controls, carried interest, commoditize, corporate governance, corporate raider, correlation coefficient, creative destruction, discounted cash flows, disintermediation, disruptive innovation, distributed generation, diversification, diversified portfolio, family office, fixed income, high net worth, information asymmetry, intangible asset, Lean Startup, market clearing, passive investing, pattern recognition, performance metric, price mechanism, profit maximization, risk tolerance, risk-adjusted returns, risk/return, shareholder value, Sharpe ratio, Silicon Valley, sovereign wealth fund, statistical arbitrage, time value of money, transaction costs

Let’s assume a management team own 10% of the equity and a business is valued at 10× EBITDA. Every €1m of EBITDA increase is valued at €10m enterprise value and results in €1m of cashflow. The Management Equity Plan accrues 10% of the €10m and of the cashflows—so every €1m of EBITDA increase delivers €1.1m directly to the management pot. This is highly motivating to management. Consequently, they embrace the performance metrics and scrutiny of their private equity investors. They thrive on seeing the EBITDA increase and the net debt go down. A great private equity CEO recognises that to get the best exit value for their business they have to create a business that has a sustainable growth strategy, a strong management team and a consistent track record of financial performance. If they get this formula right they will probably hit the jackpot!

The chapter concludes with a closer look at methods used to compare PE performance with public market performance. Interim Fund Performance The performance of a PE fund is reported to its LPs on a quarterly basis. These quarterly reports offer insight into the value of a fund’s portfolio companies and the overall performance of the fund to date. Exhibit 19.1 shows the basic steps taken to translate the value of a fund’s portfolio companies to its gross and net performance metrics. Exhibit 19.1 Evaluating PE Fund Performance Most limited partnership agreements require that a GP reports the fair market value of a fund’s investments, a fund’s net asset value (NAV) plus its gross multiple of money invested (MoM) and its internal rate of return (IRR) as of the reporting date. (See the section on gross performance below for additional details on how to make these calculations.)

Some may prefer to hold investments longer—be it to maximize money multiples or to take advantage of scarce investment opportunities in a specific sector—while others may prefer to exit earlier as a way of maximizing an investment’s IRR and avoid paying PE fees on a position in a public security. A GP may therefore decide to distribute the shares of publicly held companies in the fund “in-kind,” allowing each LP to act according to its preference.3 Gross Performance With realized and unrealized valuations of its portfolio companies in hand, a GP will calculate a range of fund-level performance metrics, including the fund’s MoM, NAV and IRR. Calculating the MoM of each investment—and ultimately of the fund—is fairly straightforward; it is simply realized plus unrealized equity value divided by the capital invested in the company. Similarly, calculating the NAV is simply the sum of the unrealized equity value in a fund’s portfolio companies. Calculating a fund’s IRR, on the other hand, is far from straightforward and can prove contentious, as different IRR methodologies will produce different fund performance results.


pages: 353 words: 88,376

The Investopedia Guide to Wall Speak: The Terms You Need to Know to Talk Like Cramer, Think Like Soros, and Buy Like Buffett by Jack (edited By) Guinan

Albert Einstein, asset allocation, asset-backed security, Brownian motion, business cycle, business process, buy and hold, capital asset pricing model, clean water, collateralized debt obligation, computerized markets, correlation coefficient, credit crunch, Credit Default Swap, credit default swaps / collateralized debt obligations, discounted cash flows, diversification, diversified portfolio, dividend-yielding stocks, dogs of the Dow, equity premium, fixed income, implied volatility, index fund, intangible asset, interest rate swap, inventory management, London Interbank Offered Rate, margin call, money market fund, mortgage debt, Myron Scholes, passive investing, performance metric, risk tolerance, risk-adjusted returns, risk/return, shareholder value, Sharpe ratio, short selling, statistical model, time value of money, transaction costs, yield curve, zero-coupon bond

Related Terms: • American Depositary Receipt—ADR • Correlation • Exchange-Traded Fund—ETF • Global Depositary Receipt—GDR • Index Multiple What Does Multiple Mean? A term that measures a particular aspect of a company’s financial well-being, determined by dividing one metric by another metric. The metric in the numerator is typically larger than the one in the denominator, because the top metric usually is supposed to be many times Multiple = Performance Metric “A” Performance Metric “B” larger than the bottom metric. It is calculated as follows: Investopedia explains Multiple As an example, the term “multiple” can be used to show how much investors are willing to pay per dollar of earnings, as computed by the P/E ratio. Suppose one is analyzing a stock with $2 of earnings per share (EPS) that is trading at $20; this stock has a P/E of 10. This means that investors are willing to pay a multiple of 10 times earnings for the stock.

For example, a European investor purchasing shares of an American company on a foreign exchange (using American dollars to do so) would be exposed to exchange-rate risk while holding that stock. To hedge that risk, the investor could purchase currency futures to lock in a specified exchange rate for the future stock sale and conversion back into the foreign currency. Related Terms: • Credit Derivative • Hedge • Stock Option • Forward Contract • Option Diluted Earnings per Share (Diluted EPS) What Does Diluted Earnings per Share (Diluted EPS) Mean? A performance metric used to gauge the quality of a company’s earnings per share (EPS) if all convertible securities were exercised. Convertible securities refer to all outstanding convertible preferred shares, convertible debentures, stock options (primarily employee-based), The Investopedia Guide to Wall Speak 75 and warrants. Unless the company has no additional potential shares outstanding (a relatively rare circumstance), the diluted EPS will always be lower than the simple EPS.


There Is No Planet B: A Handbook for the Make or Break Years by Mike Berners-Lee

air freight, autonomous vehicles, call centre, carbon footprint, cloud computing, dematerialisation, Elon Musk, energy security, energy transition, food miles, Gini coefficient, global supply chain, global village, Hans Rosling, income inequality, Intergovernmental Panel on Climate Change (IPCC), land reform, neoliberal agenda, off grid, performance metric, profit motive, shareholder value, Silicon Valley, smart cities, Stephen Hawking, The Spirit Level, The Wealth of Nations by Adam Smith, trickle-down economics, urban planning

The second shortcoming has been that in an attempt to avoid criticism, it has been inclined to underplay some of the more uncertain climate change risks. Its latest report, ‘Global Warming of 1.5 C’ makes its most compelling call so far for urgent global action. Jobs A way of spending time that can be useful, fulfilling and which can be a mechanism for appropriate wealth distribution. Worth having when at least two of these three criteria are met, but otherwise not. Therefore to be used with caution as a national performance metric. Kids (ours) The people who will have to understand better than their parents, the nature of the Anthropocene challenge and how to deal with it. Leadership A scarce and much needed quality for dealing with the issues covered in this book. Anyone can display it in any walk of life 230 ALPHABETICAL QUICK TOUR and small actions can occasionally go viral. Pitifully lacking among most politicians worldwide as I write this, but can be encouraged by simple carrot-and-stick training by voters.

32, 147–48, 227 big picture perspective 186, 191, 195–97 biodiversity 44, 53–54, 101–3, 102–3, 103–4, 214 big picture perspective 195–96 pressure on land 78–79, 91 Bioregional, One Planet Living 160–62, 162 boats/shipping 114–16, 235–36 Brazil 69–70, 70 278 Brexit 214 Buddhism 193, 208 bullshit 179, 214; see also fake news; truth Burning Question (Berners-Lee and Clark) 4, 92, 215 business as usual 8, 128, 204 businesses 158, 215 environmental strategies 163–64 fossil fuel companies 223 perspectives/vision 159 role in wealth distribution 138–39 science-based targets 164–66 systems approaches 159–62, 161–62 technological changes 166–68 useful/beneficial organisations 158–59 values 159, 174 see also food retailers call centres, negative effect of performance metrics 125–26 calorific needs 12, 242–43 carbohydrates, carbon footprint 23–25, 25 carbon budgets 51–52, 88, 146, 169–70, 201–2, 204–5 carbon capture and storage (CCS) 91–92, 141, 211, 215 carbon dioxide emissions, exponential growth 202–4, 203, 220; see also greenhouse gas emissions carbon footprints agriculture 22–25, 23, 29–30 carbohydrates 25 local food/food miles 30–32 population growth 149 protein 24 sea travel 114–16 vegetarianism/veganism 27 INDEX carbon pricing 145–47, 209–10 carbon scrubbing 211, 216 carbon taxes 142–43 CCS see carbon capture and storage celebrities 182 change, embracing see openmindedness chicken farms 25–26 Chilean seabass (Patagonian toothfish) 33–34 China 216 global distribution of fossil fuel reserves 89–90 sunlight/radiant energy 69–70, 70 choice//being in control 266 cities, urban planning and transport 104–6 citizen’s wages 136–39, 153–54 Clark, Duncan: Burning Question (with Berners-Lee) 4, 92, 215 climate change 3–4, 51, 55, 216 big picture perspective 195 biodiversity impacts 53–54 evidence against using fossil fuels 64–66 ocean acidification 54–55 plastics production/pollution 55–58, 56–57 rebound effects 52, 128, 165–66, 206–7, 206 science-based targets 164–66 scientific facts 51–53, 200–11, 203, 206 systems approaches 159–62, 161 values 169–70 coal 216; see also fossil fuels comfort breaks, performance metrics 125–27 Common Cause report (Crompton) 129 community service 174 Index commuting 217; see also travel and transport companies see businesses competence 266 complexity 189, 191, 221; see also simplistic thinking consumption/consumerism 217 ethical 147–48, 168 personal actions 174–75 risks of further growth 121 values 173 corporate responsibility 219; see also businesses critical realism 176 critical thinking skills 188–89, 191 Crompton, Tom (Common Cause report) 129 cruises 115–16 cultural norms big picture perspective 197 values 171–72 cultures of truth 177–79 cumulative carbon budgets 51, 201–2 cycling 4–5, 99–102, 116, 217 dairy industry 230–31; see also animal sources of food democracy 141, 218, 240–41; see also voting denial 198, 227 Denmark, wealth distribution 130–35 Desai, Pooran 161–62 desalination plants, energy use 94 determinism 95, 218 developed countries 218–19 energy use 93 food waste 13, 39–40, 241 diesel vehicles 107–9, 109 diet, sustainable 219; see also vegetarianism/veganism 279 digital information storage, and energy efficiency 84–85 direct air capture, carbon dioxide 211, 216 distance, units of 243 double-sided photocopying metaphor 219 driverless cars 109–10 e-transport e-bikes 101–2, 116 e-boats 115 e-cars 101–2, 106, 220 e-planes 111 investment 141 economic growth 119, 219 big picture perspective 196–97 carbon pricing 145–47 carbon taxes 142–43 consumer power through spending practices 147–48 GDP as inadequate metric 123–24, 126–27 investment 140–42 market forces 127–30 need for new metric of healthy growth 124–27 risks and benefits of growth 120–23, 121 trickledown of wealth 130–31, 130 wealth distribution 130–35, 131–40, 132, 134 education 173–74, 219 efficiency 219–20 digital information storage 84–85 energy use 82–85 investment 141 limitations of electricity 73–86, 85–87 meat eating/animal feed 212–13 rebound effects 84, 207 280 electric vehicles see e-transport electricity, limitations of use 73–86, 85–87; see also renewable energy sources empathy 172, 186–87, 191 employment see work/employment enablement, businesses 163–64 energy in a gas analogy of wealth distribution 136–39 energy use 59, 87, 95–96 current usage 59–60 efficiency 82–85 fracking 79–81, 81 growth rates over time see below inequality 60, 90–91, 131 interstellar travel 117–18 limitations of electricity 73–86, 85–87 limits to growth 67–69, 68, 94–95, 208 nuclear fission 75–77 nuclear fusion 77 personal actions and effects 97 risks of further growth 120–21 sources 63–64 supplied by food 12 UK energy by end use 62, 62 units of 242–43 values 169–70 see also fossil fuels; renewable energy sources energy use growth 1–2, 60–62, 61, 220 and energy efficiency 84 future estimates 93–94 limits to growth 67–69, 68, 94–95 and renewables 81–82 enhanced rock weathering 92 enoughness 221; see also limits to growth environmental strategies, businesses 163–64 science-based targets 164–66 INDEX ethical consumerism 147–48, 168 ethics see values evolutionary rebalancing 6, 221 expert opinion 221 exponential growth 120, 121, 149, 202–4, 220–21 extrinsic motivation and values 143–44, 170–73 facts 222 climate change 51–53, 200–11, 203, 206 meaning of 175–76 media roles in promoting 179–80 see also misinformation; truth fake news 170, 175, 222; see also misinformation farming see food and agriculture fast food 238 feedback mechanisms 272; see also rebound effects fish farming 33 fishing industry 32–36, 222–23 flat lining blip, carbon dioxide emissions 203–4, 220 flexibility see open-mindedness flying see air travel food and agriculture 11, 50, 222–23 animal farming 16–21, 29 biofuels 44 carbon footprints 22–25, 23–25, 27 chicken farming 25–26 employment in agriculture 44–45, 222 feeding growing populations 46–47 fish 32–36 global surplus in comparison to needs 12, 13 human calorific needs 12 investment in sustainability 48–50, 141 Index malnutrition and inequalities of distribution 15–16 overeating/obesity 16 personal actions 30, 34–35, 40, 43, 50 research needs 49 rice farming 29–30 soya bean farming 21, 22 supply chains 48 technology in agriculture 45–46 vegetarianism/veganism 26–29 see also waste food food imports, and population growth 150 food markets 130–31 food miles 30–32, 230 food retailers fish 35–36 food wastage 40–42 rice 30 vegetarianism/veganism 28 fossil fuel companies 223 fossil fuels 63–64, 216, 223 carbon pricing 145–47, 209–10 carbon taxes 142–43 evidence against using 64–66 global deals 87–91, 161, 205–6, 208–9 global distribution of reserves 89, 89–90 limitations of using electricity instead 73–86, 85–87 need to leave in the ground 87–91, 161, 205–6, 208–9, 223 sea travel 115 using renewables instead of or as well as 81–82 fracking 79–81, 81, 224 free markets 127–30, 172, 228 free will 95, 167 frog in a pan of water analogy 236, 241 fun 224 281 fundamentalism 176, 192 future scenarios aims and visions 8–9 climate change lag times 204–5 energy use 93–94 planning ahead 204–5 thinking/caring about 187, 191, 229 travel and transport 100–1, 109–10 gambling industry 139–40, 152, 265 gas analogy of wealth distribution 136–39 gas (natural gas) 224; see also fracking; methane GDP big picture perspective 196–97 as inappropriate metric of healthy growth 123–24, 126–27 risks of further growth 121–22 genetic modification 45–46 genuineness 172 geo engineering solutions 224–25 Germany, tax system 145 Gini coefficient of income inequality 144 global cultural norms 171–72, 197 global deals 163 fossil fuels 87–91, 208–10 inequity 210 global distribution, fossil fuels 89–91, 89 global distribution, solar energy 69–71, 70, 89 global distribution, wind energy 74, 74 global food surplus 12, 13 global governance 127–30, 141, 225 global solutions, big picture perspective 196 global systems 5–6, 186, 225 global temperature increases 200–1 282 global thinking skills 186 global travel, by mode of transport 100 global wealth distribution 130–35, 132, 132, 134, 144, 145 governmental roles big picture perspective 196 climate change policies 51–53, 200–11 energy use policies 59, 97 fishing industry 36 promoting culture of truth 178–80 sustainable farming 29, 45 technological changes 168 wealth distribution 138 see also global governance greed 225–26; see also individualism greenhouse gas emissions 209 exponential growth curves 202–4, 203, 220 food and agriculture 23 market forces 128 measurement 127 mitigation of food waste 42, 43, 43 risks of further growth 120 scientific facts 51–53 units 243 see also carbon dioxide; carbon footprints; methane; nitrogen dioxide greenwash 215, 226 growth 226; see also economic growth; energy use growth; exponential growth hair shirts 212, 224, 226–27 Handy, Charles 236 Happy Planet Index 126 Hardy, Lew 143 Hawking , Stephen 2, 166–67 Hong Kong, population growth 149–50 INDEX How Bad Are Bananas?

32, 147–48, 227 big picture perspective 186, 191, 195–97 biodiversity 44, 53–54, 101–3, 102–3, 103–4, 214 big picture perspective 195–96 pressure on land 78–79, 91 Bioregional, One Planet Living 160–62, 162 boats/shipping 114–16, 235–36 Brazil 69–70, 70 278 Brexit 214 Buddhism 193, 208 bullshit 179, 214; see also fake news; truth Burning Question (Berners-Lee and Clark) 4, 92, 215 business as usual 8, 128, 204 businesses 158, 215 environmental strategies 163–64 fossil fuel companies 223 perspectives/vision 159 role in wealth distribution 138–39 science-based targets 164–66 systems approaches 159–62, 161–62 technological changes 166–68 useful/beneficial organisations 158–59 values 159, 174 see also food retailers call centres, negative effect of performance metrics 125–26 calorific needs 12, 242–43 carbohydrates, carbon footprint 23–25, 25 carbon budgets 51–52, 88, 146, 169–70, 201–2, 204–5 carbon capture and storage (CCS) 91–92, 141, 211, 215 carbon dioxide emissions, exponential growth 202–4, 203, 220; see also greenhouse gas emissions carbon footprints agriculture 22–25, 23, 29–30 carbohydrates 25 local food/food miles 30–32 population growth 149 protein 24 sea travel 114–16 vegetarianism/veganism 27 INDEX carbon pricing 145–47, 209–10 carbon scrubbing 211, 216 carbon taxes 142–43 CCS see carbon capture and storage celebrities 182 change, embracing see openmindedness chicken farms 25–26 Chilean seabass (Patagonian toothfish) 33–34 China 216 global distribution of fossil fuel reserves 89–90 sunlight/radiant energy 69–70, 70 choice//being in control 266 cities, urban planning and transport 104–6 citizen’s wages 136–39, 153–54 Clark, Duncan: Burning Question (with Berners-Lee) 4, 92, 215 climate change 3–4, 51, 55, 216 big picture perspective 195 biodiversity impacts 53–54 evidence against using fossil fuels 64–66 ocean acidification 54–55 plastics production/pollution 55–58, 56–57 rebound effects 52, 128, 165–66, 206–7, 206 science-based targets 164–66 scientific facts 51–53, 200–11, 203, 206 systems approaches 159–62, 161 values 169–70 coal 216; see also fossil fuels comfort breaks, performance metrics 125–27 Common Cause report (Crompton) 129 community service 174 Index commuting 217; see also travel and transport companies see businesses competence 266 complexity 189, 191, 221; see also simplistic thinking consumption/consumerism 217 ethical 147–48, 168 personal actions 174–75 risks of further growth 121 values 173 corporate responsibility 219; see also businesses critical realism 176 critical thinking skills 188–89, 191 Crompton, Tom (Common Cause report) 129 cruises 115–16 cultural norms big picture perspective 197 values 171–72 cultures of truth 177–79 cumulative carbon budgets 51, 201–2 cycling 4–5, 99–102, 116, 217 dairy industry 230–31; see also animal sources of food democracy 141, 218, 240–41; see also voting denial 198, 227 Denmark, wealth distribution 130–35 Desai, Pooran 161–62 desalination plants, energy use 94 determinism 95, 218 developed countries 218–19 energy use 93 food waste 13, 39–40, 241 diesel vehicles 107–9, 109 diet, sustainable 219; see also vegetarianism/veganism 279 digital information storage, and energy efficiency 84–85 direct air capture, carbon dioxide 211, 216 distance, units of 243 double-sided photocopying metaphor 219 driverless cars 109–10 e-transport e-bikes 101–2, 116 e-boats 115 e-cars 101–2, 106, 220 e-planes 111 investment 141 economic growth 119, 219 big picture perspective 196–97 carbon pricing 145–47 carbon taxes 142–43 consumer power through spending practices 147–48 GDP as inadequate metric 123–24, 126–27 investment 140–42 market forces 127–30 need for new metric of healthy growth 124–27 risks and benefits of growth 120–23, 121 trickledown of wealth 130–31, 130 wealth distribution 130–35, 131–40, 132, 134 education 173–74, 219 efficiency 219–20 digital information storage 84–85 energy use 82–85 investment 141 limitations of electricity 73–86, 85–87 meat eating/animal feed 212–13 rebound effects 84, 207 280 electric vehicles see e-transport electricity, limitations of use 73–86, 85–87; see also renewable energy sources empathy 172, 186–87, 191 employment see work/employment enablement, businesses 163–64 energy in a gas analogy of wealth distribution 136–39 energy use 59, 87, 95–96 current usage 59–60 efficiency 82–85 fracking 79–81, 81 growth rates over time see below inequality 60, 90–91, 131 interstellar travel 117–18 limitations of electricity 73–86, 85–87 limits to growth 67–69, 68, 94–95, 208 nuclear fission 75–77 nuclear fusion 77 personal actions and effects 97 risks of further growth 120–21 sources 63–64 supplied by food 12 UK energy by end use 62, 62 units of 242–43 values 169–70 see also fossil fuels; renewable energy sources energy use growth 1–2, 60–62, 61, 220 and energy efficiency 84 future estimates 93–94 limits to growth 67–69, 68, 94–95 and renewables 81–82 enhanced rock weathering 92 enoughness 221; see also limits to growth environmental strategies, businesses 163–64 science-based targets 164–66 INDEX ethical consumerism 147–48, 168 ethics see values evolutionary rebalancing 6, 221 expert opinion 221 exponential growth 120, 121, 149, 202–4, 220–21 extrinsic motivation and values 143–44, 170–73 facts 222 climate change 51–53, 200–11, 203, 206 meaning of 175–76 media roles in promoting 179–80 see also misinformation; truth fake news 170, 175, 222; see also misinformation farming see food and agriculture fast food 238 feedback mechanisms 272; see also rebound effects fish farming 33 fishing industry 32–36, 222–23 flat lining blip, carbon dioxide emissions 203–4, 220 flexibility see open-mindedness flying see air travel food and agriculture 11, 50, 222–23 animal farming 16–21, 29 biofuels 44 carbon footprints 22–25, 23–25, 27 chicken farming 25–26 employment in agriculture 44–45, 222 feeding growing populations 46–47 fish 32–36 global surplus in comparison to needs 12, 13 human calorific needs 12 investment in sustainability 48–50, 141 Index malnutrition and inequalities of distribution 15–16 overeating/obesity 16 personal actions 30, 34–35, 40, 43, 50 research needs 49 rice farming 29–30 soya bean farming 21, 22 supply chains 48 technology in agriculture 45–46 vegetarianism/veganism 26–29 see also waste food food imports, and population growth 150 food markets 130–31 food miles 30–32, 230 food retailers fish 35–36 food wastage 40–42 rice 30 vegetarianism/veganism 28 fossil fuel companies 223 fossil fuels 63–64, 216, 223 carbon pricing 145–47, 209–10 carbon taxes 142–43 evidence against using 64–66 global deals 87–91, 161, 205–6, 208–9 global distribution of reserves 89, 89–90 limitations of using electricity instead 73–86, 85–87 need to leave in the ground 87–91, 161, 205–6, 208–9, 223 sea travel 115 using renewables instead of or as well as 81–82 fracking 79–81, 81, 224 free markets 127–30, 172, 228 free will 95, 167 frog in a pan of water analogy 236, 241 fun 224 281 fundamentalism 176, 192 future scenarios aims and visions 8–9 climate change lag times 204–5 energy use 93–94 planning ahead 204–5 thinking/caring about 187, 191, 229 travel and transport 100–1, 109–10 gambling industry 139–40, 152, 265 gas analogy of wealth distribution 136–39 gas (natural gas) 224; see also fracking; methane GDP big picture perspective 196–97 as inappropriate metric of healthy growth 123–24, 126–27 risks of further growth 121–22 genetic modification 45–46 genuineness 172 geo engineering solutions 224–25 Germany, tax system 145 Gini coefficient of income inequality 144 global cultural norms 171–72, 197 global deals 163 fossil fuels 87–91, 208–10 inequity 210 global distribution, fossil fuels 89–91, 89 global distribution, solar energy 69–71, 70, 89 global distribution, wind energy 74, 74 global food surplus 12, 13 global governance 127–30, 141, 225 global solutions, big picture perspective 196 global systems 5–6, 186, 225 global temperature increases 200–1 282 global thinking skills 186 global travel, by mode of transport 100 global wealth distribution 130–35, 132, 132, 134, 144, 145 governmental roles big picture perspective 196 climate change policies 51–53, 200–11 energy use policies 59, 97 fishing industry 36 promoting culture of truth 178–80 sustainable farming 29, 45 technological changes 168 wealth distribution 138 see also global governance greed 225–26; see also individualism greenhouse gas emissions 209 exponential growth curves 202–4, 203, 220 food and agriculture 23 market forces 128 measurement 127 mitigation of food waste 42, 43, 43 risks of further growth 120 scientific facts 51–53 units 243 see also carbon dioxide; carbon footprints; methane; nitrogen dioxide greenwash 215, 226 growth 226; see also economic growth; energy use growth; exponential growth hair shirts 212, 224, 226–27 Handy, Charles 236 Happy Planet Index 126 Hardy, Lew 143 Hawking , Stephen 2, 166–67 Hong Kong, population growth 149–50 INDEX How Bad Are Bananas?


pages: 297 words: 91,141

Market Sense and Nonsense by Jack D. Schwager

3Com Palm IPO, asset allocation, Bernie Madoff, Brownian motion, buy and hold, collateralized debt obligation, commodity trading advisor, computerized trading, conceptual framework, correlation coefficient, Credit Default Swap, credit default swaps / collateralized debt obligations, diversification, diversified portfolio, fixed income, high net worth, implied volatility, index arbitrage, index fund, London Interbank Offered Rate, Long Term Capital Management, margin call, market bubble, market fundamentalism, merger arbitrage, negative equity, pattern recognition, performance metric, pets.com, Ponzi scheme, quantitative trading / quantitative finance, random walk, risk tolerance, risk-adjusted returns, risk/return, Robert Shiller, Robert Shiller, selection bias, Sharpe ratio, short selling, statistical arbitrage, statistical model, survivorship bias, transaction costs, two-sided market, value at risk, yield curve

But the story does not end there. Figure 3.11 NAV Comparison: Three-Period Prior Best S&P Sector versus Prior Worst and Average Data source: S&P Dow Jones Indices. So far, the analysis has only considered returns and has shown that choosing the best past sector would have yielded slightly lower returns than an equal-allocation approach (that is, the average). Return, however, is an incomplete performance metric. Any meaningful performance comparison must also consider risk (a concept we will elaborate on in Chapter 4). We use two measures of risk here: 1. Standard deviation. The standard deviation is a volatility measure that indicates how spread out the data is—in this case, how broadly the returns vary. Roughly speaking, we would expect approximately 95 percent of the data points to fall within two standard deviations of the mean.

Based on performance, it would be difficult to justify choosing Manager E over Manager F, even for the most risk-tolerant investor. Figure 8.12 2DUC: Manager E versus Manager F Investment Misconceptions Investment Misconception 23: The average annual return is probably the single most important performance statistic. Reality: Return alone is a meaningless statistic because return can always be increased by increasing risk. The return/risk ratio should be the primary performance metric. Investment Misconception 24: For a risk-seeking investor considering two investment alternatives, an investment with expected lower return/risk but higher return may often be preferable to an equivalent-quality investment with the reverse characteristics. Reality: The higher return/risk alternative would still be preferable, even for risk-seeking investors, because by using leverage it can be translated into an equivalent return with lower risk (or higher return with equal risk).

However, pro forma results that only adjust for differences between current and past fees and commissions can be more representative than actual results. It is critical to differentiate between these two radically different applications of the same term: pro forma. 16. Return alone is a meaningless statistic because return can be increased by increasing risk. The return/risk ratio should be the primary performance metric. 17. Although the Sharpe ratio is by far the most widely used return/risk measure, return/risk measures based on downside risk come much closer to reflecting risk as it is perceived by most investors. 18. Conventional arithmetic-scale net asset value (NAV) charts provide a distorted picture, especially for longer-term track records that traverse a wide range of NAV levels. A log scale should be used for long-term NAV charts. 19.


pages: 561 words: 114,843

Startup CEO: A Field Guide to Scaling Up Your Business, + Website by Matt Blumberg

activist fund / activist shareholder / activist investor, airport security, Albert Einstein, bank run, Ben Horowitz, Broken windows theory, crowdsourcing, deskilling, fear of failure, high batting average, high net worth, hiring and firing, Inbox Zero, James Hargreaves, Jeff Bezos, job satisfaction, Kickstarter, knowledge economy, knowledge worker, Lean Startup, Mark Zuckerberg, minimum viable product, pattern recognition, performance metric, pets.com, rolodex, Rubik’s Cube, shareholder value, Silicon Valley, Skype

This was less surprising about some aspects than others. For example, I wasn’t surprised that there was a high degree of convergence in the way people thought about the organization’s values since we had a strong values-driven culture that people were living every day, even if those values hadn’t been well articulated in the past. But it was a little surprising that we could effectively crowdsource a strategy statement and key performance metrics at a time when the business was at a fork in the road. Given this degree of alignment, our task as an executive team became less about picking concepts and more about picking words. We worked together to come up with a solid draft that took the best of what was submitted to us. We worked with a copywriter to make the statements flow well. Then we shared the results with the company and opened the floor for comments.

How and when is this investment going to pay itself back? What is the capital required to get there and what are your financing requirements from where your balance sheet sits today? The costs are easier to forecast, especially if you carefully articulated your resource requirements. As everybody in the startup world knows, ROI is trickier. You’re not leading an enterprise that has extremely detailed historical performance metrics to rely on in their forecasting. When Schick or Gillette introduces a new razor into the marketplace, they can very accurately forecast how much it’s going to cost them and what their return will be. If you’re creating a new product in a new marketplace, that isn’t the case. While monthly burn and revenue projections will inevitably change, capital expenditures can be more predictable, though you need to make sure you understand the cash flow mechanics of capital expenditure.

Second, those criteria have to be things that will remain in the control of the acquired company for the length of the earn-out; asking an entrepreneur to agree to an earn-out based on sales, for example, when your sales force will be doing all of the selling, doesn’t make sense. Finally, an earn-out can’t be too high a percentage of the deal. The preponderance will have to be cash and stock. Otherwise, the process of judging performance should be shared by both parties. In one of our largest deals at Return Path, each side appointed representatives who met quarterly to agree on performance metrics, adjustments, and so on. We also designated a third representative in advance who was available to adjudicate any disagreements. We never had to use him. Whatever mechanism you put in place, trust plays a huge role here. If it’s not there, this acquisition might not be a good idea. THE FLIP SIDE OF M&A: DIVESTITURE When Return Path turned six years old in 2005, we had gone from being a startup focused on our initial ECOA business to the world’s smallest conglomerate, with five lines of business: in addition to change of address, we were market leaders in email delivery assurance (a market we created), email–based market research (a tiny market when we started) and email list management and list rental (both huge markets when we founded the company).


pages: 49 words: 12,968

Industrial Internet by Jon Bruner

autonomous vehicles, barriers to entry, commoditize, computer vision, data acquisition, demand response, en.wikipedia.org, factory automation, Google X / Alphabet X, industrial robot, Internet of things, job automation, loose coupling, natural language processing, performance metric, Silicon Valley, slashdot, smart grid, smart meter, statistical model, web application

Newer wind turbines use software that acts in real-time to squeeze a little more current out of each revolution, pitching the blades slightly as they rotate to compensate for the fact that gravity shortens them as they approach the top of their spin and lengthens them as they reach the bottom. Power producers use higher-level data analysis to inform longer-range capital strategies. The 150-foot-long blades on a wind turbine, for instance, chop at the air as they move through it, sending turbulence to the next row of turbines and reducing efficiency. By analyzing performance metrics from existing wind installations, planners can recommend new layouts that take into account common wind patterns and minimize interference. Automotive Google captured the public imagination when, in 2010, it announced that its autonomous cars had already driven 140,000 miles of winding California roads without incident. The idea of a car that drives itself was finally realized in a practical way by software that has strong links to the physical world around it: inbound, through computer vision software that takes in images and rangefinder data and builds an accurate model of the environment around the car; and outbound, through a full linkage to the car’s controls.


pages: 571 words: 124,448

Building Habitats on the Moon: Engineering Approaches to Lunar Settlements by Haym Benaroya

3D printing, biofilm, Black Swan, Brownian motion, Buckminster Fuller, carbon-based life, centre right, clean water, Colonization of Mars, Computer Numeric Control, conceptual framework, data acquisition, Elon Musk, fault tolerance, gravity well, inventory management, Johannes Kepler, low earth orbit, orbital mechanics / astrodynamics, performance metric, RAND corporation, risk tolerance, Ronald Reagan, stochastic process, telepresence, telerobotics, the scientific method, urban planning, X Prize, zero-sum game

Examples of metrics include: wear rate of components, materials and shielding ; failure rates – anticipated vs. actual; and a number of biological fragility curves that represent how the human body reacts to the low gravity , radiation dosage , and other psychosocial stressors. Loss Analysis is based on frequency and probability data, and is used to extract performance metrics that are meaningful to facility stakeholders; metrics such as upper bound economic loss during the owner-investor’s planning period. Risk-management decisions can be made based on the loss analysis. The project manager for the design and fabrication of a lunar facility has concerns that go beyond investor metrics. The lunar facility is more than a project; rather, it is something that has an almost metaphysical hold on those who have devoted their lives (figuratively and literally) to its creation. Performance metrics for a lunar base need to be based on survivability and development, as well technical aspects of its operation. Examples of metrics include: psychological well-being; physical health; timely accomplishment of tasks and goals; and rates of component failures.

The isolated environment that astronauts will face on the Moon, and even more isolated on Mars, makes critical the need for high levels of reliability . Part of that implies a self-healing capability. Such technologies are being studied and developed but are far from being usable. Do you see self-healing as a critical technology? We had a program called InFlex (Intelligent Flexible Materials for Deployable Space Structures) in the early 2000s. The focus was increasing inflatable structures performance metrics from habitats, to space suits, to aeroshells. We categorized each individual threat ( Micrometeoroid and Orbital Debris [MMOD], impact from external equipment, impact from inside, material degradation, etc.) in every possible environment (LEO, Moon, Mars, etc.) and looked at what technologies made sense to deal with the threats ( self-healing , structural health monitoring, layered materials, etc.).


pages: 291 words: 77,596

Total Recall: How the E-Memory Revolution Will Change Everything by Gordon Bell, Jim Gemmell

airport security, Albert Einstein, book scanning, cloud computing, conceptual framework, Douglas Engelbart, full text search, information retrieval, invention of writing, inventory management, Isaac Newton, John Markoff, lifelogging, Menlo Park, optical character recognition, pattern recognition, performance metric, RAND corporation, RFID, semantic web, Silicon Valley, Skype, social web, statistical model, Stephen Hawking, Steve Ballmer, Ted Nelson, telepresence, Turing test, Vannevar Bush, web application

“Recognizing soldier activities in the field.” Proceedings of International IEEE Workshop on Wearable and Implantable Body Sensor Networks (BSN), Aachen, Germany, March 2007. Schlenoff, Craig, et al. “Overview of the First Advanced Technology Evaluations for ASSIST.” Proceedings of Performance Metrics for Intelligent Systems (PerMIS) 2006, IEEE Press, Gaithersburg, Maryland, August 2006. Stevers, Michelle Potts. “Utility Assessments of Soldier-Worn Sensor Systems for ASSIST.” Proceedings of the Performance Metrics for Intelligent Systems Workshop, 2006. Starner, Thad. “The Virtual Patrol: Capturing and Accessing Information for the Soldier in the Field.” Proceedings of the 3rd ACM Workshop on Continuous Archival and Retrieval of Personal Experiences, Santa Barbara, California, 2006. Glass Box: Cowley, Paula, Jereme Haack, Rik Littlefield, and Ernest Hampson.


pages: 231 words: 71,248

Shipping Greatness by Chris Vander Mey

corporate raider, don't be evil, en.wikipedia.org, fudge factor, Google Chrome, Google Hangouts, Gordon Gekko, Jeff Bezos, Kickstarter, Lean Startup, minimum viable product, performance metric, recommendation engine, Skype, slashdot, sorting algorithm, source of truth, Steve Jobs, Superbowl ad, web application

Some rocket surgeon a while back came up with the notion that goals should be specific, measurable, attainable, reasonable, and time-based. This is a good, but not sufficiently specific, framework. I prefer the Great Delta Convention (described in Chapter 10). If you apply the Great Delta Convention to your goals, nobody will question them—they will almost be S.M.A.R.T. by definition (lacking only the “reasonable” part). Business Performance Business performance metrics tell you where your problems are and how you can improve your user’s experience. These metrics are frequently measured as ratios, such as conversion from when a user clicks the Buy button to when the checkout process is complete. Like goal metrics, it’s critical to measure the right aspects of your business. For example, if you want to build a great social product, you don’t need to measure friends—different segments of users have different numbers of friends.

Google Analytics provides A/B comparison tools that are incredibly powerful, but they’re just one kind of many tools you can use. Most major websites have testing frameworks that they use to roll out features incrementally and ensure that a new feature or experience has the intended effect. If it’s even remotely possible, try to build an experimentation framework in from the beginning (see Chapter 7’s discussion of launching for other benefits of experiments). Systems Performance Systems performance metrics measure the health of your product in real time. Metrics like these include 99.9% mean latency, total requests per second, simultaneous users, orders per second, and other time-based metrics. When these metrics go down substantially, something has gone wrong. A pager should go off. If you’re a very fancy person, you’ll want to look at your metrics through the lens of statistical process control (SPC).


Designing Search: UX Strategies for Ecommerce Success by Greg Nudelman, Pabini Gabriel-Petit

access to a mobile phone, Albert Einstein, AltaVista, augmented reality, barriers to entry, business intelligence, call centre, crowdsourcing, information retrieval, Internet of things, performance metric, QR code, recommendation engine, RFID, search engine result page, semantic web, Silicon Valley, social graph, social web, speech recognition, text mining, the map is not the territory, The Wisdom of Crowds, web application, zero-sum game, Zipcar

A/B Testing and Multivariate Testing To ensure you create successful design solutions that meet both business and customer goals, your team should conduct frequent, quantitative A/B testing of your no search results page and other search results pages. Follow up with qualitative lab and field testing to help you make sense of your A/B testing results and suggest ideas for future improvements. The central idea behind A/B testing is to have two different user interface designs running on your site at the same time, while collecting key performance metrics (KPMs) that enable you to measure desired customer behavior. For example, say you want to introduce some improvements to your current no search results page, which you can call variant A. To determine whether your redesigned version of the page, variant B, offers any improvement, you can deploy variant B and send a small percentage of site visitors—for example, 1% to 10%—to that server and observe the metrics for variant B: Did those visitors buy more stuff?

As for personas, the primary value of ecommerce search roles is team empathy toward the customers interacting with the site. Finally, it is important to note that, although this framework can be helpful and these five ecommerce search roles apply to most ecommerce projects, this generalized framework is not precise. For any framework to be maximally useful for a specific project, you should refine it through direct observation of customers and careful study of key performance metrics. As the prominent philosopher and the father of General Semantics Alfred Korzybski so eloquently stated, “The map is not the territory.” You can neither camp on the little triangles that represent mountains on a map nor go swimming in those blue patches of ink that represent lakes. Rather than viewing this role framework as “reality,” use it as you would a map—to help you navigate your ecommerce search design projects, and as the foundation for developing your own approach—subject to change as you get more data and gain a better understanding of the needs and behaviors of your customers.


Hands-On Machine Learning With Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems by Aurelien Geron

Amazon Mechanical Turk, Bayesian statistics, centre right, combinatorial explosion, constrained optimization, correlation coefficient, crowdsourcing, en.wikipedia.org, iterative process, Netflix Prize, NP-complete, optical character recognition, P = NP, p-value, pattern recognition, performance metric, recommendation engine, self-driving car, SpamAssassin, speech recognition, statistical model

., if you ran another clustering algorithm earlier), then you can set the init hyperparameter to a NumPy array containing the list of centroids, and set n_init to 1: good_init = np.array([[-3, 3], [-3, 2], [-3, 1], [-1, 2], [0, 2]]) kmeans = KMeans(n_clusters=5, init=good_init, n_init=1) Another solution is to run the algorithm multiple times with different random initializations and keep the best solution. This is controlled by the n_init hyperparameter: by default, it is equal to 10, which means that the whole algorithm described earlier actually runs 10 times when you call fit(), and Scikit-Learn keeps the best solution. But how exactly does it know which solution is the best? Well of course it uses a performance metric! It is called the model’s inertia: this is the mean squared distance between each instance and its closest centroid. It is roughly equal to 223.3 for the model on the left of Figure 9-5, 237.5 for the model on the right of Figure 9-5, and 211.6 for the model in Figure 9-3. The KMeans class runs the algorithm n_init times and keeps the model with the lowest inertia: in this example, the model in Figure 9-3 will be selected (unless we are very unlucky with n_init consecutive random initializations).

For example, as you can see in Figure 9-7, setting k to 3 or 8 results in fairly bad models: Figure 9-7. Bad choices for the number of clusters You might be thinking that we could just pick the model with the lowest inertia, right? Unfortunately, it is not that simple. The inertia for k=3 is 653.2, which is much higher than for k=5 (which was 211.6), but with k=8, the inertia is just 119.1. The inertia is not a good performance metric when trying to choose k since it keeps getting lower as we increase k. Indeed, the more clusters there are, the closer each instance will be to its closest centroid, and therefore the lower the inertia will be. Let’s plot the inertia as a function of k (see Figure 9-8): Figure 9-8. Selecting the number of clusters k using the “elbow rule” As you can see, the inertia drops very quickly as we increase k up to 4, but then it decreases much more slowly as we keep increasing k.


pages: 444 words: 86,565

Investment Banking: Valuation, Leveraged Buyouts, and Mergers and Acquisitions by Joshua Rosenbaum, Joshua Pearl, Joseph R. Perella

asset allocation, asset-backed security, bank run, barriers to entry, business cycle, capital asset pricing model, collateralized debt obligation, corporate governance, credit crunch, discounted cash flows, diversification, fixed income, intangible asset, London Interbank Offered Rate, performance metric, shareholder value, sovereign wealth fund, stocks for the long run, technology bubble, time value of money, transaction costs, yield curve

First, we benchmark the key financial statistics and ratios for the target and its comparables in order to establish relative positioning, with a focus on identifying the closest or “best” comparables and noting potential outliers. Second, we analyze and compare the trading multiples for the peer group, placing particular emphasis on the best comparables. Benchmark the Financial Statistics and Ratios The first stage of the benchmarking analysis involves a comparison of the target and comparables universe on the basis of key financial performance metrics. These metrics, as captured in the financial profile framework outlined in Steps I and III, include measures of size, profitability, growth, returns, and credit strength. They are core value drivers and typically translate directly into relative valuation. The results of the benchmarking exercise are displayed on spreadsheet output pages that present the data for each company in an easy-to-compare format (see Exhibits 1.53 and 1.54).

EXHIBIT 3.38 ValueCo Projected Taxes Capex Projections We projected ValueCo’s capex as a percentage of sales in line with historical levels. As shown in Exhibit 3.39, this approach led us to hold capex constant throughout the projection period at 2% of sales. Based on this assumption, capex increases from $21.6 million in 2009E to $25.3 million in 2013E. EXHIBIT 3.39 ValueCo Historical and Projected Capex Change in Net Working Capital Projections As with ValueCo’s other financial performance metrics, historical working capital levels normally serve as reliable indicators of future performance. The direct prior year’s ratios are typically the most indicative provided they are consistent with historical levels. This was the case for ValueCo’s 2007 working capital ratios, which we held constant throughout the projection period (see Exhibit 3.40). EXHIBIT 3.40Valueco Historical and Projected Net Working Capital For A/R, inventory, and A/P, respectively, these ratios are DSO of 60.2, DIH of 76.0, and DPO of 45.6.


pages: 516 words: 157,437

Principles: Life and Work by Ray Dalio

Albert Einstein, asset allocation, autonomous vehicles, backtesting, cognitive bias, Deng Xiaoping, diversification, Elon Musk, follow your passion, hiring and firing, iterative process, Jeff Bezos, Long Term Capital Management, margin call, microcredit, oil shock, performance metric, planetary scale, quantitative easing, risk tolerance, Ronald Reagan, Silicon Valley, Steve Jobs, transaction costs, yield curve

Pay for the person, not the job. Look at what people in comparable jobs with comparable experience and credentials make, add some small premium over that, and build in bonuses or other incentives so they will be motivated to knock the cover off the ball. Never pay based on the job title alone. b. Have performance metrics tied at least loosely to compensation.While you will never fully capture all the aspects that make for a great work relationship in metrics, you should be able to establish many of them. Tying performance metrics to compensation will help crystallize your understanding of your deal with people, provide good ongoing feedback, and influence how the person behaves on an ongoing basis. c. Pay north of fair. By being generous or at least a little north of fair with others I have enhanced both our work and our relationships and most people have responded in kind.

Make sure your people have character and are capable. 8.5 Don’t hire people just to fit the first job they will do; hire people you want to share your life with. a. Look for people who have lots of great questions. b. Show candidates your warts. c. Play jazz with people with whom you are compatible but who will also challenge you. 8.6 When considering compensation, provide both stability and opportunity. a. Pay for the person, not the job. b. Have performance metrics tied at least loosely to compensation. c. Pay north of fair. d. Focus more on making the pie bigger than on exactly how to slice it so that you or anyone else gets the biggest piece. 8.7 Remember that in great partnerships, consideration and generosity are more important than money. a. Be generous and expect generosity from others. 8.8 Great people are hard to find so make sure you think about how to keep them. 9 Constantly Train, Test, Evaluate, and Sort People 9.1 Understand that you and the people you manage will go through a process of personal evolution.


pages: 92 words: 23,741

Lessons From Private Equity Any Company Can Use by Orit Gadiesh, Hugh MacArthur

activist fund / activist shareholder / activist investor, call centre, corporate governance, inventory management, job-hopping, performance metric, shareholder value, telemarketer

The specific metrics used were tailored to customer segments: retention rates and the amount of up-selling of ads to “platinum” customers, cross-selling of multiple products to “gold” customers, and the number of new account sign-ups and penetration by industry vertical in the “bronze” customer category. Additionally, the measurement of performance via quantitative metrics (such as return on investment, or ROI) gave the sales force a new communication tool with their clients. The sales force found that renewal rates were higher in categories where they could articulate a high ROI from the client’s ad expenditure. The new performance metrics also provided a regular report card on whether the SYP turnaround was on track. They made transparent for the first time how much revenue per customer each dollar invested in the sales effort yielded. The clear metrics helped give a powerful signal to other potential investors that the new sales force strategy was working. A little more than a year after making the acquisition, the private equity partners floated an initial public offering of SYP shares, locking in a gain of 2.6 times their original investment while still retaining a 20 percent stake in the company.


pages: 374 words: 94,508

Infonomics: How to Monetize, Manage, and Measure Information as an Asset for Competitive Advantage by Douglas B. Laney

3D printing, Affordable Care Act / Obamacare, banking crisis, blockchain, business climate, business intelligence, business process, call centre, chief data officer, Claude Shannon: information theory, commoditize, conceptual framework, crowdsourcing, dark matter, data acquisition, digital twin, discounted cash flows, disintermediation, diversification, en.wikipedia.org, endowment effect, Erik Brynjolfsson, full employment, informal economy, intangible asset, Internet of things, linked data, Lyft, Nash equilibrium, Network effects, new economy, obamacare, performance metric, profit motive, recommendation engine, RFID, semantic web, smart meter, Snapchat, software as a service, source of truth, supply-chain management, text mining, uber lyft, Y2K, yield curve

Although we’ll cover information measurement in part III, these metrics can be useful in understanding and crafting an information supply chain. The high-level metrics framework includes the following attributes, definition, and sample metrics, along with performance measures transposed into sample information supply chain metrics: Performance Attribute Classic Supply Chain Performance Attribute Definition2 Sample Information Supply Chain Performance Metrics * * * Reliability The ability to perform tasks as expected. Reliability focuses on the predictability of the outcome of a process. Typical metrics for the reliability attribute include: on-time, the right quantity, the right quality. • Query/update performance • Data quality (accuracy, completeness, timeliness, integrity, etc.) Responsiveness The speed at which tasks are performed.

Power 57, 67, 229 Jessup, Beau Rose 35 Juniper Networks 33 keiretsus 132 Keough, Don 132 Knowledge Centered Support (KCS) 171n15 knowledge management (KM): information strategy 179; knowhow 155–6; people 191 “KnowMe” program, Westpac 54 Kosmix 31 Kovitz Investment 164 Kraft 39 Kreditech 62 Krishna, Dilip 260 Kroger 32, 36 Kumar, V. 30 Kushner, Theresa 272 Kyoto Protocol 63 Ladley, John 25, 120n13, 148, 187 Last.fm 34 Latulippe, Barb 188, 195, 234–5 Leatherberry, Tonie 260 legal rulings, information property rights 303–5 LexisNexis 57 liability, information as 216 library and information science (LIS) 156–8 library science: information strategy 179–80; metrics 184 lifecycle 252; see also information lifecycle LinkedIn 64 liquidity, information 20–1 Lockheed Martin 41–3, 62; project information 88 Logan, Valerie 144, 244 logical data warehouse (LDW) 181 Loss Adjustment Expense (LAE) 95 Lovelock, James 144n7 Lowans, Brian 272 loyalty 15, 22, 37, 80, 246 Lyft 141 McCrory, Dave 261 McDonald’s 236 McGilvray, Danette 272 McKnight, William 148 Magic Quadrants 68 marginal utility 273, 276; architecting for optimized information utility 278–9; concept of 284n4; of information 276–8; information for people 277; of information for people 277; information for technologies 278; of information for technologies 278; law of diminishing 276; negative 276; positive 276; understanding 273 market: cultivating for information product 73; entering new 35–6; market value of information (MVI) 257, 262, 266, 274; monetization success 74 Mashey, James 101n8 Mears, Rena 260 measurement: business-related benefits 244–5; data quality 246–8; future of infonomics 292; information assets 242–6; information asset valuation models 249–60; information–related benefits 242–4; information valuation 260–1; value of information 246–61 Mechanical Turk 65 Medicaid 44, 98 Medicare 44, 98 Megaupload 225 Memorial Healthcare System 62 Merck 66 Mercy Hospitals 98 metrics: applied asset management 184–6; assessing data quality 246–9; information asset management 182–6; information management challenges 299; information supply chain 126–7; objective quality 247–8; subjective quality 249 Microsoft 42, 223–5, 239n8 Microstrategy 133 Miller, Nolan 231, 272 Mishra, Gokula 235 MISO Energy 267 Mobilink 80 Mondelez 39 monetization 11–13, 16–18, 20, 29, 31–2, 34, 40–1, 44, 46, 55–6, 66–9, 80–100, 139, 176, 195, 244–5, 257, 265–6, 273, 277, 281, 286–7, 290, 292 monetizing information 28–9; analytics as engine of 77; bartering for favorable terms and conditions 38–9; bartering for goods and services 37–8; being in information business 48–9; creating supplemental revenue stream 32–3; defraying costs of information management and analytics 40–1; enabling competitive differentiation 36–7; entering new markets 35–6; future of infonomics 292; improving citizen well-being 45–8; increasing customer acquisition and retention 29–31; introducing a new line of business 33–4; measurement benefits 244–5; more than cash for 14–16; myths of 12–13; possibilities of 11–12; recognizing organizational roadblocks to 287–8; reducing fraud and risk 44–5; reducing maintenance costs, cost overruns and delays 41–4; success of 74–6; understanding unstructured information 94–5; value of 265–6 monetizing information steps 55–74; alternatives for direct and indirect monetization 66–8; available information assets 59–66; feasibility of ideas 69–73; high-value ideas from other industries 69; information product management function 56–9; market cultivation for information 73–4; preparing data for monetization 73 Monsanto 8, 21 Mozenda 65 Mullier, Graham 239n15 multiple listing service (MLS) 53–4 Multispectral 35 Nabisco 39 Nash Equilibrium 273 National Health Service 232 Naudé, Glabriel 156 Negroponte, Nicholas 147 Netflix 59 network effect 25, 27n14 New Jersey, state of 179, 192 New York City, reducing fraud and risk 44–5 New York Stock Exchange 23, 134 Ng, Andrew 231 Nielsen 34, 57, 63, 67 non-rivalrous 19, 131, 235, 256, 277, 280 non-rivalry principle 142, 274 Nordic Wellness Products 33 North Atlantic Treaty Organization (NATO) 147–8 Oberholzer-Gee, Felix 169 Ocean Tomo 213 O’Neal, Kelle 272 OpenClinical.org 98 operational data, information asset 61 opportunity cost for information 279–80 Orange 33 ownership: brief history of 223; habeas corpus of rulings on 226–7; information 221–2, 237–8; information location 223–5; infothievery 225–6; internal, of information 233–7; owning usage of information 230–1; personal, of personal information 231–3; see also information ownership; property, information as Pacioli, Luca 210 Panchmatia, Nimish 43 Patel, Ash 134 patent 1, 19, 28; algorithm 239nn17–18; applications 230–1, 260; economic value 229; intangible asset 168, 207; intellectual property 128, 130 Patrick, Charlotte 34 People Capability Maturity Model (P-CMM) 165–6 people-process-technology 99 Pepsi 40 performance metrics, information supply chain 126–7 performance value of information (PVI) 254–5 periodicity 248 personal information, ownership of 231–3 personally identifiable information (PII) 25, 76, 178 physical asset management 158–63; asset condition 161–2; asset maintenance and replacement costs 162–3; asset register 160–1; asset risks 161–2; governance 188; vision 176 Pigott, Ian 7 Pinterest 31 Plotkin, David 187 PNC Bank 192 Poste Italiane 47 Post Malaysia 47 Potbot 64 Preska, Loretta 224 Prevedere Software 97 Price, James 106, 114–15, 181, 186, 272 price elasticity of information 275–6 process: applied asset management 194–6; information management 193–6; information management challenges 301–2; maturity, challenges and remedies 193–4 production possibility frontier 280 productive efficiency 280 productization, monetization success 75 product management function 56–9 profitability, information 24–6 ProgrammableWeb 63 property, information as 227–30 public data, information asset 64 Publicly Available Specification (PAS): metrics 184; physical asset management 158–60, 162–3 public-private partnerships 47 quality see data quality Radio Shack 141 Rajesekhar, Ruchi 267 Raskino, Mark 37 Realtor.com 53 records information management (RIM) 152–3 Reddit 23 Redman, Thomas 235, 272 relationships, bartering for 38–9 reusable nature, information 19–20 revenue: monetization success 74; supplemental stream 32–3; value of expanded 266 Ricardian Rent 273 risk 12, 14, 28, 44–5, 62, 74–5, 85, 89, 91–2, 106, 115, 125, 139, 152, 159–61, 185, 194, 216, 242–4, 256–7, 286–90; reduction 44–5, 74–5, 85, 89; monetization success 74 Rite Aid 32 Roosevelt, Franklin 37 Rosenkranz, E.


pages: 324 words: 92,805

The Impulse Society: America in the Age of Instant Gratification by Paul Roberts

2013 Report for America's Infrastructure - American Society of Civil Engineers - 19 March 2013, 3D printing, accounting loophole / creative accounting, activist fund / activist shareholder / activist investor, Affordable Care Act / Obamacare, American Society of Civil Engineers: Report Card, asset allocation, business cycle, business process, Cass Sunstein, centre right, choice architecture, collateralized debt obligation, collective bargaining, computerized trading, corporate governance, corporate raider, corporate social responsibility, creative destruction, crony capitalism, David Brooks, delayed gratification, disruptive innovation, double helix, factory automation, financial deregulation, financial innovation, fixed income, full employment, game design, greed is good, If something cannot go on forever, it will stop - Herbert Stein's Law, impulse control, income inequality, inflation targeting, invisible hand, job automation, John Markoff, Joseph Schumpeter, knowledge worker, late fees, Long Term Capital Management, loss aversion, low skilled workers, mass immigration, new economy, Nicholas Carr, obamacare, Occupy movement, oil shale / tar sands, performance metric, postindustrial economy, profit maximization, Report Card for America’s Infrastructure, reshoring, Richard Thaler, rising living standards, Robert Shiller, Robert Shiller, Rodney Brooks, Ronald Reagan, shareholder value, Silicon Valley, speech recognition, Steve Jobs, technoutopianism, the built environment, The Predators' Ball, the scientific method, The Wealth of Nations by Adam Smith, Thorstein Veblen, too big to fail, total factor productivity, Tyler Cowen: Great Stagnation, Walter Mischel, winner-take-all economy

On the downside side, Autor told me, those jobs will always be low-wage “because the skills they use are generic and almost anyone can be productive at them within a couple of days.”34 And, in fact, there will likely be far more downsides to these jobs than upsides. For example, because Big Data will allow companies to more easily and accurately measure worker productivity, workers will be under constant pressure to meet specific performance metrics and will be subject to constant ratings, just as restaurants and online products are today. Companies will assess every data point that might affect performance, so that every aspect of employment, from applying for a job to the actual performance of duties, will become much more closely scrutinized and assessed. “If you’re a worker, there’ll be, like, credit scores,” Cowen told NPR.35 “There already are, to some extent.

There will be no middle class in the way we now understand the term: median income will be much lower than it is, and many of the poor will lack access to even basic public services, in part because the wealthy will resist tax increases. “Rather than balancing our budget with higher taxes or lower benefits,” Cowen says, “we will allow the real wages of many workers to fall, and thus we will allow the creation of a new underclass.” Certain critics have found such dystopic visions far too grim. And yet, the signs of such a future are everywhere. Already, companies are using Big Data performance metrics to determine whom to cut—meaning that to be laid off is to be branded unemployable. In the ultimate corruption of innovation, a technology that might be used to help workers upgrade their skills and become more secure is instead being use to harass them. To be sure, Big Data will be put to more beneficial uses. Digital technologies will certainly remake the way we deliver education, for example.


pages: 323 words: 90,868

The Wealth of Humans: Work, Power, and Status in the Twenty-First Century by Ryan Avent

"Robert Solow", 3D printing, Airbnb, American energy revolution, assortative mating, autonomous vehicles, Bakken shale, barriers to entry, basic income, Bernie Sanders, BRICs, business cycle, call centre, Capital in the Twenty-First Century by Thomas Piketty, Clayton Christensen, cloud computing, collective bargaining, computer age, creative destruction, dark matter, David Ricardo: comparative advantage, deindustrialization, dematerialisation, Deng Xiaoping, deskilling, disruptive innovation, Dissolution of the Soviet Union, Donald Trump, Downton Abbey, Edward Glaeser, Erik Brynjolfsson, eurozone crisis, everywhere but in the productivity statistics, falling living standards, first square of the chessboard, first square of the chessboard / second half of the chessboard, Ford paid five dollars a day, Francis Fukuyama: the end of history, future of work, gig economy, global supply chain, global value chain, hydraulic fracturing, income inequality, indoor plumbing, industrial robot, intangible asset, interchangeable parts, Internet of things, inventory management, invisible hand, James Watt: steam engine, Jeff Bezos, John Maynard Keynes: Economic Possibilities for our Grandchildren, Joseph-Marie Jacquard, knowledge economy, low skilled workers, lump of labour, Lyft, manufacturing employment, Marc Andreessen, mass immigration, means of production, new economy, performance metric, pets.com, post-work, price mechanism, quantitative easing, Ray Kurzweil, rent-seeking, reshoring, rising living standards, Robert Gordon, Ronald Coase, savings glut, Second Machine Age, secular stagnation, self-driving car, sharing economy, Silicon Valley, single-payer health, software is eating the world, supply-chain management, supply-chain management software, TaskRabbit, The Future of Employment, The Nature of the Firm, The Rise and Fall of American Growth, The Spirit Level, The Wealth of Nations by Adam Smith, trade liberalization, transaction costs, Tyler Cowen: Great Stagnation, Uber and Lyft, Uber for X, uber lyft, very high income, working-age population

That knowledge is absorbed by newer employees over time, through long exposure to the old habits. What our firm is, is not so much a business that produces a weekly magazine, but a way of doing things consisting of an enormous set of processes. You run that programme, and you get a weekly magazine at the end of it. Employees want job security, to advance, to receive pay rises. Those desires are linked to tangible performance metrics; within The Economist, it matters that a writer delivers the expected stories with the expected frequency and with the expected quality. Yet that is not all that matters. Advancement is also about the extent to which a worker thrives within a culture. What constitutes thriving depends on the culture. In some firms, it may mean buttering up the bosses and working long hours. In others, it may mean the practice of Machiavellian office politics.

The information-processing role of the firm can help us to understand the phenomenon of ‘disruption’, in which older businesses struggle to adapt to powerful new technologies or market opportunities. The notion of a ‘disruptive’ technology was first described in detail by Clayton Christensen, a scholar at Harvard Business School.4 Disruption is one of the most important ideas in business and management to emerge over the last generation. A disruptive innovation, in Christensen’s sense, is one that is initially not very good, in the sense that it does badly on the performance metrics that industry leaders care about, but which then catches on rapidly, wrong-footing older firms and upending the industry. Christensen explained his idea through the disk-drive industry, which was once dominated by large, 8-inch disks that could hold lots of information and access it very quickly. Both disk-drive makers and their customers initially thought that smaller drives were of little practical use.


pages: 290 words: 87,549

The Airbnb Story: How Three Ordinary Guys Disrupted an Industry, Made Billions...and Created Plenty of Controversy by Leigh Gallagher

Airbnb, Amazon Web Services, barriers to entry, Ben Horowitz, Bernie Sanders, cloud computing, crowdsourcing, don't be evil, Donald Trump, East Village, Elon Musk, housing crisis, iterative process, Jeff Bezos, Jony Ive, Justin.tv, Lyft, Marc Andreessen, Mark Zuckerberg, medical residency, Menlo Park, Network effects, Paul Buchheit, Paul Graham, performance metric, Peter Thiel, RFID, Sam Altman, Sand Hill Road, Saturday Night Live, sharing economy, side project, Silicon Valley, Silicon Valley startup, South of Market, San Francisco, Startup school, Steve Jobs, TaskRabbit, the payments system, Tony Hsieh, Travis Kalanick, uber lyft, Y Combinator, yield management

That ability could be used as a powerful reward mechanism to its hosts: those who provided positive experiences for guests and received good reviews would get vaulted to the top of search results, giving them greater exposure and increasing their chances of future bookings. But decline too many requests or respond too slowly or cancel too many reservations or simply appear inhospitable in reviews, and Airbnb can drop a powerful hammer: it can lower your listing in search results or even deactivate your account. Behave well, though, and Airbnb will shine its love upon you. If you hit a certain series of performance metrics—in the past year, if you have hosted at least ten trips, if you have maintained a 90 percent response rate or higher, if you have received a five-star review at least 80 percent of the time, and if you’ve canceled a reservation only rarely or in extenuating circumstances, you are automatically elevated to “Superhost” status. That means you get a special logo on your site, your listing will be bumped way up in the rankings, you’ll get access to a dedicated customer-support line, and you might even get the chance to preview new products and attend events.

., 139 Maslow, Abraham, 70–71, 92 Mason, Andrew, 49 matching (guest and host), 44–45 McAdoo, Greg, 30–31, 35–36, 164 McCann, Pol, 74, 116, 117 McChrystal, Stanley, 173, 186 McGovern, George, xvi, 167 McNamara, Robert, 166 media and press Airbnb in pop culture, xv–xvi, 60–61 at conventions (2009), 38 Democratic National Convention coverage, 19–20 “Meet Carol” television ad, 112 negative exposure, 50–55, 80–82, 86, 91 presidential inauguration, 28 “Meet Carol” television ad, 112 Meyer, Danny, 191 Michael (original guest), 8, 10 Mildenhall, Jonathan, 64 millennials as Airbnb early adopters, xii, xiii, 59, 66, 150–51, 157–58 apartments and, 129–30 hotel industry and, 141, 152 as mobilizing force, 134 New York and, 108 mission statement, xiv, xix, 36, 64–67, 78–79, 117, 171, 172, 194, 205 Moore, Geoffrey, 181, 188 Morey, Elizabeth, 31 Morgan, Jonathan, 74–75, 116, 117, 134 Morgan Stanley, 145 Morris, Phil, 202 Moxy, 152 multifamily buildings, 129–31 Multiple Dwelling Law, 107, 115 multiunit listings, 110–13, 116–17 Murphy, Laura, 102, 171 Mushroom Dome, 60, 183 Musk, Elon, 196 N Nassetta, Christopher, 141–42 neighbors, 83–85, 109, 118–19, 132–34 network effect, 40–41 New Jersey, 126 New York City, 105–37 anti-Airbnb alliance, 109 attorney general’s report, 109–110 Chesky’s reaction to, 113 commercial “multiunit” listings, 110–13, 115–16 customer base, 26–28, 106, 119, 126 future negotiations, 133–37 objections to short-term rentals, 118–24 Warren verdict, 108–9 Noirbnb, 102 O Oasis, 154, 155–56 Obama, Barack, 18, 28, 92, 161–62, 173–74, 209 Obama O’s, 20–23, 24, 33, 47, 174 Olympics, 156 “one host, one home” policy, 114 onefinestay, 153, 154–55, 158 online travel agencies (OTAs), 148 Open Doors policy, 102 Orbitz, 148 Orlando, 142 Oswald, Lee Harvey, xvi P Packard, Dave, 1 Paltrow, Gwyneth, 59, 60, 191 Panetta, Leon, x Paris, Airbnb Open, 77–78 parties, 81–90 Patel, Elissa, 159, 209 Patton, George S., 166 payment system, 14, 16, 27, 39–40, 42–43 PayPal, 43 Peak (Conley), 70–71 Penz, Hans, 200 performance metrics, 72–73 photography, xvii, 27, 45, 99, 100–104, 206 Pillow, 75 politics Airbnb as force for change, 126–28 Airbnb guests and, 133 future negotiations, 133–37 Lehane and, 125–29 New York advertising policy, 121–22 New York short-term rentals, 105–10 pop culture, xv–xvi, 60–61 popular listings, 60 Pressler, Paul, 196 Priceline, 148, 154, 198 pricing, as issue, 27, 99–100 privacy policy, 87, 115 product evolution, 59–60 product/market fit, 34–37 professional operators, Airbnb, 111 profit and earnings, 73, 110, 112–13, 127 property management, 129 Proposition F, 128–29 prototype operations, 177–78 R Rabois, Keith, 31 racial discrimination, 99–104 “Ramen profitable,” 26, 29 rankings, 16, 72–73, 162 Rasulo, Jay, 196 Rausch Street apartment, 7–8, 14, 25, 36–38, 179, 183, 208 rebranding, 64–67, 78–79 regulations.


pages: 98 words: 25,753

Ethics of Big Data: Balancing Risk and Innovation by Kord Davis, Doug Patterson

4chan, business process, corporate social responsibility, crowdsourcing, en.wikipedia.org, longitudinal study, Mahatma Gandhi, Mark Zuckerberg, Netflix Prize, Occupy movement, performance metric, Robert Bork, side project, smart grid, urban planning

The volume at which new data is being generated is staggering. We live in an age when the amount of data we expect to be generated in the world is measured in exabytes and zettabytes. By 2025, the forecast is that the Internet will exceed the brain capacity of everyone living on the entire planet. Additionally, the variety of sources and data types being generated expands as fast as new technology can be created. Performance metrics from in-car monitors, manufacturing floor yield measurements, all manner of healthcare devices, and the growing number of Smart Grid energy appliances all generate data. More importantly, they generate data at a rapid pace. The velocity of data generation, acquisition, processing, and output increases exponentially as the number of sources and increasingly wider variety of formats grows over time.


pages: 98 words: 27,201

Are Chief Executives Overpaid? by Deborah Hargreaves

banking crisis, Big bang: deregulation of the City of London, bonus culture, business climate, corporate governance, Donald Trump, G4S, Jeff Bezos, loadsamoney, Mark Zuckerberg, Martin Wolf, performance metric, principal–agent problem, profit maximization, Ronald Reagan, shareholder value, Snapchat, trade liberalization, trickle-down economics, wealth creators

One of the most significant changes was to give shareholders a binding vote every three years on pay policy in addition to their existing annual advisory vote on the remuneration report. All payments, including exit payments, are covered by the new vote. Although at first it seemed as though this binding vote would be a damp squib, it has proved important in some cases and it has focused the attention of companies who might be tempted to manipulate their performance goals. In conjunction with this binding vote, companies are also required to make their performance metrics clearer and more understandable. They also have to present a table about how much will be paid out for certain levels of performance, which is meant to make the remuneration report easier to understand and in an accessible format. These reforms were aimed at simplifying the multiple pages of the remuneration report contained in a company’s annual report, but they have done little to address the complexity of executive pay.


pages: 556 words: 46,885

The World's First Railway System: Enterprise, Competition, and Regulation on the Railway Network in Victorian Britain by Mark Casson

banking crisis, barriers to entry, Beeching cuts, British Empire, business cycle, combinatorial explosion, Corn Laws, corporate social responsibility, David Ricardo: comparative advantage, intermodal, iterative process, joint-stock company, joint-stock limited liability company, Kickstarter, knowledge economy, linear programming, Network effects, New Urbanism, performance metric, railway mania, rent-seeking, strikebreaker, the market place, transaction costs

To this end, the counterfactual has been constructed on very conservative assumptions, which are elaborated below. The engineering assumptions are very conservative relative to actual railway practice, while the use of detailed land surveys and large-scale maps means that major infringements of local parks and amenities have been avoided. 1 . 4 . P E R F O R M A N C E M E T R I C S : D I S TA N C E A N D T I M E Two main performance metrics are used in this study: journey distance and journey time. The most obvious metric by which to compare the actual and counterfactual systems is by the route mileages between pairs of towns. This Introduction and Summary 7 metric is not quite so useful as it seems, however. For many types of traYc, including passengers, mail, troops, and perishable goods, it is the time taken by the journey that is important and not the distance per se.

In practice the counterfactual system, being smaller, would have been completed much earlier than the actual system, assuming that the pace of construction had been the same. Thus the average working life of the counterfactual system would have been longer—another advantage which is not formally included in the comparison. 3 . 4 . C O N S T RU C T I O N O F T H E C O UN T E R FAC T UA L : PERFORMANCE METRICS To compare the performance of the actual and counterfactual systems a set of 250 representative journeys was examined. Ten different types of journey were 64 The World’s First Railway System distinguished, and sub-samples of 25 journeys of each type were generated. Performance was measured for each type of journey, and an overall measure of performance, based on an arithmetic average, was constructed.

R. 367 Clitheroe as secondary natural hub 83 Tab 3.4 Clyde River 199 Clyde Valley 156 coal industry 1, 50 exports 5 see also regional coalfields coal traffic 53, 182–3, 270 coalfield railways 127, 167 Coalville 187 Coatbridge 157 Cobden, Richard 37 Cockermouth 219 Colchester 69, 107, 108 Coldstream 158, 159 Colebrook 198 Colonial Office, British 48 Combe Down Tunnel 144 commerce, industry and railways 308 Index Commercial Railway Scheme, London 152, 154 Commission on the Merits of the Broad and Narrow Gauge 228 Tab 6.2 company law 42–3 competing local feeders 204–7 competition adverse effects of 221 adversarial 316–19 concept applied to railways 258–60 Duopolistic on networks 492–4 and duplication of routes 94 and excess capacity 477–97 excessive 16–19 and fare reduction 261–2 individual/multiple linkages 266, 267 inter-town 323–4 and invasions by competing companies 268–9, 273 and invasions in joint venture schemes (large) 166–73 and invasions in joint venture schemes (small) 173–8 network effects 262–4 principle of 221 and territorial strategy 286–7 wastage/inefficiency 162, 166 compulsory purchase of land 30, 223, 288 concavity principle 72, 82 connectivity and networks 2–3 Connel Ferry 161 construction costs 16–17 consultant engineers see civil engineers; mechanical engineers contour principle 72 contractors 301–2 Conway River 136 cooperation between companies 324–6 core and peripheral areas, UK 85 Fig 3.8 Corn Laws, Repeal (1846) 37, 110 Cornwall 152 Cornwall Railway 141 corporate social responsibility 311–13 corridor trains 311 Cosham 147, 190 Cotswold Hills 110, 111, 114, 149 counterfactual map of the railway network East Midlands 90 Fig 3.10 North of England 92 Fig 3.12 South East England 90 Fig 3.10 Wales 91 Fig 3.11 West of England 91 Fig 3.11 counterfactual railway network 4–29, 58–104 bypass principle 80–2, 89 and cities 306 concavity principle 82 continuous linear trunk network with coastal constraints 74 Fig 3.2 503 continuous linear trunk network with no coastal constraints 73 Fig 3.1 contour principle 87, 88 Fig 3.9 core and periphery principle 82–6, 84 Tab 3.5, 85 Fig 3.8 coverage of cities, town and villages 62–3 cross-country linkages on the symmetric network 100 Fig 3.19 cross-country routes 274 cut-off principle 80, 81 Fig 3.7, 89 cut-off principle with traffic weighting 81 Fig 3.7 Darlington core hub 89 Derby core hub 89 frequency of service 65–6 Gloucester as corner hub 82 heuristic principles of 10–12, 71–2 hubs 439–71, 440–9 Tab A5.1 hubs, configuration of 89, 94–103 hubs, size and distribution 95 Fig 3.13 Huddersfield core hub 89 influence of valleys and mountains 88 Fig 3.9 iterative process 64 Kirkby Lonsdale core hub 89 Leicester core hub 89 Lincolnshire region cross-country routes 119 London as corner hub 82 London terminals 155 loop principle 86–7 Melrose core hub 89, 158–9 mileage 437 Tab A4.4 Newcastle as corner hub 82 North-South linkages 148 North-South spine with ribs 75 Fig 3.3 objections to 12–14 optimality of the system 91–3 performance compared to actual system 64–5, 65 Tab 3.2 performance metrics 63–6 quality of network 392 Tab A4.1 and rational principles 322 Reading core hub 89 role of network 392, 393 Tab A4.2 route description 392–438, 393–436 Tab A4.3 and Severn Tunnel 112–14 Shoreham as corner hub 82 Southampton as corner hub 82 space-filling principle 87–9 Steiner solution 76 Fig 3.4 Steiner solution with traffic weighting 78 Fig 3.5 Stoke-on-Trent as corner hub 89 timetable 8, 89–90, 472–6, 474–6 Tab A6.1 timetable compared with actual 315–16 504 Index counterfactual railway network (cont.) traffic flows 66–71 traffic-weighting principle 77, 78 Fig 3.5 trial solution, first 89–91, 90 Fig 3.10, 91 Fig 3.11, 92 Fig 3.12 triangle principle 77–80, 79 Fig 3.6, 89, 96 triangle principle without traffic weighting 79 Fig 3.6 Trowbridge core hub 89 Warrington as corner hub 82 Wetherby core hub 122 country towns avoided by railway schemes 307–9 Coventry 68, 118, 135 Coventry Canal 117 Crafts, Nicholas F.


pages: 289

Hustle and Gig: Struggling and Surviving in the Sharing Economy by Alexandrea J. Ravenelle

"side hustle", active transport: walking or cycling, Affordable Care Act / Obamacare, Airbnb, Amazon Mechanical Turk, barriers to entry, basic income, Broken windows theory, call centre, Capital in the Twenty-First Century by Thomas Piketty, cashless society, Clayton Christensen, clean water, collaborative consumption, collective bargaining, creative destruction, crowdsourcing, disruptive innovation, Downton Abbey, East Village, Erik Brynjolfsson, full employment, future of work, gig economy, Howard Zinn, income inequality, informal economy, job automation, low skilled workers, Lyft, minimum wage unemployment, Mitch Kapor, Network effects, new economy, New Urbanism, obamacare, Panopticon Jeremy Bentham, passive income, peer-to-peer, peer-to-peer model, performance metric, precariat, rent control, ride hailing / ride sharing, Ronald Reagan, sharing economy, Silicon Valley, strikebreaker, TaskRabbit, telemarketer, the payments system, Tim Cook: Apple, transaction costs, Travis Kalanick, Triangle Shirtwaist Factory, Uber and Lyft, Uber for X, uber lyft, ubercab, universal basic income, Upton Sinclair, urban planning, very high income, white flight, working poor, Zipcar

Sarah, twenty-nine, the Tasker profiled in the opening chapter, explained, “There were people complaining and in tears because they were like, ‘I don’t have any income now. I was your top TaskRabbit and you are treating me like shit,’ and [it] kind of felt like a betrayal.” The pivot also introduced strict requirements in terms of response rates and task acceptance. “You know,” said Sarah, “they have these performance metrics, which is not a very good way to measure. You have to accept 85 percent of what’s given to you. The way that the metrics work is a thirty-day kind of thing, so sometimes it just doesn’t add up—and you don’t know what you are going to get or when you are going to get it.” As a result, Taskers who don’t accept several tasks within a relatively short period may find themselves flagged. Sarah explains, “They pause your account, and you have to take some kind of test to say, ‘Yeah, I understand the guidelines,’ And they kind of talk to you like you are an eighteen-year-old.

See also piecemeal system overtime: avoidance of, 36; independent contractor status and, 94; paid overtime, 189 overwork, 6, 15–16 The Overworked American (Schor), 16 owner-occupied move-ins, 41 paid time off, 180, 188, 190 paid travel time, 189 pajama policy, 204–5, 206 participant recruitment and methodology, 223n85, 225n34, 228n30, 229n5; overview, 22; Airbnb, 42–43; Kitchensurfing, 42–43, 57; research methodology, 21–22; shared assets, 42; sharing economy, 42–43; skills issue, 42; TaskRabbit, 42–43; Uber, 42–43; underused asset access, 42 participation barriers: capital, 42–43, 43table1, 160, 166–68, 183; entrepreneurship and, 38; skills, 42–43, 43table1, 160, 166–68, 183 partners, 3, 79 part-time workers, 180 party-line rides, 105–6 payment rate changes: Kitchensurfing, 59; TaskRabbit, 79–80; Uber, 74–79 PayPal, 26 pay-to-work situations, 2–3 Peeple, 156 Peers, 56, 72, 225n31 peer-to-peer connections: lack of full disclosure in, 97–100; marketing as, 21, 182; political language and, 23; sexual behavior and, 134 peer-to-peer firms, 26 performance metrics, 78–79 Perkins, Frances, 93, 227n8 personal assistant services, 42 Peters, Diniece, 228n32 piecemeal system, 5–6, 22, 66. See also outsourcing Piketty, Thomas, 40 Pinkerton National Detective Agency, 68, 179 pivots: effects of, 11; Kitchensurfing, 57, 222n62; TaskRabbit, 1, 17, 55–56, 79–80, 138, 203, 222n62; Uber, 74–79 platform economy: forms of, 26–28, 28fig. 2; growth of, 7; term usage, 5.


Data and the City by Rob Kitchin,Tracey P. Lauriault,Gavin McArdle

A Declaration of the Independence of Cyberspace, bike sharing scheme, bitcoin, blockchain, Bretton Woods, Chelsea Manning, citizen journalism, Claude Shannon: information theory, clean water, cloud computing, complexity theory, conceptual framework, corporate governance, correlation does not imply causation, create, read, update, delete, crowdsourcing, cryptocurrency, dematerialisation, digital map, distributed ledger, fault tolerance, fiat currency, Filter Bubble, floating exchange rates, global value chain, Google Earth, hive mind, Internet of things, Kickstarter, knowledge economy, lifelogging, linked data, loose coupling, new economy, New Urbanism, Nicholas Carr, open economy, openstreetmap, packet switching, pattern recognition, performance metric, place-making, RAND corporation, RFID, Richard Florida, ride hailing / ride sharing, semantic web, sentiment analysis, sharing economy, Silicon Valley, Skype, smart cities, Smart Cities: Big Data, Civic Hackers, and the Quest for a New Utopia, smart contracts, smart grid, smart meter, social graph, software studies, statistical model, TaskRabbit, text mining, The Chicago School, The Death and Life of Great American Cities, the market place, the medium is the message, the scientific method, Toyota Production System, urban planning, urban sprawl, web application

Such dangers are even greater in situations where experiments and data gathering are slow and difficult to replicate, such as policy research (Reichman et al. 2011: 704). Data provenance failures can result in massive societal and economic losses (see, for example, discussions around the liability of geographic information/data in Onsrud 1999 and Phillips 1999). While urban policy and planning have long been guided by data, big urban data, performance metrics and data analytics are increasingly shaping urban policy and planning (Kitchin 2014a; Townsend 2013). The provenance of data, particularly geographic data (Monmonier 1995), must then be firmly established to produce trust and the necessary information required to avoid possible misinterpretation or misuses of the data. Definitions for and examples of data provenance can be found in methodological guides, data dictionaries, metadata abstracts and their accompanying articles, survey questionnaires and elsewhere.

In contrast, some municipalities use dashboards in a more contextual way. Here, it is recognized that cities are not mechanical systems that can be disassembled into its component parts and fixed, or steered and controlled through data levers. Instead, systems and governance are understood as complex and multi-level in nature, and the effects of policy measures are diverse and multifaceted, and neither is easily reducible to targets and performance metrics (Van Assche et al. 2010). Indicators highlight trends and potential issues, but do not show their causes or prescribe answers. Conceived in this way city dashboards provide useful contextual data – that can be used in conjunction with other data and initiatives – but are not used in a strongly instrumentalist, mechanistic way to direct management practices (Kitchin et al. 2015). A longstanding example of such an approach is that employed within Flanders, Belgium, where since the late 1990s a number of cities have employed a common City Monitor for Sustainable Urban Development, consisting of nearly 200 indicators, to provide contextual evidence for policymaking (Van Assche et al. 2010).


pages: 132 words: 31,976

Getting Real by Jason Fried, David Heinemeier Hansson, Matthew Linderman, 37 Signals

call centre, David Heinemeier Hansson, iterative process, John Gruber, knowledge worker, Merlin Mann, Metcalfe's law, performance metric, post-work, premature optimization, Ruby on Rails, slashdot, Steve Jobs, web application

Complexity Does Not Scale Linearly With Size The most important ruleof software engineering is also the least known: Complexity does not scale linearly with size...A 2000 line program requires more than twice as much development time as one half the size. —The Ganssle Group (from Keep It Small) Table of contents | Essay list for this chapter | Next essay Optimize for Happiness Choose tools that keep your team excited and motivated A happy programmer is a productive programmer. That's why we optimize for happiness and you should too. Don't just pick tools and practices based on industry standards or performance metrics. Look at the intangibles: Is there passion, pride, and craftmanship here? Would you truly be happy working in this environment eight hours a day? This is especially important for choosing a programming language. Despite public perception to the contrary, they are not created equal. While just about any language can create just about any application, the right one makes the effort not merely possible or bearable, but pleasant and invigorating.


pages: 484 words: 104,873

Rise of the Robots: Technology and the Threat of a Jobless Future by Martin Ford

"Robert Solow", 3D printing, additive manufacturing, Affordable Care Act / Obamacare, AI winter, algorithmic trading, Amazon Mechanical Turk, artificial general intelligence, assortative mating, autonomous vehicles, banking crisis, basic income, Baxter: Rethink Robotics, Bernie Madoff, Bill Joy: nanobots, business cycle, call centre, Capital in the Twenty-First Century by Thomas Piketty, Chris Urmson, Clayton Christensen, clean water, cloud computing, collateralized debt obligation, commoditize, computer age, creative destruction, debt deflation, deskilling, disruptive innovation, diversified portfolio, Erik Brynjolfsson, factory automation, financial innovation, Flash crash, Fractional reserve banking, Freestyle chess, full employment, Goldman Sachs: Vampire Squid, Gunnar Myrdal, High speed trading, income inequality, indoor plumbing, industrial robot, informal economy, iterative process, Jaron Lanier, job automation, John Markoff, John Maynard Keynes: technological unemployment, John von Neumann, Kenneth Arrow, Khan Academy, knowledge worker, labor-force participation, liquidity trap, low skilled workers, low-wage service sector, Lyft, manufacturing employment, Marc Andreessen, McJob, moral hazard, Narrative Science, Network effects, new economy, Nicholas Carr, Norbert Wiener, obamacare, optical character recognition, passive income, Paul Samuelson, performance metric, Peter Thiel, plutocrats, Plutocrats, post scarcity, precision agriculture, price mechanism, Ray Kurzweil, rent control, rent-seeking, reshoring, RFID, Richard Feynman, Rodney Brooks, Sam Peltzman, secular stagnation, self-driving car, Silicon Valley, Silicon Valley startup, single-payer health, software is eating the world, sovereign wealth fund, speech recognition, Spread Networks laid a new fibre optics cable between New York and Chicago, stealth mode startup, stem cell, Stephen Hawking, Steve Jobs, Steven Levy, Steven Pinker, strong AI, Stuxnet, technological singularity, telepresence, telepresence robot, The Bell Curve by Richard Herrnstein and Charles Murray, The Coming Technological Singularity, The Future of Employment, Thomas L Friedman, too big to fail, Tyler Cowen: Great Stagnation, uber lyft, union organizing, Vernor Vinge, very high income, Watson beat the top human players on Jeopardy!, women in the workforce

Police departments across the globe are turning to algorithmic analysis to predict the times and locations where crimes are most likely to occur and then deploying their forces accordingly. The City of Chicago’s data portal allows residents to see both historical trends and real-time data in a range of areas that capture the ebb and flow of life in a major city—including energy usage, crime, performance metrics for transportation, schools and health care, and even the number of potholes patched in a given period of time. Tools that provide new ways to visualize data collected from social media interactions as well as sensors built into doors, turnstiles, and escalators offer urban planners and city managers graphic representations of the way people move, work, and interact in urban environments, a development that may lead directly to more efficient and livable cities.

He received the green light from IBM management in 2007 and set out to build, in his words, “the most sophisticated intelligence architecture the world has ever seen.”18 To do this, he drew on resources from throughout the company and put together a team consisting of artificial intelligence experts from within IBM as well as at top universities, including MIT and Carnegie Mellon.19 Ferrucci’s team, which eventually grew to include about twenty researchers, began by building a massive collection of reference information that would form the basis for Watson’s responses. This amounted to about 200 million pages of information, including dictionaries and reference books, works of literature, newspaper archives, web pages, and nearly the entire content of Wikipedia. Next they collected historical data for the Jeopardy! quiz show. Over 180,000 clues from previously televised matches became fodder for Watson’s machine learning algorithms, while performance metrics from the best human competitors were used to refine the computer’s betting strategy.20 Watson’s development required thousands of separate algorithms, each geared toward a specific task—such as searching within text; comparing dates, times, and locations; analyzing the grammar in clues; and translating raw information into properly formatted candidate responses. Watson begins by pulling apart the clue, analyzing the words, and attempting to understand what exactly it should look for.


pages: 128 words: 38,187

The New Prophets of Capital by Nicole Aschoff

3D printing, affirmative action, Affordable Care Act / Obamacare, Airbnb, American Legislative Exchange Council, basic income, Bretton Woods, clean water, collective bargaining, commoditize, crony capitalism, feminist movement, follow your passion, Food sovereignty, glass ceiling, global supply chain, global value chain, helicopter parent, hiring and firing, income inequality, Khan Academy, late capitalism, Lyft, Mark Zuckerberg, mass incarceration, means of production, performance metric, post-work, profit motive, rent-seeking, Ronald Reagan, Rosa Parks, school vouchers, shareholder value, sharing economy, Silicon Valley, Slavoj Žižek, structural adjustment programs, Tim Cook: Apple, urban renewal, women in the workforce, working poor, zero-sum game

But they are not, and feminist ideals cannot be achieved if they are pursued Sandberg-style. Women who channel their energies toward reaching the top of corporate America undermine the struggles of women trying to realize institutional change by organizing unions and implementing laws that protect women (and men) in the workplace. An anecdote shared by Sandberg illustrates this point: In 2010 Mark Zuckerberg pledged $100 million to improve the performance metrics of the Newark Public Schools. The money would be distributed through a new foundation called Startup: Education. Sandberg recommended Jen Holleran, a woman she knew “with deep knowledge and experience in school reform” to run the foundation. The only problem was that Jen was raising fourteen-month-old twins at the time, working part time, and not getting much help from her husband. Jen hesitated to accept the offer, fearful of “upsetting the current order” at home.


pages: 302 words: 82,233

Beautiful security by Andy Oram, John Viega

Albert Einstein, Amazon Web Services, business intelligence, business process, call centre, cloud computing, corporate governance, credit crunch, crowdsourcing, defense in depth, Donald Davies, en.wikipedia.org, fault tolerance, Firefox, loose coupling, Marc Andreessen, market design, MITM: man-in-the-middle, Monroe Doctrine, new economy, Nicholas Carr, Nick Leeson, Norbert Wiener, optical character recognition, packet switching, peer-to-peer, performance metric, pirate software, Robert Bork, Search for Extraterrestrial Intelligence, security theater, SETI@home, Silicon Valley, Skype, software as a service, statistical model, Steven Levy, The Wisdom of Crowds, Upton Sinclair, web application, web of trust, zero day, Zimmermann PGP

Operational profile definition Explore Problem definition prioritizes key performance and capacity needs Architect for performance, capacity, and future growth Volume deploy Execute Performance budgets Performance targets Performance engineer begins work during requirements phase Annotated use cases and user scenarios Releases and iterations prioritized to validate key performance issues early Prototyping Performance estimates Benchmarks Performance measurements Code instrumentation Automated execution of performance and load tests Performance data capture Test tools/scripts for field measurement of performance/capacity Project management tracks performance metrics FIGURE 10-3. Best practices dependencies: Performance and Capacity SECURITY BY DESIGN 177 Explore Problem definition prioritizes key functions needed Operational profile definition Reliability engineer begins work during requirements phase to understand critical functions and constraints Tune physical and functional architecture for reliability and Define acceptable failure and Annotated use cases and user scenarios availability recovery rates– availability and reliability targets Predict expected reliability and availability Releases and iterations prioritized to handle capabilities early Fault/failure injection testing Failure data collected and analyzed and predictions made Fault detection, System auditing and isolation, and repair sanity control Automated execution Project management tracks of stability testing Code instrumentation quality index Volume deploy Execute Reliability budgets for failure and recovery rates Reliability and availability data capture Field measurement of failures and recovery FIGURE 10-4.

I initially dreaded this decision since it limited the leverage I had to encourage project leaders to identify and remediate security vulnerabilities. The results proved that this decision actually increased compliance with the security plan. With the requirement to pass the static analysis test still hanging over teams, they felt the need to remove defects earlier in the lifecycle so that they would avoid last-minute rejections. The second decision was the implementation of a detailed reporting framework in which key performance metrics (for instance, percentage of high-risk vulnerabilities per lines of code) were shared with team leaders, their managers, and the CIO on a monthly basis. The vulnerability information from the static code analyzer was summarized at the project, portfolio, and organization level and shared with all three sets of stakeholders. Over time, development leaders focused on the issues that were raising their risk score and essentially competed with each other to achieve better results.


Trading Risk: Enhanced Profitability Through Risk Control by Kenneth L. Grant

backtesting, business cycle, buy and hold, commodity trading advisor, correlation coefficient, correlation does not imply causation, delta neutral, diversification, diversified portfolio, fixed income, frictionless, frictionless market, George Santayana, implied volatility, interest rate swap, invisible hand, Isaac Newton, John Meriwether, Long Term Capital Management, market design, Myron Scholes, performance metric, price mechanism, price stability, risk tolerance, risk-adjusted returns, Sharpe ratio, short selling, South Sea Bubble, Stephen Hawking, the scientific method, The Wealth of Nations by Adam Smith, transaction costs, two-sided market, value at risk, volatility arbitrage, yield curve, zero-coupon bond

If on each transaction you strive to save yourself a few pennies on commission, achieve a tick or two better on each execution, manage your risk to slightly more precise parameters, and conduct that extra little bit of research, you will achieve a dramatic, positive impact on your performance. I promise that, depending on where you are in your P/L cycle, this will turn good periods into great ones, mediocre periods into respectable ones, and otherwise catastrophic intervals into ones where the consequences are acceptable. Managing these types of performance metrics is hard work, but it is not nearly as difficult as losing lots of money or simply treading water. What is more, you’ll never maximize your returns unless you factor these components in. Improving performance at the margins of your trading activity will be a main theme of this book. Do Not Become Overreliant on Other Market Participants for Comfort or Assistance. Blanche DuBois would not have lasted through the first scene of A Wall Street Car Named Desire because depending on the kindness of strangers is the practical equivalent of financial suicide (except for me—I am your friend).

See Scientific method Optimal f, 245–251 Optimism, importance of, 4 Options: asymmetric payoff functions, 150–151 implications of, generally, 148–149 implied volatility, 86–89, 150 leverage, 151–153 nonlinear pricing dynamics, 149 pricing, 88–89, 106 strike price/underlying price, relationship between, 149–150 volatility arbitrage, 106 Out-of-the-money option, 150 Over-the-counter derivatives, 148 Performance analysis, 7–8 Performance metrics, 16, 35 Performance objectives: “going to the beach,” 32–36 importance of, 19–20, 29 nominal target return, 20, 24–26 optimal target return, 20–24 stop-out level, 20–21, 26–32 Performance ratio, 188–200 Performance success metrics: accuracy ratio (win/loss), 184–186 impact ratio, 186–188 performance ratio, 188–200 profitability concentration (90/10) ratio, 200–208 Planning, importance of, 9.


pages: 133 words: 42,254

Big Data Analytics: Turning Big Data Into Big Money by Frank J. Ohlhorst

algorithmic trading, bioinformatics, business intelligence, business process, call centre, cloud computing, create, read, update, delete, data acquisition, DevOps, fault tolerance, linked data, natural language processing, Network effects, pattern recognition, performance metric, personalized medicine, RFID, sentiment analysis, six sigma, smart meter, statistical model, supply-chain management, Watson beat the top human players on Jeopardy!, web application

That is why it is important to build objectives, measurements, and milestones that demonstrate the benefits of a team focused on Big Data analytics. Developing performance measurements is an important part of designing a business plan. With Big Data, those metrics can be assigned to the specific goal in mind. For example, if an organization is looking to bring efficiency to a warehouse, a performance metric may be measuring the amount of empty shelf space and what the cost of that empty shelf space means to the company. Analytics can be used to identify product movement, sales predictions, and so forth to move product into that shelf space to better service the needs of customers. It is a simple comparison of the percentage of space used before the analytics process and the percentage of space used after the analytics team has tackled the issue.


pages: 320 words: 33,385

Market Risk Analysis, Quantitative Methods in Finance by Carol Alexander

asset allocation, backtesting, barriers to entry, Brownian motion, capital asset pricing model, constrained optimization, credit crunch, Credit Default Swap, discounted cash flows, discrete time, diversification, diversified portfolio, en.wikipedia.org, fixed income, implied volatility, interest rate swap, market friction, market microstructure, p-value, performance metric, quantitative trading / quantitative finance, random walk, risk tolerance, risk-adjusted returns, risk/return, Sharpe ratio, statistical arbitrage, statistical model, stochastic process, stochastic volatility, Thomas Bayes, transaction costs, value at risk, volatility smile, Wiener process, yield curve, zero-sum game

We describe some standard utility functions that display different risk aversion characteristics and show how an investor’s utility determines his optimal portfolio. Then we solve the portfolio allocation decision for a risk averse investor, following and then generalizing the classical problem of portfolio selection that was introduced by Markowitz (1959). This lays the foundation for our review of the theory of asset pricing, and our critique of the many risk adjusted performance metrics that are commonly used by asset managers. ABOUT THE CD-ROM My golden rule of teaching has always been to provide copious examples, and whenever possible to illustrate every formula by replicating it in an Excel spreadsheet. Virtually all the concepts in this book are illustrated using numerical and empirical examples, and the Excel workbooks for each chapter may be found on the accompanying CD-ROM.

Many risk adjusted performance measures that are commonly used today are either not linked to a utility function at all, or if they are associated with a utility function we assume the investor cares nothing at all about the gains he makes above a certain threshold. Kappa indices can be loosely tailored to the degree of risk aversion of the investor, but otherwise the rankings produced by the risk adjusted performance measure may not be ranking in the order of an investor’s preference! The only universal risk adjusted performance metric, i.e. one that can rank investments having any returns distributions for investors having any type of utility function, is the certain equivalent. The certain equivalent of an uncertain investment is the amount of money, received for certain, that gives the same utility to the investor as the uncertain investment. References Adjaouté, K. and Danthine, J.P. (2004) Equity returns and integration: Is Europe changing?


pages: 410 words: 119,823

Radical Technologies: The Design of Everyday Life by Adam Greenfield

3D printing, Airbnb, augmented reality, autonomous vehicles, bank run, barriers to entry, basic income, bitcoin, blockchain, business intelligence, business process, call centre, cellular automata, centralized clearinghouse, centre right, Chuck Templeton: OpenTable:, cloud computing, collective bargaining, combinatorial explosion, Computer Numeric Control, computer vision, Conway's Game of Life, cryptocurrency, David Graeber, dematerialisation, digital map, disruptive innovation, distributed ledger, drone strike, Elon Musk, Ethereum, ethereum blockchain, facts on the ground, fiat currency, global supply chain, global village, Google Glasses, IBM and the Holocaust, industrial robot, informal economy, information retrieval, Internet of things, James Watt: steam engine, Jane Jacobs, Jeff Bezos, job automation, John Conway, John Markoff, John Maynard Keynes: Economic Possibilities for our Grandchildren, John Maynard Keynes: technological unemployment, John von Neumann, joint-stock company, Kevin Kelly, Kickstarter, late capitalism, license plate recognition, lifelogging, M-Pesa, Mark Zuckerberg, means of production, megacity, megastructure, minimum viable product, money: store of value / unit of account / medium of exchange, natural language processing, Network effects, New Urbanism, Occupy movement, Oculus Rift, Pareto efficiency, pattern recognition, Pearl River Delta, performance metric, Peter Eisenman, Peter Thiel, planetary scale, Ponzi scheme, post scarcity, post-work, RAND corporation, recommendation engine, RFID, rolodex, Satoshi Nakamoto, self-driving car, sentiment analysis, shareholder value, sharing economy, Silicon Valley, smart cities, smart contracts, social intelligence, sorting algorithm, special economic zone, speech recognition, stakhanovite, statistical model, stem cell, technoutopianism, Tesla Model S, the built environment, The Death and Life of Great American Cities, The Future of Employment, transaction costs, Uber for X, undersea cable, universal basic income, urban planning, urban sprawl, Whole Earth Review, WikiLeaks, women in the workforce

This shrunken workforce will be asked to do more, for lower wages, at a yet higher pace. Amazon is again the leading indicator here.28 Its warehouse workers are hired on fixed, short-term contracts, through a deniable outsourcing agency, and precluded from raises, benefits, opportunities for advancement or the meaningful prospect of permanent employment. They work under conditions of “rationalized” oversight in the form of performance metrics that are calibrated in real time. Any degree of discretion or autonomy they might have retained is ruthlessly pared away by efficiency algorithm. The point couldn’t be made much more clearly: these facilities are places that no one sane would choose to be if they had any other option at all. And this is only the most obvious sort of technological intervention in the workplace. We barely have words for what happens when an algorithm breaks down jobs into tasks that are simple enough that they don’t call for any particular expertise—just about anybody will suffice to perform them—and outsources them to a global network of individuals made precarious and therefore willing to work for very little.

The company uses the accompanying analytic suite to “identify top performers” (and, by implication, those at the bottom as well), and plan schedules and distribute assignments in the store accordingly. Theatro’s devices are less elaborate than a Hitachi wearable called Business Microscope, which aims to capture, quantify and make inferences from several dimensions of employee behavior.33 As grim as call-center work is, a Hitachi press release brags about their ability to render it more dystopian yet via the use of this tool—improving performance metrics not by reducing employees’ workload, but by compelling them to be more physically active during their allotted break periods.34 Hitachi’s wearables, in turn, are less capable than the badges offered by Cambridge, MA, startup Sociometric Solutions, which are “equipped with two microphones, a location sensor and an accelerometer” and are capable of registering “tone of voice, posture and body language, as well as who spoke to whom for how long.”35 As with all of these devices, the aim is to continuously monitor (and eventually regulate) employee behavior.


pages: 493 words: 139,845

Women Leaders at Work: Untold Tales of Women Achieving Their Ambitions by Elizabeth Ghaffari

Albert Einstein, AltaVista, business cycle, business process, cloud computing, Columbine, corporate governance, corporate social responsibility, dark matter, family office, Fellow of the Royal Society, financial independence, follow your passion, glass ceiling, Grace Hopper, high net worth, knowledge worker, Long Term Capital Management, longitudinal study, performance metric, pink-collar, profit maximization, profit motive, recommendation engine, Ronald Reagan, shareholder value, Silicon Valley, Silicon Valley startup, Steve Ballmer, Steve Jobs, thinkpad, trickle-down economics, urban planning, women in the workforce, young professional

Trying to do the best for them. My whole academic and personal upbringing was working with physicians. So I don’t view physicians as the enemy. It just doesn’t make good business sense. Ghaffari: How many departments did you end up having under you? Luttgens: I had a total of ten professional services departments. Most of them were physician-led or physician-supported. Ghaffari: What was your performance metric that you did for them? Luttgens: Back in those days, the early eighties, we didn’t have quality management or outcomes as we do today. You needed to control expenses, enhance revenue, increase patient volume, and get along. I was well-known around the medical center for getting substantial capital funding for items in my capital budgets each year. Most of my departments were very capital-intensive.

It was a big change to run a nonprofit where a major part of your job is fundraising. That taught me that I was both good at, and enjoyed, fundraising because I understood the customer and believed in the product. Ghaffari: Was your primary responsibility there in an executive director role? What were some of your key accomplishments? Roden: Yes. Regarding accomplishments, we tracked several metrics. First of all, sponsorship was an important performance metric. When I started, SVASE was bringing in about $10,000 a year in sponsorship. When I left, it was $300,000 a year. Another key metric was the mailing list. When I started, we had about two thousand people on our e-mail list. When I left, it was about twenty thousand people. When I started, we had about twenty volunteers. When I left, we had about two hundred and fifty volunteers, meaning people actively engaged in running parts of the organization.


How I Became a Quant: Insights From 25 of Wall Street's Elite by Richard R. Lindsey, Barry Schachter

Albert Einstein, algorithmic trading, Andrew Wiles, Antoine Gombaud: Chevalier de Méré, asset allocation, asset-backed security, backtesting, bank run, banking crisis, Black-Scholes formula, Bonfire of the Vanities, Bretton Woods, Brownian motion, business cycle, business process, butter production in bangladesh, buy and hold, buy low sell high, capital asset pricing model, centre right, collateralized debt obligation, commoditize, computerized markets, corporate governance, correlation coefficient, creative destruction, Credit Default Swap, credit default swaps / collateralized debt obligations, currency manipulation / currency intervention, discounted cash flows, disintermediation, diversification, Donald Knuth, Edward Thorp, Emanuel Derman, en.wikipedia.org, Eugene Fama: efficient market hypothesis, financial innovation, fixed income, full employment, George Akerlof, Gordon Gekko, hiring and firing, implied volatility, index fund, interest rate derivative, interest rate swap, John von Neumann, linear programming, Loma Prieta earthquake, Long Term Capital Management, margin call, market friction, market microstructure, martingale, merger arbitrage, Myron Scholes, Nick Leeson, P = NP, pattern recognition, Paul Samuelson, pensions crisis, performance metric, prediction markets, profit maximization, purchasing power parity, quantitative trading / quantitative finance, QWERTY keyboard, RAND corporation, random walk, Ray Kurzweil, Richard Feynman, Richard Stallman, risk-adjusted returns, risk/return, shareholder value, Sharpe ratio, short selling, Silicon Valley, six sigma, sorting algorithm, statistical arbitrage, statistical model, stem cell, Steven Levy, stochastic process, systematic trading, technology bubble, The Great Moderation, the scientific method, too big to fail, trade route, transaction costs, transfer pricing, value at risk, volatility smile, Wiener process, yield curve, young professional

In the early 1990s, the entire banking industry was moving headlong toward Raroc as a pricing and performance measurement framework. However, as early as 1992, I recognized that the common Raroc measure based on own portfolio risk or VaR was at odds with equilibrium and arbitrage pricing theory (see Wilson (1992)). Using classical finance to make the point, I recast a simple CAPM model into a Raroc performance metric and showed that Raroc based on own portfolio risk without the recognition of funding was inherently biased. In the years since 1992, many other authors have followed a similar line of thought. What is the appropriate cost of capital, by line of business, if capital is allocated based on the standalone risk of each underlying business? And, what role does earnings volatility play in the valuation of a bank or insurance company?

See Credit risk integrated tool set, application, 80 technology, usage, 134–135 Portfolio optimization, 281–283 “Portfolio Optimization with Factors, Scenarios, and Realistic Short Positions,” 281 Portfolio Theory (Levy/Sarnat), 228 Portfolio trading, mathematics, 128–130 Positive interest rates, ensuring, 161–162 Prepayment data, study, 183 Press, Bill, 36 Price/book controls, pure return, 272 Price data, study, 183 Price/earnings ratios, correlation, 269 Price limits, impact, 77 Primitive polynomial modulo two, 170 Prisoner’s dilemma, 160 Private equity returns, benchmarks (finding), 145 Private signals, quality (improvement), 159–160 Publicly traded contingent claims, combinations (determination), 249 Public pension funds, investment, 25 Pure mathematics, 119, 126 Quantitative active management, growth, 46–47 Quantitative approach, usage, 26–27 Quantitative finance, 237–238 purpose, 96–98 Quantitative Financial Research (Bloomberg), 137 Quantitative investing, limitation, 209 Quantitative label, implication, 25–26 Quantitative methods, role, 96–97 Quantitative Methods for Financial Analysis (Stephen/Kritzman), 253 Quantitative models, enthusiasm, 234 Quantitative portfolio management, 130–131 Quantitative strategies, usage, 240 Quantitative Strategies (SAC Capital Management, LLC), 107 Quantitative training, application process, 255–260 Quants business sense, discussion, 240–241 characteristics/description, 208–210 conjecture, 177–179 conversion, 327 data mining, 209–210 description, New York Times article, 32 due diligence, requirement, 169 future, 13–16, 261 innovations, 255–258 myths, dispelling, 258–260 perspective, change, 134–135 process, 92–93 research, 127–128 Quigg, Laura, 156–158, 160 Quotron, recorded data (usage), 22 Rahl, Leslie, 83–93 Ramaswamy, Krishna, 253 385 RAND Corporation, 13–17 Raroc models, usage/development, 102–103 Raroc performance metric, 103 Reagan, Ronald, 15 Real economic behavior, level (usefulness), 101 Real options (literature), study, 149 Real-time artificial intelligence, 16 Rebonato, Riccardo, 168, 169, 232 Reed, John, 89 Registered investment advisors, 79 Regression, time-varying, 239 Renaissance Medallion fund, 310 Representation Theory and Complex Geometry, 122–125 Resampling statistics, usage, 239–240 Research collaboration, type, 157–158 Research Objectivity Standards, 280–281 Retail markets, changes, 148–149 Return, examination, 71–72 Return-predictor relationships, 269 Returns separation, 34–35 variance, increasing, 72 “Revenue Recognition Certificates: A New Security” (LeClair/Schulman), 82 Rich, Don, 256 Riemann Hypothesis, solution, 108 Risk analytics, sale, 301 bank rating, 216 buckets, 71 cost, 129 examination, 70–71 forecast, BARRA bond model (usage), 39 importance, 34–35 manager, role, 302–303 reversal, 299 worries, 39 Risk-adjusted return, 102 Risk management, 233 consulting firm, 293 technology, usage, 134–135 world developments, 96 Risk Management (Clinton Group), 295 Risk Management & Quantitative Research (Permal Group), 227 RiskMetrics, 300–301 business, improvement, 301 computational device, 240 Technical Document, publication (1996), 66 Risk/return trade-off, 259 RJR Nabisco, LBO, 39 Roll, Richard, 140 Ronn, Ehud, 157, 160–162 Rosenberg, Barr, 34–42 models, development, 34–37 Rosenbluth, Jeff, 132 Ross, Stephen A., 141, 254, 336 arbitrage pricing model, development, 147–148 Rubinstein, Mark, 278, 336 P1: OTE/PGN JWPR007-Lindsey P2: OTE January 1, 1904 6:33 386 Rudd, Andrew, 35, 307 historical performance analysis, 44 Rudy, Rob, 219 Russell 3000, constitution, 275 Salomon Brothers, Bloomberg (employ), 73 Samuelson, Paul, 256–257 time demonstration, 258 Sankar, L., 162 Sargent, Thomas, 188 Savine, Antoine, 167 Sayles, Loomis, 33 SBCC, 285 Scholes, Myron, 11, 88, 177, 336 input, 217 Schulman, Evan, 67–82 Schwartz, Robert J., 293, 320 Secret, classification, 16–18 Securities Act of 1933, 147 Securities Exchange Act of 1934, 147 Security replication, probability (usage), 122 SETS, 77 Settlement delays, 174 Seymour, Carl, 175–176 Shareholder value creation, questions, 98 Sharpe, William, 34, 254 algorithm, 257–258 modification, 258 Shaw, Julian, 227–242 Sherring, Mike, 232 Short selling, 275–276 Short selling, risk-reducing/returnenhancing benefits, 277 Short-term reversal strategy, 198–199 Shubik, Martin, 288–289, 291, 293 Siegel’s Paradox, 321–322 Sklar’s theorem, 240 Slawsky, Al, 40–41 Small-cap stocks, purchase, 268 Smoothing, 192–193 Sobol’ numbers, 173–173 Social Sciences Research Network (SSRN), 122 Social Security system, bankruptcy, 148 Society for Quantitative Analysis (SQA), 253 Spatt, Chester, 252 Spot volatility, parameter, 89–90 Standard & Poor’s futures, price (relationship), 75 INDEX Start-up company, excitement, 24–25 Statistical data analysis, 213–214 Statistical error, 228 Sterge, Andrew J., 317–327 Stevens, Ross, 201 Stochastic calculus, 239 Stock market crash (1987), 282 Stocks portfolio trading, path trace, 129 stories, analogy, 23–26 Strategic Business Development (RiskMetrics Group), 49 Sugimoto, E., 171 Summer experience, impact, 57 Sun Unix workstation, 22 Surplus insurance, usage, 255–256 Swaps rate, Black volatilities, 172 usage, 292–293 Sweeney, Richard, 190 Symbolics, 16, 18 Taleb, Nassim, 132 Tenenbein, Aaron, 252 Textbook learning, expansion, 144 Theoretical biases, 103 Theory, usage/improvement, 182–185 Thornton, Dan, 139 Time diversification, myths, 258 Top secret, classification, 16–18 Tracking error, focus, 80–81 Trading, 72–73 Transaction cost, 129 absence, 247 impact, 273–274 Transaction pricing, decision-making process, 248 Transistor experiment (TX), 11 Transistorized Experimental Computer Zero (tixo), usage, 86 Treynor, Jack, 34, 254 Trigger, usage, 117–118 Trimability, 281 TRS-80 (Radio Shack), usage, 50, 52, 113 Trust companies, individually managed accounts (growth), 79 Tucker, Alan, 334 Uncertainty examination, 149–150 resolution, 323–324 Unit initialization, 172 Universal Investment Reasoning, 19–20 Upstream Technologies, LLC, 67 U.S. individual stock data, research, 201–202 Value-at-Risk (VaR), 195. calculation possibility tails, changes, 100 design, 293 evolution, 235 measurement, 196 number, emergence, 235 van Eyseren, Olivier, 173–175 Vanilla interest swaptions, 172 VarianceCoVariance (VCV), 235 Variance reduction techniques, 174 Vector auto-regression (VAR), 188 Venture capital investments, call options (analogy), 145–146 Volatility, 100, 174, 193–194 Volcker, Paul, 32 von Neumann, John, 319 Waddill, Marcellus, 318 Wall Street business, arrival, 61–65 interest, 160–162 move, shift, 125–127 quant search, genesis, 32 roots, 83–85 Wanless, Derek, 173 Wavelets, 239 Weisman, Andrew B., 187–196 Wells Fargo Nikko Investment Advisors, Grinold (attachment), 44 Westlaw database, 146–148 “What Practitioners Need to Know” (Kritzman), 255 Wigner, Eugene, 54 Wiles, Andrew, 112 Wilson, Thomas C., 95–105 Windham Capital Management LLC, 251, 254 Wires, rat consumption (prevention), 20–23 Within-horizon risk, usage, 256 Worker longevity, increase, 148 Wyckoff, Richard D., 321 Wyle, Steven, 18 Yield, defining, 182 Yield curve, 89–90, 174 Zimmer, Bob, 131–132


pages: 892 words: 91,000

Valuation: Measuring and Managing the Value of Companies by Tim Koller, McKinsey, Company Inc., Marc Goedhart, David Wessels, Barbara Schwimmer, Franziska Manoury

activist fund / activist shareholder / activist investor, air freight, barriers to entry, Basel III, BRICs, business climate, business cycle, business process, capital asset pricing model, capital controls, Chuck Templeton: OpenTable:, cloud computing, commoditize, compound rate of return, conceptual framework, corporate governance, corporate social responsibility, creative destruction, credit crunch, Credit Default Swap, discounted cash flows, distributed generation, diversified portfolio, energy security, equity premium, fixed income, index fund, intangible asset, iterative process, Long Term Capital Management, market bubble, market friction, Myron Scholes, negative equity, new economy, p-value, performance metric, Ponzi scheme, price anchoring, purchasing power parity, quantitative easing, risk/return, Robert Shiller, Robert Shiller, shareholder value, six sigma, sovereign wealth fund, speech recognition, stocks for the long run, survivorship bias, technology bubble, time value of money, too big to fail, transaction costs, transfer pricing, value at risk, yield curve, zero-coupon bond

Equal attention is paid to the long-term value-creating intent behind short-term profit targets, and people across the company are in constant communication about the adjustments needed to stay in line with long-term performance goals. We approach performance management from both an analytical and an organizational perspective. The analytical perspective focuses first on ensuring that companies use the right metrics at the right level in the organization. Companies should not just rely on performance metrics for divisions or business units, but disaggregate performance to the level of individual business segments. In addition to historical performance measures, companies need to use diagnostic metrics that help them understand and manage their ability to create value over the longer term. Second, we analyze how to set appropriate targets, giving examples of analytically sound performance measurement in action.

At some point, expansion of market share and sales will require additional production capacity. Once that point is reached, the associated 6 For example, declining sales in one segment would imply increasing capital allocated to other segments even if their sales would be unchanged. 592 PERFORMANCE MANAGEMENT investments and operating costs need to be factored in for target setting in individual business segments. The Right Metrics in Action Choosing the right performance metrics can provide new insights into how a company might improve its performance in the future. For instance, Exhibit 26.8 illustrates the most important value drivers for a pharmaceutical company. The exhibit shows the key value drivers, the company’s current performance relative to best- and worst-in-class benchmarks, its aspirations for each driver, and the potential value impact from meeting its targets.

The greatest value creation would come from three areas: accelerating the rate of release of new products from 0.5 to 0.8 per year, reducing from six years to four the time it takes for a new drug to reach 80 percent of peak sales, and cutting the cost of goods sold from 26 percent to 23 percent of sales. Some of the value drivers (such as new-drug development) are long-term, whereas others (such as reducing cost of goods sold) have a shorter-term focus. Similarly, focusing on the right performance metrics can help reveal what may be driving underperformance. A consumer goods company we know illustrates the importance of having a tailored set of key value metrics. For several years, a business unit showed consistent double-digit growth in economic profit. Since the financial results were consistently strong—in fact, the strongest across all the business units—corporate managers were pleased and did not ask many questions of the business unit.


Android Developer Tools Essentials: Android Studio to Zipalign by Mike Wolfson, Donn Felker

Debian, performance metric, pez dispenser

Unused layouts in your hierarchy are a common problem with potentially big performance impacts, as each additional ViewGroup makes the measure pass described in Two-pass layout take more time (and it’s already the bulk of the time required to render the screen). It is reasonably easy to identify unused layouts. In this case, there is one LinearLayout (in the middle towards the left) that doesn’t show any performance metrics (there is just a blank space where the colored balls and timing information would be). This indicates that it is not being rendered and should be removed. Figure 13-15. Hierarchy View: bad detail Using the Tree tool to inspect the good UI The performance indicators in the good UI look much better than the bad one in the Tree View. Most of the indicators in Figure 13-16 are green.


pages: 204 words: 54,395

Drive: The Surprising Truth About What Motivates Us by Daniel H. Pink

affirmative action, call centre, Daniel Kahneman / Amos Tversky, Dean Kamen, deliberate practice, Firefox, Frederick Winslow Taylor, functional fixedness, game design, George Akerlof, Isaac Newton, Jean Tirole, job satisfaction, knowledge worker, longitudinal study, performance metric, profit maximization, profit motive, Results Only Work Environment, side project, the built environment, Tony Hsieh, transaction costs, zero-sum game

It's another way to allow people to focus on the work itself. Indeed, other economists have shown that providing an employee a high level of base pay does more to boost performance and organizational commitment than an attractive bonus structure. Of course, by the very nature of the exercise, paying above the average will work for only about half of you. So get going before your competitors do. 3. IF YOU USE PERFORMANCE METRICS, MAKE THEM WIDE-RANGING, RELEVANT, AND HARD TO GAME I magine you're a product manager and your pay depends largely on reaching a particular sales goal for the next quarter. If you're smart, or if you've got a family to feed, you're going to try mightily to hit that number. You probably won't concern yourself much with the quarter after that or the health of the company or whether the firm is investing enough in research and development.


pages: 188 words: 54,942

Drone Warfare: Killing by Remote Control by Medea Benjamin

airport security, autonomous vehicles, Chelsea Manning, clean water, Clive Stafford Smith, crowdsourcing, drone strike, friendly fire, illegal immigration, Khyber Pass, megacity, nuremberg principles, performance metric, private military company, Ralph Nader, WikiLeaks

Some of the private contractors who are hired to participate in the CIA’s drone program have another incentive. Joshua Foust of the American Security Project discovered that in some targeting programs, contracted staffers have review quotas—that is, they must review a certain number of possible targets per given length of time. “Because they are contractors, their continued employment depends on their ability to satisfy the stated performance metrics,” Foust explained.256 “So they have a financial incentive to make life-or-death decisions about possible kill targets just to stay employed. This should be an intolerable situation, but because the system lacks transparency or outside review it is almost impossible to monitor or alter.” A policy paper on UAVs by the United Kingdom’s Ministry of Defence asked questions rarely heard in US government circles.


pages: 195 words: 52,701

Better Buses, Better Cities by Steven Higashide

Affordable Care Act / Obamacare, autonomous vehicles, business process, congestion charging, decarbonisation, Elon Musk, Hyperloop, income inequality, intermodal, jitney, Lyft, mass incarceration, Pareto efficiency, performance metric, place-making, self-driving car, Silicon Valley, six sigma, smart cities, transportation-network company, Uber and Lyft, Uber for X, uber lyft, urban planning, urban sprawl, walkable city, white flight, young professional

It also doesn’t sync with the agency’s work orders; if it did, the agency would be able to see at a glance when shelters were last cleaned and maintained and know whether certain shelters received more maintenance than others. A new data support tool, built with open-source software, may allow the agency to see at a glance which stops should have shelters based on the new guidelines (based on ridership and land use, such as whether the stop is near a hospital), as well as stops where shelters should be removed. Internally, the agency has also developed an equity performance metric to determine whether the percentage of customer boardings at stops with shelters is equal systemwide and in census blocks that are predominantly low-income and communities of color.26 Metro Transit is on track to meet its goal of installing 150 shelters in priority neighborhoods in 2019. As of December 2018, 64 percent of boardings in racially concentrated areas of poverty take place at a sheltered bus stop, similar to the systemwide average, according to Farrington.


pages: 519 words: 155,332

Tailspin: The People and Forces Behind America's Fifty-Year Fall--And Those Fighting to Reverse It by Steven Brill

2013 Report for America's Infrastructure - American Society of Civil Engineers - 19 March 2013, activist fund / activist shareholder / activist investor, affirmative action, Affordable Care Act / Obamacare, airport security, American Society of Civil Engineers: Report Card, asset allocation, Bernie Madoff, Bernie Sanders, Blythe Masters, Bretton Woods, business process, call centre, Capital in the Twenty-First Century by Thomas Piketty, carried interest, clean water, collapse of Lehman Brothers, collective bargaining, computerized trading, corporate governance, corporate raider, corporate social responsibility, Credit Default Swap, currency manipulation / currency intervention, Donald Trump, ending welfare as we know it, failed state, financial deregulation, financial innovation, future of work, ghettoisation, Gordon Gekko, hiring and firing, Home mortgage interest deduction, immigration reform, income inequality, invention of radio, job automation, knowledge economy, knowledge worker, labor-force participation, laissez-faire capitalism, Mahatma Gandhi, Mark Zuckerberg, mortgage tax deduction, new economy, obamacare, old-boy network, paper trading, performance metric, post-work, Potemkin village, Powell Memorandum, quantitative hedge fund, Ralph Nader, ride hailing / ride sharing, Robert Bork, Robert Gordon, Robert Mercer, Ronald Reagan, shareholder value, Silicon Valley, Social Responsibility of Business Is to Increase Its Profits, telemarketer, too big to fail, trade liberalization, union organizing, Unsafe at Any Speed, War on Poverty, women in the workforce, working poor

Except for the most civic minded among them, corporate executives—who spend millions to lobby against employment laws forcing even a fraction of these due process protections on their companies when they hire or fire their own employees—are not likely to worry about the straitjacket their government faces in recruiting talent or in training or in dismissing the untalented. Nor do they care much that their government doesn’t produce a budget or performance metrics, or pay enough to hire and keep competent people in jobs managing billions of dollars’ worth of programs. Similarly, there is an imbalance of passion and interest when it comes to perhaps the most obvious common good: the nation’s infrastructure. America’s deteriorating roads and power grids, and broken mass transit systems, are daily reminders of how the protected have undermined the government’s ability to fulfill its most basic purpose.

Or, as suggested, one or more news organizations could finally generate broad disgust over the auctioning of American democracy by routinely attaching to the name of every politician, whenever he or she holds a hearing or votes on an issue, a tally of the campaign contributions received from the special interests involved. Maybe an elderly woman’s struggle to get the Social Security money she is owed will go viral and snowball into a meme that taps everyone’s frustrations about broken government—and makes the Partnership for Public Service’s wonky plans for civil service reform, agency performance metrics, and a rational budgeting process a cause that politicians have to latch on to. Perhaps one of the cable news networks could televise Stier’s annual “Sammy Awards” dinner for stellar public servants to drive home the message that government can work if the right people are attracted to the mission. A protest launched by a daughter who cannot get care for her elderly mother, joined by a mother who cannot get care for her toddler, could mushroom into a national caregiver movement that seizes on Peter Edelman’s blueprint for a new caregiver workforce infrastructure.


pages: 559 words: 155,372

Chaos Monkeys: Obscene Fortune and Random Failure in Silicon Valley by Antonio Garcia Martinez

Airbnb, airport security, always be closing, Amazon Web Services, Burning Man, Celtic Tiger, centralized clearinghouse, cognitive dissonance, collective bargaining, corporate governance, Credit Default Swap, crowdsourcing, death of newspapers, disruptive innovation, drone strike, El Camino Real, Elon Musk, Emanuel Derman, financial independence, global supply chain, Goldman Sachs: Vampire Squid, hive mind, income inequality, information asymmetry, interest rate swap, intermodal, Jeff Bezos, Kickstarter, Malcom McLean invented shipping containers, Marc Andreessen, Mark Zuckerberg, Maui Hawaii, means of production, Menlo Park, minimum viable product, MITM: man-in-the-middle, move fast and break things, move fast and break things, Network effects, orbital mechanics / astrodynamics, Paul Graham, performance metric, Peter Thiel, Ponzi scheme, pre–internet, Ralph Waldo Emerson, random walk, Ruby on Rails, Sam Altman, Sand Hill Road, Scientific racism, second-price auction, self-driving car, Silicon Valley, Silicon Valley startup, Skype, Snapchat, social graph, social web, Socratic dialogue, source of truth, Steve Jobs, telemarketer, undersea cable, urban renewal, Y Combinator, zero-sum game, éminence grise

I even hung a real length of Spanish chorizo from my monitor, as a rallying symbol, and the targeting team got down to the serious business of monetizing every last user action on Facebook. Just as my first view of Facebook’s high-level revenue dashboard proved a dispiriting exercise, Chorizo’s final results, which took months to produce, were a similar tale of woe. No user data we had, if fed freely into the topics that Facebook’s savviest marketers used to target their ads, improved any performance metric we had access to. That meant that advertisers trying to find someone who, say, wanted to buy a car, benefited not at all from all the car chatter taking place on Facebook. It was as if we had fed a mile-long trainful of meat cows into a slaughterhouse, and had come out with one measly sausage to show for it. It was incomprehensible, and it tested my faith (which, believe it or not, I certainly had at that time) in Facebook’s claim to unique primacy in the realm of user data.

Immature advertising markets, the embryonic state of their e-commerce infrastructure, and their lower general wealth meant the impact of new optimization tricks or targeting data on those countries was minimal. And so the Ads team would slice off tranches of the FB user base in rich ads markets and dose them with different versions of the ads system to measure the effect of a new feature, as you would test subjects in a clinical drug trial.* The performance metrics of interest included clickthrough rates, which are a coarse measure of user interest. More convincing is the actual downstream monetization resulting from someone clicking through and buying something—assuming Facebook got the conversion data, which it often didn’t, given that Facebook didn’t have a conversion-tracking system. Also important, and not related to money at all, was overall usage.


pages: 262 words: 60,248

Python Tricks: The Book by Dan Bader

anti-pattern, domain-specific language, don't repeat yourself, linked data, pattern recognition, performance metric

A Dog-free Example While no dogs were harmed in the making of this chapter (it’s all fun and games until someone sprouts and extra pair of legs), I wanted to give you one more practical example of the useful things you can do with class variables. Something that’s a little closer to the real-world applications for class variables. So here it is. The following CountedObject class keeps track of how many times it was instantiated over the lifetime of a program (which might actually be an interesting performance metric to know): class CountedObject: num_instances = 0 def __init__(self): self.__class__.num_instances += 1 CountedObject keeps a num_instances class variable that serves as a shared counter. When the class is declared, it initializes the counter to zero and then leaves it alone. Every time you create a new instance of this class, it increments the shared counter by one when the __init__ constructor runs: >>> CountedObject.num_instances 0 >>> CountedObject().num_instances 1 >>> CountedObject().num_instances 2 >>> CountedObject().num_instances 3 >>> CountedObject.num_instances 3 Notice how this code needs to jump through a little hoop to make sure it increments the counter variable on the class.


pages: 204 words: 58,565

Keeping Up With the Quants: Your Guide to Understanding and Using Analytics by Thomas H. Davenport, Jinho Kim

Black-Scholes formula, business intelligence, business process, call centre, computer age, correlation coefficient, correlation does not imply causation, Credit Default Swap, en.wikipedia.org, feminist movement, Florence Nightingale: pie chart, forensic accounting, global supply chain, Hans Rosling, hypertext link, invention of the telescope, inventory management, Jeff Bezos, Johannes Kepler, longitudinal study, margin call, Moneyball by Michael Lewis explains big data, Myron Scholes, Netflix Prize, p-value, performance metric, publish or perish, quantitative hedge fund, random walk, Renaissance Technologies, Robert Shiller, Robert Shiller, self-driving car, sentiment analysis, six sigma, Skype, statistical model, supply-chain management, text mining, the scientific method, Thomas Davenport

MODELING (VARIABLE SELECTION). The variables in deciding whether to acquire Battier from the Grizzlies would be the cost of acquiring him (outright or in trade for other players), the amount that he would be paid going forward, various individual performance measures, and ideally some measure of team performance while Battier was on the court versus when he was not. DATA COLLECTION (MEASUREMENT). The individual performance metrics and financials were easy to gather. And there is a way to measure an individual player’s impact on team performance. The “plus/minus” statistic, adapted by Roland Beech of 82games.com from a similar statistic used in hockey, compares how a team performs with a particular player in the game versus its performance when he is on the bench. DATA ANALYSIS. Morey and his statisticians decided to use plus/ minus analysis to evaluate Battier.


pages: 394 words: 57,287

Unleashed by Anne Morriss, Frances Frei

"side hustle", Airbnb, Donald Trump, future of work, gig economy, glass ceiling, Grace Hopper, Jeff Bezos, Netflix Prize, Network effects, performance metric, race to the bottom, ride hailing / ride sharing, Silicon Valley, Steve Jobs, TaskRabbit, Tony Hsieh, Toyota Production System, Travis Kalanick, Uber for X, women in the workforce

If you give constructive advice once or twice a week, for example, look for daily opportunities for sincere, specific, positive reinforcement. Monthly constructive advice? Shoot for the positive stuff slightly more than weekly. And so on. You get the math. Once you get hooked on the uptick in improvement, we’re confident you’ll be converted. How will you know if you’re getting it right? Your performance metric is someone else’s improvement. If you’re not seeing improvement, then it’s your job to try another way. Restate your observations with more specificity. Build more trust so that your recipient(s) can actually hear what you have to say. You don’t get credit for trying regardless of whether you were effective. Your job as a leader is to make others better. If the feedback you’re giving has a neutral to negative impact, then you’re not doing your job.


Digital Accounting: The Effects of the Internet and Erp on Accounting by Ashutosh Deshmukh

accounting loophole / creative accounting, AltaVista, business continuity plan, business intelligence, business process, call centre, computer age, conceptual framework, corporate governance, data acquisition, dumpster diving, fixed income, hypertext link, interest rate swap, inventory management, iterative process, late fees, money market fund, new economy, New Journalism, optical character recognition, packet switching, performance metric, profit maximization, semantic web, shareholder value, six sigma, statistical model, supply-chain management, supply-chain management software, telemarketer, transaction costs, value at risk, web application, Y2K

Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited. 40 Deshmukh Supply Chain Management Supplier Relationship Management/E-Procurement • Supports multiple standards for security and data communication • Support for sourcing • Support for online negotiation • Support for catalog development and hosting • Support for auctions • Support for customization and automation according to trading agreements, workflows and business rules Vendor Self-Service • Get information on request for quotation or proposals, purchase order revision, receipt or return of goods and payments • Performance metrics for quality and delivery • Inventory requirements available through EDI or Internet posting • Drill-down capabilities • Chat facilities with purchase people Expenditures • Online forms for claiming travel and entertainment expenses • Online management of expense reimbursements • Online payroll • Online time sheets • Support for online travel centers Product Development • Online design and development tools • Sharing of product design over the Internet • Virtual testing and collaboration • Integration or interface with Computer Aided Design/Computer Aided Manufacturing (CAD/CAM) software Human Resources • Access to personal files, job performance and company policies • Access to 401K funds Copyright © 2006, Idea Group Inc.

Quantitative data includes patterns in spending and supplier performance, projections of future performance based on available data and changes in existing products and business, among other things. Analytical facilities in SRM enable detailed analysis of Exhibit 4. Supplier selection strategy Supplier selection strategy Non-quantifiable factors • Changes in the economic and market conditions • Changes in corporate objectives • Political imperatives • Personal opinions and beliefs Quantifiable factors • Contract usage • Contract compliance • Supplier performance metrics • Maverick purchases Copyright © 2006, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited. The Expenditure Cycle 195 supplier contracts. Illustrative examples in the contract analysis include calculating the use of the supplier by organization, evaluating gaps in contract coverage and supplier performance, identifying purchases that do not conform to contracts and investigating purchases made from non-approved suppliers.


pages: 265 words: 71,143

Empires of the Weak: The Real Story of European Expansion and the Creation of the New World Order by Jason Sharman

British Empire, cognitive dissonance, colonial rule, corporate social responsibility, death of newspapers, European colonialism, joint-stock company, joint-stock limited liability company, land tenure, offshore financial centre, passive investing, Peace of Westphalia, performance metric, profit maximization, Scramble for Africa, South China Sea, spice trade, trade route, transaction costs

According to this logic, organizations are the way they are and do things the way they do for a reason, and that reason is to efficiently and effectively achieve the tasks they were designed for. Meyer and his colleagues are very much focused on the contemporary period, so there is a need to exercise caution when reading their insights back into previous centuries. Yet if organizations in an era of detailed performance metrics, huge data-processing and analytical capacities, and a whole industry of professional managers and consultants nevertheless can deviate so fundamentally from the rational ideal, there are good reasons to think that their early modern counterparts would have had an even harder time coming anywhere near this idealized mark. This is especially so as early modern actors in all regions tended to explain success and failure by divine providence and supernatural interventions.67 Rather than distinguishing modern, professional, rational organizations from their backward, primitive equivalents in centuries past, however, Meyer believes that the former are just as likely to be in thrall to myth and ritual as the latter.68 For sociologists like Meyer, whether they be government departments, hospitals, universities or firms, organizations are said to be generally indifferent to matters of efficiency and effectiveness, even if they could work out how to achieve these goals, which they probably couldn’t.


The Ethical Algorithm: The Science of Socially Aware Algorithm Design by Michael Kearns, Aaron Roth

23andMe, affirmative action, algorithmic trading, Alvin Roth, Bayesian statistics, bitcoin, cloud computing, computer vision, crowdsourcing, Edward Snowden, Elon Musk, Filter Bubble, general-purpose programming language, Google Chrome, ImageNet competition, Lyft, medical residency, Nash equilibrium, Netflix Prize, p-value, Pareto efficiency, performance metric, personalized medicine, pre–internet, profit motive, quantitative trading / quantitative finance, RAND corporation, recommendation engine, replication crisis, ride hailing / ride sharing, Robert Bork, Ronald Coase, self-driving car, short selling, sorting algorithm, speech recognition, statistical model, Stephen Hawking, superintelligent machines, telemarketer, Turing machine, two-sided market, Vilfredo Pareto

Some require less memory than others, at the expense of being slower. Some excel when we can assume that each number in the list is unique (as with social security numbers). So even within the constraint of developing a precise recipe for a precise computational task, there may be choices and trade-offs to confront. As the previous paragraph suggests, computer science has traditionally focused on algorithmic trade-offs related to what we might consider performance metrics, including computational speed, the amount of memory required, or the amount of communication required between algorithms running on separate computers. But this book, and the emerging research it describes, is about an entirely new dimension in algorithm design: the explicit consideration of social values such as privacy and fairness. Man Versus Machine (Learning) Algorithms such as the sorting algorithm we describe above are typically coded by the scientists and engineers who design them: every step of the procedure is explicitly specified by its human designers, and written down in a general-purpose programming language such as Python or C++.


pages: 253 words: 65,834

Mastering the VC Game: A Venture Capital Insider Reveals How to Get From Start-Up to IPO on Your Terms by Jeffrey Bussgang

business cycle, business process, carried interest, digital map, discounted cash flows, hiring and firing, Jeff Bezos, Kickstarter, Marc Andreessen, Mark Zuckerberg, Menlo Park, moveable type in China, pattern recognition, Paul Graham, performance metric, Peter Thiel, pets.com, risk tolerance, rolodex, Ronald Reagan, Sand Hill Road, selection bias, shareholder value, Silicon Valley, Skype, software as a service, sovereign wealth fund, Steve Jobs, technology bubble, The Wisdom of Crowds

” Gail made a point of previewing news with each director right before the board meeting to allow them to have some reflection time before the meeting, and to alert her to any of their initial concerns. “I wanted them to know exactly what they were going to hear in the meeting. By the time we got into the board meeting, everybody was informed and we could really get into the meat of whatever the issue was.” The “no surprises” rule applies to changes in management as well as performance metrics. “The board would lose confidence in some team members at different times,” Gail told me. “So I was very clear about saying, ‘I see the same weaknesses. But here’s what they’re doing. And I’ll make the decision about this person at the right time.’ You can’t fool these guys. If you have an executive that has weaknesses and you try to deny it, it erodes the board’s confidence, makes them think you don’t have good judgment when it comes to people.


pages: 229 words: 67,752

The Quantum Curators and the Fabergé Egg: A Fast Paced Portal Adventure by Eva St. John

3D printing, Berlin Wall, clean water, double helix, Fall of the Berlin Wall, off grid, performance metric

Julius’ tone didn’t seem to suggest he was taking me seriously. ‘While I’m dropping to the ground and finding cover, do you think I can do so, in a constructive, manly fashion?’ ‘I don’t care what fashion you do it in so long as you don’t get in my way or end up dead. Both options would be hugely helpful. If you do choose to die, could you do it after I have retrieved the egg? A failed recovery and a dead civilian will play hell with my performance metrics.’ ‘I’ll try not to spoil your clean-up rate. That’s obviously at the top of my priorities.’ I sighed. This wasn’t what I needed. We both needed to be alert, not at odds. I was used to telling people what they needed to do, and then they did it. I didn’t mind when Clio or Ramin talked back because I valued their insight and experience. Julius had none of that. Clearly, he had native intelligence, but he had zero experience in self-preservation.


pages: 261 words: 16,734

Peopleware: Productive Projects and Teams by Tom Demarco, Timothy Lister

A Pattern Language, cognitive dissonance, interchangeable parts, job satisfaction, knowledge worker, lateral thinking, Parkinson's law, performance metric, skunkworks, supply-chain management, women in the workforce

Three rules of thumb seem to apply whenever you measure variations in performance over a sample of individuals. • Count on the best people outperforming the worst by about 10:1. • Count on the best performer being about 2.5 times better than the median performer. • Count on the half that are better-than-median performers outdoing the other half by more than 2:1. These rules apply for virtually any performance metric you define. So, for instance, the better half of a sample will do a given job in less than half the time the others take; the more defect-prone half will put in more than two thirds of the defects, and so on. Results of the Coding War Games were very much in line with this profile. Take as an example Figure 8–2, which shows the performance spread of time to achieve the first milestone (clean compile, ready for test) in one year’s games.


pages: 256 words: 15,765

The New Elite: Inside the Minds of the Truly Wealthy by Dr. Jim Taylor

British Empire, business cycle, call centre, dark matter, Donald Trump, estate planning, full employment, glass ceiling, income inequality, Jeff Bezos, longitudinal study, Louis Pasteur, Maui Hawaii, McMansion, means of production, passive income, performance metric, plutocrats, Plutocrats, Plutonomy: Buying Luxury, Explaining Global Imbalances, Ronald Reagan, stealth mode startup, Steve Jobs, Thorstein Veblen, trickle-down economics, women in the workforce, zero-sum game

For any respondent who wanted it, we provided a coded identification number that enabled the individual to examine the results and reports for personal reasons. In some cases, we even let them examine their own data in comparison to others in the financial elite. For a generation of business men and women who believe in measurement, and who grew Debunking Paris Hilton 15 up with IQ tests, SAT scores, and other performance metrics, this quantitative capability was an often irresistible source of pleasure. This was particularly true because the individuals had been on a special journey, one their upbringings had left them largely unprepared for, and so understanding the journeys of others was a means for understanding their own trips and themselves. But there is a deeper, more telling reason the wealthy volunteered hours of their time for us.


The Fix: How Bankers Lied, Cheated and Colluded to Rig the World's Most Important Number (Bloomberg) by Liam Vaughan, Gavin Finch

asset allocation, asset-backed security, bank run, banking crisis, Bernie Sanders, Big bang: deregulation of the City of London, buy low sell high, call centre, central bank independence, collapse of Lehman Brothers, corporate governance, credit crunch, Credit Default Swap, eurozone crisis, fear of failure, financial deregulation, financial innovation, fixed income, interest rate derivative, interest rate swap, Kickstarter, light touch regulation, London Interbank Offered Rate, London Whale, mortgage debt, Northern Rock, performance metric, Ponzi scheme, Ronald Reagan, social intelligence, sovereign wealth fund, urban sprawl

His voice sped up when he talked about heady days piling into positions, squeezing the best prices from brokers and playing traders off against each other. “The first thing you think is where’s the edge, where can I make a bit more money, how can I push, push the boundaries, maybe you know a bit of a gray area, push the edge of the envelope,” he said in one early interview. “But the point is, you are greedy, you want every little bit of money that you can possibly get because, like I say, that is how you are judged, that is your performance metric.” Paper coffee cups piled up as Hayes went over the minutiae of the case: how to hedge a forward rate agreement; the nuances of Libor and Tibor; why he and Darin hated each other so much. One of the interviews was conducted in the dark so Hayes could talk the investigators through his trading book, which was beamed onto a wall. At one stage, Hayes was asked about how he viewed his attempts to move Libor around.


pages: 290 words: 73,000

Algorithms of Oppression: How Search Engines Reinforce Racism by Safiya Umoja Noble

A Declaration of the Independence of Cyberspace, affirmative action, Airbnb, borderless world, cloud computing, conceptual framework, crowdsourcing, desegregation, Donald Trump, Edward Snowden, Filter Bubble, Firefox, Google Earth, Google Glasses, housing crisis, illegal immigration, immigration reform, information retrieval, Internet Archive, Jaron Lanier, Mitch Kapor, Naomi Klein, new economy, PageRank, performance metric, phenotype, profit motive, Silicon Valley, Silicon Valley ideology, Snapchat, Tim Cook: Apple, union organizing, women in the workforce, yellow journalism

What is most popular on the Internet is not wholly a matter of what users click on and how websites are hyperlinked—there are a variety of processes at play. Max Holloway of Search Engine Watch notes, “Similarly, with Google, when you click on a result—or, for that matter, don’t click on a result—that behavior impacts future results. One consequence of this complexity is difficulty in explaining system behavior. We primarily rely on performance metrics to quantify the success or failure of retrieval results, or to tell us which variations of a system work better than others. Such metrics allow the system to be continuously improved upon.”52 The goal of combining search terms, then, in the context of the landscape of the search engine optimization logic, is only the beginning. Much research has now been done to dispel the notion that users of the Internet have the ability to “vote” with their clicks and express interest in individual content and information, resulting in democratic practices online.53 Research shows the ways that political news and information in the blogosphere are mediated and directed such that major news outlets surface to the top of the information pile over less well-known websites and alternative news sites in the blogosphere, to the benefit of elites.54 In the case of political information seeking, research has shown how Google directs web traffic to mainstream corporate news conglomerates, which increases their ability to shape the political discourse.


pages: 345 words: 75,660

Prediction Machines: The Simple Economics of Artificial Intelligence by Ajay Agrawal, Joshua Gans, Avi Goldfarb

"Robert Solow", Ada Lovelace, AI winter, Air France Flight 447, Airbus A320, artificial general intelligence, autonomous vehicles, basic income, Bayesian statistics, Black Swan, blockchain, call centre, Capital in the Twenty-First Century by Thomas Piketty, Captain Sullenberger Hudson, collateralized debt obligation, computer age, creative destruction, Daniel Kahneman / Amos Tversky, data acquisition, data is the new oil, deskilling, disruptive innovation, Elon Musk, en.wikipedia.org, Erik Brynjolfsson, everywhere but in the productivity statistics, Google Glasses, high net worth, ImageNet competition, income inequality, information retrieval, inventory management, invisible hand, job automation, John Markoff, Joseph Schumpeter, Kevin Kelly, Lyft, Minecraft, Mitch Kapor, Moneyball by Michael Lewis explains big data, Nate Silver, new economy, On the Economy of Machinery and Manufactures, pattern recognition, performance metric, profit maximization, QWERTY keyboard, race to the bottom, randomized controlled trial, Ray Kurzweil, ride hailing / ride sharing, Second Machine Age, self-driving car, shareholder value, Silicon Valley, statistical model, Stephen Hawking, Steve Jobs, Steven Levy, strong AI, The Future of Employment, The Signal and the Noise by Nate Silver, Tim Cook: Apple, Turing test, Uber and Lyft, uber lyft, US Airways Flight 1549, Vernor Vinge, Watson beat the top human players on Jeopardy!, William Langewiesche, Y Combinator, zero-sum game

For instance, one of the practical applications of recent AI is in radiology. Much of what radiologists currently do involves taking images and then identifying issues of concern. They predict abnormalities in images. AIs are increasingly able to perform that prediction function at human levels of accuracy or better, which can assist radiologists and other medical specialists in making decisions that have an impact on patients. The critical performance metric is the accuracy of the diagnosis: whether the machine predicts a disease when the patient is ill and predicts no disease when the patient is healthy. But we must consider what such decisions involve. Suppose doctors suspect a lump and must decide how to determine if it is cancerous. One option is medical imaging. Another option is something more invasive, like a biopsy. A biopsy has the advantage of being highly likely to provide an accurate diagnosis.


pages: 300 words: 76,638

The War on Normal People: The Truth About America's Disappearing Jobs and Why Universal Basic Income Is Our Future by Andrew Yang

3D printing, Airbnb, assortative mating, augmented reality, autonomous vehicles, basic income, Ben Horowitz, Bernie Sanders, call centre, corporate governance, cryptocurrency, David Brooks, Donald Trump, Elon Musk, falling living standards, financial deregulation, full employment, future of work, global reserve currency, income inequality, Internet of things, invisible hand, Jeff Bezos, job automation, John Maynard Keynes: technological unemployment, Khan Academy, labor-force participation, longitudinal study, low skilled workers, Lyft, manufacturing employment, Mark Zuckerberg, megacity, Narrative Science, new economy, passive income, performance metric, post-work, quantitative easing, reserve currency, Richard Florida, ride hailing / ride sharing, risk tolerance, Ronald Reagan, Sam Altman, self-driving car, shareholder value, Silicon Valley, Simon Kuznets, single-payer health, Stephen Hawking, Steve Ballmer, supercomputer in your pocket, technoutopianism, telemarketer, The Wealth of Nations by Adam Smith, Tyler Cowen: Great Stagnation, Uber and Lyft, uber lyft, unemployed young men, universal basic income, urban renewal, white flight, winner-take-all economy, Y Combinator

Perhaps the most interesting application of technology in college education is the Minerva Project, a startup university now entering its fifth year. At Minerva, students take classes online, but they do so while living together in dorm-style housing. Minerva’s online interface is unusual in that the student’s face is shown the whole time, and they get called on to ensure accountability and engagement. This “facetime” is even the main performance metric—there aren’t final exams. Professors review the classes to see if individual students are demonstrating the right “habits of mind.” Minerva saves money by not investing in libraries, athletic facilities, sports teams, and the like. Students spend up to one year each in different dorms in San Francisco, Buenos Aires, Berlin, Seoul, and Istanbul. Minerva is selective—the acceptance rate for the latest class was only 1.9 percent.


pages: 264 words: 76,643

The Growth Delusion: Wealth, Poverty, and the Well-Being of Nations by David Pilling

Airbnb, banking crisis, Bernie Sanders, Big bang: deregulation of the City of London, Branko Milanovic, call centre, centre right, clean water, collapse of Lehman Brothers, collateralized debt obligation, commoditize, Credit Default Swap, credit default swaps / collateralized debt obligations, dark matter, Deng Xiaoping, Diane Coyle, Donald Trump, double entry bookkeeping, Erik Brynjolfsson, falling living standards, financial deregulation, financial intermediation, financial repression, Gini coefficient, Goldman Sachs: Vampire Squid, Google Hangouts, Hans Rosling, happiness index / gross national happiness, income inequality, income per capita, informal economy, invisible hand, job satisfaction, Mahatma Gandhi, market fundamentalism, Martin Wolf, means of production, Monkeys Reject Unequal Pay, mortgage debt, off grid, old-boy network, Panopticon Jeremy Bentham, peak oil, performance metric, pez dispenser, profit motive, purchasing power parity, race to the bottom, rent-seeking, Robert Gordon, Ronald Reagan, Rory Sutherland, science of happiness, shareholder value, sharing economy, Simon Kuznets, sovereign wealth fund, The Great Moderation, The Wealth of Nations by Adam Smith, Thomas Malthus, total factor productivity, transaction costs, transfer pricing, trickle-down economics, urban sprawl, women in the workforce, World Values Survey

“In reality, our CO2 emissions are still increasing, but the efforts we are making are very big.”17 Things were beginning to change politically as well. Niu’s green GDP was beginning to catch on. In 2015 China’s environment ministry again floated the idea that the performance of provincial officials should be judged partly by progress on improving the environment. In 2014 more than seventy smaller cities and counties jettisoned GDP as a performance metric for government officials, prioritizing environmental protection and poverty reduction instead. That summer President Xi had told party officials, “We need to look at obvious achievements as well as hidden achievements. We can no longer simply use GDP growth rates to decide who the [party] heroes are.”18 Internationally too, Beijing had gone from laggard to putative world leader. Even as Donald Trump was pulling America out of the Paris Climate Agreement, Beijing agreed with the European Union to accelerate what was called a historic shift away from fossil fuels.


pages: 257 words: 76,785

Shorter by Alex Soojung-Kim Pang

8-hour work day, airport security, Albert Einstein, Bertrand Russell: In Praise of Idleness, business process, call centre, carbon footprint, centre right, cloud computing, colonial rule, disruptive innovation, Erik Brynjolfsson, future of work, game design, gig economy, Henri Poincaré, IKEA effect, iterative process, job automation, job satisfaction, job-hopping, Johannes Kepler, Kickstarter, labor-force participation, longitudinal study, means of production, neurotypical, performance metric, race to the bottom, remote working, Second Machine Age, side project, Silicon Valley, Steve Jobs, telemarketer, The Wealth of Nations by Adam Smith, women in the workforce, young professional, zero-sum game

But they weren’t sure how they could make a four-day week work for everyone, or whether a four-day week would actually improve everyone’s working life, so they chose to remain on a five-day week. Having said that, the question “Would it be politically feasible and functionally possible to implement a shorter workweek for some but not all in your organization?” is one you have to answer for yourself, and the answer is going to be different with every organization. RUN A TRIAL Even after discussing it with employees, writing up contingency plans, and establishing performance metrics, very few companies make a permanent switch to a shorter workweek immediately. They first start with a trial period during which they give people time to adjust to the new schedule, observe, and solve unexpected problems. They then review the experiment at regular intervals to assess how things are going, absorb new lessons, and make course corrections. Three months or ninety days is the most popular length for a trial.


pages: 772 words: 203,182

What Went Wrong: How the 1% Hijacked the American Middle Class . . . And What Other Countries Got Right by George R. Tyler

8-hour work day, active measures, activist fund / activist shareholder / activist investor, affirmative action, Affordable Care Act / Obamacare, bank run, banking crisis, Basel III, Black Swan, blood diamonds, blue-collar work, Bolshevik threat, bonus culture, British Empire, business cycle, business process, buy and hold, capital controls, Carmen Reinhart, carried interest, cognitive dissonance, collateralized debt obligation, collective bargaining, commoditize, corporate governance, corporate personhood, corporate raider, corporate social responsibility, creative destruction, credit crunch, crony capitalism, crowdsourcing, currency manipulation / currency intervention, David Brooks, David Graeber, David Ricardo: comparative advantage, declining real wages, deindustrialization, Diane Coyle, disruptive innovation, Double Irish / Dutch Sandwich, eurozone crisis, financial deregulation, financial innovation, fixed income, Francis Fukuyama: the end of history, full employment, George Akerlof, George Gilder, Gini coefficient, Gordon Gekko, hiring and firing, income inequality, invisible hand, job satisfaction, John Markoff, joint-stock company, Joseph Schumpeter, Kenneth Rogoff, labor-force participation, laissez-faire capitalism, lake wobegon effect, light touch regulation, Long Term Capital Management, manufacturing employment, market clearing, market fundamentalism, Martin Wolf, minimum wage unemployment, mittelstand, moral hazard, Myron Scholes, Naomi Klein, Northern Rock, obamacare, offshore financial centre, Paul Samuelson, pension reform, performance metric, pirate software, plutocrats, Plutocrats, Ponzi scheme, precariat, price stability, profit maximization, profit motive, purchasing power parity, race to the bottom, Ralph Nader, rent-seeking, reshoring, Richard Thaler, rising living standards, road to serfdom, Robert Gordon, Robert Shiller, Robert Shiller, Ronald Reagan, Sand Hill Road, shareholder value, Silicon Valley, Social Responsibility of Business Is to Increase Its Profits, South Sea Bubble, sovereign wealth fund, Steve Ballmer, Steve Jobs, The Chicago School, The Spirit Level, The Wealth of Nations by Adam Smith, Thorstein Veblen, too big to fail, transcontinental railway, transfer pricing, trickle-down economics, tulip mania, Tyler Cowen: Great Stagnation, union organizing, Upton Sinclair, upwardly mobile, women in the workforce, working poor, zero-sum game

European and Asian executives, even those running multinational corporations, are paid a fraction of the salaries paid in the Anglo sphere.”41 CEO Lemons: The Collapse of Pay-for-Performance in America Foreign scholars describe American firms as providing “pathological overcompensation of fair-weather captains.”42 They are correct: the rise in US executive compensation of recent decades is unjustified by any performance metric, vastly outstripping indices like sales, profits, or returns to shareholders. The Clinton administration’s Secretary of Labor, Robert Reich, unearthed the smoking gun evidence: “By 2006, CEOs were earning, on average, eight times as much per dollar of corporate profits as they did in the 1980s.”43 A vast disparity like this in trend lines is powerful evidence that executive pay suffers from market failure.

They are more common in Europe and even in the United Kingdom since the Financial Services Authority in London imposed clawback rules in 2009, targeted at bad apples. That is perhaps why Lloyds Banking Group reclaimed bonuses paid to senior executives who engineered a consumer scam.107 And it seems likely that the LIBOR scandal will eventually involve clawbacks at firms such as Barclays. The general framework just outlined, with modest bonuses featuring delayed vesting and dependent on long-term performance metrics, has been endorsed by academics, notably the Squam Lake Group of economists including Kenneth French of Dartmouth and Robert Shiller of Yale.108 And its principles are reflected in Germany’s VorstAG law enacted in July 2009, explicitly intended to lengthen executive time horizons, with incentive pay vesting only after four years.109 Moreover, risky decision-making is discouraged by its legal provisions precluding management profiting from extraordinary developments such as takeovers or other realization of hidden assets.


Toast by Stross, Charles

anthropic principle, Buckminster Fuller, cosmological principle, dark matter, double helix, Ernest Rutherford, Extropian, Francis Fukuyama: the end of history, glass ceiling, gravity well, Khyber Pass, Mars Rover, Mikhail Gorbachev, NP-complete, oil shale / tar sands, peak oil, performance metric, phenotype, plutocrats, Plutocrats, Ronald Reagan, Silicon Valley, slashdot, speech recognition, strong AI, traveling salesman, Turing test, urban renewal, Vernor Vinge, Whole Earth Review, Y2K

It was a woman I’d met somewhere—some conference or other—lanky blonde hair, palid skin, and far too evangelical about formal methods. “Feel free.” She pulled a chair out and sat down and the steward poured her a cup of coffee immediately. I noticed that even on a cruise ship she was dressed in a business suit, although it looked somewhat the worse for wear. “Coffee, please,” I called after the retreating steward. “We met in Darmstadt, `97,” she said. “You’re Marcus Jackman? I critiqued your paper on performance metrics for IEEE maintenance transactions.” The penny dropped. “Karla . . . Carrol?” I asked. She smiled. “Yes, I remember your review.” I did indeed, and nearly burned my tongue on the coffee trying not to let slip precisely how I remembered it. I’m not fit to be rude until after at least the third cup of the morning. “Most interesting. What brings you here?” “The usual risk contingency planning.


pages: 360 words: 85,321

The Perfect Bet: How Science and Math Are Taking the Luck Out of Gambling by Adam Kucharski

Ada Lovelace, Albert Einstein, Antoine Gombaud: Chevalier de Méré, beat the dealer, Benoit Mandelbrot, butterfly effect, call centre, Chance favours the prepared mind, Claude Shannon: information theory, collateralized debt obligation, correlation does not imply causation, diversification, Edward Lorenz: Chaos theory, Edward Thorp, Everything should be made as simple as possible, Flash crash, Gerolamo Cardano, Henri Poincaré, Hibernia Atlantic: Project Express, if you build it, they will come, invention of the telegraph, Isaac Newton, Johannes Kepler, John Nash: game theory, John von Neumann, locking in a profit, Louis Pasteur, Nash equilibrium, Norbert Wiener, p-value, performance metric, Pierre-Simon Laplace, probability theory / Blaise Pascal / Pierre de Fermat, quantitative trading / quantitative finance, random walk, Richard Feynman, Ronald Reagan, Rubik’s Cube, statistical model, The Design of Experiments, Watson beat the top human players on Jeopardy!, zero-sum game

1398870164. 205hockey analyst Brian King suggested a way: Charron, Cam. “Analytics Mailbag: Save Percentages, PDO, and Repeatability.” TheLeafsNation.com. May 27, 2014. http://theleafsnation.com/2014/5/27/analytics-mailbag-save-percentages-pdo-and-repeatability. 205The statistic, later dubbed PDO: Details on PDO and NHL statistics given in: Weissbock, Joshua, Herna Viktor, and Diana Inkpen. “Use of Performance Metrics to Forecast Success in the National Hockey League” (paper presented at the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, Prague, September 23–27, 2013). 205England had the lowest PDO: Burn-Murdoch, John. “Were England the Uunluckiest Team in the World Cup Group Stages?” FT Data Blog. 29 June 2014. http://blogs.ft.com/ftdata/2014/06/29/were-england-the-unluckiest-team-in-the-world-cup-group-stages/. 206Cambridge college spent on wine: “In Vino Veritas, Redux.”


pages: 294 words: 82,438

Simple Rules: How to Thrive in a Complex World by Donald Sull, Kathleen M. Eisenhardt

Affordable Care Act / Obamacare, Airbnb, asset allocation, Atul Gawande, barriers to entry, Basel III, Berlin Wall, carbon footprint, Checklist Manifesto, complexity theory, Craig Reynolds: boids flock, Credit Default Swap, Daniel Kahneman / Amos Tversky, diversification, drone strike, en.wikipedia.org, European colonialism, Exxon Valdez, facts on the ground, Fall of the Berlin Wall, haute cuisine, invention of the printing press, Isaac Newton, Kickstarter, late fees, Lean Startup, Louis Pasteur, Lyft, Moneyball by Michael Lewis explains big data, Nate Silver, Network effects, obamacare, Paul Graham, performance metric, price anchoring, RAND corporation, risk/return, Saturday Night Live, sharing economy, Silicon Valley, Startup school, statistical model, Steve Jobs, TaskRabbit, The Signal and the Noise by Nate Silver, transportation-network company, two-sided market, Wall-E, web application, Y Combinator, Zipcar

You can also limit your rules to two or three, as we have seen elsewhere in the book, to increase the odds that you will remember and follow them. After crafting your preliminary rules, it is helpful to measure how well they are working. Measuring impact allows you to pinpoint what is and isn’t working, and evidence of success also provides more motivation to stick with the rules. The best performance metrics are tightly linked to what will move the needles for you—pounds lost for a dieter, or dollars invested if you are trying to save for retirement. Apps have made collecting data and tracking progress easier than at any other time in history. Imagine what the legendary self-improver Benjamin Franklin could have accomplished if he’d had an iPhone. To measure the impact of your simple rules, it helps to collect some data before you start using your rules.


pages: 304 words: 82,395

Big Data: A Revolution That Will Transform How We Live, Work, and Think by Viktor Mayer-Schonberger, Kenneth Cukier

23andMe, Affordable Care Act / Obamacare, airport security, barriers to entry, Berlin Wall, big data - Walmart - Pop Tarts, Black Swan, book scanning, business intelligence, business process, call centre, cloud computing, computer age, correlation does not imply causation, dark matter, double entry bookkeeping, Eratosthenes, Erik Brynjolfsson, game design, IBM and the Holocaust, index card, informal economy, intangible asset, Internet of things, invention of the printing press, Jeff Bezos, Joi Ito, lifelogging, Louis Pasteur, Mark Zuckerberg, Menlo Park, Moneyball by Michael Lewis explains big data, Nate Silver, natural language processing, Netflix Prize, Network effects, obamacare, optical character recognition, PageRank, paypal mafia, performance metric, Peter Thiel, post-materialism, random walk, recommendation engine, self-driving car, sentiment analysis, Silicon Valley, Silicon Valley startup, smart grid, smart meter, social graph, speech recognition, Steve Jobs, Steven Levy, the scientific method, The Signal and the Noise by Nate Silver, The Wealth of Nations by Adam Smith, Thomas Davenport, Turing test, Watson beat the top human players on Jeopardy!

Grigsby, Pamela Ann Nesbitt, and Lisa Anne Seacat. “Securing premises using surfaced-based computing technology,” U.S. Patent number: 8138882. Issue date: March 20, 2012. The quantified-self movement—“Counting Every Moment,” The Economist, March 3, 2012. Apple earbuds for bio-measurements—Jesse Lee Dorogusker, Anthony Fadell, Donald J. Novotney, and Nicholas R Kalayjian, “Integrated Sensors for Tracking Performance Metrics,” U.S. Patent Application 20090287067. Assignee: Apple. Application Date: 2009-07-23. Publication Date: 2009-11-19. Derawi Biometrics, “Your Walk Is Your PIN-Code,” press release, February 21, 2011 (http://biometrics.derawi.com/?p=175). iTrem information—See the iTrem project page of the Landmarc Research Center at Georgia Tech (http://eosl.gtri.gatech.edu/Capabilities/LandmarcResearchCenter/LandmarcProjects/iTrem/tabid/798/Default.aspx) and email exchange.


pages: 245 words: 83,272

Artificial Unintelligence: How Computers Misunderstand the World by Meredith Broussard

1960s counterculture, A Declaration of the Independence of Cyberspace, Ada Lovelace, AI winter, Airbnb, Amazon Web Services, autonomous vehicles, availability heuristic, barriers to entry, Bernie Sanders, bitcoin, Buckminster Fuller, Chris Urmson, Clayton Christensen, cloud computing, cognitive bias, complexity theory, computer vision, crowdsourcing, Danny Hillis, DARPA: Urban Challenge, digital map, disruptive innovation, Donald Trump, Douglas Engelbart, easy for humans, difficult for computers, Electric Kool-Aid Acid Test, Elon Musk, Firefox, gig economy, global supply chain, Google Glasses, Google X / Alphabet X, Hacker Ethic, Jaron Lanier, Jeff Bezos, John von Neumann, Joi Ito, Joseph-Marie Jacquard, life extension, Lyft, Mark Zuckerberg, mass incarceration, Minecraft, minimum viable product, Mother of all demos, move fast and break things, move fast and break things, Nate Silver, natural language processing, PageRank, payday loans, paypal mafia, performance metric, Peter Thiel, price discrimination, Ray Kurzweil, ride hailing / ride sharing, Ross Ulbricht, Saturday Night Live, school choice, self-driving car, Silicon Valley, speech recognition, statistical model, Steve Jobs, Steven Levy, Stewart Brand, Tesla Model S, the High Line, The Signal and the Noise by Nate Silver, theory of mind, Travis Kalanick, Turing test, Uber for X, uber lyft, Watson beat the top human players on Jeopardy!, Whole Earth Catalog, women in the workforce

However, machine learning is so new, and there is so little consensus, that it’s not surprising that the linguistic definitions haven’t caught up to reality. Tom M. Mitchell, the E. Fredkin University Professor in the Machine Learning Department of Carnegie Mellon University’s School of Computer Science, offers a good definition of machine learning in “The Discipline of Machine Learning.” He writes: “We say that a machine learns with respect to a particular task T, performance metric P, and type of experience E, if the system reliably improves its performance P at task T, following experience E. Depending on how we specify T, P, and E, the learning task might also be called by names such as data mining, autonomous discovery, database updating, programming by example, etc.”9 I think this is a good definition because Mitchell uses very precise language to define learning.


The Buddha and the Badass: The Secret Spiritual Art of Succeeding at Work by Vishen Lakhiani

Buckminster Fuller, Burning Man, call centre, Colonization of Mars, crowdsourcing, deliberate practice, Elon Musk, fundamental attribution error, future of work, Google Glasses, Google X / Alphabet X, iterative process, Jeff Bezos, meta analysis, meta-analysis, microbiome, performance metric, Peter Thiel, profit motive, Ralph Waldo Emerson, Silicon Valley, Silicon Valley startup, skunkworks, Skype, Steve Jobs, Steven Levy, web application, white picket fence

If you think these ideas are fluff, this data will change your mind. Gallup’s Q12 Employee Engagement Survey completely shatters the idea that friendships at work are unproductive. The study concludes that one of the key determinants to engagement at work is having a workplace bestie. Coworkers who report a best friend at work are seven times more engaged at work than their disconnected counterparts. They score higher on all performance metrics. They are better with customers, bring more innovation to projects and have superior mental acuity and reduced rates of error and injury. This is because having social bonds at work makes people feel good. It makes them happy. In 2014, I interviewed Shawn Achor, Harvard researcher and bestselling author of The Happiness Advantage, Beyond Happiness and Big Potential. Get a load of this data: When the brain is in a positive state, productivity rises by 31 percent Sales success increases by 37 percent Intelligence, creativity, and memory all improve dramatically Doctors primed to be happy are 19 percent better at making the right diagnosis Social bonds raise positivity, and that’s important.


Seeking SRE: Conversations About Running Production Systems at Scale by David N. Blank-Edelman

Affordable Care Act / Obamacare, algorithmic trading, Amazon Web Services, bounce rate, business continuity plan, business process, cloud computing, cognitive bias, cognitive dissonance, commoditize, continuous integration, crowdsourcing, dark matter, database schema, Debian, defense in depth, DevOps, domain-specific language, en.wikipedia.org, fault tolerance, fear of failure, friendly fire, game design, Grace Hopper, information retrieval, Infrastructure as a Service, Internet of things, invisible hand, iterative process, Kubernetes, loose coupling, Lyft, Marc Andreessen, microservices, minimum viable product, MVC pattern, performance metric, platform as a service, pull request, RAND corporation, remote working, Richard Feynman, risk tolerance, Ruby on Rails, search engine result page, self-driving car, sentiment analysis, Silicon Valley, single page application, Snapchat, software as a service, software is eating the world, source of truth, the scientific method, Toyota Production System, web application, WebSocket, zero day

Keep in mind that all three criteria must be verified simultaneously, because receiving messages and sending them further might be done asynchronously by different processes, and the good condition of one metric does not mean the same status of the other. Back to our scaling. We previously said that we need to double the throughput. Now, this statement needs to be revised because we changed the testing procedure by appending new requirements, and the results might not be the same as before. The forecasted input will be received by the first “Data receiver” component. Knowing particular values of expected traffic and component performance metrics, we can finally estimate required capacity adjustments. We can calculate the potential maximum capacity we have now, the required capacity that can handle the maximum traffic expected, and then find the delta between the two. But this will be true only to the “Data receiver” because the forecasted input defines the size for this component only. Do not forget that, for instance, doubling the throughput for a component does not necessarily mean that we need to double its fleet.

That is, how do per-transaction latency numbers at P50 compare to P99 or even P99.9? Although throughput is important, in practice it really matters for only the largest companies. This is because for smaller companies, developer time is almost always worth more than infrastructure costs. It’s only at very large scale that overall throughput really begins to matter when considering total cost of ownership (TCO). Instead, tail latency ends up being the most important performance metric for both small and large companies. This is because the causes of high tail latency are hard to understand and lead to a large amount of developer and operator cognitive load. Engineer time is typically the most precious resource for an organization, and debugging tail latency issues is one of the most time-intensive things that engineers do. As a result, the sidecar proxy ends up being a double-edged sword.


pages: 292 words: 81,699

More Joel on Software by Joel Spolsky

a long time ago in a galaxy far, far away, barriers to entry, Black Swan, Build a better mousetrap, business process, call centre, Danny Hillis, David Heinemeier Hansson, failed state, Firefox, fixed income, George Gilder, Larry Wall, low cost airline, low cost carrier, Mars Rover, Network effects, Paul Graham, performance metric, place-making, price discrimination, prisoner's dilemma, Ray Oldenburg, Ruby on Rails, Sand Hill Road, Silicon Valley, slashdot, social software, Steve Ballmer, Steve Jobs, Superbowl ad, The Great Good Place, type inference, unpaid internship, wage slave, web application, Y Combinator

Or the tester agrees to report the bug “informally” to the developer before writing it up in the bug tracking system. And now nobody uses the bug tracking system. The bug count goes way down, but the number of bugs stays the same. Developers are clever this way. Whatever you try to measure, they’ll find a way to maximize, and you’ll never quite get what you want. Robert D. Austin, in his book Measuring and Managing Performance in Organizations, says there are two phases when you introduce new performance metrics. At first, you actually get what you want, because nobody has figured out how to cheat. In the second phase, you actually get something worse, as everyone figures out the trick to maximizing the thing that you’re measuring, even at the cost of ruining the company. Worse, Econ 101 managers think that they can somehow avoid this situation just by tweaking the metrics. Dr. Austin’s conclusion is that you just can’t.


pages: 309 words: 91,581

The Great Divergence: America's Growing Inequality Crisis and What We Can Do About It by Timothy Noah

assortative mating, autonomous vehicles, blue-collar work, Bonfire of the Vanities, Branko Milanovic, business cycle, call centre, collective bargaining, computer age, corporate governance, Credit Default Swap, David Ricardo: comparative advantage, Deng Xiaoping, easy for humans, difficult for computers, Erik Brynjolfsson, Everybody Ought to Be Rich, feminist movement, Frank Levy and Richard Murnane: The New Division of Labor, Gini coefficient, Gunnar Myrdal, income inequality, industrial robot, invisible hand, job automation, Joseph Schumpeter, longitudinal study, low skilled workers, lump of labour, manufacturing employment, moral hazard, oil shock, pattern recognition, Paul Samuelson, performance metric, positional goods, post-industrial society, postindustrial economy, purchasing power parity, refrigerator car, rent control, Richard Feynman, Ronald Reagan, shareholder value, Silicon Valley, Simon Kuznets, Stephen Hawking, Steve Jobs, The Spirit Level, too big to fail, trickle-down economics, Tyler Cowen: Great Stagnation, union organizing, upwardly mobile, very high income, Vilfredo Pareto, War on Poverty, We are the 99%, women in the workforce, Works Progress Administration, Yom Kippur War

But the bill exempted performance-based bonuses and stock options, on the theory that these tied chief executives’ compensation to company profitability. Corporate compensation committees responded in three ways. First, “everybody got a raise to $1 million,” Nell Minow, a corporate governance critic, told me.16 Next, corporate compensation committees, which remained bent on showering chief executives indiscriminately with cash, started inventing make-believe performance metrics. For instance, AES Corp., a firm based in Arlington, Virginia, that operates power plants, made it one of chief executive Dennis Bakke’s performance goals to ensure that AES remained a “fun” place to work. (“To some, it’s soft,” the fun-loving Bakke told Businessweek. “To me, it’s a vision of the world.”) Third, and most important, corporations showered top executives with so many stock options that this form of compensation came to account, on average, for the majority of CEO pay.


pages: 343 words: 91,080

Uberland: How Algorithms Are Rewriting the Rules of Work by Alex Rosenblat

"side hustle", Affordable Care Act / Obamacare, Airbnb, Amazon Mechanical Turk, autonomous vehicles, barriers to entry, basic income, big-box store, call centre, cashless society, Cass Sunstein, choice architecture, collaborative economy, collective bargaining, creative destruction, crowdsourcing, disruptive innovation, don't be evil, Donald Trump, en.wikipedia.org, future of work, gender pay gap, gig economy, Google Chrome, income inequality, information asymmetry, Jaron Lanier, job automation, job satisfaction, Lyft, marginal employment, Mark Zuckerberg, move fast and break things, Network effects, new economy, obamacare, performance metric, Peter Thiel, price discrimination, Ralph Waldo Emerson, regulatory arbitrage, ride hailing / ride sharing, self-driving car, sharing economy, Silicon Valley, Silicon Valley ideology, Skype, social software, stealth mode startup, Steve Jobs, strikebreaker, TaskRabbit, Tim Cook: Apple, transportation-network company, Travis Kalanick, Uber and Lyft, Uber for X, uber lyft, union organizing, universal basic income, urban planning, Wolfgang Streeck, Zipcar

While the rating system is described as a simple way to compare Driver X to Driver Y across Uberland and to scale trust between drivers and passengers, in practice its implementation has troubling implications.15 In our case study of Uber’s drivers, Luke Stark and I found that passengers effectively perform one of the roles of middle managers, because they are responsible for evaluating worker performance.16 When workers are monitored through an opaque system like Uber’s, it’s much harder to see the extent to which control and power dynamics are at play.17 In addition to sending in-the-moment nudges to drivers, Uber also exerts longer-term performance management through weekly performance metrics. The company tracks a combination of personalized stats, including ratings, ride acceptance rates, cancellation rates, hours online, number of trips, and comparisons to other drivers (such as the driver’s personal rating compared to the ratings of top drivers). Historically, drivers risked being deactivated if their ratings fell below a certain threshold, such as 4.6/5 stars; if their ride-acceptance rate fell below 80–90 percent; or if their cancellation rate climbed above 5 percent.


Driverless: Intelligent Cars and the Road Ahead by Hod Lipson, Melba Kurman

AI winter, Air France Flight 447, Amazon Mechanical Turk, autonomous vehicles, barriers to entry, butterfly effect, carbon footprint, Chris Urmson, cloud computing, computer vision, connected car, creative destruction, crowdsourcing, DARPA: Urban Challenge, digital map, Elon Musk, en.wikipedia.org, Erik Brynjolfsson, Google Earth, Google X / Alphabet X, high net worth, hive mind, ImageNet competition, income inequality, industrial robot, intermodal, Internet of things, job automation, Joseph Schumpeter, lone genius, Lyft, megacity, Network effects, New Urbanism, Oculus Rift, pattern recognition, performance metric, precision agriculture, RFID, ride hailing / ride sharing, Second Machine Age, self-driving car, Silicon Valley, smart cities, speech recognition, statistical model, Steve Jobs, technoutopianism, Tesla Model S, Travis Kalanick, Uber and Lyft, uber lyft, Unsafe at Any Speed

The AVA should lead states to define and clarify another core challenge related to safety, that of liability. Who is at fault in a driverless-car accident needs to be clarified. While accidents involving driverless cars are likely to be relatively rare and the question may wind up being irrelevant, the issue stills requires examination. In the United States, insurance law is defined and enforced at the state level. If the federal government can clarify standard performance metrics for each major system in a driverless car (i.e., the software, hardware sensors, and the automotive body), insurance companies will have a clear framework for quantifying risk, and manufacturers will be protected from frivolous lawsuits. Another issue where federal regulation will be needed at the state and city level is that of privacy. Driverless cars will be a goldmine of data on passenger comings and goings and a treasure trove of visual data of roads and roadsides.


pages: 338 words: 92,465

Reskilling America: Learning to Labor in the Twenty-First Century by Katherine S. Newman, Hella Winston

active measures, blue-collar work, business cycle, collective bargaining, Computer Numeric Control, deindustrialization, desegregation, factory automation, interchangeable parts, invisible hand, job-hopping, knowledge economy, longitudinal study, low skilled workers, performance metric, reshoring, Ronald Reagan, Silicon Valley, social intelligence, two tier labour market, union organizing, upwardly mobile, War on Poverty, Wolfgang Streeck, working poor

“The crossover between the two sides has been excellent.”4 Even though some students need to take the MCAS multiple times before they pass—vocational schools are particularly committed to offering help and remediation for students who fail—only three seniors did not receive diplomas in 2002. Moreover, Massachusetts vocational schools do far better than comprehensive high schools on crucial performance metrics.5 The statewide dropout rate at regular/comprehensive high schools averaged 2.8 percent in 2011 but was only 1.6 percent among the thirty-nine vocational technical schools and averaged 0.9 percent among regional vocational technical schools. (Massachusetts requires every school district to offer students a career vocational technical education option, either by providing it themselves—common among the larger districts—or as part of a regional career vocational technical high school system.)


Concentrated Investing by Allen C. Benello

activist fund / activist shareholder / activist investor, asset allocation, barriers to entry, beat the dealer, Benoit Mandelbrot, Bob Noyce, business cycle, buy and hold, carried interest, Claude Shannon: information theory, corporate governance, corporate raider, delta neutral, discounted cash flows, diversification, diversified portfolio, Edward Thorp, family office, fixed income, high net worth, index fund, John von Neumann, Louis Bachelier, margin call, merger arbitrage, Paul Samuelson, performance metric, random walk, risk tolerance, risk-adjusted returns, risk/return, Robert Shiller, Robert Shiller, shareholder value, Sharpe ratio, short selling, survivorship bias, technology bubble, transaction costs, zero-sum game

This no doubt contributed to tensions, and the eventual split between Greenberg and the other partners a year later in 2009. After his campaign went public, Comcast quietly began to make a lot of changes. A new chief financial officer, Michael Angelakis, joined the company and brought discipline to capital spending, operating budgets, and acquisitions. Compensation, which was entirely based on EBITDA size, was broadened to include more appropriate performance metrics. Share retirement and dividend increases became regular and predictable, while remaining somewhat anemic. Free cash flow became, for the first time, the company’s mantra. Although Comcast generates a lot of free cash flow they don’t return a significant amount of it. They just move up the dividend and buy back gently over time. They’re massively under-levered relative to where the rest of the cable industry is, and where they were historically.


pages: 358 words: 93,969

Climate Change by Joseph Romm

carbon footprint, Climatic Research Unit, decarbonisation, demand response, Douglas Hofstadter, Elon Musk, energy security, energy transition, failed state, hydraulic fracturing, hydrogen economy, Intergovernmental Panel on Climate Change (IPCC), knowledge worker, mass immigration, performance metric, renewable energy transition, ride hailing / ride sharing, Ronald Reagan, Silicon Valley, Silicon Valley startup, the scientific method

There were “statistically significant and meaningful reductions in decision-making performance” in the test subjects based on a standard assessment used for assessing cognitive function: At 1,000 ppm CO2, compared with 600 ppm, performance was significantly diminished on six of nine metrics of decision-making performance. At 2,500 ppm CO2, compared with 600 ppm, performance was significantly reduced in seven of nine metrics of performance, with percentile ranks for some performance metrics decreasing to levels associated with marginal or dysfunctional performance. Dr. Wargocki told me the original LBNL study had an “exemplary design” with “systematic results” showing that “high cognitive skills were most affected” by high CO2 levels. His team decided to repeat the experiment, but were unable to use tests that measured high cognitive skills. Instead they mainly looked at more basic task performance including typing, addition, and proofreading.


pages: 347 words: 91,318

Netflixed: The Epic Battle for America's Eyeballs by Gina Keating

activist fund / activist shareholder / activist investor, barriers to entry, business intelligence, collaborative consumption, corporate raider, inventory management, Jeff Bezos, late fees, Mark Zuckerberg, McMansion, Menlo Park, Netflix Prize, new economy, out of africa, performance metric, Ponzi scheme, pre–internet, price stability, recommendation engine, Saturday Night Live, shareholder value, Silicon Valley, Silicon Valley startup, Steve Jobs, subscription business, Superbowl ad, telemarketer, X Prize

Mario Cibelli of Marathon Partners spent one of the more interesting days of his career as a hedge fund manager at a Netflix distribution center in Long Island, and came away with a different opinion. “There’s not a snowball’s chance in hell that Blockbuster can do this,” he told his colleagues, when he returned to work later that day. The warehouse manager, a former aerospace engineer, had shown Cibelli a series of charts posted on the wall; it had about two dozen optimum performance metrics. “As long as my performance is within this band, I won’t hear from senior management,” the man said, indicating the charts. “As soon as I move out of this band, I will get a call.” Hastings and his team had spent the time and thought to build a quality business, and management clearly was running Netflix for the long term, Cibelli thought. He laid out his reasons for taking a long position in Netflix shares in an internal memo: On the surface, Netflix is a massive video store; taking in cash for monthly renting rights and loaning out DVDs to the consumer.


Learn Algorithmic Trading by Sebastien Donadio

active measures, algorithmic trading, automated trading system, backtesting, Bayesian statistics, buy and hold, buy low sell high, cryptocurrency, DevOps, en.wikipedia.org, fixed income, Flash crash, Guido van Rossum, latency arbitrage, locking in a profit, market fundamentalism, market microstructure, martingale, natural language processing, p-value, paper trading, performance metric, prediction markets, quantitative trading / quantitative finance, random walk, risk tolerance, risk-adjusted returns, Sharpe ratio, short selling, sorting algorithm, statistical arbitrage, statistical model, stochastic process, survivorship bias, transaction costs, type inference, WebSocket, zero-sum game

This is known as testing your model and the datasets used for that are known as test data. The task of using a model where the parameters were learned by statistical inference to actually make predictions on previously unseen data is known as statistical prediction or forecasting. We need to be able to understand the metrics of how to differentiate between a good model and a bad model. There are several well known and well understood performance metrics for different models. For regression prediction problems, we should try to minimize the differences between predicted value and the actual value of the target variable. This error term is known as residual errors; larger errors mean worse models and, in regression, we try to minimize the sum of these residual errors, or the sum of the square of these residual errors (squaring has the effect of penalizing large outliers more strongly, but more on that later).


pages: 317 words: 89,825

No Rules Rules: Netflix and the Culture of Reinvention by Reed Hastings, Erin Meyer

Airbnb, Downton Abbey, Elon Musk, en.wikipedia.org, global village, hiring and firing, job-hopping, late fees, loose coupling, loss aversion, out of africa, performance metric, Saturday Night Live, Silicon Valley, Skype, Stephen Hawking, Steve Ballmer, Steve Jobs, subscription business

But at Netflix, where we have to be able to adapt direction quickly in response to rapid changes, the last thing we want is our employees rewarded in December for attaining some goal fixed the previous January. The risk is that employees will focus on a target instead of spot what’s best for the company in the present moment. Many of our Hollywood-based employees come from studios like WarnerMedia or NBC, where a big part of executive compensation is based on specific financial performance metrics. If this year the target is to increase operating profit by 5 percent, the way to get your bonus—often a quarter of annual pay—is to focus doggedly on increasing operational profit. But what if, in order to be competitive five years down the line, a division needs to change course? Changing course involves investment and risk that may reduce this year’s profit margin. The stock price might go down with it.


pages: 1,088 words: 228,743

Expected Returns: An Investor's Guide to Harvesting Market Rewards by Antti Ilmanen

Andrei Shleifer, asset allocation, asset-backed security, availability heuristic, backtesting, balance sheet recession, bank run, banking crisis, barriers to entry, Bernie Madoff, Black Swan, Bretton Woods, business cycle, buy and hold, buy low sell high, capital asset pricing model, capital controls, Carmen Reinhart, central bank independence, collateralized debt obligation, commoditize, commodity trading advisor, corporate governance, credit crunch, Credit Default Swap, credit default swaps / collateralized debt obligations, debt deflation, deglobalization, delta neutral, demand response, discounted cash flows, disintermediation, diversification, diversified portfolio, dividend-yielding stocks, equity premium, Eugene Fama: efficient market hypothesis, fiat currency, financial deregulation, financial innovation, financial intermediation, fixed income, Flash crash, framing effect, frictionless, frictionless market, G4S, George Akerlof, global reserve currency, Google Earth, high net worth, hindsight bias, Hyman Minsky, implied volatility, income inequality, incomplete markets, index fund, inflation targeting, information asymmetry, interest rate swap, invisible hand, Kenneth Rogoff, laissez-faire capitalism, law of one price, London Interbank Offered Rate, Long Term Capital Management, loss aversion, margin call, market bubble, market clearing, market friction, market fundamentalism, market microstructure, mental accounting, merger arbitrage, mittelstand, moral hazard, Myron Scholes, negative equity, New Journalism, oil shock, p-value, passive investing, Paul Samuelson, performance metric, Ponzi scheme, prediction markets, price anchoring, price stability, principal–agent problem, private sector deleveraging, purchasing power parity, quantitative easing, quantitative trading / quantitative finance, random walk, reserve currency, Richard Thaler, risk tolerance, risk-adjusted returns, risk/return, riskless arbitrage, Robert Shiller, Robert Shiller, savings glut, selection bias, Sharpe ratio, short selling, sovereign wealth fund, statistical arbitrage, statistical model, stochastic volatility, stocks for the long run, survivorship bias, systematic trading, The Great Moderation, The Myth of the Rational Market, too big to fail, transaction costs, tulip mania, value at risk, volatility arbitrage, volatility smile, working-age population, Y2K, yield curve, zero-coupon bond, zero-sum game

Most studies conclude that irrational mispricing contributes importantly to observed option market regularities. The rational camp responds that risk stories can explain a surprisingly large part of observed returns without resorting to irrationality—and that various market frictions can make exploiting any remaining opportunities difficult. Specifically, Broadie–Chernov–Johannes (2009) argue that options are often thought to be mispriced because the performance metrics that are used (Sharpe ratios and CAPM alphas) are ill-suited for option analysis, especially over short samples. After documenting the huge challenge for rational models—massively negative average returns for long index puts, losses of 30% per month, or worse, as noted earlier—they proceed to show that standard option-pricing models can largely explain these average returns. OTM puts are especially highly levered positions on the underlying index; during a period of high realized equity premium, OTM puts with large negative betas can be expected to have large negative returns.

Operational risks (errors and fraud) are a good example; the SR of Madoff’s track record was hard to beat but it came with huge operational risk. Conclusions The portfolio SR is a good starting point but it needs to be supplemented with other portfolio attributes. All of the desirable attributes discussed above may be worth some SR sacrifice. However, no single risk-adjusted return measure can capture them all, and many of these tradeoffs can only be assessed in a qualitative fashion. Multiple performance metrics are needed, given the multi-dimensional nature of the problem. 28.2.4 Smart risk taking and portfolio construction There now follow some intuitive rules of thumb for smart investing: a recipe for optimal diversification and the “fundamental law of active management”. First, here is a recipe for smart portfolio construction, which sums up mean variance optimization in a nutshell: allocate equal volatility to each asset class (or return source) in a portfolio, unless some assets’ exceptional SRs or diversification abilities justify deviating from equal volatility weightings.


Martin Kleppmann-Designing Data-Intensive Applications. The Big Ideas Behind Reliable, Scalable and Maintainable Systems-O’Reilly (2017) by Unknown

active measures, Amazon Web Services, bitcoin, blockchain, business intelligence, business process, c2.com, cloud computing, collaborative editing, commoditize, conceptual framework, cryptocurrency, database schema, DevOps, distributed ledger, Donald Knuth, Edward Snowden, Ethereum, ethereum blockchain, fault tolerance, finite state, Flash crash, full text search, general-purpose programming language, informal economy, information retrieval, Internet of things, iterative process, John von Neumann, Kubernetes, loose coupling, Marc Andreessen, microservices, natural language processing, Network effects, packet switching, peer-to-peer, performance metric, place-making, premature optimization, recommendation engine, Richard Feynman, self-driving car, semantic web, Shoshana Zuboff, social graph, social web, software as a service, software is eating the world, sorting algorithm, source of truth, SPARQL, speech recognition, statistical model, undersea cable, web application, WebSocket, wikimedia commons

Reliability | 9 • Allow quick and easy recovery from human errors, to minimize the impact in the case of a failure. For example, make it fast to roll back configuration changes, roll out new code gradually (so that any unexpected bugs affect only a small subset of users), and provide tools to recompute data (in case it turns out that the old com‐ putation was incorrect). • Set up detailed and clear monitoring, such as performance metrics and error rates. In other engineering disciplines this is referred to as telemetry. (Once a rocket has left the ground, telemetry is essential for tracking what is happening, and for understanding failures [14].) Monitoring can show us early warning sig‐ nals and allow us to check whether any assumptions or constraints are being vio‐ lated. When a problem occurs, metrics can be invaluable in diagnosing the issue. • Implement good management practices and training—a complex and important aspect, and beyond the scope of this book.

The opposite of bounded. 558 | Glossary Index A aborts (transactions), 222, 224 in two-phase commit, 356 performance of optimistic concurrency con‐ trol, 266 retrying aborted transactions, 231 abstraction, 21, 27, 222, 266, 321 access path (in network model), 37, 60 accidental complexity, removing, 21 accountability, 535 ACID properties (transactions), 90, 223 atomicity, 223, 228 consistency, 224, 529 durability, 226 isolation, 225, 228 acknowledgements (messaging), 445 active/active replication (see multi-leader repli‐ cation) active/passive replication (see leader-based rep‐ lication) ActiveMQ (messaging), 137, 444 distributed transaction support, 361 ActiveRecord (object-relational mapper), 30, 232 actor model, 138 (see also message-passing) comparison to Pregel model, 425 comparison to stream processing, 468 Advanced Message Queuing Protocol (see AMQP) aerospace systems, 6, 10, 305, 372 aggregation data cubes and materialized views, 101 in batch processes, 406 in stream processes, 466 aggregation pipeline query language, 48 Agile, 22 minimizing irreversibility, 414, 497 moving faster with confidence, 532 Unix philosophy, 394 agreement, 365 (see also consensus) Airflow (workflow scheduler), 402 Ajax, 131 Akka (actor framework), 139 algorithms algorithm correctness, 308 B-trees, 79-83 for distributed systems, 306 hash indexes, 72-75 mergesort, 76, 402, 405 red-black trees, 78 SSTables and LSM-trees, 76-79 all-to-all replication topologies, 175 AllegroGraph (database), 50 ALTER TABLE statement (SQL), 40, 111 Amazon Dynamo (database), 177 Amazon Web Services (AWS), 8 Kinesis Streams (messaging), 448 network reliability, 279 postmortems, 9 RedShift (database), 93 S3 (object storage), 398 checking data integrity, 530 amplification of bias, 534 of failures, 364, 495 Index | 559 of tail latency, 16, 207 write amplification, 84 AMQP (Advanced Message Queuing Protocol), 444 (see also messaging systems) comparison to log-based messaging, 448, 451 message ordering, 446 analytics, 90 comparison to transaction processing, 91 data warehousing (see data warehousing) parallel query execution in MPP databases, 415 predictive (see predictive analytics) relation to batch processing, 411 schemas for, 93-95 snapshot isolation for queries, 238 stream analytics, 466 using MapReduce, analysis of user activity events (example), 404 anti-caching (in-memory databases), 89 anti-entropy, 178 Apache ActiveMQ (see ActiveMQ) Apache Avro (see Avro) Apache Beam (see Beam) Apache BookKeeper (see BookKeeper) Apache Cassandra (see Cassandra) Apache CouchDB (see CouchDB) Apache Curator (see Curator) Apache Drill (see Drill) Apache Flink (see Flink) Apache Giraph (see Giraph) Apache Hadoop (see Hadoop) Apache HAWQ (see HAWQ) Apache HBase (see HBase) Apache Helix (see Helix) Apache Hive (see Hive) Apache Impala (see Impala) Apache Jena (see Jena) Apache Kafka (see Kafka) Apache Lucene (see Lucene) Apache MADlib (see MADlib) Apache Mahout (see Mahout) Apache Oozie (see Oozie) Apache Parquet (see Parquet) Apache Qpid (see Qpid) Apache Samza (see Samza) Apache Solr (see Solr) Apache Spark (see Spark) 560 | Index Apache Storm (see Storm) Apache Tajo (see Tajo) Apache Tez (see Tez) Apache Thrift (see Thrift) Apache ZooKeeper (see ZooKeeper) Apama (stream analytics), 466 append-only B-trees, 82, 242 append-only files (see logs) Application Programming Interfaces (APIs), 5, 27 for batch processing, 403 for change streams, 456 for distributed transactions, 361 for graph processing, 425 for services, 131-136 (see also services) evolvability, 136 RESTful, 133 SOAP, 133 application state (see state) approximate search (see similarity search) archival storage, data from databases, 131 arcs (see edges) arithmetic mean, 14 ASCII text, 119, 395 ASN.1 (schema language), 127 asynchronous networks, 278, 553 comparison to synchronous networks, 284 formal model, 307 asynchronous replication, 154, 553 conflict detection, 172 data loss on failover, 157 reads from asynchronous follower, 162 Asynchronous Transfer Mode (ATM), 285 atomic broadcast (see total order broadcast) atomic clocks (caesium clocks), 294, 295 (see also clocks) atomicity (concurrency), 553 atomic increment-and-get, 351 compare-and-set, 245, 327 (see also compare-and-set operations) replicated operations, 246 write operations, 243 atomicity (transactions), 223, 228, 553 atomic commit, 353 avoiding, 523, 528 blocking and nonblocking, 359 in stream processing, 360, 477 maintaining derived data, 453 for multi-object transactions, 229 for single-object writes, 230 auditability, 528-533 designing for, 531 self-auditing systems, 530 through immutability, 460 tools for auditable data systems, 532 availability, 8 (see also fault tolerance) in CAP theorem, 337 in service level agreements (SLAs), 15 Avro (data format), 122-127 code generation, 127 dynamically generated schemas, 126 object container files, 125, 131, 414 reader determining writer’s schema, 125 schema evolution, 123 use in Hadoop, 414 awk (Unix tool), 391 AWS (see Amazon Web Services) Azure (see Microsoft) B B-trees (indexes), 79-83 append-only/copy-on-write variants, 82, 242 branching factor, 81 comparison to LSM-trees, 83-85 crash recovery, 82 growing by splitting a page, 81 optimizations, 82 similarity to dynamic partitioning, 212 backpressure, 441, 553 in TCP, 282 backups database snapshot for replication, 156 integrity of, 530 snapshot isolation for, 238 use for ETL processes, 405 backward compatibility, 112 BASE, contrast to ACID, 223 bash shell (Unix), 70, 395, 503 batch processing, 28, 389-431, 553 combining with stream processing lambda architecture, 497 unifying technologies, 498 comparison to MPP databases, 414-418 comparison to stream processing, 464 comparison to Unix, 413-414 dataflow engines, 421-423 fault tolerance, 406, 414, 422, 442 for data integration, 494-498 graphs and iterative processing, 424-426 high-level APIs and languages, 403, 426-429 log-based messaging and, 451 maintaining derived state, 495 MapReduce and distributed filesystems, 397-413 (see also MapReduce) measuring performance, 13, 390 outputs, 411-413 key-value stores, 412 search indexes, 411 using Unix tools (example), 391-394 Bayou (database), 522 Beam (dataflow library), 498 bias, 534 big ball of mud, 20 Bigtable data model, 41, 99 binary data encodings, 115-128 Avro, 122-127 MessagePack, 116-117 Thrift and Protocol Buffers, 117-121 binary encoding based on schemas, 127 by network drivers, 128 binary strings, lack of support in JSON and XML, 114 BinaryProtocol encoding (Thrift), 118 Bitcask (storage engine), 72 crash recovery, 74 Bitcoin (cryptocurrency), 532 Byzantine fault tolerance, 305 concurrency bugs in exchanges, 233 bitmap indexes, 97 blockchains, 532 Byzantine fault tolerance, 305 blocking atomic commit, 359 Bloom (programming language), 504 Bloom filter (algorithm), 79, 466 BookKeeper (replicated log), 372 Bottled Water (change data capture), 455 bounded datasets, 430, 439, 553 (see also batch processing) bounded delays, 553 in networks, 285 process pauses, 298 broadcast hash joins, 409 Index | 561 brokerless messaging, 442 Brubeck (metrics aggregator), 442 BTM (transaction coordinator), 356 bulk synchronous parallel (BSP) model, 425 bursty network traffic patterns, 285 business data processing, 28, 90, 390 byte sequence, encoding data in, 112 Byzantine faults, 304-306, 307, 553 Byzantine fault-tolerant systems, 305, 532 Byzantine Generals Problem, 304 consensus algorithms and, 366 C caches, 89, 553 and materialized views, 101 as derived data, 386, 499-504 database as cache of transaction log, 460 in CPUs, 99, 338, 428 invalidation and maintenance, 452, 467 linearizability, 324 CAP theorem, 336-338, 554 Cascading (batch processing), 419, 427 hash joins, 409 workflows, 403 cascading failures, 9, 214, 281 Cascalog (batch processing), 60 Cassandra (database) column-family data model, 41, 99 compaction strategy, 79 compound primary key, 204 gossip protocol, 216 hash partitioning, 203-205 last-write-wins conflict resolution, 186, 292 leaderless replication, 177 linearizability, lack of, 335 log-structured storage, 78 multi-datacenter support, 184 partitioning scheme, 213 secondary indexes, 207 sloppy quorums, 184 cat (Unix tool), 391 causal context, 191 (see also causal dependencies) causal dependencies, 186-191 capturing, 191, 342, 494, 514 by total ordering, 493 causal ordering, 339 in transactions, 262 sending message to friends (example), 494 562 | Index causality, 554 causal ordering, 339-343 linearizability and, 342 total order consistent with, 344, 345 consistency with, 344-347 consistent snapshots, 340 happens-before relationship, 186 in serializable transactions, 262-265 mismatch with clocks, 292 ordering events to capture, 493 violations of, 165, 176, 292, 340 with synchronized clocks, 294 CEP (see complex event processing) certificate transparency, 532 chain replication, 155 linearizable reads, 351 change data capture, 160, 454 API support for change streams, 456 comparison to event sourcing, 457 implementing, 454 initial snapshot, 455 log compaction, 456 changelogs, 460 change data capture, 454 for operator state, 479 generating with triggers, 455 in stream joins, 474 log compaction, 456 maintaining derived state, 452 Chaos Monkey, 7, 280 checkpointing in batch processors, 422, 426 in high-performance computing, 275 in stream processors, 477, 523 chronicle data model, 458 circuit-switched networks, 284 circular buffers, 450 circular replication topologies, 175 clickstream data, analysis of, 404 clients calling services, 131 pushing state changes to, 512 request routing, 214 stateful and offline-capable, 170, 511 clocks, 287-299 atomic (caesium) clocks, 294, 295 confidence interval, 293-295 for global snapshots, 294 logical (see logical clocks) skew, 291-294, 334 slewing, 289 synchronization and accuracy, 289-291 synchronization using GPS, 287, 290, 294, 295 time-of-day versus monotonic clocks, 288 timestamping events, 471 cloud computing, 146, 275 need for service discovery, 372 network glitches, 279 shared resources, 284 single-machine reliability, 8 Cloudera Impala (see Impala) clustered indexes, 86 CODASYL model, 36 (see also network model) code generation with Avro, 127 with Thrift and Protocol Buffers, 118 with WSDL, 133 collaborative editing multi-leader replication and, 170 column families (Bigtable), 41, 99 column-oriented storage, 95-101 column compression, 97 distinction between column families and, 99 in batch processors, 428 Parquet, 96, 131, 414 sort order in, 99-100 vectorized processing, 99, 428 writing to, 101 comma-separated values (see CSV) command query responsibility segregation (CQRS), 462 commands (event sourcing), 459 commits (transactions), 222 atomic commit, 354-355 (see also atomicity; transactions) read committed isolation, 234 three-phase commit (3PC), 359 two-phase commit (2PC), 355-359 commutative operations, 246 compaction of changelogs, 456 (see also log compaction) for stream operator state, 479 of log-structured storage, 73 issues with, 84 size-tiered and leveled approaches, 79 CompactProtocol encoding (Thrift), 119 compare-and-set operations, 245, 327 implementing locks, 370 implementing uniqueness constraints, 331 implementing with total order broadcast, 350 relation to consensus, 335, 350, 352, 374 relation to transactions, 230 compatibility, 112, 128 calling services, 136 properties of encoding formats, 139 using databases, 129-131 using message-passing, 138 compensating transactions, 355, 461, 526 complex event processing (CEP), 465 complexity distilling in theoretical models, 310 hiding using abstraction, 27 of software systems, managing, 20 composing data systems (see unbundling data‐ bases) compute-intensive applications, 3, 275 concatenated indexes, 87 in Cassandra, 204 Concord (stream processor), 466 concurrency actor programming model, 138, 468 (see also message-passing) bugs from weak transaction isolation, 233 conflict resolution, 171, 174 detecting concurrent writes, 184-191 dual writes, problems with, 453 happens-before relationship, 186 in replicated systems, 161-191, 324-338 lost updates, 243 multi-version concurrency control (MVCC), 239 optimistic concurrency control, 261 ordering of operations, 326, 341 reducing, through event logs, 351, 462, 507 time and relativity, 187 transaction isolation, 225 write skew (transaction isolation), 246-251 conflict-free replicated datatypes (CRDTs), 174 conflicts conflict detection, 172 causal dependencies, 186, 342 in consensus algorithms, 368 in leaderless replication, 184 Index | 563 in log-based systems, 351, 521 in nonlinearizable systems, 343 in serializable snapshot isolation (SSI), 264 in two-phase commit, 357, 364 conflict resolution automatic conflict resolution, 174 by aborting transactions, 261 by apologizing, 527 convergence, 172-174 in leaderless systems, 190 last write wins (LWW), 186, 292 using atomic operations, 246 using custom logic, 173 determining what is a conflict, 174, 522 in multi-leader replication, 171-175 avoiding conflicts, 172 lost updates, 242-246 materializing, 251 relation to operation ordering, 339 write skew (transaction isolation), 246-251 congestion (networks) avoidance, 282 limiting accuracy of clocks, 293 queueing delays, 282 consensus, 321, 364-375, 554 algorithms, 366-368 preventing split brain, 367 safety and liveness properties, 365 using linearizable operations, 351 cost of, 369 distributed transactions, 352-375 in practice, 360-364 two-phase commit, 354-359 XA transactions, 361-364 impossibility of, 353 membership and coordination services, 370-373 relation to compare-and-set, 335, 350, 352, 374 relation to replication, 155, 349 relation to uniqueness constraints, 521 consistency, 224, 524 across different databases, 157, 452, 462, 492 causal, 339-348, 493 consistent prefix reads, 165-167 consistent snapshots, 156, 237-242, 294, 455, 500 (see also snapshots) 564 | Index crash recovery, 82 enforcing constraints (see constraints) eventual, 162, 322 (see also eventual consistency) in ACID transactions, 224, 529 in CAP theorem, 337 linearizability, 324-338 meanings of, 224 monotonic reads, 164-165 of secondary indexes, 231, 241, 354, 491, 500 ordering guarantees, 339-352 read-after-write, 162-164 sequential, 351 strong (see linearizability) timeliness and integrity, 524 using quorums, 181, 334 consistent hashing, 204 consistent prefix reads, 165 constraints (databases), 225, 248 asynchronously checked, 526 coordination avoidance, 527 ensuring idempotence, 519 in log-based systems, 521-524 across multiple partitions, 522 in two-phase commit, 355, 357 relation to consensus, 374, 521 relation to event ordering, 347 requiring linearizability, 330 Consul (service discovery), 372 consumers (message streams), 137, 440 backpressure, 441 consumer offsets in logs, 449 failures, 445, 449 fan-out, 11, 445, 448 load balancing, 444, 448 not keeping up with producers, 441, 450, 502 context switches, 14, 297 convergence (conflict resolution), 172-174, 322 coordination avoidance, 527 cross-datacenter, 168, 493 cross-partition ordering, 256, 294, 348, 523 services, 330, 370-373 coordinator (in 2PC), 356 failure, 358 in XA transactions, 361-364 recovery, 363 copy-on-write (B-trees), 82, 242 CORBA (Common Object Request Broker Architecture), 134 correctness, 6 auditability, 528-533 Byzantine fault tolerance, 305, 532 dealing with partial failures, 274 in log-based systems, 521-524 of algorithm within system model, 308 of compensating transactions, 355 of consensus, 368 of derived data, 497, 531 of immutable data, 461 of personal data, 535, 540 of time, 176, 289-295 of transactions, 225, 515, 529 timeliness and integrity, 524-528 corruption of data detecting, 519, 530-533 due to pathological memory access, 529 due to radiation, 305 due to split brain, 158, 302 due to weak transaction isolation, 233 formalization in consensus, 366 integrity as absence of, 524 network packets, 306 on disks, 227 preventing using write-ahead logs, 82 recovering from, 414, 460 Couchbase (database) durability, 89 hash partitioning, 203-204, 211 rebalancing, 213 request routing, 216 CouchDB (database) B-tree storage, 242 change feed, 456 document data model, 31 join support, 34 MapReduce support, 46, 400 replication, 170, 173 covering indexes, 86 CPUs cache coherence and memory barriers, 338 caching and pipelining, 99, 428 increasing parallelism, 43 CRDTs (see conflict-free replicated datatypes) CREATE INDEX statement (SQL), 85, 500 credit rating agencies, 535 Crunch (batch processing), 419, 427 hash joins, 409 sharded joins, 408 workflows, 403 cryptography defense against attackers, 306 end-to-end encryption and authentication, 519, 543 proving integrity of data, 532 CSS (Cascading Style Sheets), 44 CSV (comma-separated values), 70, 114, 396 Curator (ZooKeeper recipes), 330, 371 curl (Unix tool), 135, 397 cursor stability, 243 Cypher (query language), 52 comparison to SPARQL, 59 D data corruption (see corruption of data) data cubes, 102 data formats (see encoding) data integration, 490-498, 543 batch and stream processing, 494-498 lambda architecture, 497 maintaining derived state, 495 reprocessing data, 496 unifying, 498 by unbundling databases, 499-515 comparison to federated databases, 501 combining tools by deriving data, 490-494 derived data versus distributed transac‐ tions, 492 limits of total ordering, 493 ordering events to capture causality, 493 reasoning about dataflows, 491 need for, 385 data lakes, 415 data locality (see locality) data models, 27-64 graph-like models, 49-63 Datalog language, 60-63 property graphs, 50 RDF and triple-stores, 55-59 query languages, 42-48 relational model versus document model, 28-42 data protection regulations, 542 data systems, 3 about, 4 Index | 565 concerns when designing, 5 future of, 489-544 correctness, constraints, and integrity, 515-533 data integration, 490-498 unbundling databases, 499-515 heterogeneous, keeping in sync, 452 maintainability, 18-22 possible faults in, 221 reliability, 6-10 hardware faults, 7 human errors, 9 importance of, 10 software errors, 8 scalability, 10-18 unreliable clocks, 287-299 data warehousing, 91-95, 554 comparison to data lakes, 415 ETL (extract-transform-load), 92, 416, 452 keeping data systems in sync, 452 schema design, 93 slowly changing dimension (SCD), 476 data-intensive applications, 3 database triggers (see triggers) database-internal distributed transactions, 360, 364, 477 databases archival storage, 131 comparison of message brokers to, 443 dataflow through, 129 end-to-end argument for, 519-520 checking integrity, 531 inside-out, 504 (see also unbundling databases) output from batch workflows, 412 relation to event streams, 451-464 (see also changelogs) API support for change streams, 456, 506 change data capture, 454-457 event sourcing, 457-459 keeping systems in sync, 452-453 philosophy of immutable events, 459-464 unbundling, 499-515 composing data storage technologies, 499-504 designing applications around dataflow, 504-509 566 | Index observing derived state, 509-515 datacenters geographically distributed, 145, 164, 278, 493 multi-tenancy and shared resources, 284 network architecture, 276 network faults, 279 replication across multiple, 169 leaderless replication, 184 multi-leader replication, 168, 335 dataflow, 128-139, 504-509 correctness of dataflow systems, 525 differential, 504 message-passing, 136-139 reasoning about, 491 through databases, 129 through services, 131-136 dataflow engines, 421-423 comparison to stream processing, 464 directed acyclic graphs (DAG), 424 partitioning, approach to, 429 support for declarative queries, 427 Datalog (query language), 60-63 datatypes binary strings in XML and JSON, 114 conflict-free, 174 in Avro encodings, 122 in Thrift and Protocol Buffers, 121 numbers in XML and JSON, 114 Datomic (database) B-tree storage, 242 data model, 50, 57 Datalog query language, 60 excision (deleting data), 463 languages for transactions, 255 serial execution of transactions, 253 deadlocks detection, in two-phase commit (2PC), 364 in two-phase locking (2PL), 258 Debezium (change data capture), 455 declarative languages, 42, 554 Bloom, 504 CSS and XSL, 44 Cypher, 52 Datalog, 60 for batch processing, 427 recursive SQL queries, 53 relational algebra and SQL, 42 SPARQL, 59 delays bounded network delays, 285 bounded process pauses, 298 unbounded network delays, 282 unbounded process pauses, 296 deleting data, 463 denormalization (data representation), 34, 554 costs, 39 in derived data systems, 386 materialized views, 101 updating derived data, 228, 231, 490 versus normalization, 462 derived data, 386, 439, 554 from change data capture, 454 in event sourcing, 458-458 maintaining derived state through logs, 452-457, 459-463 observing, by subscribing to streams, 512 outputs of batch and stream processing, 495 through application code, 505 versus distributed transactions, 492 deterministic operations, 255, 274, 554 accidental nondeterminism, 423 and fault tolerance, 423, 426 and idempotence, 478, 492 computing derived data, 495, 526, 531 in state machine replication, 349, 452, 458 joins, 476 DevOps, 394 differential dataflow, 504 dimension tables, 94 dimensional modeling (see star schemas) directed acyclic graphs (DAGs), 424 dirty reads (transaction isolation), 234 dirty writes (transaction isolation), 235 discrimination, 534 disks (see hard disks) distributed actor frameworks, 138 distributed filesystems, 398-399 decoupling from query engines, 417 indiscriminately dumping data into, 415 use by MapReduce, 402 distributed systems, 273-312, 554 Byzantine faults, 304-306 cloud versus supercomputing, 275 detecting network faults, 280 faults and partial failures, 274-277 formalization of consensus, 365 impossibility results, 338, 353 issues with failover, 157 limitations of distributed transactions, 363 multi-datacenter, 169, 335 network problems, 277-286 quorums, relying on, 301 reasons for using, 145, 151 synchronized clocks, relying on, 291-295 system models, 306-310 use of clocks and time, 287 distributed transactions (see transactions) Django (web framework), 232 DNS (Domain Name System), 216, 372 Docker (container manager), 506 document data model, 30-42 comparison to relational model, 38-42 document references, 38, 403 document-oriented databases, 31 many-to-many relationships and joins, 36 multi-object transactions, need for, 231 versus relational model convergence of models, 41 data locality, 41 document-partitioned indexes, 206, 217, 411 domain-driven design (DDD), 457 DRBD (Distributed Replicated Block Device), 153 drift (clocks), 289 Drill (query engine), 93 Druid (database), 461 Dryad (dataflow engine), 421 dual writes, problems with, 452, 507 duplicates, suppression of, 517 (see also idempotence) using a unique ID, 518, 522 durability (transactions), 226, 554 duration (time), 287 measurement with monotonic clocks, 288 dynamic partitioning, 212 dynamically typed languages analogy to schema-on-read, 40 code generation and, 127 Dynamo-style databases (see leaderless replica‐ tion) E edges (in graphs), 49, 403 property graph model, 50 edit distance (full-text search), 88 effectively-once semantics, 476, 516 Index | 567 (see also exactly-once semantics) preservation of integrity, 525 elastic systems, 17 Elasticsearch (search server) document-partitioned indexes, 207 partition rebalancing, 211 percolator (stream search), 467 usage example, 4 use of Lucene, 79 ElephantDB (database), 413 Elm (programming language), 504, 512 encodings (data formats), 111-128 Avro, 122-127 binary variants of JSON and XML, 115 compatibility, 112 calling services, 136 using databases, 129-131 using message-passing, 138 defined, 113 JSON, XML, and CSV, 114 language-specific formats, 113 merits of schemas, 127 representations of data, 112 Thrift and Protocol Buffers, 117-121 end-to-end argument, 277, 519-520 checking integrity, 531 publish/subscribe streams, 512 enrichment (stream), 473 Enterprise JavaBeans (EJB), 134 entities (see vertices) epoch (consensus algorithms), 368 epoch (Unix timestamps), 288 equi-joins, 403 erasure coding (error correction), 398 Erlang OTP (actor framework), 139 error handling for network faults, 280 in transactions, 231 error-correcting codes, 277, 398 Esper (CEP engine), 466 etcd (coordination service), 370-373 linearizable operations, 333 locks and leader election, 330 quorum reads, 351 service discovery, 372 use of Raft algorithm, 349, 353 Ethereum (blockchain), 532 Ethernet (networks), 276, 278, 285 packet checksums, 306, 519 568 | Index Etherpad (collaborative editor), 170 ethics, 533-543 code of ethics and professional practice, 533 legislation and self-regulation, 542 predictive analytics, 533-536 amplifying bias, 534 feedback loops, 536 privacy and tracking, 536-543 consent and freedom of choice, 538 data as assets and power, 540 meaning of privacy, 539 surveillance, 537 respect, dignity, and agency, 543, 544 unintended consequences, 533, 536 ETL (extract-transform-load), 92, 405, 452, 554 use of Hadoop for, 416 event sourcing, 457-459 commands and events, 459 comparison to change data capture, 457 comparison to lambda architecture, 497 deriving current state from event log, 458 immutability and auditability, 459, 531 large, reliable data systems, 519, 526 Event Store (database), 458 event streams (see streams) events, 440 deciding on total order of, 493 deriving views from event log, 461 difference to commands, 459 event time versus processing time, 469, 477, 498 immutable, advantages of, 460, 531 ordering to capture causality, 493 reads as, 513 stragglers, 470, 498 timestamp of, in stream processing, 471 EventSource (browser API), 512 eventual consistency, 152, 162, 308, 322 (see also conflicts) and perpetual inconsistency, 525 evolvability, 21, 111 calling services, 136 graph-structured data, 52 of databases, 40, 129-131, 461, 497 of message-passing, 138 reprocessing data, 496, 498 schema evolution in Avro, 123 schema evolution in Thrift and Protocol Buffers, 120 schema-on-read, 39, 111, 128 exactly-once semantics, 360, 476, 516 parity with batch processors, 498 preservation of integrity, 525 exclusive mode (locks), 258 eXtended Architecture transactions (see XA transactions) extract-transform-load (see ETL) F Facebook Presto (query engine), 93 React, Flux, and Redux (user interface libra‐ ries), 512 social graphs, 49 Wormhole (change data capture), 455 fact tables, 93 failover, 157, 554 (see also leader-based replication) in leaderless replication, absence of, 178 leader election, 301, 348, 352 potential problems, 157 failures amplification by distributed transactions, 364, 495 failure detection, 280 automatic rebalancing causing cascading failures, 214 perfect failure detectors, 359 timeouts and unbounded delays, 282, 284 using ZooKeeper, 371 faults versus, 7 partial failures in distributed systems, 275-277, 310 fan-out (messaging systems), 11, 445 fault tolerance, 6-10, 555 abstractions for, 321 formalization in consensus, 365-369 use of replication, 367 human fault tolerance, 414 in batch processing, 406, 414, 422, 425 in log-based systems, 520, 524-526 in stream processing, 476-479 atomic commit, 477 idempotence, 478 maintaining derived state, 495 microbatching and checkpointing, 477 rebuilding state after a failure, 478 of distributed transactions, 362-364 transaction atomicity, 223, 354-361 faults, 6 Byzantine faults, 304-306 failures versus, 7 handled by transactions, 221 handling in supercomputers and cloud computing, 275 hardware, 7 in batch processing versus distributed data‐ bases, 417 in distributed systems, 274-277 introducing deliberately, 7, 280 network faults, 279-281 asymmetric faults, 300 detecting, 280 tolerance of, in multi-leader replication, 169 software errors, 8 tolerating (see fault tolerance) federated databases, 501 fence (CPU instruction), 338 fencing (preventing split brain), 158, 302-304 generating fencing tokens, 349, 370 properties of fencing tokens, 308 stream processors writing to databases, 478, 517 Fibre Channel (networks), 398 field tags (Thrift and Protocol Buffers), 119-121 file descriptors (Unix), 395 financial data, 460 Firebase (database), 456 Flink (processing framework), 421-423 dataflow APIs, 427 fault tolerance, 422, 477, 479 Gelly API (graph processing), 425 integration of batch and stream processing, 495, 498 machine learning, 428 query optimizer, 427 stream processing, 466 flow control, 282, 441, 555 FLP result (on consensus), 353 FlumeJava (dataflow library), 403, 427 followers, 152, 555 (see also leader-based replication) foreign keys, 38, 403 forward compatibility, 112 forward decay (algorithm), 16 Index | 569 Fossil (version control system), 463 shunning (deleting data), 463 FoundationDB (database) serializable transactions, 261, 265, 364 fractal trees, 83 full table scans, 403 full-text search, 555 and fuzzy indexes, 88 building search indexes, 411 Lucene storage engine, 79 functional reactive programming (FRP), 504 functional requirements, 22 futures (asynchronous operations), 135 fuzzy search (see similarity search) G garbage collection immutability and, 463 process pauses for, 14, 296-299, 301 (see also process pauses) genome analysis, 63, 429 geographically distributed datacenters, 145, 164, 278, 493 geospatial indexes, 87 Giraph (graph processing), 425 Git (version control system), 174, 342, 463 GitHub, postmortems, 157, 158, 309 global indexes (see term-partitioned indexes) GlusterFS (distributed filesystem), 398 GNU Coreutils (Linux), 394 GoldenGate (change data capture), 161, 170, 455 (see also Oracle) Google Bigtable (database) data model (see Bigtable data model) partitioning scheme, 199, 202 storage layout, 78 Chubby (lock service), 370 Cloud Dataflow (stream processor), 466, 477, 498 (see also Beam) Cloud Pub/Sub (messaging), 444, 448 Docs (collaborative editor), 170 Dremel (query engine), 93, 96 FlumeJava (dataflow library), 403, 427 GFS (distributed file system), 398 gRPC (RPC framework), 135 MapReduce (batch processing), 390 570 | Index (see also MapReduce) building search indexes, 411 task preemption, 418 Pregel (graph processing), 425 Spanner (see Spanner) TrueTime (clock API), 294 gossip protocol, 216 government use of data, 541 GPS (Global Positioning System) use for clock synchronization, 287, 290, 294, 295 GraphChi (graph processing), 426 graphs, 555 as data models, 49-63 example of graph-structured data, 49 property graphs, 50 RDF and triple-stores, 55-59 versus the network model, 60 processing and analysis, 424-426 fault tolerance, 425 Pregel processing model, 425 query languages Cypher, 52 Datalog, 60-63 recursive SQL queries, 53 SPARQL, 59-59 Gremlin (graph query language), 50 grep (Unix tool), 392 GROUP BY clause (SQL), 406 grouping records in MapReduce, 406 handling skew, 407 H Hadoop (data infrastructure) comparison to distributed databases, 390 comparison to MPP databases, 414-418 comparison to Unix, 413-414, 499 diverse processing models in ecosystem, 417 HDFS distributed filesystem (see HDFS) higher-level tools, 403 join algorithms, 403-410 (see also MapReduce) MapReduce (see MapReduce) YARN (see YARN) happens-before relationship, 340 capturing, 187 concurrency and, 186 hard disks access patterns, 84 detecting corruption, 519, 530 faults in, 7, 227 sequential write throughput, 75, 450 hardware faults, 7 hash indexes, 72-75 broadcast hash joins, 409 partitioned hash joins, 409 hash partitioning, 203-205, 217 consistent hashing, 204 problems with hash mod N, 210 range queries, 204 suitable hash functions, 203 with fixed number of partitions, 210 HAWQ (database), 428 HBase (database) bug due to lack of fencing, 302 bulk loading, 413 column-family data model, 41, 99 dynamic partitioning, 212 key-range partitioning, 202 log-structured storage, 78 request routing, 216 size-tiered compaction, 79 use of HDFS, 417 use of ZooKeeper, 370 HDFS (Hadoop Distributed File System), 398-399 (see also distributed filesystems) checking data integrity, 530 decoupling from query engines, 417 indiscriminately dumping data into, 415 metadata about datasets, 410 NameNode, 398 use by Flink, 479 use by HBase, 212 use by MapReduce, 402 HdrHistogram (numerical library), 16 head (Unix tool), 392 head vertex (property graphs), 51 head-of-line blocking, 15 heap files (databases), 86 Helix (cluster manager), 216 heterogeneous distributed transactions, 360, 364 heuristic decisions (in 2PC), 363 Hibernate (object-relational mapper), 30 hierarchical model, 36 high availability (see fault tolerance) high-frequency trading, 290, 299 high-performance computing (HPC), 275 hinted handoff, 183 histograms, 16 Hive (query engine), 419, 427 for data warehouses, 93 HCatalog and metastore, 410 map-side joins, 409 query optimizer, 427 skewed joins, 408 workflows, 403 Hollerith machines, 390 hopping windows (stream processing), 472 (see also windows) horizontal scaling (see scaling out) HornetQ (messaging), 137, 444 distributed transaction support, 361 hot spots, 201 due to celebrities, 205 for time-series data, 203 in batch processing, 407 relieving, 205 hot standbys (see leader-based replication) HTTP, use in APIs (see services) human errors, 9, 279, 414 HyperDex (database), 88 HyperLogLog (algorithm), 466 I I/O operations, waiting for, 297 IBM DB2 (database) distributed transaction support, 361 recursive query support, 54 serializable isolation, 242, 257 XML and JSON support, 30, 42 electromechanical card-sorting machines, 390 IMS (database), 36 imperative query APIs, 46 InfoSphere Streams (CEP engine), 466 MQ (messaging), 444 distributed transaction support, 361 System R (database), 222 WebSphere (messaging), 137 idempotence, 134, 478, 555 by giving operations unique IDs, 518, 522 idempotent operations, 517 immutability advantages of, 460, 531 Index | 571 deriving state from event log, 459-464 for crash recovery, 75 in B-trees, 82, 242 in event sourcing, 457 inputs to Unix commands, 397 limitations of, 463 Impala (query engine) for data warehouses, 93 hash joins, 409 native code generation, 428 use of HDFS, 417 impedance mismatch, 29 imperative languages, 42 setting element styles (example), 45 in doubt (transaction status), 358 holding locks, 362 orphaned transactions, 363 in-memory databases, 88 durability, 227 serial transaction execution, 253 incidents cascading failures, 9 crashes due to leap seconds, 290 data corruption and financial losses due to concurrency bugs, 233 data corruption on hard disks, 227 data loss due to last-write-wins, 173, 292 data on disks unreadable, 309 deleted items reappearing, 174 disclosure of sensitive data due to primary key reuse, 157 errors in transaction serializability, 529 gigabit network interface with 1 Kb/s throughput, 311 network faults, 279 network interface dropping only inbound packets, 279 network partitions and whole-datacenter failures, 275 poor handling of network faults, 280 sending message to ex-partner, 494 sharks biting undersea cables, 279 split brain due to 1-minute packet delay, 158, 279 vibrations in server rack, 14 violation of uniqueness constraint, 529 indexes, 71, 555 and snapshot isolation, 241 as derived data, 386, 499-504 572 | Index B-trees, 79-83 building in batch processes, 411 clustered, 86 comparison of B-trees and LSM-trees, 83-85 concatenated, 87 covering (with included columns), 86 creating, 500 full-text search, 88 geospatial, 87 hash, 72-75 index-range locking, 260 multi-column, 87 partitioning and secondary indexes, 206-209, 217 secondary, 85 (see also secondary indexes) problems with dual writes, 452, 491 SSTables and LSM-trees, 76-79 updating when data changes, 452, 467 Industrial Revolution, 541 InfiniBand (networks), 285 InfiniteGraph (database), 50 InnoDB (storage engine) clustered index on primary key, 86 not preventing lost updates, 245 preventing write skew, 248, 257 serializable isolation, 257 snapshot isolation support, 239 inside-out databases, 504 (see also unbundling databases) integrating different data systems (see data integration) integrity, 524 coordination-avoiding data systems, 528 correctness of dataflow systems, 525 in consensus formalization, 365 integrity checks, 530 (see also auditing) end-to-end, 519, 531 use of snapshot isolation, 238 maintaining despite software bugs, 529 Interface Definition Language (IDL), 117, 122 intermediate state, materialization of, 420-423 internet services, systems for implementing, 275 invariants, 225 (see also constraints) inversion of control, 396 IP (Internet Protocol) unreliability of, 277 ISDN (Integrated Services Digital Network), 284 isolation (in transactions), 225, 228, 555 correctness and, 515 for single-object writes, 230 serializability, 251-266 actual serial execution, 252-256 serializable snapshot isolation (SSI), 261-266 two-phase locking (2PL), 257-261 violating, 228 weak isolation levels, 233-251 preventing lost updates, 242-246 read committed, 234-237 snapshot isolation, 237-242 iterative processing, 424-426 J Java Database Connectivity (JDBC) distributed transaction support, 361 network drivers, 128 Java Enterprise Edition (EE), 134, 356, 361 Java Message Service (JMS), 444 (see also messaging systems) comparison to log-based messaging, 448, 451 distributed transaction support, 361 message ordering, 446 Java Transaction API (JTA), 355, 361 Java Virtual Machine (JVM) bytecode generation, 428 garbage collection pauses, 296 process reuse in batch processors, 422 JavaScript in MapReduce querying, 46 setting element styles (example), 45 use in advanced queries, 48 Jena (RDF framework), 57 Jepsen (fault tolerance testing), 515 jitter (network delay), 284 joins, 555 by index lookup, 403 expressing as relational operators, 427 in relational and document databases, 34 MapReduce map-side joins, 408-410 broadcast hash joins, 409 merge joins, 410 partitioned hash joins, 409 MapReduce reduce-side joins, 403-408 handling skew, 407 sort-merge joins, 405 parallel execution of, 415 secondary indexes and, 85 stream joins, 472-476 stream-stream join, 473 stream-table join, 473 table-table join, 474 time-dependence of, 475 support in document databases, 42 JOTM (transaction coordinator), 356 JSON Avro schema representation, 122 binary variants, 115 for application data, issues with, 114 in relational databases, 30, 42 representing a résumé (example), 31 Juttle (query language), 504 K k-nearest neighbors, 429 Kafka (messaging), 137, 448 Kafka Connect (database integration), 457, 461 Kafka Streams (stream processor), 466, 467 fault tolerance, 479 leader-based replication, 153 log compaction, 456, 467 message offsets, 447, 478 request routing, 216 transaction support, 477 usage example, 4 Ketama (partitioning library), 213 key-value stores, 70 as batch process output, 412 hash indexes, 72-75 in-memory, 89 partitioning, 201-205 by hash of key, 203, 217 by key range, 202, 217 dynamic partitioning, 212 skew and hot spots, 205 Kryo (Java), 113 Kubernetes (cluster manager), 418, 506 L lambda architecture, 497 Lamport timestamps, 345 Index | 573 Large Hadron Collider (LHC), 64 last write wins (LWW), 173, 334 discarding concurrent writes, 186 problems with, 292 prone to lost updates, 246 late binding, 396 latency instability under two-phase locking, 259 network latency and resource utilization, 286 response time versus, 14 tail latency, 15, 207 leader-based replication, 152-161 (see also replication) failover, 157, 301 handling node outages, 156 implementation of replication logs change data capture, 454-457 (see also changelogs) statement-based, 158 trigger-based replication, 161 write-ahead log (WAL) shipping, 159 linearizability of operations, 333 locking and leader election, 330 log sequence number, 156, 449 read-scaling architecture, 161 relation to consensus, 367 setting up new followers, 155 synchronous versus asynchronous, 153-155 leaderless replication, 177-191 (see also replication) detecting concurrent writes, 184-191 capturing happens-before relationship, 187 happens-before relationship and concur‐ rency, 186 last write wins, 186 merging concurrently written values, 190 version vectors, 191 multi-datacenter, 184 quorums, 179-182 consistency limitations, 181-183, 334 sloppy quorums and hinted handoff, 183 read repair and anti-entropy, 178 leap seconds, 8, 290 in time-of-day clocks, 288 leases, 295 implementation with ZooKeeper, 370 574 | Index need for fencing, 302 ledgers, 460 distributed ledger technologies, 532 legacy systems, maintenance of, 18 less (Unix tool), 397 LevelDB (storage engine), 78 leveled compaction, 79 Levenshtein automata, 88 limping (partial failure), 311 linearizability, 324-338, 555 cost of, 335-338 CAP theorem, 336 memory on multi-core CPUs, 338 definition, 325-329 implementing with total order broadcast, 350 in ZooKeeper, 370 of derived data systems, 492, 524 avoiding coordination, 527 of different replication methods, 332-335 using quorums, 334 relying on, 330-332 constraints and uniqueness, 330 cross-channel timing dependencies, 331 locking and leader election, 330 stronger than causal consistency, 342 using to implement total order broadcast, 351 versus serializability, 329 LinkedIn Azkaban (workflow scheduler), 402 Databus (change data capture), 161, 455 Espresso (database), 31, 126, 130, 153, 216 Helix (cluster manager) (see Helix) profile (example), 30 reference to company entity (example), 34 Rest.li (RPC framework), 135 Voldemort (database) (see Voldemort) Linux, leap second bug, 8, 290 liveness properties, 308 LMDB (storage engine), 82, 242 load approaches to coping with, 17 describing, 11 load testing, 16 load balancing (messaging), 444 local indexes (see document-partitioned indexes) locality (data access), 32, 41, 555 in batch processing, 400, 405, 421 in stateful clients, 170, 511 in stream processing, 474, 478, 508, 522 location transparency, 134 in the actor model, 138 locks, 556 deadlock, 258 distributed locking, 301-304, 330 fencing tokens, 303 implementation with ZooKeeper, 370 relation to consensus, 374 for transaction isolation in snapshot isolation, 239 in two-phase locking (2PL), 257-261 making operations atomic, 243 performance, 258 preventing dirty writes, 236 preventing phantoms with index-range locks, 260, 265 read locks (shared mode), 236, 258 shared mode and exclusive mode, 258 in two-phase commit (2PC) deadlock detection, 364 in-doubt transactions holding locks, 362 materializing conflicts with, 251 preventing lost updates by explicit locking, 244 log sequence number, 156, 449 logic programming languages, 504 logical clocks, 293, 343, 494 for read-after-write consistency, 164 logical logs, 160 logs (data structure), 71, 556 advantages of immutability, 460 compaction, 73, 79, 456, 460 for stream operator state, 479 creating using total order broadcast, 349 implementing uniqueness constraints, 522 log-based messaging, 446-451 comparison to traditional messaging, 448, 451 consumer offsets, 449 disk space usage, 450 replaying old messages, 451, 496, 498 slow consumers, 450 using logs for message storage, 447 log-structured storage, 71-79 log-structured merge tree (see LSMtrees) replication, 152, 158-161 change data capture, 454-457 (see also changelogs) coordination with snapshot, 156 logical (row-based) replication, 160 statement-based replication, 158 trigger-based replication, 161 write-ahead log (WAL) shipping, 159 scalability limits, 493 loose coupling, 396, 419, 502 lost updates (see updates) LSM-trees (indexes), 78-79 comparison to B-trees, 83-85 Lucene (storage engine), 79 building indexes in batch processes, 411 similarity search, 88 Luigi (workflow scheduler), 402 LWW (see last write wins) M machine learning ethical considerations, 534 (see also ethics) iterative processing, 424 models derived from training data, 505 statistical and numerical algorithms, 428 MADlib (machine learning toolkit), 428 magic scaling sauce, 18 Mahout (machine learning toolkit), 428 maintainability, 18-22, 489 defined, 23 design principles for software systems, 19 evolvability (see evolvability) operability, 19 simplicity and managing complexity, 20 many-to-many relationships in document model versus relational model, 39 modeling as graphs, 49 many-to-one and many-to-many relationships, 33-36 many-to-one relationships, 34 MapReduce (batch processing), 390, 399-400 accessing external services within job, 404, 412 comparison to distributed databases designing for frequent faults, 417 diversity of processing models, 416 diversity of storage, 415 Index | 575 comparison to stream processing, 464 comparison to Unix, 413-414 disadvantages and limitations of, 419 fault tolerance, 406, 414, 422 higher-level tools, 403, 426 implementation in Hadoop, 400-403 the shuffle, 402 implementation in MongoDB, 46-48 machine learning, 428 map-side processing, 408-410 broadcast hash joins, 409 merge joins, 410 partitioned hash joins, 409 mapper and reducer functions, 399 materialization of intermediate state, 419-423 output of batch workflows, 411-413 building search indexes, 411 key-value stores, 412 reduce-side processing, 403-408 analysis of user activity events (exam‐ ple), 404 grouping records by same key, 406 handling skew, 407 sort-merge joins, 405 workflows, 402 marshalling (see encoding) massively parallel processing (MPP), 216 comparison to composing storage technolo‐ gies, 502 comparison to Hadoop, 414-418, 428 master-master replication (see multi-leader replication) master-slave replication (see leader-based repli‐ cation) materialization, 556 aggregate values, 101 conflicts, 251 intermediate state (batch processing), 420-423 materialized views, 101 as derived data, 386, 499-504 maintaining, using stream processing, 467, 475 Maven (Java build tool), 428 Maxwell (change data capture), 455 mean, 14 media monitoring, 467 median, 14 576 | Index meeting room booking (example), 249, 259, 521 membership services, 372 Memcached (caching server), 4, 89 memory in-memory databases, 88 durability, 227 serial transaction execution, 253 in-memory representation of data, 112 random bit-flips in, 529 use by indexes, 72, 77 memory barrier (CPU instruction), 338 MemSQL (database) in-memory storage, 89 read committed isolation, 236 memtable (in LSM-trees), 78 Mercurial (version control system), 463 merge joins, MapReduce map-side, 410 mergeable persistent data structures, 174 merging sorted files, 76, 402, 405 Merkle trees, 532 Mesos (cluster manager), 418, 506 message brokers (see messaging systems) message-passing, 136-139 advantages over direct RPC, 137 distributed actor frameworks, 138 evolvability, 138 MessagePack (encoding format), 116 messages exactly-once semantics, 360, 476 loss of, 442 using total order broadcast, 348 messaging systems, 440-451 (see also streams) backpressure, buffering, or dropping mes‐ sages, 441 brokerless messaging, 442 event logs, 446-451 comparison to traditional messaging, 448, 451 consumer offsets, 449 replaying old messages, 451, 496, 498 slow consumers, 450 message brokers, 443-446 acknowledgements and redelivery, 445 comparison to event logs, 448, 451 multiple consumers of same topic, 444 reliability, 442 uniqueness in log-based messaging, 522 Meteor (web framework), 456 microbatching, 477, 495 microservices, 132 (see also services) causal dependencies across services, 493 loose coupling, 502 relation to batch/stream processors, 389, 508 Microsoft Azure Service Bus (messaging), 444 Azure Storage, 155, 398 Azure Stream Analytics, 466 DCOM (Distributed Component Object Model), 134 MSDTC (transaction coordinator), 356 Orleans (see Orleans) SQL Server (see SQL Server) migrating (rewriting) data, 40, 130, 461, 497 modulus operator (%), 210 MongoDB (database) aggregation pipeline, 48 atomic operations, 243 BSON, 41 document data model, 31 hash partitioning (sharding), 203-204 key-range partitioning, 202 lack of join support, 34, 42 leader-based replication, 153 MapReduce support, 46, 400 oplog parsing, 455, 456 partition splitting, 212 request routing, 216 secondary indexes, 207 Mongoriver (change data capture), 455 monitoring, 10, 19 monotonic clocks, 288 monotonic reads, 164 MPP (see massively parallel processing) MSMQ (messaging), 361 multi-column indexes, 87 multi-leader replication, 168-177 (see also replication) handling write conflicts, 171 conflict avoidance, 172 converging toward a consistent state, 172 custom conflict resolution logic, 173 determining what is a conflict, 174 linearizability, lack of, 333 replication topologies, 175-177 use cases, 168 clients with offline operation, 170 collaborative editing, 170 multi-datacenter replication, 168, 335 multi-object transactions, 228 need for, 231 Multi-Paxos (total order broadcast), 367 multi-table index cluster tables (Oracle), 41 multi-tenancy, 284 multi-version concurrency control (MVCC), 239, 266 detecting stale MVCC reads, 263 indexes and snapshot isolation, 241 mutual exclusion, 261 (see also locks) MySQL (database) binlog coordinates, 156 binlog parsing for change data capture, 455 circular replication topology, 175 consistent snapshots, 156 distributed transaction support, 361 InnoDB storage engine (see InnoDB) JSON support, 30, 42 leader-based replication, 153 performance of XA transactions, 360 row-based replication, 160 schema changes in, 40 snapshot isolation support, 242 (see also InnoDB) statement-based replication, 159 Tungsten Replicator (multi-leader replica‐ tion), 170 conflict detection, 177 N nanomsg (messaging library), 442 Narayana (transaction coordinator), 356 NATS (messaging), 137 near-real-time (nearline) processing, 390 (see also stream processing) Neo4j (database) Cypher query language, 52 graph data model, 50 Nephele (dataflow engine), 421 netcat (Unix tool), 397 Netflix Chaos Monkey, 7, 280 Network Attached Storage (NAS), 146, 398 network model, 36 Index | 577 graph databases versus, 60 imperative query APIs, 46 Network Time Protocol (see NTP) networks congestion and queueing, 282 datacenter network topologies, 276 faults (see faults) linearizability and network delays, 338 network partitions, 279, 337 timeouts and unbounded delays, 281 next-key locking, 260 nodes (in graphs) (see vertices) nodes (processes), 556 handling outages in leader-based replica‐ tion, 156 system models for failure, 307 noisy neighbors, 284 nonblocking atomic commit, 359 nondeterministic operations accidental nondeterminism, 423 partial failures in distributed systems, 275 nonfunctional requirements, 22 nonrepeatable reads, 238 (see also read skew) normalization (data representation), 33, 556 executing joins, 39, 42, 403 foreign key references, 231 in systems of record, 386 versus denormalization, 462 NoSQL, 29, 499 transactions and, 223 Notation3 (N3), 56 npm (package manager), 428 NTP (Network Time Protocol), 287 accuracy, 289, 293 adjustments to monotonic clocks, 289 multiple server addresses, 306 numbers, in XML and JSON encodings, 114 O object-relational mapping (ORM) frameworks, 30 error handling and aborted transactions, 232 unsafe read-modify-write cycle code, 244 object-relational mismatch, 29 observer pattern, 506 offline systems, 390 (see also batch processing) 578 | Index stateful, offline-capable clients, 170, 511 offline-first applications, 511 offsets consumer offsets in partitioned logs, 449 messages in partitioned logs, 447 OLAP (online analytic processing), 91, 556 data cubes, 102 OLTP (online transaction processing), 90, 556 analytics queries versus, 411 workload characteristics, 253 one-to-many relationships, 30 JSON representation, 32 online systems, 389 (see also services) Oozie (workflow scheduler), 402 OpenAPI (service definition format), 133 OpenStack Nova (cloud infrastructure) use of ZooKeeper, 370 Swift (object storage), 398 operability, 19 operating systems versus databases, 499 operation identifiers, 518, 522 operational transformation, 174 operators, 421 flow of data between, 424 in stream processing, 464 optimistic concurrency control, 261 Oracle (database) distributed transaction support, 361 GoldenGate (change data capture), 161, 170, 455 lack of serializability, 226 leader-based replication, 153 multi-table index cluster tables, 41 not preventing write skew, 248 partitioned indexes, 209 PL/SQL language, 255 preventing lost updates, 245 read committed isolation, 236 Real Application Clusters (RAC), 330 recursive query support, 54 snapshot isolation support, 239, 242 TimesTen (in-memory database), 89 WAL-based replication, 160 XML support, 30 ordering, 339-352 by sequence numbers, 343-348 causal ordering, 339-343 partial order, 341 limits of total ordering, 493 total order broadcast, 348-352 Orleans (actor framework), 139 outliers (response time), 14 Oz (programming language), 504 P package managers, 428, 505 packet switching, 285 packets corruption of, 306 sending via UDP, 442 PageRank (algorithm), 49, 424 paging (see virtual memory) ParAccel (database), 93 parallel databases (see massively parallel pro‐ cessing) parallel execution of graph analysis algorithms, 426 queries in MPP databases, 216 Parquet (data format), 96, 131 (see also column-oriented storage) use in Hadoop, 414 partial failures, 275, 310 limping, 311 partial order, 341 partitioning, 199-218, 556 and replication, 200 in batch processing, 429 multi-partition operations, 514 enforcing constraints, 522 secondary index maintenance, 495 of key-value data, 201-205 by key range, 202 skew and hot spots, 205 rebalancing partitions, 209-214 automatic or manual rebalancing, 213 problems with hash mod N, 210 using dynamic partitioning, 212 using fixed number of partitions, 210 using N partitions per node, 212 replication and, 147 request routing, 214-216 secondary indexes, 206-209 document-based partitioning, 206 term-based partitioning, 208 serial execution of transactions and, 255 Paxos (consensus algorithm), 366 ballot number, 368 Multi-Paxos (total order broadcast), 367 percentiles, 14, 556 calculating efficiently, 16 importance of high percentiles, 16 use in service level agreements (SLAs), 15 Percona XtraBackup (MySQL tool), 156 performance describing, 13 of distributed transactions, 360 of in-memory databases, 89 of linearizability, 338 of multi-leader replication, 169 perpetual inconsistency, 525 pessimistic concurrency control, 261 phantoms (transaction isolation), 250 materializing conflicts, 251 preventing, in serializability, 259 physical clocks (see clocks) pickle (Python), 113 Pig (dataflow language), 419, 427 replicated joins, 409 skewed joins, 407 workflows, 403 Pinball (workflow scheduler), 402 pipelined execution, 423 in Unix, 394 point in time, 287 polyglot persistence, 29 polystores, 501 PostgreSQL (database) BDR (multi-leader replication), 170 causal ordering of writes, 177 Bottled Water (change data capture), 455 Bucardo (trigger-based replication), 161, 173 distributed transaction support, 361 foreign data wrappers, 501 full text search support, 490 leader-based replication, 153 log sequence number, 156 MVCC implementation, 239, 241 PL/pgSQL language, 255 PostGIS geospatial indexes, 87 preventing lost updates, 245 preventing write skew, 248, 261 read committed isolation, 236 recursive query support, 54 representing graphs, 51 Index | 579 serializable snapshot isolation (SSI), 261 snapshot isolation support, 239, 242 WAL-based replication, 160 XML and JSON support, 30, 42 pre-splitting, 212 Precision Time Protocol (PTP), 290 predicate locks, 259 predictive analytics, 533-536 amplifying bias, 534 ethics of (see ethics) feedback loops, 536 preemption of datacenter resources, 418 of threads, 298 Pregel processing model, 425 primary keys, 85, 556 compound primary key (Cassandra), 204 primary-secondary replication (see leaderbased replication) privacy, 536-543 consent and freedom of choice, 538 data as assets and power, 540 deleting data, 463 ethical considerations (see ethics) legislation and self-regulation, 542 meaning of, 539 surveillance, 537 tracking behavioral data, 536 probabilistic algorithms, 16, 466 process pauses, 295-299 processing time (of events), 469 producers (message streams), 440 programming languages dataflow languages, 504 for stored procedures, 255 functional reactive programming (FRP), 504 logic programming, 504 Prolog (language), 61 (see also Datalog) promises (asynchronous operations), 135 property graphs, 50 Cypher query language, 52 Protocol Buffers (data format), 117-121 field tags and schema evolution, 120 provenance of data, 531 publish/subscribe model, 441 publishers (message streams), 440 punch card tabulating machines, 390 580 | Index pure functions, 48 putting computation near data, 400 Q Qpid (messaging), 444 quality of service (QoS), 285 Quantcast File System (distributed filesystem), 398 query languages, 42-48 aggregation pipeline, 48 CSS and XSL, 44 Cypher, 52 Datalog, 60 Juttle, 504 MapReduce querying, 46-48 recursive SQL queries, 53 relational algebra and SQL, 42 SPARQL, 59 query optimizers, 37, 427 queueing delays (networks), 282 head-of-line blocking, 15 latency and response time, 14 queues (messaging), 137 quorums, 179-182, 556 for leaderless replication, 179 in consensus algorithms, 368 limitations of consistency, 181-183, 334 making decisions in distributed systems, 301 monitoring staleness, 182 multi-datacenter replication, 184 relying on durability, 309 sloppy quorums and hinted handoff, 183 R R-trees (indexes), 87 RabbitMQ (messaging), 137, 444 leader-based replication, 153 race conditions, 225 (see also concurrency) avoiding with linearizability, 331 caused by dual writes, 452 dirty writes, 235 in counter increments, 235 lost updates, 242-246 preventing with event logs, 462, 507 preventing with serializable isolation, 252 write skew, 246-251 Raft (consensus algorithm), 366 sensitivity to network problems, 369 term number, 368 use in etcd, 353 RAID (Redundant Array of Independent Disks), 7, 398 railways, schema migration on, 496 RAMCloud (in-memory storage), 89 ranking algorithms, 424 RDF (Resource Description Framework), 57 querying with SPARQL, 59 RDMA (Remote Direct Memory Access), 276 read committed isolation level, 234-237 implementing, 236 multi-version concurrency control (MVCC), 239 no dirty reads, 234 no dirty writes, 235 read path (derived data), 509 read repair (leaderless replication), 178 for linearizability, 335 read replicas (see leader-based replication) read skew (transaction isolation), 238, 266 as violation of causality, 340 read-after-write consistency, 163, 524 cross-device, 164 read-modify-write cycle, 243 read-scaling architecture, 161 reads as events, 513 real-time collaborative editing, 170 near-real-time processing, 390 (see also stream processing) publish/subscribe dataflow, 513 response time guarantees, 298 time-of-day clocks, 288 rebalancing partitions, 209-214, 556 (see also partitioning) automatic or manual rebalancing, 213 dynamic partitioning, 212 fixed number of partitions, 210 fixed number of partitions per node, 212 problems with hash mod N, 210 recency guarantee, 324 recommendation engines batch process outputs, 412 batch workflows, 403, 420 iterative processing, 424 statistical and numerical algorithms, 428 records, 399 events in stream processing, 440 recursive common table expressions (SQL), 54 redelivery (messaging), 445 Redis (database) atomic operations, 243 durability, 89 Lua scripting, 255 single-threaded execution, 253 usage example, 4 redundancy hardware components, 7 of derived data, 386 (see also derived data) Reed–Solomon codes (error correction), 398 refactoring, 22 (see also evolvability) regions (partitioning), 199 register (data structure), 325 relational data model, 28-42 comparison to document model, 38-42 graph queries in SQL, 53 in-memory databases with, 89 many-to-one and many-to-many relation‐ ships, 33 multi-object transactions, need for, 231 NoSQL as alternative to, 29 object-relational mismatch, 29 relational algebra and SQL, 42 versus document model convergence of models, 41 data locality, 41 relational databases eventual consistency, 162 history, 28 leader-based replication, 153 logical logs, 160 philosophy compared to Unix, 499, 501 schema changes, 40, 111, 130 statement-based replication, 158 use of B-tree indexes, 80 relationships (see edges) reliability, 6-10, 489 building a reliable system from unreliable components, 276 defined, 6, 22 hardware faults, 7 human errors, 9 importance of, 10 of messaging systems, 442 Index | 581 software errors, 8 Remote Method Invocation (Java RMI), 134 remote procedure calls (RPCs), 134-136 (see also services) based on futures, 135 data encoding and evolution, 136 issues with, 134 using Avro, 126, 135 using Thrift, 135 versus message brokers, 137 repeatable reads (transaction isolation), 242 replicas, 152 replication, 151-193, 556 and durability, 227 chain replication, 155 conflict resolution and, 246 consistency properties, 161-167 consistent prefix reads, 165 monotonic reads, 164 reading your own writes, 162 in distributed filesystems, 398 leaderless, 177-191 detecting concurrent writes, 184-191 limitations of quorum consistency, 181-183, 334 sloppy quorums and hinted handoff, 183 monitoring staleness, 182 multi-leader, 168-177 across multiple datacenters, 168, 335 handling write conflicts, 171-175 replication topologies, 175-177 partitioning and, 147, 200 reasons for using, 145, 151 single-leader, 152-161 failover, 157 implementation of replication logs, 158-161 relation to consensus, 367 setting up new followers, 155 synchronous versus asynchronous, 153-155 state machine replication, 349, 452 using erasure coding, 398 with heterogeneous data systems, 453 replication logs (see logs) reprocessing data, 496, 498 (see also evolvability) from log-based messaging, 451 request routing, 214-216 582 | Index approaches to, 214 parallel query execution, 216 resilient systems, 6 (see also fault tolerance) response time as performance metric for services, 13, 389 guarantees on, 298 latency versus, 14 mean and percentiles, 14 user experience, 15 responsibility and accountability, 535 REST (Representational State Transfer), 133 (see also services) RethinkDB (database) document data model, 31 dynamic partitioning, 212 join support, 34, 42 key-range partitioning, 202 leader-based replication, 153 subscribing to changes, 456 Riak (database) Bitcask storage engine, 72 CRDTs, 174, 191 dotted version vectors, 191 gossip protocol, 216 hash partitioning, 203-204, 211 last-write-wins conflict resolution, 186 leaderless replication, 177 LevelDB storage engine, 78 linearizability, lack of, 335 multi-datacenter support, 184 preventing lost updates across replicas, 246 rebalancing, 213 search feature, 209 secondary indexes, 207 siblings (concurrently written values), 190 sloppy quorums, 184 ring buffers, 450 Ripple (cryptocurrency), 532 rockets, 10, 36, 305 RocksDB (storage engine), 78 leveled compaction, 79 rollbacks (transactions), 222 rolling upgrades, 8, 112 routing (see request routing) row-oriented storage, 96 row-based replication, 160 rowhammer (memory corruption), 529 RPCs (see remote procedure calls) Rubygems (package manager), 428 rules (Datalog), 61 S safety and liveness properties, 308 in consensus algorithms, 366 in transactions, 222 sagas (see compensating transactions) Samza (stream processor), 466, 467 fault tolerance, 479 streaming SQL support, 466 sandboxes, 9 SAP HANA (database), 93 scalability, 10-18, 489 approaches for coping with load, 17 defined, 22 describing load, 11 describing performance, 13 partitioning and, 199 replication and, 161 scaling up versus scaling out, 146 scaling out, 17, 146 (see also shared-nothing architecture) scaling up, 17, 146 scatter/gather approach, querying partitioned databases, 207 SCD (slowly changing dimension), 476 schema-on-read, 39 comparison to evolvable schema, 128 in distributed filesystems, 415 schema-on-write, 39 schemaless databases (see schema-on-read) schemas, 557 Avro, 122-127 reader determining writer’s schema, 125 schema evolution, 123 dynamically generated, 126 evolution of, 496 affecting application code, 111 compatibility checking, 126 in databases, 129-131 in message-passing, 138 in service calls, 136 flexibility in document model, 39 for analytics, 93-95 for JSON and XML, 115 merits of, 127 schema migration on railways, 496 Thrift and Protocol Buffers, 117-121 schema evolution, 120 traditional approach to design, fallacy in, 462 searches building search indexes in batch processes, 411 k-nearest neighbors, 429 on streams, 467 partitioned secondary indexes, 206 secondaries (see leader-based replication) secondary indexes, 85, 557 partitioning, 206-209, 217 document-partitioned, 206 index maintenance, 495 term-partitioned, 208 problems with dual writes, 452, 491 updating, transaction isolation and, 231 secondary sorts, 405 sed (Unix tool), 392 self-describing files, 127 self-joins, 480 self-validating systems, 530 semantic web, 57 semi-synchronous replication, 154 sequence number ordering, 343-348 generators, 294, 344 insufficiency for enforcing constraints, 347 Lamport timestamps, 345 use of timestamps, 291, 295, 345 sequential consistency, 351 serializability, 225, 233, 251-266, 557 linearizability versus, 329 pessimistic versus optimistic concurrency control, 261 serial execution, 252-256 partitioning, 255 using stored procedures, 253, 349 serializable snapshot isolation (SSI), 261-266 detecting stale MVCC reads, 263 detecting writes that affect prior reads, 264 distributed execution, 265, 364 performance of SSI, 265 preventing write skew, 262-265 two-phase locking (2PL), 257-261 index-range locks, 260 performance, 258 Serializable (Java), 113 Index | 583 serialization, 113 (see also encoding) service discovery, 135, 214, 372 using DNS, 216, 372 service level agreements (SLAs), 15 service-oriented architecture (SOA), 132 (see also services) services, 131-136 microservices, 132 causal dependencies across services, 493 loose coupling, 502 relation to batch/stream processors, 389, 508 remote procedure calls (RPCs), 134-136 issues with, 134 similarity to databases, 132 web services, 132, 135 session windows (stream processing), 472 (see also windows) sessionization, 407 sharding (see partitioning) shared mode (locks), 258 shared-disk architecture, 146, 398 shared-memory architecture, 146 shared-nothing architecture, 17, 146-147, 557 (see also replication) distributed filesystems, 398 (see also distributed filesystems) partitioning, 199 use of network, 277 sharks biting undersea cables, 279 counting (example), 46-48 finding (example), 42 website about (example), 44 shredding (in relational model), 38 siblings (concurrent values), 190, 246 (see also conflicts) similarity search edit distance, 88 genome data, 63 k-nearest neighbors, 429 single-leader replication (see leader-based rep‐ lication) single-threaded execution, 243, 252 in batch processing, 406, 421, 426 in stream processing, 448, 463, 522 size-tiered compaction, 79 skew, 557 584 | Index clock skew, 291-294, 334 in transaction isolation read skew, 238, 266 write skew, 246-251, 262-265 (see also write skew) meanings of, 238 unbalanced workload, 201 compensating for, 205 due to celebrities, 205 for time-series data, 203 in batch processing, 407 slaves (see leader-based replication) sliding windows (stream processing), 472 (see also windows) sloppy quorums, 183 (see also quorums) lack of linearizability, 334 slowly changing dimension (data warehouses), 476 smearing (leap seconds adjustments), 290 snapshots (databases) causal consistency, 340 computing derived data, 500 in change data capture, 455 serializable snapshot isolation (SSI), 261-266, 329 setting up a new replica, 156 snapshot isolation and repeatable read, 237-242 implementing with MVCC, 239 indexes and MVCC, 241 visibility rules, 240 synchronized clocks for global snapshots, 294 snowflake schemas, 95 SOAP, 133 (see also services) evolvability, 136 software bugs, 8 maintaining integrity, 529 solid state drives (SSDs) access patterns, 84 detecting corruption, 519, 530 faults in, 227 sequential write throughput, 75 Solr (search server) building indexes in batch processes, 411 document-partitioned indexes, 207 request routing, 216 usage example, 4 use of Lucene, 79 sort (Unix tool), 392, 394, 395 sort-merge joins (MapReduce), 405 Sorted String Tables (see SSTables) sorting sort order in column storage, 99 source of truth (see systems of record) Spanner (database) data locality, 41 snapshot isolation using clocks, 295 TrueTime API, 294 Spark (processing framework), 421-423 bytecode generation, 428 dataflow APIs, 427 fault tolerance, 422 for data warehouses, 93 GraphX API (graph processing), 425 machine learning, 428 query optimizer, 427 Spark Streaming, 466 microbatching, 477 stream processing on top of batch process‐ ing, 495 SPARQL (query language), 59 spatial algorithms, 429 split brain, 158, 557 in consensus algorithms, 352, 367 preventing, 322, 333 using fencing tokens to avoid, 302-304 spreadsheets, dataflow programming capabili‐ ties, 504 SQL (Structured Query Language), 21, 28, 43 advantages and limitations of, 416 distributed query execution, 48 graph queries in, 53 isolation levels standard, issues with, 242 query execution on Hadoop, 416 résumé (example), 30 SQL injection vulnerability, 305 SQL on Hadoop, 93 statement-based replication, 158 stored procedures, 255 SQL Server (database) data warehousing support, 93 distributed transaction support, 361 leader-based replication, 153 preventing lost updates, 245 preventing write skew, 248, 257 read committed isolation, 236 recursive query support, 54 serializable isolation, 257 snapshot isolation support, 239 T-SQL language, 255 XML support, 30 SQLstream (stream analytics), 466 SSDs (see solid state drives) SSTables (storage format), 76-79 advantages over hash indexes, 76 concatenated index, 204 constructing and maintaining, 78 making LSM-Tree from, 78 staleness (old data), 162 cross-channel timing dependencies, 331 in leaderless databases, 178 in multi-version concurrency control, 263 monitoring for, 182 of client state, 512 versus linearizability, 324 versus timeliness, 524 standbys (see leader-based replication) star replication topologies, 175 star schemas, 93-95 similarity to event sourcing, 458 Star Wars analogy (event time versus process‐ ing time), 469 state derived from log of immutable events, 459 deriving current state from the event log, 458 interplay between state changes and appli‐ cation code, 507 maintaining derived state, 495 maintenance by stream processor in streamstream joins, 473 observing derived state, 509-515 rebuilding after stream processor failure, 478 separation of application code and, 505 state machine replication, 349, 452 statement-based replication, 158 statically typed languages analogy to schema-on-write, 40 code generation and, 127 statistical and numerical algorithms, 428 StatsD (metrics aggregator), 442 stdin, stdout, 395, 396 Stellar (cryptocurrency), 532 Index | 585 stock market feeds, 442 STONITH (Shoot The Other Node In The Head), 158 stop-the-world (see garbage collection) storage composing data storage technologies, 499-504 diversity of, in MapReduce, 415 Storage Area Network (SAN), 146, 398 storage engines, 69-104 column-oriented, 95-101 column compression, 97-99 defined, 96 distinction between column families and, 99 Parquet, 96, 131 sort order in, 99-100 writing to, 101 comparing requirements for transaction processing and analytics, 90-96 in-memory storage, 88 durability, 227 row-oriented, 70-90 B-trees, 79-83 comparing B-trees and LSM-trees, 83-85 defined, 96 log-structured, 72-79 stored procedures, 161, 253-255, 557 and total order broadcast, 349 pros and cons of, 255 similarity to stream processors, 505 Storm (stream processor), 466 distributed RPC, 468, 514 Trident state handling, 478 straggler events, 470, 498 stream processing, 464-481, 557 accessing external services within job, 474, 477, 478, 517 combining with batch processing lambda architecture, 497 unifying technologies, 498 comparison to batch processing, 464 complex event processing (CEP), 465 fault tolerance, 476-479 atomic commit, 477 idempotence, 478 microbatching and checkpointing, 477 rebuilding state after a failure, 478 for data integration, 494-498 586 | Index maintaining derived state, 495 maintenance of materialized views, 467 messaging systems (see messaging systems) reasoning about time, 468-472 event time versus processing time, 469, 477, 498 knowing when window is ready, 470 types of windows, 472 relation to databases (see streams) relation to services, 508 search on streams, 467 single-threaded execution, 448, 463 stream analytics, 466 stream joins, 472-476 stream-stream join, 473 stream-table join, 473 table-table join, 474 time-dependence of, 475 streams, 440-451 end-to-end, pushing events to clients, 512 messaging systems (see messaging systems) processing (see stream processing) relation to databases, 451-464 (see also changelogs) API support for change streams, 456 change data capture, 454-457 derivative of state by time, 460 event sourcing, 457-459 keeping systems in sync, 452-453 philosophy of immutable events, 459-464 topics, 440 strict serializability, 329 strong consistency (see linearizability) strong one-copy serializability, 329 subjects, predicates, and objects (in triplestores), 55 subscribers (message streams), 440 (see also consumers) supercomputers, 275 surveillance, 537 (see also privacy) Swagger (service definition format), 133 swapping to disk (see virtual memory) synchronous networks, 285, 557 comparison to asynchronous networks, 284 formal model, 307 synchronous replication, 154, 557 chain replication, 155 conflict detection, 172 system models, 300, 306-310 assumptions in, 528 correctness of algorithms, 308 mapping to the real world, 309 safety and liveness, 308 systems of record, 386, 557 change data capture, 454, 491 treating event log as, 460 systems thinking, 536 T t-digest (algorithm), 16 table-table joins, 474 Tableau (data visualization software), 416 tail (Unix tool), 447 tail vertex (property graphs), 51 Tajo (query engine), 93 Tandem NonStop SQL (database), 200 TCP (Transmission Control Protocol), 277 comparison to circuit switching, 285 comparison to UDP, 283 connection failures, 280 flow control, 282, 441 packet checksums, 306, 519, 529 reliability and duplicate suppression, 517 retransmission timeouts, 284 use for transaction sessions, 229 telemetry (see monitoring) Teradata (database), 93, 200 term-partitioned indexes, 208, 217 termination (consensus), 365 Terrapin (database), 413 Tez (dataflow engine), 421-423 fault tolerance, 422 support by higher-level tools, 427 thrashing (out of memory), 297 threads (concurrency) actor model, 138, 468 (see also message-passing) atomic operations, 223 background threads, 73, 85 execution pauses, 286, 296-298 memory barriers, 338 preemption, 298 single (see single-threaded execution) three-phase commit, 359 Thrift (data format), 117-121 BinaryProtocol, 118 CompactProtocol, 119 field tags and schema evolution, 120 throughput, 13, 390 TIBCO, 137 Enterprise Message Service, 444 StreamBase (stream analytics), 466 time concurrency and, 187 cross-channel timing dependencies, 331 in distributed systems, 287-299 (see also clocks) clock synchronization and accuracy, 289 relying on synchronized clocks, 291-295 process pauses, 295-299 reasoning about, in stream processors, 468-472 event time versus processing time, 469, 477, 498 knowing when window is ready, 470 timestamp of events, 471 types of windows, 472 system models for distributed systems, 307 time-dependence in stream joins, 475 time-of-day clocks, 288 timeliness, 524 coordination-avoiding data systems, 528 correctness of dataflow systems, 525 timeouts, 279, 557 dynamic configuration of, 284 for failover, 158 length of, 281 timestamps, 343 assigning to events in stream processing, 471 for read-after-write consistency, 163 for transaction ordering, 295 insufficiency for enforcing constraints, 347 key range partitioning by, 203 Lamport, 345 logical, 494 ordering events, 291, 345 Titan (database), 50 tombstones, 74, 191, 456 topics (messaging), 137, 440 total order, 341, 557 limits of, 493 sequence numbers or timestamps, 344 total order broadcast, 348-352, 493, 522 consensus algorithms and, 366-368 Index | 587 implementation in ZooKeeper and etcd, 370 implementing with linearizable storage, 351 using, 349 using to implement linearizable storage, 350 tracking behavioral data, 536 (see also privacy) transaction coordinator (see coordinator) transaction manager (see coordinator) transaction processing, 28, 90-95 comparison to analytics, 91 comparison to data warehousing, 93 transactions, 221-267, 558 ACID properties of, 223 atomicity, 223 consistency, 224 durability, 226 isolation, 225 compensating (see compensating transac‐ tions) concept of, 222 distributed transactions, 352-364 avoiding, 492, 502, 521-528 failure amplification, 364, 495 in doubt/uncertain status, 358, 362 two-phase commit, 354-359 use of, 360-361 XA transactions, 361-364 OLTP versus analytics queries, 411 purpose of, 222 serializability, 251-266 actual serial execution, 252-256 pessimistic versus optimistic concur‐ rency control, 261 serializable snapshot isolation (SSI), 261-266 two-phase locking (2PL), 257-261 single-object and multi-object, 228-232 handling errors and aborts, 231 need for multi-object transactions, 231 single-object writes, 230 snapshot isolation (see snapshots) weak isolation levels, 233-251 preventing lost updates, 242-246 read committed, 234-238 transitive closure (graph algorithm), 424 trie (data structure), 88 triggers (databases), 161, 441 implementing change data capture, 455 implementing replication, 161 588 | Index triple-stores, 55-59 SPARQL query language, 59 tumbling windows (stream processing), 472 (see also windows) in microbatching, 477 tuple spaces (programming model), 507 Turtle (RDF data format), 56 Twitter constructing home timelines (example), 11, 462, 474, 511 DistributedLog (event log), 448 Finagle (RPC framework), 135 Snowflake (sequence number generator), 294 Summingbird (processing library), 497 two-phase commit (2PC), 353, 355-359, 558 confusion with two-phase locking, 356 coordinator failure, 358 coordinator recovery, 363 how it works, 357 issues in practice, 363 performance cost, 360 transactions holding locks, 362 two-phase locking (2PL), 257-261, 329, 558 confusion with two-phase commit, 356 index-range locks, 260 performance of, 258 type checking, dynamic versus static, 40 U UDP (User Datagram Protocol) comparison to TCP, 283 multicast, 442 unbounded datasets, 439, 558 (see also streams) unbounded delays, 558 in networks, 282 process pauses, 296 unbundling databases, 499-515 composing data storage technologies, 499-504 federation versus unbundling, 501 need for high-level language, 503 designing applications around dataflow, 504-509 observing derived state, 509-515 materialized views and caching, 510 multi-partition data processing, 514 pushing state changes to clients, 512 uncertain (transaction status) (see in doubt) uniform consensus, 365 (see also consensus) uniform interfaces, 395 union type (in Avro), 125 uniq (Unix tool), 392 uniqueness constraints asynchronously checked, 526 requiring consensus, 521 requiring linearizability, 330 uniqueness in log-based messaging, 522 Unix philosophy, 394-397 command-line batch processing, 391-394 Unix pipes versus dataflow engines, 423 comparison to Hadoop, 413-414 comparison to relational databases, 499, 501 comparison to stream processing, 464 composability and uniform interfaces, 395 loose coupling, 396 pipes, 394 relation to Hadoop, 499 UPDATE statement (SQL), 40 updates preventing lost updates, 242-246 atomic write operations, 243 automatically detecting lost updates, 245 compare-and-set operations, 245 conflict resolution and replication, 246 using explicit locking, 244 preventing write skew, 246-251 V validity (consensus), 365 vBuckets (partitioning), 199 vector clocks, 191 (see also version vectors) vectorized processing, 99, 428 verification, 528-533 avoiding blind trust, 530 culture of, 530 designing for auditability, 531 end-to-end integrity checks, 531 tools for auditable data systems, 532 version control systems, reliance on immutable data, 463 version vectors, 177, 191 capturing causal dependencies, 343 versus vector clocks, 191 Vertica (database), 93 handling writes, 101 replicas using different sort orders, 100 vertical scaling (see scaling up) vertices (in graphs), 49 property graph model, 50 Viewstamped Replication (consensus algo‐ rithm), 366 view number, 368 virtual machines, 146 (see also cloud computing) context switches, 297 network performance, 282 noisy neighbors, 284 reliability in cloud services, 8 virtualized clocks in, 290 virtual memory process pauses due to page faults, 14, 297 versus memory management by databases, 89 VisiCalc (spreadsheets), 504 vnodes (partitioning), 199 Voice over IP (VoIP), 283 Voldemort (database) building read-only stores in batch processes, 413 hash partitioning, 203-204, 211 leaderless replication, 177 multi-datacenter support, 184 rebalancing, 213 reliance on read repair, 179 sloppy quorums, 184 VoltDB (database) cross-partition serializability, 256 deterministic stored procedures, 255 in-memory storage, 89 output streams, 456 secondary indexes, 207 serial execution of transactions, 253 statement-based replication, 159, 479 transactions in stream processing, 477 W WAL (write-ahead log), 82 web services (see services) Web Services Description Language (WSDL), 133 webhooks, 443 webMethods (messaging), 137 WebSocket (protocol), 512 Index | 589 windows (stream processing), 466, 468-472 infinite windows for changelogs, 467, 474 knowing when all events have arrived, 470 stream joins within a window, 473 types of windows, 472 winners (conflict resolution), 173 WITH RECURSIVE syntax (SQL), 54 workflows (MapReduce), 402 outputs, 411-414 key-value stores, 412 search indexes, 411 with map-side joins, 410 working set, 393 write amplification, 84 write path (derived data), 509 write skew (transaction isolation), 246-251 characterizing, 246-251, 262 examples of, 247, 249 materializing conflicts, 251 occurrence in practice, 529 phantoms, 250 preventing in snapshot isolation, 262-265 in two-phase locking, 259-261 options for, 248 write-ahead log (WAL), 82, 159 writes (database) atomic write operations, 243 detecting writes affecting prior reads, 264 preventing dirty writes with read commit‐ ted, 235 WS-* framework, 133 (see also services) WS-AtomicTransaction (2PC), 355 590 | Index X XA transactions, 355, 361-364 heuristic decisions, 363 limitations of, 363 xargs (Unix tool), 392, 396 XML binary variants, 115 encoding RDF data, 57 for application data, issues with, 114 in relational databases, 30, 41 XSL/XPath, 45 Y Yahoo!


pages: 1,237 words: 227,370

Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems by Martin Kleppmann

active measures, Amazon Web Services, bitcoin, blockchain, business intelligence, business process, c2.com, cloud computing, collaborative editing, commoditize, conceptual framework, cryptocurrency, database schema, DevOps, distributed ledger, Donald Knuth, Edward Snowden, Ethereum, ethereum blockchain, fault tolerance, finite state, Flash crash, full text search, general-purpose programming language, informal economy, information retrieval, Infrastructure as a Service, Internet of things, iterative process, John von Neumann, Kubernetes, loose coupling, Marc Andreessen, microservices, natural language processing, Network effects, packet switching, peer-to-peer, performance metric, place-making, premature optimization, recommendation engine, Richard Feynman, self-driving car, semantic web, Shoshana Zuboff, social graph, social web, software as a service, software is eating the world, sorting algorithm, source of truth, SPARQL, speech recognition, statistical model, undersea cable, web application, WebSocket, wikimedia commons

Allow quick and easy recovery from human errors, to minimize the impact in the case of a failure. For example, make it fast to roll back configuration changes, roll out new code gradually (so that any unexpected bugs affect only a small subset of users), and provide tools to recompute data (in case it turns out that the old computation was incorrect). Set up detailed and clear monitoring, such as performance metrics and error rates. In other engineering disciplines this is referred to as telemetry. (Once a rocket has left the ground, telemetry is essential for tracking what is happening, and for understanding failures [14].) Monitoring can show us early warning signals and allow us to check whether any assumptions or constraints are being violated. When a problem occurs, metrics can be invaluable in diagnosing the issue.

replication topologies, Multi-Leader Replication Topologies-Multi-Leader Replication Topologies partitioning and, Distributed Data, Partitioning and Replication reasons for using, Distributed Data, Replication single-leader, Leaders and Followers-Trigger-based replicationfailover, Leader failure: Failover implementation of replication logs, Implementation of Replication Logs-Trigger-based replication relation to consensus, Single-leader replication and consensus setting up new followers, Setting Up New Followers synchronous versus asynchronous, Synchronous Versus Asynchronous Replication-Synchronous Versus Asynchronous Replication state machine replication, Using total order broadcast, Databases and Streams using erasure coding, MapReduce and Distributed Filesystems with heterogeneous data systems, Keeping Systems in Sync replication logs (see logs) reprocessing data, Reprocessing data for application evolution, Unifying batch and stream processing(see also evolvability) from log-based messaging, Replaying old messages request routing, Request Routing-Parallel Query Executionapproaches to, Request Routing parallel query execution, Parallel Query Execution resilient systems, Reliability(see also fault tolerance) response timeas performance metric for services, Describing Performance, Batch Processing guarantees on, Response time guarantees latency versus, Describing Performance mean and percentiles, Describing Performance user experience, Describing Performance responsibility and accountability, Responsibility and accountability REST (Representational State Transfer), Web services(see also services) RethinkDB (database)document data model, The Object-Relational Mismatch dynamic partitioning, Dynamic partitioning join support, Many-to-One and Many-to-Many Relationships, Convergence of document and relational databases key-range partitioning, Partitioning by Key Range leader-based replication, Leaders and Followers subscribing to changes, API support for change streams Riak (database)Bitcask storage engine, Hash Indexes CRDTs, Custom conflict resolution logic, Merging concurrently written values dotted version vectors, Version vectors gossip protocol, Request Routing hash partitioning, Partitioning by Hash of Key-Partitioning by Hash of Key, Fixed number of partitions last-write-wins conflict resolution, Last write wins (discarding concurrent writes) leaderless replication, Leaderless Replication LevelDB storage engine, Making an LSM-tree out of SSTables linearizability, lack of, Linearizability and quorums multi-datacenter support, Multi-datacenter operation preventing lost updates across replicas, Conflict resolution and replication rebalancing, Operations: Automatic or Manual Rebalancing search feature, Partitioning Secondary Indexes by Term secondary indexes, Partitioning Secondary Indexes by Document siblings (concurrently written values), Merging concurrently written values sloppy quorums, Sloppy Quorums and Hinted Handoff ring buffers, Disk space usage Ripple (cryptocurrency), Tools for auditable data systems rockets, Human Errors, Are Document Databases Repeating History?


pages: 327 words: 103,336

Everything Is Obvious: *Once You Know the Answer by Duncan J. Watts

active measures, affirmative action, Albert Einstein, Amazon Mechanical Turk, Black Swan, business cycle, butterfly effect, Carmen Reinhart, Cass Sunstein, clockwork universe, cognitive dissonance, coherent worldview, collapse of Lehman Brothers, complexity theory, correlation does not imply causation, crowdsourcing, death of newspapers, discovery of DNA, East Village, easy for humans, difficult for computers, edge city, en.wikipedia.org, Erik Brynjolfsson, framing effect, Geoffrey West, Santa Fe Institute, George Santayana, happiness index / gross national happiness, high batting average, hindsight bias, illegal immigration, industrial cluster, interest rate swap, invention of the printing press, invention of the telescope, invisible hand, Isaac Newton, Jane Jacobs, Jeff Bezos, Joseph Schumpeter, Kenneth Rogoff, lake wobegon effect, Laplace demon, Long Term Capital Management, loss aversion, medical malpractice, meta analysis, meta-analysis, Milgram experiment, natural language processing, Netflix Prize, Network effects, oil shock, packet switching, pattern recognition, performance metric, phenotype, Pierre-Simon Laplace, planetary scale, prediction markets, pre–internet, RAND corporation, random walk, RFID, school choice, Silicon Valley, social intelligence, statistical model, Steve Ballmer, Steve Jobs, Steve Wozniak, supply-chain management, The Death and Life of Great American Cities, the scientific method, The Wisdom of Crowds, too big to fail, Toyota Production System, ultimatum game, urban planning, Vincenzo Peruggia: Mona Lisa, Watson beat the top human players on Jeopardy!, X Prize

The problem is therefore not that planning of any kind is impossible, any more than prediction of any kind is impossible, but rather that certain kinds of plans can be made reliably and others can’t be, and that planners need to be able to tell the difference. 3. See Helft (2008) for a story about the Yahoo! home page overhoul. 4. See Kohavi et al. (2010) and Tang et al. (2010). 5. See Clifford (2009) for a story about startup companies using quantitative performance metrics to substitute for design instinct. 6. See Alterman (2008) for Peretti’s original description of the Mullet Strategy. See Dholakia and Vianello (2009) for a discussion of how the same approach can work for communities built around brands, and the associated tradeoff between control and insight. 7. See Howe (2008, 2006) for a general discussion of crowdsourcing. See Rice (2010) for examples of recent trends in online journalism. 8.


Beautiful Visualization by Julie Steele

barriers to entry, correlation does not imply causation, data acquisition, database schema, Drosophila, en.wikipedia.org, epigenetics, global pandemic, Hans Rosling, index card, information retrieval, iterative process, linked data, Mercator projection, meta analysis, meta-analysis, natural language processing, Netflix Prize, pattern recognition, peer-to-peer, performance metric, QR code, recommendation engine, semantic web, social graph, sorting algorithm, Steve Jobs, web application, wikimedia commons

For example, how and what movies you rate on Netflix will influence what movies are recommended to other users, and on Amazon, reviewing a book, buying a book, or even adding a book to your cart but later removing it can affect the recommendations given to others. Similarly, with Google, when you click on a result—or, for that matter, don’t click on a result—that behavior impacts future search results. One consequence of this complexity is difficulty in explaining system behavior. We primarily rely on performance metrics to quantify the success or failure of retrieval results, or to tell us which variations of a system work better than others. Such metrics allow the system to be continuously improved upon. A supplementary approach to understanding the behavior of these systems is to use information visualization. With visualization, we can sometimes gain insights not available from metrics alone. In this chapter, I’ll show how one particular visualization technique can provide large-scale views of certain system dynamics.


pages: 368 words: 96,825

Bold: How to Go Big, Create Wealth and Impact the World by Peter H. Diamandis, Steven Kotler

3D printing, additive manufacturing, Airbnb, Amazon Mechanical Turk, Amazon Web Services, augmented reality, autonomous vehicles, Charles Lindbergh, cloud computing, creative destruction, crowdsourcing, Daniel Kahneman / Amos Tversky, dematerialisation, deskilling, disruptive innovation, Elon Musk, en.wikipedia.org, Exxon Valdez, fear of failure, Firefox, Galaxy Zoo, Google Glasses, Google Hangouts, gravity well, ImageNet competition, industrial robot, Internet of things, Jeff Bezos, John Harrison: Longitude, John Markoff, Jono Bacon, Just-in-time delivery, Kickstarter, Kodak vs Instagram, Law of Accelerating Returns, Lean Startup, life extension, loss aversion, Louis Pasteur, low earth orbit, Mahatma Gandhi, Marc Andreessen, Mark Zuckerberg, Mars Rover, meta analysis, meta-analysis, microbiome, minimum viable product, move fast and break things, Narrative Science, Netflix Prize, Network effects, Oculus Rift, optical character recognition, packet switching, PageRank, pattern recognition, performance metric, Peter H. Diamandis: Planetary Resources, Peter Thiel, pre–internet, Ray Kurzweil, recommendation engine, Richard Feynman, ride hailing / ride sharing, risk tolerance, rolodex, self-driving car, sentiment analysis, shareholder value, Silicon Valley, Silicon Valley startup, skunkworks, Skype, smart grid, stem cell, Stephen Hawking, Steve Jobs, Steven Levy, Stewart Brand, superconnector, technoutopianism, telepresence, telepresence robot, Turing test, urban renewal, web application, X Prize, Y Combinator, zero-sum game

You’ve probably heard about hackathons—those mysterious tournaments where coders compete to see who can hack together the best piece of software in a weekend. Well, with TopCoder, now you can have over 600,000 developers, designers, and data scientists hacking away to create solutions just for you. In fields like software and algorithm development, where there are many ways to solve a problem, having multiple submissions lets you compare performance metrics and choose the best one. Or take Gigwalk, a crowdsourced information-gathering platform that pays a small denomination to incentivize the crowd (i.e., anyone who has the Gigwalk app) to perform a simple task at a particular place and time. “Crowdsourced platforms are being quickly adopted in the retail and consumer products industry,” says Marcus Shingles, a principal with Deloitte Consulting.


pages: 347 words: 97,721

Only Humans Need Apply: Winners and Losers in the Age of Smart Machines by Thomas H. Davenport, Julia Kirby

AI winter, Andy Kessler, artificial general intelligence, asset allocation, Automated Insights, autonomous vehicles, basic income, Baxter: Rethink Robotics, business intelligence, business process, call centre, carbon-based life, Clayton Christensen, clockwork universe, commoditize, conceptual framework, dark matter, David Brooks, deliberate practice, deskilling, digital map, disruptive innovation, Douglas Engelbart, Edward Lloyd's coffeehouse, Elon Musk, Erik Brynjolfsson, estate planning, fixed income, follow your passion, Frank Levy and Richard Murnane: The New Division of Labor, Freestyle chess, game design, general-purpose programming language, global pandemic, Google Glasses, Hans Lippershey, haute cuisine, income inequality, index fund, industrial robot, information retrieval, intermodal, Internet of things, inventory management, Isaac Newton, job automation, John Markoff, John Maynard Keynes: Economic Possibilities for our Grandchildren, John Maynard Keynes: technological unemployment, Joi Ito, Khan Academy, knowledge worker, labor-force participation, lifelogging, longitudinal study, loss aversion, Mark Zuckerberg, Narrative Science, natural language processing, Norbert Wiener, nuclear winter, pattern recognition, performance metric, Peter Thiel, precariat, quantitative trading / quantitative finance, Ray Kurzweil, Richard Feynman, risk tolerance, Robert Shiller, Robert Shiller, Rodney Brooks, Second Machine Age, self-driving car, Silicon Valley, six sigma, Skype, social intelligence, speech recognition, spinning jenny, statistical model, Stephen Hawking, Steve Jobs, Steve Wozniak, strong AI, superintelligent machines, supply-chain management, transaction costs, Tyler Cowen: Great Stagnation, Watson beat the top human players on Jeopardy!, Works Progress Administration, Zipcar

The important thing, for the individual learner, is to adopt some framework like this that can bring discipline to the task of focusing on a strength and building it.16 Part of any conscious attempt to build a strength should be a defensible way of measuring progress. We suspect that one reason why “left brain” skills so dominate discussions of human intelligence is simply that they are so easily assessed and compared. The yardsticks we use to measure human achievement—our “performance metrics,” to use business parlance—always push us back to believing that more hard skills training is the answer. Yet that belief constrains us to a narrow track, and the same track we have designed computers to dominate. We are limiting ourselves to running a race we have already determined we cannot win. It might even be that our attempts to have humans keep pace with machines militate against the development of other human strengths.


Systematic Trading: A Unique New Method for Designing Trading and Investing Systems by Robert Carver

asset allocation, automated trading system, backtesting, barriers to entry, Black Swan, buy and hold, cognitive bias, commodity trading advisor, Credit Default Swap, diversification, diversified portfolio, easy for humans, difficult for computers, Edward Thorp, Elliott wave, fixed income, implied volatility, index fund, interest rate swap, Long Term Capital Management, margin call, merger arbitrage, Nick Leeson, paper trading, performance metric, risk tolerance, risk-adjusted returns, risk/return, Sharpe ratio, short selling, survivorship bias, systematic trading, technology bubble, transaction costs, Y Combinator, yield curve

If you do get creative and make your own rules the number of options will explode. Which rules and variations should you use in practice? As I said in chapter three, ‘Fitting’, I don’t recommend using back-tested profitability to select trading rules or variations. Instead you should focus on behaviour such as the correlation between variations, and the speed they trade at, whilst ignoring Sharpe ratio and other performance metrics. You should then use two selection criteria. The first is to avoid including any two variations with more than a 95% correlation to each other, as one of them will not be adding any value to the system. I discuss how to prune possible variations of the EWMAC rule in appendix B (in the section on EWMAC, from page 282). Secondly, you must exclude anything which trades too slowly, or too quickly.


pages: 831 words: 98,409

SUPERHUBS: How the Financial Elite and Their Networks Rule Our World by Sandra Navidi

activist fund / activist shareholder / activist investor, assortative mating, bank run, barriers to entry, Bernie Sanders, Black Swan, Blythe Masters, Bretton Woods, butterfly effect, Capital in the Twenty-First Century by Thomas Piketty, Carmen Reinhart, central bank independence, cognitive bias, collapse of Lehman Brothers, collateralized debt obligation, commoditize, conceptual framework, corporate governance, Credit Default Swap, credit default swaps / collateralized debt obligations, crony capitalism, diversification, East Village, Elon Musk, eurozone crisis, family office, financial repression, Gini coefficient, glass ceiling, Goldman Sachs: Vampire Squid, Google bus, Gordon Gekko, haute cuisine, high net worth, hindsight bias, income inequality, index fund, intangible asset, Jaron Lanier, John Meriwether, Kenneth Arrow, Kenneth Rogoff, knowledge economy, London Whale, Long Term Capital Management, longitudinal study, Mark Zuckerberg, mass immigration, McMansion, mittelstand, money market fund, Myron Scholes, NetJets, Network effects, offshore financial centre, old-boy network, Parag Khanna, Paul Samuelson, peer-to-peer, performance metric, Peter Thiel, plutocrats, Plutocrats, Ponzi scheme, quantitative easing, Renaissance Technologies, rent-seeking, reserve currency, risk tolerance, Robert Gordon, Robert Shiller, Robert Shiller, rolodex, Satyajit Das, shareholder value, Silicon Valley, social intelligence, sovereign wealth fund, Stephen Hawking, Steve Jobs, The Future of Employment, The Predators' Ball, The Rise and Fall of American Growth, too big to fail, women in the workforce, young professional

However, in the complex and opaque world of finance, objective performance measurement is challenging. There are many unknown variables beyond executive control, such as the blowup of a previously hailed asset class, like energy, or the bursting of a bubble like the Internet. A systemic financial crisis may even reveal that all asset classes are in fact negatively correlated. The application of performance metrics has been questioned in view of the recent billion-dollar losses and fines ranging in the hundreds of millions. Yet, CEOs still receive rising pay. Proponents argue that winner-takes-all compensation is simply the result of market forces and freely agreed contracts, and that competitive salaries are necessary to obtain and retain top talent. According to them, paying finance executives handsomely is less costly and disruptive than losing them.


pages: 317 words: 100,414

Superforecasting: The Art and Science of Prediction by Philip Tetlock, Dan Gardner

Affordable Care Act / Obamacare, Any sufficiently advanced technology is indistinguishable from magic, availability heuristic, Black Swan, butterfly effect, buy and hold, cloud computing, cuban missile crisis, Daniel Kahneman / Amos Tversky, desegregation, drone strike, Edward Lorenz: Chaos theory, forward guidance, Freestyle chess, fundamental attribution error, germ theory of disease, hindsight bias, index fund, Jane Jacobs, Jeff Bezos, Kenneth Arrow, Laplace demon, longitudinal study, Mikhail Gorbachev, Mohammed Bouazizi, Nash equilibrium, Nate Silver, Nelson Mandela, obamacare, pattern recognition, performance metric, Pierre-Simon Laplace, place-making, placebo effect, prediction markets, quantitative easing, random walk, randomized controlled trial, Richard Feynman, Richard Thaler, Robert Shiller, Robert Shiller, Ronald Reagan, Saturday Night Live, scientific worldview, Silicon Valley, Skype, statistical model, stem cell, Steve Ballmer, Steve Jobs, Steven Pinker, the scientific method, The Signal and the Noise by Nate Silver, The Wisdom of Crowds, Thomas Bayes, Watson beat the top human players on Jeopardy!

Elisabeth Rosenthal, “The Hype over Hospital Rankings,” New York Times, July 27, 2013. Efforts to identify “supers”—superhospitals or superteachers or super–intelligence analysts—are easy to dismiss for two reasons: (1) excellence is multidimensional and we can only imperfectly capture some dimensions (patient longevity or test results or Brier scores); (2) as soon as we anoint an official performance metric, we create incentives to game the new system by rejecting very sick patients or ejecting troublesome students. But the solution is not to abandon metrics. It is to resist overinterpreting them. 16. Thomas Friedman, “Iraq Without Saddam,” New York Times, September 1, 2002. 17. Thomas Friedman, “Is Vacation Over?,” New York Times, December 23, 2014. 18. Caleb Melby, Laura Marcinek, and Danielle Burger, “Fed Critics Say ’10 Letter Warning Inflation Still Right,” Bloomberg, October 2, 2014, http://www.bloomberg.com/news/articles/2014-10-02/fed-critics-say-10-letter-warning-inflation-still-right. 19.


pages: 350 words: 98,077

Artificial Intelligence: A Guide for Thinking Humans by Melanie Mitchell

Ada Lovelace, AI winter, Amazon Mechanical Turk, Apple's 1984 Super Bowl advert, artificial general intelligence, autonomous vehicles, Bernie Sanders, Claude Shannon: information theory, cognitive dissonance, computer age, computer vision, dark matter, Douglas Hofstadter, Elon Musk, en.wikipedia.org, Gödel, Escher, Bach, I think there is a world market for maybe five computers, ImageNet competition, Jaron Lanier, job automation, John Markoff, John von Neumann, Kevin Kelly, Kickstarter, license plate recognition, Mark Zuckerberg, natural language processing, Norbert Wiener, ought to be enough for anybody, pattern recognition, performance metric, RAND corporation, Ray Kurzweil, recommendation engine, ride hailing / ride sharing, Rodney Brooks, self-driving car, sentiment analysis, Silicon Valley, Singularitarianism, Skype, speech recognition, Stephen Hawking, Steve Jobs, Steve Wozniak, Steven Pinker, strong AI, superintelligent machines, theory of mind, There's no reason for any individual to have a computer in his home - Ken Olsen, Turing test, Vernor Vinge, Watson beat the top human players on Jeopardy!

Dean, “Large Scale Deep Learning,” slides from keynote lecture, Conference on Information and Knowledge Management (CIKM), Nov. 2014, accessed Dec. 7, 2018, static.googleusercontent.com/media/research.google.com/en//people/jeff/CIKM-keynote-Nov2014.pdf.   4.  S. Levy, “The iBrain Is Here, and It’s Already in Your Phone,” Wired, Aug. 24, 2016, www.wired.com/2016/08/an-exclusive-look-at-how-ai-and-machine-learning-work-at-apple.   5.  In the speech-recognition literature, the most commonly used performance metric is “word-error rate” on large collections of short audio segments. While the word-error-rate performance of state-of-the-art speech-recognition systems applied to these collections is at or above “human level,” there are several reasons to argue that when more realistic measures are used (for example, noisy or accented speech, important words, ambiguous language), speech-recognition performance by machines is still significantly below that of humans.


RDF Database Systems: Triples Storage and SPARQL Query Processing by Olivier Cure, Guillaume Blin

Amazon Web Services, bioinformatics, business intelligence, cloud computing, database schema, fault tolerance, full text search, information retrieval, Internet Archive, Internet of things, linked data, NP-complete, peer-to-peer, performance metric, random walk, recommendation engine, RFID, semantic web, Silicon Valley, social intelligence, software as a service, SPARQL, web application

The data generated can be split into multiple files. The data generator may, finally, include update transactions and may use named graphs. The Lehigh University Benchmark (LUBM; http://swat.cse.lehigh.edu/ projects/lubm/) includes a benchmark built around the university domain and includes a university domain ontology, customizable and repeatable synthetic data, a set of 14 test queries, and several performance metrics.The data generator allows us to indicate the number of universities to generate and the seed used for random data generation (for the repeatability purpose). The corresponding queries bear large inputs with both high and low selectivities, triangular patterns of relationships, explicit and implicit subClassOf and subPropertyOf, and some OWL reasoning (transitive and inverse properties and some inferences based on domain and range). 77 78 RDF Database Systems The University Ontology Benchmark (UOBM; http://www.springerlink. com/content/l0wu543x26350462/University) is designed as an extension of LUBM with a more complex ontology (i.e., supporting OWL Lite and OWL DL).


pages: 463 words: 105,197

Radical Markets: Uprooting Capitalism and Democracy for a Just Society by Eric Posner, E. Weyl

3D printing, activist fund / activist shareholder / activist investor, Affordable Care Act / Obamacare, Airbnb, Amazon Mechanical Turk, anti-communist, augmented reality, basic income, Berlin Wall, Bernie Sanders, Branko Milanovic, business process, buy and hold, carbon footprint, Cass Sunstein, Clayton Christensen, cloud computing, collective bargaining, commoditize, Corn Laws, corporate governance, crowdsourcing, cryptocurrency, Donald Trump, Elon Musk, endowment effect, Erik Brynjolfsson, Ethereum, feminist movement, financial deregulation, Francis Fukuyama: the end of history, full employment, George Akerlof, global supply chain, guest worker program, hydraulic fracturing, Hyperloop, illegal immigration, immigration reform, income inequality, income per capita, index fund, informal economy, information asymmetry, invisible hand, Jane Jacobs, Jaron Lanier, Jean Tirole, Joseph Schumpeter, Kenneth Arrow, labor-force participation, laissez-faire capitalism, Landlord’s Game, liberal capitalism, low skilled workers, Lyft, market bubble, market design, market friction, market fundamentalism, mass immigration, negative equity, Network effects, obamacare, offshore financial centre, open borders, Pareto efficiency, passive investing, patent troll, Paul Samuelson, performance metric, plutocrats, Plutocrats, pre–internet, random walk, randomized controlled trial, Ray Kurzweil, recommendation engine, rent-seeking, Richard Thaler, ride hailing / ride sharing, risk tolerance, road to serfdom, Robert Shiller, Robert Shiller, Ronald Coase, Rory Sutherland, Second Machine Age, second-price auction, self-driving car, shareholder value, sharing economy, Silicon Valley, Skype, special economic zone, spectrum auction, speech recognition, statistical model, stem cell, telepresence, Thales and the olive presses, Thales of Miletus, The Death and Life of Great American Cities, The Future of Employment, The Market for Lemons, The Nature of the Firm, The Rise and Fall of American Growth, The Wealth of Nations by Adam Smith, Thorstein Veblen, trade route, transaction costs, trickle-down economics, Uber and Lyft, uber lyft, universal basic income, urban planning, Vanguard fund, women in the workforce, Zipcar

Yet our policy affects issues other than competition, and we now consider its likely effects on these other areas. GOVERNANCE Beyond the competition benefits of our proposal, it would also greatly improve corporate governance. Commentators have noted that the current system of institutional-investor dominance harms corporate governances. In the words of law professors Ronald Gilson and Jeffrey Gordon: Institutional intermediaries compete and are rewarded on the basis of “relative performance” metrics that give them little incentive to engage in shareholder activism that could address shortfalls in managerial performance; such activity can improve absolute but not relative performance [of the institution].38 In other words, if a large investor spends time and resources improving the performance of Firm X, the higher stock price of Firm X benefits all owners of Firm X. Because a large institutional investor owns roughly the same shares, including the shares of Firm X, as other large institutional investors, it has gained nothing relative to its own competitors in the financial services industry.


pages: 903 words: 235,753

The Stack: On Software and Sovereignty by Benjamin H. Bratton

1960s counterculture, 3D printing, 4chan, Ada Lovelace, additive manufacturing, airport security, Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem, algorithmic trading, Amazon Mechanical Turk, Amazon Web Services, augmented reality, autonomous vehicles, basic income, Benevolent Dictator For Life (BDFL), Berlin Wall, bioinformatics, bitcoin, blockchain, Buckminster Fuller, Burning Man, call centre, carbon footprint, carbon-based life, Cass Sunstein, Celebration, Florida, charter city, clean water, cloud computing, connected car, corporate governance, crowdsourcing, cryptocurrency, dark matter, David Graeber, deglobalization, dematerialisation, disintermediation, distributed generation, don't be evil, Douglas Engelbart, Douglas Engelbart, Edward Snowden, Elon Musk, en.wikipedia.org, Eratosthenes, Ethereum, ethereum blockchain, facts on the ground, Flash crash, Frank Gehry, Frederick Winslow Taylor, future of work, Georg Cantor, gig economy, global supply chain, Google Earth, Google Glasses, Guggenheim Bilbao, High speed trading, Hyperloop, illegal immigration, industrial robot, information retrieval, Intergovernmental Panel on Climate Change (IPCC), intermodal, Internet of things, invisible hand, Jacob Appelbaum, Jaron Lanier, Joan Didion, John Markoff, Joi Ito, Jony Ive, Julian Assange, Khan Academy, liberal capitalism, lifelogging, linked data, Mark Zuckerberg, market fundamentalism, Marshall McLuhan, Masdar, McMansion, means of production, megacity, megastructure, Menlo Park, Minecraft, MITM: man-in-the-middle, Monroe Doctrine, Network effects, new economy, offshore financial centre, oil shale / tar sands, packet switching, PageRank, pattern recognition, peak oil, peer-to-peer, performance metric, personalized medicine, Peter Eisenman, Peter Thiel, phenotype, Philip Mirowski, Pierre-Simon Laplace, place-making, planetary scale, RAND corporation, recommendation engine, reserve currency, RFID, Robert Bork, Sand Hill Road, self-driving car, semantic web, sharing economy, Silicon Valley, Silicon Valley ideology, Slavoj Žižek, smart cities, smart grid, smart meter, social graph, software studies, South China Sea, sovereign wealth fund, special economic zone, spectrum auction, Startup school, statistical arbitrage, Steve Jobs, Steven Levy, Stewart Brand, Stuxnet, Superbowl ad, supply-chain management, supply-chain management software, TaskRabbit, the built environment, The Chicago School, the scientific method, Torches of Freedom, transaction costs, Turing complete, Turing machine, Turing test, undersea cable, universal basic income, urban planning, Vernor Vinge, Washington Consensus, web application, Westphalian system, WikiLeaks, working poor, Y Combinator

We see this play out with the absolute User's slide into an abyssal dissolution of the self when confronted with the potential totality of virtualized experiences. In response to the white noise of his infinitely refracted subjectivity, he reflects this entropy by sliding back into perceptual incoherency (or potentially stumbling toward secular hypermaterialism). It's true that the real purpose of QS is not to provide all possible information at once, but to reduce systemic complexity with summary diagrammatic accounts of one's inputs, states, and performance metrics. But adding more and more data sources to the mix and providing greater multivariate fidelity also produces other pathways of dissolution. By tracking external forces (e.g., environmental, microbial, economic) and their role in the formation of the User-subject's state and performance, the boundaries between internal and external systems are perforated and blurred. Those external variables not only act on you; in effect they are you as well, and so the profile reflecting back at the User is both more and less than a single figure (and as we'll see, sometimes those extrinsic forces live inside one's own body).

As discussed in the Interfaces chapter, the images of systemic interrelationality found in GUI and in dynamic visualizations not only diagram how platforms operate; they are the very instruments with which a User interacts with those platforms and with other Users in the first place. At stake for the redesign of the User is not only the subjective (QS) and objective (Exit) reflections of her inputs, states, and performance metrics within local/global and intrinsic/extrinsic variations, but also that the profiles of these traces are the medium through which those interactions are realized. The recursion is not only between scales of action; it is also between event and its mediation. Put differently, the composition with which (and into which) the tangled positions of Users draw their own maps (the sum of the parts that busily sum themselves) is always both more and less whole than the whole that sums their sums!


pages: 445 words: 105,255

Radical Abundance: How a Revolution in Nanotechnology Will Change Civilization by K. Eric Drexler

3D printing, additive manufacturing, agricultural Revolution, Bill Joy: nanobots, Brownian motion, carbon footprint, Cass Sunstein, conceptual framework, continuation of politics by other means, crowdsourcing, dark matter, double helix, failed state, global supply chain, industrial robot, iterative process, Mars Rover, means of production, Menlo Park, mutually assured destruction, New Journalism, performance metric, reversible computing, Richard Feynman, Silicon Valley, South China Sea, Thomas Malthus, V2 rocket, Vannevar Bush, zero-sum game

Participants in the ITRS can safely assume that silicon will rule for years to come, but the QISTR collaboration faced a range of fundamentally different competing approaches: quantum bits represented by the states of (pick one or more) trapped atoms in a vacuum, spin states of atoms embedded in silicon, nuclear spins in solution-phase molecules, or photons in purely photonic systems. These approaches differ radically in scalability and manufacturability as well as in the range of functions that each can implement. The QISTR document must rise to a higher level of abstraction than ITRS. Rather than focusing on performance metrics, it adopts the “DiVincenzo promise criteria” (including scalability, gate universality, decoherence times, and suitable means for input and output) and through these criteria for essential functional capabilities, QISTR then compares diverse approaches and their potential to serve as more than dead-end demos. QISTR shows how a community can explore fields that are rich in alternatives, identifying the technologies that have a genuine potential to serve a role in a functional system, setting others aside as unpromising.


pages: 350 words: 109,379

How to Run a Government: So That Citizens Benefit and Taxpayers Don't Go Crazy by Michael Barber

Affordable Care Act / Obamacare, Atul Gawande, battle of ideas, Berlin Wall, Black Swan, Checklist Manifesto, collapse of Lehman Brothers, collective bargaining, deliberate practice, facts on the ground, failed state, fear of failure, full employment, G4S, illegal immigration, invisible hand, libertarian paternalism, Mark Zuckerberg, Nate Silver, North Sea oil, obamacare, performance metric, Potemkin village, Ronald Reagan, school choice, The Signal and the Noise by Nate Silver, transaction costs, WikiLeaks

question. The review has a rubric divided into five sections and fifteen modules. The five headings are: Develop a foundation for delivery Understand the delivery challenge Plan for delivery Drive delivery Create an irreversible delivery culture.* The rubric simply asks a set of questions under each of the five sections, such as (under ‘Plan for delivery’): Do plans track relevant performance metrics, leading indicators and implementation indicators for each intervention? That is hardly a question designed to set the pulse racing, but the point of the rubric is not to emulate a good thriller, but to be thorough, to make sure, in the classic phrase, that no stone is left unturned in checking out whether a government machine or an individual department is ready to deliver or not. The rubric then offers best-case and worst-case options to help those responsible answer the questions for themselves.


pages: 370 words: 107,983

Rage Inside the Machine: The Prejudice of Algorithms, and How to Stop the Internet Making Bigots of Us All by Robert Elliott Smith

Ada Lovelace, affirmative action, AI winter, Alfred Russel Wallace, Amazon Mechanical Turk, animal electricity, autonomous vehicles, Black Swan, British Empire, cellular automata, citizen journalism, Claude Shannon: information theory, combinatorial explosion, corporate personhood, correlation coefficient, crowdsourcing, Daniel Kahneman / Amos Tversky, desegregation, discovery of DNA, Douglas Hofstadter, Elon Musk, Fellow of the Royal Society, feminist movement, Filter Bubble, Flash crash, Gerolamo Cardano, gig economy, Gödel, Escher, Bach, invention of the wheel, invisible hand, Jacquard loom, Jacques de Vaucanson, John Harrison: Longitude, John von Neumann, Kenneth Arrow, low skilled workers, Mark Zuckerberg, mass immigration, meta analysis, meta-analysis, mutually assured destruction, natural language processing, new economy, On the Economy of Machinery and Manufactures, p-value, pattern recognition, Paul Samuelson, performance metric, Pierre-Simon Laplace, precariat, profit maximization, profit motive, Silicon Valley, social intelligence, statistical model, Stephen Hawking, stochastic process, telemarketer, The Bell Curve by Richard Herrnstein and Charles Murray, The Future of Employment, the scientific method, The Wealth of Nations by Adam Smith, The Wisdom of Crowds, theory of mind, Thomas Bayes, Thomas Malthus, traveling salesman, Turing machine, Turing test, twin studies, Vilfredo Pareto, Von Neumann architecture, women in the workforce

Likewise, Amazon warehouse workers select products from shelves that would be difficult for robots to identify or handle, but feed an otherwise mechanized system of sales, order and delivery (excepting those final delivery steps, which are also increasingly done by algorithmically controlled humans). Everything in these systems is managed algorithmically with incentives and punishments hard-wired to performance metrics. Deliveroo riders who do the most drops get benefits (like the most lucrative time slots), but those who fail to meet their targets get a lower rating and a less lucrative shift next time. Marx’s vision of people as the ‘mere living appendage’ of a mechanism is fully realized here. The number of people doing this simplified, but difficult to computerize, Victorian-style piecework is growing rapidly, having increased at least by a factor of ten in just the last three years.


The Targeter: My Life in the CIA, Hunting Terrorists and Challenging the White House by Nada Bakos

Chelsea Manning, Edward Snowden, fear of failure, feminist movement, meta analysis, meta-analysis, performance metric, place-making, RAND corporation, WikiLeaks

“not the group responsible for their actions”: Liz Sly, “Al-Qaeda Disavows Any Ties with Radical Islamist ISIS Group in Syria, Iraq,” Washington Post, February 3, 2014. 73. school-age children gun down its prisoners: Lizzie Dearden, “ISIS Uses Young Boys to Hunt Down and Kill Prisoners in Ruined Syrian Castle for Gory Propaganda Video,” Independent, December 5, 2015. 74. publishes annual reports for its financial backers: Allison Hoffman, “1,083 Assassinations and Other Performance Metrics: ISIS’s Year in Review,” Bloomberg Businessweek, June 18, 2014. 75. a high perception of the threat: Christoph O. Meyer, “International Terrorism as a Force of Homogenization? A Constructivist Approach to Understanding Cross-National Threat Perceptions and Responses,” Cambridge Review of International Affairs 22, no. 4 (2009): 647–66.


pages: 461 words: 106,027

Zero to Sold: How to Start, Run, and Sell a Bootstrapped Business by Arvid Kahl

"side hustle", business process, centre right, Chuck Templeton: OpenTable:, continuous integration, coronavirus, COVID-19, Covid-19, crowdsourcing, domain-specific language, financial independence, Google Chrome, if you build it, they will come, information asymmetry, information retrieval, inventory management, Jeff Bezos, job automation, Kubernetes, minimum viable product, Network effects, performance metric, post-work, premature optimization, risk tolerance, Ruby on Rails, sentiment analysis, Silicon Valley, software as a service, source of truth, statistical model, subscription business, supply-chain management, trickle-down economics, web application

When we noticed that teachers were copying and pasting this little bit of text into their finished feedback, we quickly added a centralized "signature" feature, where they could add their sign-off to their generated feedback automatically, saving them a few seconds every time they used our product. Over a day, that's a few minutes, and it quickly adds up. By adding lots of features like this, we shaved off a few hours a week just by optimizing the hot path. Within your application, hot paths can be found by looking at performance metrics. If a screen that your users view 100 times a day loads a second faster, that's a few minutes saved at the end of the day. If you measure which parts of your service your customers use most often, you will know where your optimization efforts will be most impactful. You can accomplish this on the user-experience level by using tools like Hotjar and CrazyEgg, which are called behavior analytics tools, and are mostly recording your users' sessions.


pages: 470 words: 109,589

Apache Solr 3 Enterprise Search Server by Unknown

bioinformatics, continuous integration, database schema, en.wikipedia.org, fault tolerance, Firefox, full text search, information retrieval, natural language processing, performance metric, platform as a service, Ruby on Rails, web application

Summary We briefly covered a wide variety of the issues that surround taking a Solr configuration that works in a development environment and getting it ready for the rigors of a production environment. Solr's modular nature and stripped down focus on search allows it to be compatible with a broad variety of deployment platforms. Solr offers a wealth of monitoring options, from log files, to HTTP request logs, to JMX options. Nonetheless, for a really robust solution, you must define what the key performance metrics are that concern you, and then implement automated solutions for tracking them. Now that we have set up our Solr server, we need to take advantage of it to build better applications. In the next chapter, we'll look at how to easily integrate Solr search through various client libraries. Chapter 9. Integrating Solr As the saying goes, if a tree falls in the woods and no one hears it, did it make a sound?


pages: 429 words: 114,726

The Computer Boys Take Over: Computers, Programmers, and the Politics of Technical Expertise by Nathan L. Ensmenger

barriers to entry, business process, Claude Shannon: information theory, computer age, deskilling, Donald Knuth, Firefox, Frederick Winslow Taylor, future of work, Grace Hopper, informal economy, information retrieval, interchangeable parts, Isaac Newton, Jacquard loom, job satisfaction, John von Neumann, knowledge worker, loose coupling, new economy, Norbert Wiener, pattern recognition, performance metric, Philip Mirowski, post-industrial society, Productivity paradox, RAND corporation, Robert Gordon, Shoshana Zuboff, sorting algorithm, Steve Jobs, Steven Levy, the market place, Thomas Kuhn: the structure of scientific revolutions, Thorstein Veblen, Turing machine, Von Neumann architecture, Y2K

One guidebook from 1969 for managers captured the essence of this adversarial approach to programmer management by describing the successful computer manager as the “one whose grasp of the job is reflected in simple work units that are in the hand[s] of simple programmers; not one who, with control lost, is held in contempt by clever programmers dangerously maintaining control on his behalf.”32 An uncritical reading of this and other similar management perspectives on the process of software development, with their confident claims about the value and efficacy of various performance metrics, development methodologies, and programming languages, might suggest that Kraft and Greenbaum are correct in their assessments. In fact, many of these methodologies do indeed represent “elaborate efforts” that “are being made to develop ways of gradually eliminating programmers, or at least reduce their average skill levels, required training, experience, and so on.”33 Their authors would be the first to admit it.


pages: 401 words: 115,959

Philanthrocapitalism by Matthew Bishop, Michael Green, Bill Clinton

Albert Einstein, anti-communist, barriers to entry, battle of ideas, Bernie Madoff, Bob Geldof, Bonfire of the Vanities, business process, business process outsourcing, Charles Lindbergh, clean water, cleantech, corporate governance, corporate social responsibility, Dava Sobel, David Ricardo: comparative advantage, don't be evil, family office, financial innovation, full employment, global pandemic, global village, God and Mammon, Hernando de Soto, high net worth, Intergovernmental Panel on Climate Change (IPCC), invisible hand, James Dyson, John Harrison: Longitude, joint-stock company, knowledge economy, knowledge worker, Live Aid, lone genius, Marc Andreessen, market bubble, mass affluent, microcredit, Mikhail Gorbachev, Nelson Mandela, new economy, offshore financial centre, old-boy network, peer-to-peer lending, performance metric, Peter Singer: altruism, plutocrats, Plutocrats, profit maximization, profit motive, Richard Feynman, risk tolerance, risk-adjusted returns, Ronald Coase, Ronald Reagan, shareholder value, Silicon Valley, Slavoj Žižek, South Sea Bubble, sovereign wealth fund, stem cell, Steve Jobs, The Wealth of Nations by Adam Smith, The Wisdom of Crowds, Thomas Malthus, Thorstein Veblen, trade liberalization, transaction costs, trickle-down economics, wealth creators, winner-take-all economy, working poor, World Values Survey, X Prize

At least what data there is can now be viewed relatively easily using the Guide Star Web site, a sort of “Bloomberg screen” for philanthropy. Launched in America in 1994 by entrepreneur Buzz Schmidt, and since rolled out in several other countries, Guide Star publishes raw data from tax returns, covering more than 1.7 million nonprofits in America alone. Yet this data is often too limited to provide an accurate assessment of performance. Hence the need for research firms such as NPC. At its best, NPC provides the sort of performance metrics that new philanthropists love. It does such things as calculate the rate of return on supporting a charity that gets a persistent truant to attend school regularly (1,160 percent, it turns out). In a similar spirit, Geneva Global, an American firm founded in 2000 that focuses on effective giving to developing countries, measures the performance of donations using units of what it calls Life Change.


pages: 424 words: 114,905

Deep Medicine: How Artificial Intelligence Can Make Healthcare Human Again by Eric Topol

23andMe, Affordable Care Act / Obamacare, AI winter, Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem, artificial general intelligence, augmented reality, autonomous vehicles, bioinformatics, blockchain, cloud computing, cognitive bias, Colonization of Mars, computer age, computer vision, conceptual framework, creative destruction, crowdsourcing, Daniel Kahneman / Amos Tversky, dark matter, David Brooks, digital twin, Elon Musk, en.wikipedia.org, epigenetics, Erik Brynjolfsson, fault tolerance, George Santayana, Google Glasses, ImageNet competition, Jeff Bezos, job automation, job satisfaction, Joi Ito, Mark Zuckerberg, medical residency, meta analysis, meta-analysis, microbiome, natural language processing, new economy, Nicholas Carr, nudge unit, pattern recognition, performance metric, personalized medicine, phenotype, placebo effect, randomized controlled trial, recommendation engine, Rubik’s Cube, Sam Altman, self-driving car, Silicon Valley, speech recognition, Stephen Hawking, text mining, the scientific method, Tim Cook: Apple, War on Poverty, Watson beat the top human players on Jeopardy!, working-age population

The error rate dropped to 1 percent, and the receiver operating characteristic (ROC) curve, a measure of predictive accuracy where 1.0 is perfect, rose from 0.63 at the time of the scatterplot (Figure 4.1) to 0.86. We’ll be referring to ROC curves a lot throughout the book, since they are considered one of the best ways to show (underscoring one, and to point out the method has been sharply criticized and there are ongoing efforts to develop better performance metrics) and quantify accuracy—plotting the true positive rate against the false positive rate (Figure 4.2). The value denoting accuracy is the area under the curve, whereby 1.0 is perfect, 0.50 is the diagonal line “worthless,” the equivalent of a coin toss. The area of 0.63 that AliveCor initially obtained is deemed poor. Generally, 0.80–.90 is considered good, 0.70–.80 fair. They further prospectively validated their algorithm in forty dialysis patients with simultaneous ECGs and potassium levels.


Sam Friedman and Daniel Laurison by The Class Ceiling Why it Pays to be Privileged (2019, Policy Press)

affirmative action, Boris Johnson, discrete time, Donald Trump, Downton Abbey, equal pay for equal work, gender pay gap, gig economy, Gini coefficient, glass ceiling, Hyperloop, if you build it, they will come, income inequality, invisible hand, job satisfaction, knowledge economy, longitudinal study, meta analysis, meta-analysis, nudge unit, old-boy network, performance metric, psychological pricing, school choice, Skype, starchitect, The Spirit Level, unpaid internship, upwardly mobile

I remember at one point he was like, “You’re not ready yet, but I’ll tell you what I’ll do, we’ll see if we can find someone who will do this role 111 The Class Ceiling for two to three years, possibly been at a large firm, has reached mandatory retirement but could come in and be a contractor for us, you could learn a lot from that, with a view to, not making any promises, bringing you through in three years’ time …” What is significant here is how sponsors like Martin were able to single-handedly change the course of a person’s career, here placing James on an accelerated pathway – often referred to as the ‘Partner track’. TC facilitated this kind of sponsorship by allowing Partners to intervene, influence or “game” promotion decisions. Many of our interviewees told us this was a fairly standard practice – an “open secret”, as Jane, an Audit Partner noted – and involved Partners circumventing or bypassing “objective” performance metrics or ratings in order to secure promotions for those they wanted to promote – their “favourites”, as Tax Partner Karen put it. Significantly, when people told us about opportunities brokered by sponsors, they almost always narrated this in terms of rewarding skill or competence – of someone “spotting talent” or “believing in me”. Yet we were particularly interested in how these relationships were established in the first place.


eBoys by Randall E. Stross

barriers to entry, business cycle, call centre, carried interest, cognitive dissonance, disintermediation, edge city, high net worth, hiring and firing, Jeff Bezos, job-hopping, knowledge worker, late capitalism, market bubble, Menlo Park, new economy, old-boy network, passive investing, performance metric, pez dispenser, railway mania, rolodex, Sand Hill Road, shareholder value, Silicon Valley, Silicon Valley startup, Steve Ballmer, Steve Jobs, Y2K

Pat Cloherty, of New York–based Patricof & Company (whose Mephistophelean partners were dished up in Michael Wolff’s 1998 book, Burn Rate), spoke of her firm’s highly evolved hierarchy: analysts, associates, senior associates, principals, and at the apex, partners. Sonja Hoel, a young venture capitalist at California-based Menlo Ventures, described how her firm’s year-end bonuses were based on a point system that would do a Pentagon-sized bureaucracy proud, carefully documenting “deals out the door,” replete with negative points, too, assigned for undesirable “performance metrics.” As a profession, venture firms were most comfortable with wide skews of compensation; in one case, the most senior general partner arrogated two thirds of the entire firm’s carry just for himself. (In 1995, when Benchmark announced its structure of equal compensation for all partners, a partner at Boston-based TA Associates sniffed in the Venture Capital Journal that that was “communism”; Bruce Dunlevie, upon being told by the reporter of the comment, guessed that “the guy who said that must have been a senior partner.”)


pages: 413 words: 117,782

What Happened to Goldman Sachs: An Insider's Story of Organizational Drift and Its Unintended Consequences by Steven G. Mandis

activist fund / activist shareholder / activist investor, algorithmic trading, Berlin Wall, bonus culture, BRICs, business process, buy and hold, collapse of Lehman Brothers, collateralized debt obligation, commoditize, complexity theory, corporate governance, corporate raider, Credit Default Swap, credit default swaps / collateralized debt obligations, crony capitalism, disintermediation, diversification, Emanuel Derman, financial innovation, fixed income, friendly fire, Goldman Sachs: Vampire Squid, high net worth, housing crisis, London Whale, Long Term Capital Management, merger arbitrage, Myron Scholes, new economy, passive investing, performance metric, risk tolerance, Ronald Reagan, Saturday Night Live, Satyajit Das, shareholder value, short selling, sovereign wealth fund, The Nature of the Firm, too big to fail, value at risk

Although my new bosses were smart, sophisticated, and supportive, and as demanding as my investment banking bosses, there was an intense focus on measuring relatively short-term results because they were measurable. Our performance as investors was marked to market every day, meaning that the value of the trades we made was calculated every day, so there was total transparency about how much money we’d made or lost for the firm each and every day. This isn’t done in investment banking, although each year new performance metrics were being added by the time I left for FICC. Typically in banking, relationships take a long time to develop and pay off. A bad day in banking may mean that, after years of meetings and presentations performed for free, a client didn’t select you to execute a transaction. You could offer excuses: “The other bank offered to loan them money,” “They were willing to do it much cheaper,” and so on.


pages: 478 words: 126,416

Other People's Money: Masters of the Universe or Servants of the People? by John Kay

Affordable Care Act / Obamacare, asset-backed security, bank run, banking crisis, Basel III, Bernie Madoff, Big bang: deregulation of the City of London, bitcoin, Black Swan, Bonfire of the Vanities, bonus culture, Bretton Woods, buy and hold, call centre, capital asset pricing model, Capital in the Twenty-First Century by Thomas Piketty, cognitive dissonance, corporate governance, Credit Default Swap, cross-subsidies, dematerialisation, disruptive innovation, diversification, diversified portfolio, Edward Lloyd's coffeehouse, Elon Musk, Eugene Fama: efficient market hypothesis, eurozone crisis, financial innovation, financial intermediation, financial thriller, fixed income, Flash crash, forward guidance, Fractional reserve banking, full employment, George Akerlof, German hyperinflation, Goldman Sachs: Vampire Squid, Growth in a Time of Debt, income inequality, index fund, inflation targeting, information asymmetry, intangible asset, interest rate derivative, interest rate swap, invention of the wheel, Irish property bubble, Isaac Newton, John Meriwether, light touch regulation, London Whale, Long Term Capital Management, loose coupling, low cost airline, low cost carrier, M-Pesa, market design, millennium bug, mittelstand, money market fund, moral hazard, mortgage debt, Myron Scholes, NetJets, new economy, Nick Leeson, Northern Rock, obamacare, Occupy movement, offshore financial centre, oil shock, passive investing, Paul Samuelson, peer-to-peer lending, performance metric, Peter Thiel, Piper Alpha, Ponzi scheme, price mechanism, purchasing power parity, quantitative easing, quantitative trading / quantitative finance, railway mania, Ralph Waldo Emerson, random walk, regulatory arbitrage, Renaissance Technologies, rent control, risk tolerance, road to serfdom, Robert Shiller, Robert Shiller, Ronald Reagan, Schrödinger's Cat, shareholder value, Silicon Valley, Simon Kuznets, South Sea Bubble, sovereign wealth fund, Spread Networks laid a new fibre optics cable between New York and Chicago, Steve Jobs, Steve Wozniak, The Great Moderation, The Market for Lemons, the market place, The Myth of the Rational Market, the payments system, The Wealth of Nations by Adam Smith, The Wisdom of Crowds, Tobin tax, too big to fail, transaction costs, tulip mania, Upton Sinclair, Vanguard fund, Washington Consensus, We are the 99%, Yom Kippur War

Even as the thinly capitalised Deutsche Bank was benefiting from state guarantees of its liabilities, it was buying back its own shares to reduce its capital base. And whatever return on equity was claimed by the financial officers of Deutsche Bank, the shareholder returns told a different, and more enlightening, story: the average annual total return on its shares (in US dollars with dividends re-invested) over the period May 2002 to May 2012 (Ackermann’s tenure as chief executive of the bank) was around minus 2 per cent. RoE is an inappropriate performance metric for any company, but especially for a bank, and it is bizarre that its use should have been championed by people who profess particular expertise in financial and risk management. Banks still proclaim return on equity targets: less ambitious, but nevertheless fanciful. In recent discussions of the implications of imposing more extensive capital requirements on banks, a figure of 15 per cent has been proposed and endorsed as a measure of the cost of equity capital to conglomerate banks.28 If these companies were really likely to earn 15 per cent rates of return for the benefit of their shareholders, there would be long queues of investors seeking these attractive returns.


pages: 382 words: 120,064

Bank 3.0: Why Banking Is No Longer Somewhere You Go but Something You Do by Brett King

3D printing, additive manufacturing, Airbus A320, Albert Einstein, Amazon Web Services, Any sufficiently advanced technology is indistinguishable from magic, asset-backed security, augmented reality, barriers to entry, bitcoin, bounce rate, business intelligence, business process, business process outsourcing, call centre, capital controls, citizen journalism, Clayton Christensen, cloud computing, credit crunch, crowdsourcing, disintermediation, en.wikipedia.org, fixed income, George Gilder, Google Glasses, high net worth, I think there is a world market for maybe five computers, Infrastructure as a Service, invention of the printing press, Jeff Bezos, jimmy wales, Kickstarter, London Interbank Offered Rate, M-Pesa, Mark Zuckerberg, mass affluent, Metcalfe’s law, microcredit, mobile money, more computing power than Apollo, Northern Rock, Occupy movement, optical character recognition, peer-to-peer, performance metric, Pingit, platform as a service, QR code, QWERTY keyboard, Ray Kurzweil, recommendation engine, RFID, risk tolerance, Robert Metcalfe, self-driving car, Skype, speech recognition, stem cell, telepresence, Tim Cook: Apple, transaction costs, underbanked, US Airways Flight 1549, web application

There are, however, two sides of Big Data that are consistently discussed in the industry as having strong business benefit. The first is the ability to make better trading decisions, and the second, the ability to connect with customers in the retail environment. In a trading environment, the financial benefits of Big Data appear extremely compelling. The ability, for example, to understand trading cost analytics, capacity of a trade, performance metrics of traders, etc. could be massively profitable to a trading business. How do you create alpha opportunities to outperform, based on that data? The ability to create algorithms that forecast prices in the near term and then make trading decisions accordingly is what will likely drive the profits of banking and trading firms in the near term. Speed of execution is, of course, another key platform capability to leverage this learning and has spawned a raft of low-latency platform investments designed to capture the value of these so-called “alpha” data points.


pages: 472 words: 117,093

Machine, Platform, Crowd: Harnessing Our Digital Future by Andrew McAfee, Erik Brynjolfsson

"Robert Solow", 3D printing, additive manufacturing, AI winter, Airbnb, airline deregulation, airport security, Albert Einstein, Amazon Mechanical Turk, Amazon Web Services, artificial general intelligence, augmented reality, autonomous vehicles, backtesting, barriers to entry, bitcoin, blockchain, British Empire, business cycle, business process, carbon footprint, Cass Sunstein, centralized clearinghouse, Chris Urmson, cloud computing, cognitive bias, commoditize, complexity theory, computer age, creative destruction, crony capitalism, crowdsourcing, cryptocurrency, Daniel Kahneman / Amos Tversky, Dean Kamen, discovery of DNA, disintermediation, disruptive innovation, distributed ledger, double helix, Elon Musk, en.wikipedia.org, Erik Brynjolfsson, Ethereum, ethereum blockchain, everywhere but in the productivity statistics, family office, fiat currency, financial innovation, George Akerlof, global supply chain, Hernando de Soto, hive mind, information asymmetry, Internet of things, inventory management, iterative process, Jean Tirole, Jeff Bezos, jimmy wales, John Markoff, joint-stock company, Joseph Schumpeter, Kickstarter, law of one price, longitudinal study, Lyft, Machine translation of "The spirit is willing, but the flesh is weak." to Russian and back, Marc Andreessen, Mark Zuckerberg, meta analysis, meta-analysis, Mitch Kapor, moral hazard, multi-sided market, Myron Scholes, natural language processing, Network effects, new economy, Norbert Wiener, Oculus Rift, PageRank, pattern recognition, peer-to-peer lending, performance metric, plutocrats, Plutocrats, precision agriculture, prediction markets, pre–internet, price stability, principal–agent problem, Ray Kurzweil, Renaissance Technologies, Richard Stallman, ride hailing / ride sharing, risk tolerance, Ronald Coase, Satoshi Nakamoto, Second Machine Age, self-driving car, sharing economy, Silicon Valley, Skype, slashdot, smart contracts, Snapchat, speech recognition, statistical model, Steve Ballmer, Steve Jobs, Steven Pinker, supply-chain management, TaskRabbit, Ted Nelson, The Market for Lemons, The Nature of the Firm, Thomas Davenport, Thomas L Friedman, too big to fail, transaction costs, transportation-network company, traveling salesman, Travis Kalanick, two-sided market, Uber and Lyft, Uber for X, uber lyft, ubercab, Watson beat the top human players on Jeopardy!, winner-take-all economy, yield management, zero day

US presidential elections are determined by the electoral college, not the national popular vote, and that calls for a more nuanced, state-by-state strategy. Similarly, it’s easy to measure page views or click-through generated by an online advertising campaign, but most companies care more about long-term sales, which are usually maximized by a different kind of campaign. Careful selection of the right data inputs and the right performance metrics, especially the overall evaluation criterion, is a key characteristic of successful data-driven decision makers. Algorithms Behaving Badly A real risk of turning over decisions to machines is that bias in algorithmic systems can perpetuate or even amplify some of the pernicious biases that exist in our society. For instance, Latanya Sweeney, a widely cited professor at Harvard, had a disturbing experience when she entered her own name into the Google search engine.


pages: 756 words: 120,818

The Levelling: What’s Next After Globalization by Michael O’sullivan

"Robert Solow", 3D printing, Airbnb, algorithmic trading, bank run, banking crisis, barriers to entry, Bernie Sanders, bitcoin, Black Swan, blockchain, Boris Johnson, Branko Milanovic, Bretton Woods, British Empire, business cycle, business process, capital controls, Celtic Tiger, central bank independence, cloud computing, continuation of politics by other means, corporate governance, credit crunch, cryptocurrency, deglobalization, deindustrialization, disruptive innovation, distributed ledger, Donald Trump, eurozone crisis, financial innovation, first-past-the-post, fixed income, Geoffrey West, Santa Fe Institute, Gini coefficient, global value chain, housing crisis, income inequality, Intergovernmental Panel on Climate Change (IPCC), knowledge economy, liberal world order, Long Term Capital Management, longitudinal study, market bubble, minimum wage unemployment, new economy, Northern Rock, offshore financial centre, open economy, pattern recognition, Peace of Westphalia, performance metric, private military company, quantitative easing, race to the bottom, reserve currency, Robert Gordon, Robert Shiller, Robert Shiller, Ronald Reagan, Scramble for Africa, secular stagnation, Silicon Valley, Sinatra Doctrine, South China Sea, South Sea Bubble, special drawing rights, supply-chain management, The inhabitant of London could order by telephone, sipping his morning tea in bed, the various products of the whole earth, The Rise and Fall of American Growth, The Wealth of Nations by Adam Smith, Thomas Kuhn: the structure of scientific revolutions, total factor productivity, trade liberalization, tulip mania, Valery Gerasimov, Washington Consensus

The last time the United States was troubled by a corporate governance crisis was in the first years of the twenty-first century, when WorldCom and Enron Corporation became prominent examples of poor governance. In both cases, suspect accounting and high debt led to high-profile bankruptcies. The title of a book about Enron, The Smartest Guys in the Room, gives a sense of the hubris involved.38 At the time of writing, corporate debt ratios and levels and executive pay and its relation to performance metrics are more stretched than they were in the run-up to the Enron scandal in 2001, and the overall pattern suggests that there is a growing, systemic governance problem in the United States. There are two ways to resolve this. One is the traditional approach of letting markets discover and price corporate governance issues. The other is to try to frame the governance problems and ensure they do not happen again.


pages: 476 words: 132,042

What Technology Wants by Kevin Kelly

Albert Einstein, Alfred Russel Wallace, Buckminster Fuller, c2.com, carbon-based life, Cass Sunstein, charter city, Clayton Christensen, cloud computing, computer vision, Danny Hillis, dematerialisation, demographic transition, double entry bookkeeping, Douglas Engelbart, en.wikipedia.org, Exxon Valdez, George Gilder, gravity well, hive mind, Howard Rheingold, interchangeable parts, invention of air conditioning, invention of writing, Isaac Newton, Jaron Lanier, Joan Didion, John Conway, John Markoff, John von Neumann, Kevin Kelly, knowledge economy, Lao Tzu, life extension, Louis Daguerre, Marshall McLuhan, megacity, meta analysis, meta-analysis, new economy, off grid, out of africa, performance metric, personalized medicine, phenotype, Picturephone, planetary scale, RAND corporation, random walk, Ray Kurzweil, recommendation engine, refrigerator car, Richard Florida, Rubik’s Cube, Silicon Valley, silicon-based life, Skype, speech recognition, Stephen Hawking, Steve Jobs, Stewart Brand, Ted Kaczynski, the built environment, the scientific method, Thomas Malthus, Vernor Vinge, wealth creators, Whole Earth Catalog, Y2K

As one exponential boom is subsumed into the next, an established technology relays its momentum to the next paradigm and carries forward an unrelenting growth. The exact unit of what is being measured can also morph from one subcurve to the next. We may start out counting pixel size, then shift to pixel density, then to pixel speed. The final performance trait may not be evident in the initial technologies and reveal itself only over the long term, perhaps as a macrotrend that continues indefinitely. In the case of computers, as the performance metric of chips is constantly recalibrated from one technological stage to the next, Moore’s Law—redefined—will never end. Compound S Curves. On this idealized chart, technological performance is measured on the vertical axis and time or engineering effort captured on the horizontal. A series of sub-S curves create an emergent larger-scale invariant slope. The slow demise of the more-transistors-per-chip trend is inevitable.


pages: 497 words: 130,817

Pedigree: How Elite Students Get Elite Jobs by Lauren A. Rivera

affirmative action, availability heuristic, barriers to entry, Donald Trump, fundamental attribution error, glass ceiling, income inequality, job satisfaction, knowledge economy, meta analysis, meta-analysis, new economy, performance metric, profit maximization, profit motive, school choice, Silicon Valley, Silicon Valley startup, The Wisdom of Crowds, unpaid internship, women in the workforce, young professional

Although cultural similarity can facilitate trust and communication, it often does so at the expense of group effectiveness and high-quality team decision making.39 Furthermore, the emphasis on super-elite schools and the lack of systematic structures in place to reduce the use of gender and race stereotypes in candidate evaluation push qualified women and minorities out of the pool in favor of males and whites. Such patterns could adversely affect organizational performance not only because of the relationship between demographic diversity and higher-quality decision making but also because gender and racial diversity have become key performance metrics that clients and future job candidates use to evaluate firm quality and status. Likewise, the subjective nature of the hiring process can leave employers open to costly gender and racial discrimination lawsuits. EPS firms have faced such suits in the past and continue to face them in the present. Finally, although screening on socioeconomic status may enhance a firm’s status and facilitate client comfort, it excludes individuals who have critical skills relevant for successful job performance.


pages: 303 words: 67,891

Advances in Artificial General Intelligence: Concepts, Architectures and Algorithms: Proceedings of the Agi Workshop 2006 by Ben Goertzel, Pei Wang

AI winter, artificial general intelligence, bioinformatics, brain emulation, combinatorial explosion, complexity theory, computer vision, conceptual framework, correlation coefficient, epigenetics, friendly AI, G4S, information retrieval, Isaac Newton, John Conway, Loebner Prize, Menlo Park, natural language processing, Occam's razor, p-value, pattern recognition, performance metric, Ray Kurzweil, Rodney Brooks, semantic web, statistical model, strong AI, theory of mind, traveling salesman, Turing machine, Turing test, Von Neumann architecture, Y2K

Evaluating intelligence: A computational semiotics perspective. In IEEE International conference on systems, man and cybernetics, pages 2080–2085, Nashville, Tenessee, USA, 2000. [30] R. R. Gudwin. Evaluating intelligence: A computational semiotics perspective. In IEEE International conference on systems, man and cybernetics, pages 2080–2085, Nashville, Tenessee, USA, 2000. [30] J. Horst. A native intelligence metric for artificial systems. In Performance Metrics for Intelligent Systems Workshop, Gaithersburg, MD, USA, 2002. [31] D. Lenat and E. Feigenbaum. On the thresholds of knowledge. Artificial Intelligence, 47:185–250, 1991. 24 S. Legg and M. Hutter / A Collection of Definitions of Intelligence [32] H. Masum, S. Christensen, and F. Oppacher. The Turing ratio: Metrics for open-ended tasks. In GECCO 2002: Proceedings of the Genetic and Evolutionary Computation Conference, pages 973–980, New York, 2002.


pages: 370 words: 129,096

Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future by Ashlee Vance

addicted to oil, Burning Man, cleantech, digital map, El Camino Real, Elon Musk, global supply chain, Hyperloop, industrial robot, Jeff Bezos, Kickstarter, low earth orbit, Mark Zuckerberg, Maui Hawaii, Menlo Park, Mercator projection, money market fund, multiplanetary species, optical character recognition, orbital mechanics / astrodynamics, paypal mafia, performance metric, Peter Thiel, pre–internet, risk tolerance, Ronald Reagan, Sand Hill Road, self-driving car, side project, Silicon Valley, Silicon Valley startup, Steve Jobs, technoutopianism, Tesla Model S, transaction costs, Tyler Cowen: Great Stagnation, We wanted flying cars, instead we got 140 characters, X Prize

Tesla had thought about doing a hybrid like Fisker where a gas engine would be present to recharge the car’s batteries after they had consumed an initial charge. The car would be able to travel fifty to eighty miles after being plugged into an outlet and then take advantage of ubiquitous gas stations as needed to top up the batteries, eliminating range anxiety. Tesla’s engineers prototyped the hybrid vehicle and ran all sorts of cost and performance metrics. In the end, they found the hybrid to be too much of a compromise. “It would be expensive, and the performance would not be as good as the all-electric car,” said J. B. Straubel. “And we would have needed to build a team to compete with the core competency of every car company in the world. We would have been betting against all the things we believe in, like the power electronics and batteries improving.


pages: 469 words: 132,438

Taming the Sun: Innovations to Harness Solar Energy and Power the Planet by Varun Sivaram

addicted to oil, Albert Einstein, asset-backed security, autonomous vehicles, bitcoin, blockchain, carbon footprint, cleantech, collateralized debt obligation, Colonization of Mars, decarbonisation, demand response, disruptive innovation, distributed generation, diversified portfolio, Donald Trump, Elon Musk, energy security, energy transition, financial innovation, fixed income, global supply chain, global village, Google Earth, hive mind, hydrogen economy, index fund, Indoor air pollution, Intergovernmental Panel on Climate Change (IPCC), Internet of things, M-Pesa, market clearing, market design, mass immigration, megacity, mobile money, Negawatt, off grid, oil shock, peer-to-peer lending, performance metric, renewable energy transition, Richard Feynman, ride hailing / ride sharing, Ronald Reagan, Silicon Valley, Silicon Valley startup, smart grid, smart meter, sovereign wealth fund, Tesla Model S, time value of money, undersea cable, wikimedia commons

The bottom chart compares the cost of solar panels, per watt of power-generating capacity, with the cost of memory microchips, per gigabyte of data-storage capacity. Sources: U.S. Energy Information Administration, IC Knowledge. But technology experts consider solar’s cost declines to be unremarkable. Instead of comparing the cost of electricity from solar with that from fossil fuels, they might compare the cost of 1 watt of power generated from solar with the cost to achieve an analogous performance metric in a microchip, like a gigabyte of memory storage. That comparison, in the bottom panel of figure 0.1, is unflattering. Microchip costs have fallen a million times faster than those of solar panels. That rapid decline is a corollary of Moore’s Law. (Moore’s Law, by some reports, is dying for computer chips known as microprocessors. But it is alive and well for memory chips, as I’ve learned from my father—the technology chief at a major chipmaker—who recently unveiled a new flash memory chip design that is actually beating Moore’s Law and is already in your iPhone.)2 Recognizing the value of different perspectives, I’ve eagerly sought out more over the last decade.


pages: 513 words: 141,153

The Spider Network: The Wild Story of a Math Genius, a Gang of Backstabbing Bankers, and One of the Greatest Scams in Financial History by David Enrich

Bernie Sanders, call centre, centralized clearinghouse, computerized trading, Credit Default Swap, Downton Abbey, Flash crash, Goldman Sachs: Vampire Squid, information asymmetry, interest rate derivative, interest rate swap, London Interbank Offered Rate, London Whale, Long Term Capital Management, Nick Leeson, Northern Rock, Occupy movement, performance metric, profit maximization, tulip mania, zero-sum game

He painted detailed portraits of bank trading technology, the mechanics of the derivatives market, how his Excel models worked, how traders and brokers communicated with each other, how traders like him thought and felt. “The first thing you think is, where’s the edge, where can I make a bit more money, how can I push the boundaries, maybe, you know, a bit of a grey area, push the edge of the envelope?” he explained. He added: “The point is, you’re greedy. You want every little bit of money that you can possibly get because, like I say, that is how you’re judged. That’s your performance metric.” And then, one by one, Hayes went through all the people he’d worked with over the years, the colleagues and brokers and competitors whom he’d chewed out or begged for favors or bossed around. Any time he was tempted to hold back or spin a conversation in a slightly more favorable light, he remembered what was riding on this process: If the SFO perceived him as being dishonest or uncooperative, the agency could pull the plug on the interviews and throw him to the American wolves.


pages: 420 words: 130,503

Actionable Gamification: Beyond Points, Badges and Leaderboards by Yu-Kai Chou

Apple's 1984 Super Bowl advert, barriers to entry, bitcoin, Burning Man, Cass Sunstein, crowdsourcing, Daniel Kahneman / Amos Tversky, delayed gratification, don't be evil, en.wikipedia.org, endowment effect, Firefox, functional fixedness, game design, IKEA effect, Internet of things, Kickstarter, late fees, lifelogging, loss aversion, Maui Hawaii, Minecraft, pattern recognition, peer-to-peer, performance metric, QR code, recommendation engine, Richard Thaler, Silicon Valley, Skype, software as a service, Stanford prison experiment, Steve Jobs, The Wealth of Nations by Adam Smith, transaction costs

The fundamental design of an effective corporation taps into its collective talent to build something greater than its individual parts. If members of a basketball team were competing against each other instead of their opponents in an important match, they would play more selfishly, avoid passing the ball, and try to feature themselves as the star player. In fact, in both professional and collegiate basketball, besides standard stats such as 2-Point Shots, 3-Points Shots, Rebounds and other personal performance metrics, there is an important stat called Assists. Assists represents the amount of passes to teammates that immediately led to a score. Studies have shown that most successful offensive teams have a high percentage of assists associated with their scoring efforts. This is because assists lead to higher quality shots, which in turn, result in higher shooting percentages and greater success on the floor.


pages: 461 words: 128,421

The Myth of the Rational Market: A History of Risk, Reward, and Delusion on Wall Street by Justin Fox

activist fund / activist shareholder / activist investor, Albert Einstein, Andrei Shleifer, asset allocation, asset-backed security, bank run, beat the dealer, Benoit Mandelbrot, Black-Scholes formula, Bretton Woods, Brownian motion, business cycle, buy and hold, capital asset pricing model, card file, Cass Sunstein, collateralized debt obligation, complexity theory, corporate governance, corporate raider, Credit Default Swap, credit default swaps / collateralized debt obligations, Daniel Kahneman / Amos Tversky, David Ricardo: comparative advantage, discovery of the americas, diversification, diversified portfolio, Edward Glaeser, Edward Thorp, endowment effect, Eugene Fama: efficient market hypothesis, experimental economics, financial innovation, Financial Instability Hypothesis, fixed income, floating exchange rates, George Akerlof, Henri Poincaré, Hyman Minsky, implied volatility, impulse control, index arbitrage, index card, index fund, information asymmetry, invisible hand, Isaac Newton, John Meriwether, John Nash: game theory, John von Neumann, joint-stock company, Joseph Schumpeter, Kenneth Arrow, libertarian paternalism, linear programming, Long Term Capital Management, Louis Bachelier, mandelbrot fractal, market bubble, market design, Myron Scholes, New Journalism, Nikolai Kondratiev, Paul Lévy, Paul Samuelson, pension reform, performance metric, Ponzi scheme, prediction markets, pushing on a string, quantitative trading / quantitative finance, Ralph Nader, RAND corporation, random walk, Richard Thaler, risk/return, road to serfdom, Robert Bork, Robert Shiller, Robert Shiller, rolodex, Ronald Reagan, shareholder value, Sharpe ratio, short selling, side project, Silicon Valley, Social Responsibility of Business Is to Increase Its Profits, South Sea Bubble, statistical model, stocks for the long run, The Chicago School, The Myth of the Rational Market, The Predators' Ball, the scientific method, The Wealth of Nations by Adam Smith, The Wisdom of Crowds, Thomas Kuhn: the structure of scientific revolutions, Thomas L Friedman, Thorstein Veblen, Tobin tax, transaction costs, tulip mania, value at risk, Vanguard fund, Vilfredo Pareto, volatility smile, Yogi Berra

Gerd Gigerenzer, Zeno Swijtink, Theodore Porter, Lorraine Daston, John Beatty, Lorenz Krüger, The Empire of Chance: How Probability Changed Science and Everyday Life (Cambridge: Cambridge University Press, 1989), 3–4. 23. A crucial intermediate step between Markowitz and Treynor was James Tobin, Liquidity Preference as Behavior Towards Risk,” Review of Economic Studies 25, no. 1 (1958): 65–86. 24. Jack L. Treynor, “Towards a Theory of Market Value of Risky Assets,” in Asset Pricing and Portfolio Performance; Models, Strategy and Performance Metrics, Robert A. Korajczk, ed. (London: Risk Books, 1999). 25. William F. Sharpe, “A Simplified Model for Portfolio Analysis,” Management Science (Jan. 1963): 281. 26. William F. Sharpe, “Capital Asset Prices: A Theory of Market Equilibrium Under Conditions of Risk,” Journal of Finance (Sept. 1964): 425–42. 27. John Lintner, “The Valuation of Risk Assets and the Selection of Risky Investments in Stock Portfolios and Capital Budgets,” Review of Economics and Statistics (Feb. 1965): 13–37.


pages: 567 words: 122,311

Lean Analytics: Use Data to Build a Better Startup Faster by Alistair Croll, Benjamin Yoskovitz

Airbnb, Amazon Mechanical Turk, Amazon Web Services, Any sufficiently advanced technology is indistinguishable from magic, barriers to entry, Bay Area Rapid Transit, Ben Horowitz, bounce rate, business intelligence, call centre, cloud computing, cognitive bias, commoditize, constrained optimization, en.wikipedia.org, Firefox, Frederick Winslow Taylor, frictionless, frictionless market, game design, Google X / Alphabet X, Infrastructure as a Service, Internet of things, inventory management, Kickstarter, lateral thinking, Lean Startup, lifelogging, longitudinal study, Marshall McLuhan, minimum viable product, Network effects, pattern recognition, Paul Graham, performance metric, place-making, platform as a service, recommendation engine, ride hailing / ride sharing, rolodex, sentiment analysis, skunkworks, Skype, social graph, social software, software as a service, Steve Jobs, subscription business, telemarketer, transaction costs, two-sided market, Uber for X, web application, Y Combinator

, Stage Three: Virality business model comparisons, Model + Stage Drives the Metric You Track depicted, The Lean Analytics Stages and Gates enterprise startups and, Stickiness: Standardization and Integration exercise for, A Summary of the Virality Stage growth hacking, Instrumenting the Viral Pattern instrumenting viral pattern, Timehop Experiments with Content Sharing to Achieve Virality intrapreneurs and, Stickiness: Know Your Real Minimum metrics for, Metrics for the Viral Phase, Instrumenting the Viral Pattern summary of, Causality Hacks the Future Timehop case study, Beyond the Viral Coefficient usage example, What Stage Are You At?, What Stage Are You At? ways things spread, Stage Three: Virality visit frequency metric, Measuring Engagement W Wang, ChenLi, Attacking the Leading Indicator Wardley, Simon, Pitch Success web performance metric, Site Engagement Webtrends tool, What DuProprio Watches Wegener, Jonathan, Beyond the Viral Coefficient WiderFunnel Marketing agency, WineExpress Increases Revenue by 41% Per Visitor Widrich, Leo, Is My Business Model Right? Wikipedia site content creation and interaction, Content Creation and Interaction user-generated content model and, Model Five: User-Generated Content value of created content, Engagement Funnel Changes Williams, Alex, Selling into Enterprise Markets Wilson, Fred, Keywords and Search Terms, Notification Effectiveness, Is Growth at All Costs a Good Thing?


The New Enclosure: The Appropriation of Public Land in Neoliberal Britain by Brett Christophers

Boris Johnson, Capital in the Twenty-First Century by Thomas Piketty, Corn Laws, credit crunch, cross-subsidies, Diane Coyle, estate planning, ghettoisation, Hernando de Soto, housing crisis, income inequality, invisible hand, land reform, land tenure, land value tax, late capitalism, market clearing, Martin Wolf, New Journalism, New Urbanism, off grid, offshore financial centre, performance metric, Philip Mirowski, price mechanism, price stability, profit motive, Right to Buy, Skype, sovereign wealth fund, special economic zone, the built environment, The Wealth of Nations by Adam Smith, Thorstein Veblen, urban sprawl, wealth creators

The NAO set the trend – commissioning a private-sector consultancy, Concerto, to look at ‘best practice in property management’ – and others have dutifully followed suit.1 The abiding principle, says Alan White, is that ‘asset management efficiency must be regularly benchmarked against best performers in the private sector’ (and of course the best performers are always in the private sector).2 Take a look at the GPU’s homepage, and the all-consuming nature of this disciplinary regime is clearly apparent:3 ‘Departments and their arm’s length bodies are required to measure performance on all buildings larger than 500 square metres’; ‘The Civil Estate Property Benchmarking Service measures the performance against private sector benchmarks [and] government targets and standards’; ‘Key Performance Indicators allow reliable, like-for-like comparisons between individual buildings, as well as across property portfolios’ – and so it goes on. White aptly summarizes the implications: ‘Today, every chief executive and senior departmental manager is under no illusion that their performance on effective real estate asset management will be scrutinised and their local authority or departmental performance metrics will be tested and benchmarked.’4 Of course the critical problem with all these collective measures to improve property efficiencies, and thus free up surplus land – and especially with those focused on capital charging (asset rents, and so on), is that they treat public bodies as what they are not: private-sector bodies. This is not to say that ‘efficiency’ is necessarily a misplaced objective for the public sector; it is not.


pages: 515 words: 126,820

Blockchain Revolution: How the Technology Behind Bitcoin Is Changing Money, Business, and the World by Don Tapscott, Alex Tapscott

Airbnb, altcoin, asset-backed security, autonomous vehicles, barriers to entry, bitcoin, blockchain, Blythe Masters, Bretton Woods, business process, buy and hold, Capital in the Twenty-First Century by Thomas Piketty, carbon footprint, clean water, cloud computing, cognitive dissonance, commoditize, corporate governance, corporate social responsibility, creative destruction, Credit Default Swap, crowdsourcing, cryptocurrency, disintermediation, disruptive innovation, distributed ledger, Donald Trump, double entry bookkeeping, Edward Snowden, Elon Musk, Erik Brynjolfsson, Ethereum, ethereum blockchain, failed state, fiat currency, financial innovation, Firefox, first square of the chessboard, first square of the chessboard / second half of the chessboard, future of work, Galaxy Zoo, George Gilder, glass ceiling, Google bus, Hernando de Soto, income inequality, informal economy, information asymmetry, intangible asset, interest rate swap, Internet of things, Jeff Bezos, jimmy wales, Kickstarter, knowledge worker, Kodak vs Instagram, Lean Startup, litecoin, Lyft, M-Pesa, Marc Andreessen, Mark Zuckerberg, Marshall McLuhan, means of production, microcredit, mobile money, money market fund, Network effects, new economy, Oculus Rift, off grid, pattern recognition, peer-to-peer, peer-to-peer lending, peer-to-peer model, performance metric, Peter Thiel, planetary scale, Ponzi scheme, prediction markets, price mechanism, Productivity paradox, QR code, quantitative easing, ransomware, Ray Kurzweil, renewable energy credits, rent-seeking, ride hailing / ride sharing, Ronald Coase, Ronald Reagan, Satoshi Nakamoto, Second Machine Age, seigniorage, self-driving car, sharing economy, Silicon Valley, Skype, smart contracts, smart grid, social graph, social intelligence, social software, standardized shipping container, Stephen Hawking, Steve Jobs, Steve Wozniak, Stewart Brand, supply-chain management, TaskRabbit, The Fortune at the Bottom of the Pyramid, The Nature of the Firm, The Wisdom of Crowds, transaction costs, Turing complete, Turing test, Uber and Lyft, uber lyft, unbanked and underbanked, underbanked, unorthodox policies, wealth creators, X Prize, Y2K, Zipcar

When they do the job as specified, they are instantly paid—perhaps not biweekly but daily, hourly, or in microseconds. As the entity wouldn’t necessarily have an anthropomorphic body, employees might not even know that algorithms are managing them. But they would know the rules and norms for good behavior. Given that the smart contract could encode the collective knowledge of management science and that their assignments and performance metrics would be transparent, people could love to work. Customers would provide feedback that the enterprise would apply dispassionately and instantly to correct course. Shareholders would receive dividends, perhaps frequently, as real-time accounting would obviate the need for year-end reports. The organization would perform all these activities under the guidance and incorruptible business rules that are as transparent as the open source software that its founders used to set it in motion.


pages: 520 words: 134,627

Unacceptable: Privilege, Deceit & the Making of the College Admissions Scandal by Melissa Korn, Jennifer Levitz

"side hustle", affirmative action, barriers to entry, blockchain, call centre, Donald Trump, Gordon Gekko, helicopter parent, high net worth, Jeffrey Epstein, Maui Hawaii, medical residency, Menlo Park, performance metric, rolodex, Ronald Reagan, Sand Hill Road, Saturday Night Live, side project, Silicon Valley, Snapchat, stealth mode startup, Steve Jobs, telemarketer, Thorstein Veblen, unpaid internship, upwardly mobile, yield management, young professional, zero-sum game

Though the call center was set in a modern technology park, leading a business in the fast-growing but poor country came with operational obstacles. No public transportation went to the call center, so managers ran vans back and forth to Bangalore between shifts. The roads were bad and the vans were constantly breaking down, preventing people from getting to work. Sometimes comical cultural challenges abounded. Managers were judged on performance metrics, including how fast employees dealt with calls. But many Indians have long, complicated names, and trying to relay them to customers would waste valuable seconds, Singh recalls. When managers told employees to pick American names, the workers wound up using the same handful of well-known monikers. That would waste even more time as surprised customers learned they were on the phone with “Tom Cruise” or “Johnny Walker.”


pages: 475 words: 134,707