chief data officer

16 results back to index


pages: 374 words: 94,508

Infonomics: How to Monetize, Manage, and Measure Information as an Asset for Competitive Advantage by Douglas B. Laney

3D printing, Affordable Care Act / Obamacare, banking crisis, blockchain, business climate, business intelligence, business process, call centre, chief data officer, Claude Shannon: information theory, commoditize, conceptual framework, crowdsourcing, dark matter, data acquisition, digital twin, discounted cash flows, disintermediation, diversification, en.wikipedia.org, endowment effect, Erik Brynjolfsson, full employment, informal economy, intangible asset, Internet of things, linked data, Lyft, Nash equilibrium, Network effects, new economy, obamacare, performance metric, profit motive, recommendation engine, RFID, semantic web, smart meter, Snapchat, software as a service, source of truth, supply-chain management, text mining, uber lyft, Y2K, yield curve

In addition, there have been sightings out there in the wild of “data journalists,” “algorithm librarians,” “information attorneys,” “digital ethicists,” and even “digital prophets,” “hackers in residence,” and rumors of an ominously titled “lord of dark data” somewhere out there. The Chief Data Officer: Foresight, Not Fad The chief data officer role is foresight, not fad. To demonstrate, let’s start with looking at the path to the emergence of the chief data officer role itself. I often speak to individuals including other executives who scoff at the notion of needing another chief somethingorother. Advances in business and management science have always required new kinds of specialist leaders.

—Judd Williams, Chief Information Officer, NCAA Laney’s work redefines information as a true strategic asset, and shows how we CDOs can be instrumental in unlocking new ways for companies to grow and be relevant in the new connected modern economy. —Rajeev Kapur, Chief Data Officer, Kimberly-Clark We will one day look back at Doug’s work and say, it is the groundbreaking work that firmly put data and data leadership in the middle of the business arena not as the white elephant, but as the phoenix: a formal player at the boardroom table. —Althea Davis, Chief Data Officer, ABN AMRO Doug Laney has put together a smart, practical book that applies traditional rules of business economics to the emerging information marketplace.

—Gokula Mishra, Senior Director, Global Data & Analytics, Supply Chain, McDonald’s Corporation Through a myriad of relevant examples, Doug successfully brings together data management, analytics, and economics in a book that offers practical guidelines to manage, improve, and monetize an organization’s data assets. The book is not only a must read for Chief Data Officers, but for any other executive interested in succeeding in the Information Age. —Leandro Dallemule, Chief Data Officer, AIG Thank you, Doug, for an engaging read and for giving Data a well-deserved “seat at the table.” This is a must have book, not only for CDOs, CIOs, and Data Strategists but, for any executive interested in creating a data driven, info-saavy company.


pages: 292 words: 85,151

Exponential Organizations: Why New Organizations Are Ten Times Better, Faster, and Cheaper Than Yours (And What to Do About It) by Salim Ismail, Yuri van Geest

23andMe, 3D printing, Airbnb, Amazon Mechanical Turk, Amazon Web Services, augmented reality, autonomous vehicles, Baxter: Rethink Robotics, Ben Horowitz, bioinformatics, bitcoin, Black Swan, blockchain, Burning Man, business intelligence, business process, call centre, chief data officer, Chris Wanstrath, Clayton Christensen, clean water, cloud computing, cognitive bias, collaborative consumption, collaborative economy, commoditize, corporate social responsibility, cross-subsidies, crowdsourcing, cryptocurrency, dark matter, Dean Kamen, dematerialisation, discounted cash flows, disruptive innovation, distributed ledger, Edward Snowden, Elon Musk, en.wikipedia.org, Ethereum, ethereum blockchain, game design, Google Glasses, Google Hangouts, Google X / Alphabet X, gravity well, hiring and firing, Hyperloop, industrial robot, Innovator's Dilemma, intangible asset, Internet of things, Iridium satellite, Isaac Newton, Jeff Bezos, Joi Ito, Kevin Kelly, Kickstarter, knowledge worker, Kodak vs Instagram, Law of Accelerating Returns, Lean Startup, life extension, lifelogging, loose coupling, loss aversion, low earth orbit, Lyft, Marc Andreessen, Mark Zuckerberg, market design, means of production, minimum viable product, natural language processing, Netflix Prize, NetJets, Network effects, new economy, Oculus Rift, offshore financial centre, PageRank, pattern recognition, Paul Graham, paypal mafia, peer-to-peer, peer-to-peer model, Peter H. Diamandis: Planetary Resources, Peter Thiel, prediction markets, profit motive, publish or perish, Ray Kurzweil, recommendation engine, RFID, ride hailing / ride sharing, risk tolerance, Ronald Coase, Second Machine Age, self-driving car, sharing economy, Silicon Valley, skunkworks, Skype, smart contracts, Snapchat, social software, software is eating the world, speech recognition, stealth mode startup, Stephen Hawking, Steve Jobs, subscription business, supply-chain management, TaskRabbit, telepresence, telepresence robot, Tony Hsieh, transaction costs, Travis Kalanick, Tyler Cowen: Great Stagnation, uber lyft, urban planning, WikiLeaks, winner-take-all economy, X Prize, Y Combinator, zero-sum game

One example: large software implementations, such as ERP systems, are being replaced to a certain degree by specialized SaaS startups that align horizontally with other software offerings via open APIs. As ExOs scale beyond their traditional boundaries, the number of integration and data handoff points is set to explode, making fault traceability increasingly difficult. CDO – Chief Data Officer Brad Peters, co-founder and chairman of Birst and a columnist at Forbes.com, has defined the chief data officer as a newest C-Level profession. Throughout the course of this book we’ve mentioned data extensively: billions of sensors churning out data for algorithms, Big Data solutions, data-driven decisions and value (or Lean) metrics. All organizations today have a dire need to manage and make sense of all this data and to somehow do so without breaching privacy and security laws and customer trust.

Step 2: Join or Create Relevant MTP Communities Step 3: Compose a Team Step 4: Breakthrough Idea Step 5: Build a Business Model Canvas Step 6: Find a Business Model Step 7: Build the MVP Step 8: Validate Marketing and Sales Step 9: Implement SCALE and IDEAS Step 10: Establish the Culture Step 11: Ask Key Questions Periodically Step 12: Building and Maintaining a Platform In Concert Lessons for Enterprise ExOs (EExOs) Chapter Seven: ExOs and Mid-Market Companies Example 1: TED Example 2: GitHub Example 3: Coyote Logistics Example 4: Studio Roosegaarde Retrofitting an ExO Example 5: GoPro Chapter Eight: ExOs for Large Organizations Transform Leadership Education Board Management Implement Diversity Skills and Leadership Partner with, Invest in or Acquire ExOs Disrupt[X]—set up Edge ExOs Inspire ExOs at the Edge Hire a Black Ops Team Copy Google[X] Partner with Accelerators, Incubators and Hackerspaces ExO Lite (The Gentle Cycle) Migrate towards an MTP Community & Crowd Algorithms Engagement Dashboards Experimentation Social Technologies Conclusion Chapter Nine: Big Companies Adapt The Coca-Cola Company – Exponential Pop Haier – Higher and Higher Xiaomi – Showing You and Me The Guardian – Guarding Journalism General Electric – General Excellence Amazon – Clearing the Rainforest of “No” Zappos – Zapping Boredom ING Direct Canada (now Tangerine) – BankING Autonomy Google Ventures – The Almost Perfect EExO Growing with the Crowd Chapter Ten: The Exponential Executive CEO – Chief Executive Officer CMO – Chief Marketing Officer CFO – Chief Financial Officer CTO/CIO – Chief Technology Officer/Chief Information Officer CDO – Chief Data Officer CIO – Chief Innovation Officer COO – Chief Operating Officer CLO - Chief Legal Officer CHRO - Chief Human Resources Officer The World’s Most Important Job Epilogue: A New Cambrian Explosion Afterword Appendix A: What is your Exponential Quotient? Appendix B: Sources and Inspirations About the Authors Acknowledgements Foreword Welcome to a time of exponential change, the most amazing time ever to be alive.

The first Quantified Self conference was held in May 2011, and today the QS community has more than 32,000 members in thirty-eight countries. Many new devices have been spun out of this movement. One of them is Spire, a QS device that measures respiration. Singularity University alumnus Francesco Mosconi is the chief data officer of Spire. The analytics and software he has written are all about real-time feedback regarding breath and how it relates to stress and focus—not unlike the way sensor feedback in a BMW’s traction control system reduces wheel slip. With more than seven billion mobile phones in use globally, many equipped with a high-resolution camera, anything and everything can be recorded in real time, from a baby’s first words to the events of the Arab Spring.


pages: 161 words: 39,526

Applied Artificial Intelligence: A Handbook for Business Leaders by Mariya Yao, Adelyn Zhou, Marlene Jia

Airbnb, algorithmic bias, Amazon Web Services, artificial general intelligence, autonomous vehicles, backpropagation, business intelligence, business process, call centre, chief data officer, computer vision, conceptual framework, en.wikipedia.org, future of work, industrial robot, Internet of things, iterative process, Jeff Bezos, job automation, Marc Andreessen, natural language processing, new economy, pattern recognition, performance metric, price discrimination, randomized controlled trial, recommendation engine, robotic process automation, self-driving car, sentiment analysis, Silicon Valley, skunkworks, software is eating the world, source of truth, speech recognition, statistical model, strong AI, technological singularity, The future is already here

Retrieved from http://futurism.com/amazons-ceo-sayswere-living-in-the-golden-age-of-ai/ (45) Zilis, S. & Cham, J. (2016, November 7) The Current State of Machine Intelligence 3.0. O’Reilly. Retrieved from https://www.oreilly.com/ideas/the-current-state-of-machine-intelligence-3-0 (46) Leaper, B. (2014, July). The Rise of the Chief Data Officer. Wired. Retrieved from https://www.wired.com/insights/2014/07/rise-chief-data-officer/ (47) Purdy, M., & Daugherty, P. (2016) Artificial intelligence is the future of growth.(2016). Retrieved from https://www.accenture.com/ca-en/insight-artificialintelligence-future-growth (48) Ng, A. (2017, April 21). Hiring Your First Chief AI Officer.

Regardless of whether they are your primary AI champion, CIOs will likely play a vital role in implementing AI in an organization due to the need to develop and integrate infrastructure to support AI. ML systems and data mining systems require complex storage, networking, and computing systems that will require the CIO’s input to implement in many enterprises. CDO Since data touches all aspects of enterprises, Chief Data Officers (CDOs) are becoming increasingly common,(46) but their mandate is more often the security, regulation, and governance of enterprise data. Depending on their focus, they typically report to CIOs, CFOs, Chief Risk Officers (CRO), or Chief Security Officers (CSO). Companies that have the CDO report directly to the CEO tend to value data and analytics more highly than those that don’t.

For example, Product Management, Sales, and Marketing all want chatbot data on customer interactions, but there is no clear owner when the results are important to all three units. As a result of the need to manage data that is increasing in scope and complexity and being generated by multiple business units across an organization, new jobs specializing in the care and feeding of shared data have appeared. Chief Data Officer (CDOs) and Chief Data Scientist positions are now becoming common in companies, especially those interested in championing new AI investments. For companies that do not have the capacity or the desire to tackle data silos on their own, companies like Maana, Alation, and Tamr offer ML-powered data unification and cataloguing services.


Big Data at Work: Dispelling the Myths, Uncovering the Opportunities by Thomas H. Davenport

Automated Insights, autonomous vehicles, bioinformatics, business intelligence, business process, call centre, chief data officer, cloud computing, commoditize, data acquisition, disruptive innovation, Edward Snowden, Erik Brynjolfsson, intermodal, Internet of things, Jeff Bezos, knowledge worker, lifelogging, Mark Zuckerberg, move fast and break things, move fast and break things, Narrative Science, natural language processing, Netflix Prize, New Journalism, recommendation engine, RFID, self-driving car, sentiment analysis, Silicon Valley, smart grid, smart meter, social graph, sorting algorithm, statistical model, Tesla Model S, text mining, Thomas Davenport

We’re working on traditional insurance analytics issues like pricing optimization, and some exotic big data problems in collaboration with MIT. It was and will continue to be an integrated approach.”5 We’re already beginning to see more roles of this type, with a ­variety of specific titles. One variation is the chief data officer (CDO) role, which is pretty common in large banks. In ­principle, I think Chapter_06.indd 142 03/12/13 12:24 PM What It Takes to Succeed with Big Data   143 it is a fine idea to combine the responsibility for data m ­ anagement and governance with the application of data—that is, ­ analytics.

The wider the company’s reach, the greater the role of big data in its strategy. The subject of corporate information—its integration, its quality, and its growing volumes—was a natural by-product of executive-level conversations around regulatory and competitive demands. In 2010, the company established a Chief Data Office. Shortly thereafter, Chapter_08.indd 187 03/12/13 12:57 PM 188 big data @ work the company downloaded Hadoop and began reengineering computation-heavy data transformations using the big data ­ ­environment. A major focus of the Hadoop implementation is cost reduction. The firm’s plans include expanding that environment to refine its understanding of customer relationships and behaviors.

See analytics business intelligence (BI), 7, 10, 10t, 14, 18, 23, 93, 102, 124, 128, 129, 130 business models, 41–42, 57, 168, 173, 188 business-to-business (B2B) firms, 42t, 43, 45–46 business-to-business-to-consumer (B2B2C) firms, 43, 46 business view, in big data stack, 119t, 123 Caesars Entertainment, 42, 179–180 Cafarella, Mike, 157 Capital One, 42 Carolinas HealthCare, 121, 122 cars driving data on, 52, 198 self-driving, 35, 41, 42, 65, 83, 148 Carter, John, 143 casino industry, 42, 179–180 Charles Schwab Corporation, 143 chief analytics officers, 143, 202 chief data officers (CDOs), 142–143 chief science officers, 142 Chrysler, 83 CIA, 19 Cisco Systems, 47 Citigroup, 187–188 Index.indd 219 Cleveland, William S., 195 cloud-based computing, 55, 89, 117, 163, 169, 192, 200, 208 Cloudera Hadoop, 115 commitment, culture of, 148 communication skills, 88, 92, 93, 99, 102–103 Competing on Analytics (Davenport and Harris), 2, 43 Compute Engine, 163 Concept 2, 12 conservative approach to big data ­adoption, 80, 81 consultants, data scientists as, 81, 98–99, 103–104, 112, 209 consumer products companies, 42, 42t, 43, 46, 54, 71, 82 Consumers Union, 67 Corporate Insight, 109 cost-reduction, 21, 60–63, 145 Coursera, 41 cows, data from, 11–12 credit card data, 37, 38, 42, 42t, 46, 164 culture for big data in organizations, 147–149, 152 customer relationship management (CRM), 54, 129f customers banking industry and, 9, 44, 49, 133 big data’s effect on relationships with, 26–27 business-to-business (B2B) firms and, 43, 45–46 business-to-business-to-consumer (B2B2C) firms and, 43, 46 data-based products and services for, 16, 23–24, 26, 66, 106, 155, 195 as focus of big data efforts, 16 future scenario of big data’s effect on relationships with, 35–38, 41–42, 58 identification of dissatisfaction and possible attrition of, 23, 48, 67, 68, 72, 78, 96, 179, 180, 181, 191 intermediaries reporting information about, 46 managers’ attention to, 21 marketing efforts targeted to, 27, 55, 63–64, 65, 67, 72, 79, 107, 108–109, 128, 142, 144, 179, 180, 197 media and entertainment firms and, 48, 49 03/12/13 2:04 PM 220 Index customers (continued) multichannel relationships with, 51, 67, 177, 186 Netflix Prize’s focus on, 16, 22, 66 overachievers and, 42, 42t, 46 regulatory environment for data from, 27 research on website behavior of, 164 sentiment analysis of, 17, 27, 107, 118, 123 service transaction histories from, 23 sharing data with, 167–168 social media and, 48, 50–51, 107 travel industry and, 75–76 underachievers and, 42t, 43–44 unstructured data from, 51, 67, 68, 69, 180, 186 volume of data warehoused from, 116–117, 168 Cutting, Doug, 157 CycleOps, 12 dashboards, 109, 128, 129, 130, 137, 167, 185, 198 data in big data stack, 119t, 121–122 success of big data initiatives and, 136–138 data disadvantaged organizations, 42t, 43 data discovery process big data strategy and, 70–72, 74–75, 75f, 84 enterprise orientation for, 139 focus of architecture on, 20, 201 GE’s experience with, 75 leadership and, 140 management orientation toward, 18–19 model generation for, 64 moderately aggressive approach to big data and, 82 objectives and, 75, 75f, 84 research on, 3 responsibility locus for, 76–77, 77f technical platform for, 131, 201 Data Lab product, 160 data mining, 122–123, 128, 183, 184 data production process big data strategy and, 70, 72–75, 75f, 84 data scientists and teams and, 201 enterprise orientation for, 139 Index.indd 220 GE’s experience with, 74–75 highly ambitious approach to big data and, 83 moderately aggressive approach to big data and, 82 objectives and, 75, 75f, 84 responsibility locus for, 76–77, 77f technical platform for, 74, 127, 129–130, 132, 133, 201 Data Science Central, 97 data scientists activities performed by, 15, 137–138, 148, 159–160, 199 analysts differentiated from, 15 background to, 86–87, 196–197 business expert traits of, 88 classic model of, 87–97 collaboration by, 165–167, 173, 176 development of products and services and, 16, 18, 20, 24, 61–62, 65, 66, 71, 79–80, 106, 161 education and training of, 14, 91, 92, 104, 184, 209 future for, 110–111 hacker traits of, 88–91 horizontal versus vertical, 97–99 job growth for, 111, 111f, 184–185 in large companies, 201 LinkedIn’s use of, 158, 160, 161 motivation of, 106 organizational structure with, 16, 61, 82, 140, 141, 142, 152, 153, 158, 173, 180, 187, 202, 207, 209 quantitative analyst traits of, 88, 93–97 research on, 3 retention of, 104–106, 112, 161 role of, 14, 209 scientist traits of, 88, 91–92 skills of, 71, 79, 88, 145, 147, 182–184, 185 sources of, for hiring, 101–105 start-ups using, 16, 157–158 team approach using, 99–101, 165–167, 181, 201, 209 traits of, 87, 88 trusted adviser traits of, 88, 92–93 data visualization, 124–125, 125f Davis, Jim, 163–164 DB2, 183.


pages: 296 words: 66,815

The AI-First Company by Ash Fontana

23andMe, Amazon Mechanical Turk, Amazon Web Services, autonomous vehicles, barriers to entry, blockchain, business intelligence, business process, business process outsourcing, call centre, chief data officer, Clayton Christensen, cloud computing, combinatorial explosion, computer vision, crowdsourcing, data acquisition, DevOps, en.wikipedia.org, independent contractor, industrial robot, inventory management, John Conway, knowledge economy, Kubernetes, Lean Startup, minimum viable product, natural language processing, Network effects, optical character recognition, Pareto efficiency, performance metric, price discrimination, recommendation engine, Ronald Coase, software as a service, source of truth, speech recognition, the scientific method, transaction costs, yield management

Specifically, the degree of difference in managing engineers and managing data scientists depends on the specific role in the AI-First team. Data infrastructure engineer: little difference. Managed by those that otherwise manage engineers. Data engineer: some difference. Managed by those that otherwise manage engineers but may require coordination by those managing a company’s data assets, such as a chief data officer. Data analyst: little difference. Management by analytics or business intelligence leaders, or by a general manager within a business unit. Data scientist: different. Management by nonanalytics leaders is difficult because the work is more experimental and involves advanced analytical methods.

Designing an organizational structure that positions the best data science and ML staff close to business units makes it an AI-First company. The choice to make is where to sit on the spectrum of centralized to decentralized. Centralized AI teams have an executive-level leader, such as a chief data officer, who manages all of the data science, analytics, and ML people in the company. That CDO collaborates with the chief technology officer (CTO) and the chief information officer (CIO) to decide which data infrastructure to use. Requests for data, reports, analytical tools, and predictive models go to this central unit, and the unit decides which requests to fulfill.

A/B test, 271 accessibility of data, 72, 107 accuracy, 175, 203–4 in proof of concept phase, 59–60 active learning-based systems, 94–95 acyclic, 150, 271 advertising, 227, 240 agent-based models (ABMs), 103–5, 271 simulations versus, 105 aggregated data, 81, 83 aggregating advantages, 222–65 branding and, 255–56 data aggregation and, 241–45 on demand side, 225 disruption and, 239–41 first-mover advantage and, 253–55 and integrating incumbents, 244–45 and leveraging the loop against incumbents, 256–61 positioning and, 245–56 ecosystem, 251–53 staging, 249–51 standardization, 247–51 storage, 246–47 pricing and, 236–39 customer data contribution, 237 features, 238–39 transactional, 237, 281 updating, 238 usage-based, 237–38, 281 on supply side, 224–25 talent loop and, 260–61 traditional forms of competitive advantage versus, 224–25 with vertical integration, see vertical integration aggregation theory, 243–44, 271 agreement rate, 216 AI (artificial intelligence), 1–3 coining of term, 5 definitions and analogies regarding, 15–16 investment in, 7 lean, see Lean AI AI-First Century, 3 first half of (1950–2000), 3–9 cost and power of computers and, 8 progression to practice, 5–7 theoretical foundations, 4–5 second half of (2000–2050), 9 AI-First companies, 1, 9, 10, 44 eight-part framework for, 10–13 learning journey of, 44–45 AI-First teams, 127–42 centralized, 138–39 decentralized, 139 management of, 135–38 organization structure of, 138–39 outsourcing, 131 support for, 134–35 when to hire, 130–32 where to find people for, 133 who to hire, 128–30 airlines, 42 Alexa, 8, , 228, 230 algorithms, 23, 58, 200–201 evolutionary, 150–51, 153 alliances of corporate and noncorporate organizations, 251 Amazon, 34, 37, 84, 112, 226 Alexa, 8, 228, 230 Mechanical Turk, 98, 99, 215 analytics, 50–52 anonymized data, 81, 83 Apple, 8, 226 iPhone, 252 application programming interfaces (APIs), 86, 118–22, 159, 172, 236, 271 applications, 171 area underneath the curve (AUC), 206, 272 artificial intelligence, see AI artificial neural network, 5 Atlassian Corporation, 243 augmentation, 172 automation versus, 163 availability of data, 72–73 Babbage, Charles, 2 Bank of England, 104–5 Bayesian networks, 150, 201 Bengio, Yoshua, 7 bias, 177 big-data era, 28 BillGuard, 112 binary classification, 204–6 blockchain, 109–10, 117, 272 Bloomberg, 73, 121 brain, 5, 15, 31–32 shared, 31–33 branding, 256–57 breadth of data, 76 business goal, in proof of concept phase, 60 business software companies, 113 buying data, 119–22 data brokers, 119–22 financial, 120–21 marketing, 120 car insurance, 85 Carnegie, Andrew, 226 cars, 6, 254 causes, 145 census, 118 centrifugal process, 49–50 centripetal process, 50 chess, 6 chief data officer (CDO), 138 chief information officer (CIO), 138 chief technology officer (CTO), 139 Christensen, Clay, 239 cloud computing, 8, 22, 78–79, 87, 242, 248, 257 Cloudflare, 35–36 clustering, 53, 64, 95, 272 Coase, Ronald, 226 compatibility, 251–52 competitions, 117–18 competitive advantages, 16, 20, 22 in DLEs, 24, 33 traditional forms of, 224–25 see also aggregating advantages complementarity, 253 complementary data, 89, 124, 272 compliance concerns, 80 computer chips, 7, 22, 250 computers, 2, 3, 6 cost of, 8–9 power of, 7, 8, 19, 22 computer vision, 90 concave payoffs, 195–98, 272 concept drift, 175–76, 272 confusion matrix, 173–74 consistency, 256–57 consultants, 117–18, 131 consumer apps, 111–13, 272 consumer data, 109–14 apps, 111–13 customer-contributed data versus, 109 sensor networks, 113–14 token-based incentives for, 109–10 consumer reviews, 29, 43 contractual rights, 78–82 clean start advantage and, 78–79 negotiating, 79 structuring, 79–82 contribution margin, 214, 272 convex payoffs, 195–97, 202, 272 convolutional neural networks (CNNs), 151, 153 Conway, John, 104 cost of data labeling, 108 in ML management, 158 in proof of concept phase, 60 cost leadership, 272 DLEs and, 39–41 cost of goods sold (COGS), 217 crawling, 115–16, 281 Credit Karma, 112 credit scores, 36–37 CRM (customer relationship management), 159, 230–31, 255, 260, 272 Salesforce, 159, 212, 243, 248, 258 cryptography, 272 crypto tokens, 109–10, 272 CUDA, 250 customer-generated data, 77–91 consumer data versus, 109 contractual rights and, 78–82 clean start advantage and, 78–79 negotiating, 79 structuring, 79–82 customer data coalitions, 82–84 data integrators and, 86–89 partnerships and, 89–91 pricing and, 237 workflow applications for, 84–86 customers costs to serve, 242 direct relationship with, 242 needs of, 49–50 customer support agents, 232, 272 customer support tickets, 260, 272 cybernetics, 4, 273 Dark Sky, 112, 113 DARPA (Defense Advanced Research Projects Agency), 5 dashboards, 171 data, 1, 8, 69, 273 aggregation of, 241–45 big-data era, 28 complementary, 89 harvesting from multiple sources, 57 incomplete, 178 information versus, 22–23 missing sources of, 177 in proof of concept phase, 60 quality of, 177–78 scale effects with, 22 sensitive, 57 starting small with, 56–58 vertical integration and, 231–32 data acquisition, 69–126, 134 buying data, 119–22 consumer data, 109–14 apps, 111–13 customer-contributed data versus, 109 sensor networks, 113–14 token-based incentives for, 109–10 customer-generated data, see customer-generated data human-generated data, see human-generated data machine-generated data, 102–8 agent-based models, 103–5 simulation, 103–4 synthetic, 105–8 partnerships for, 89–91 public data, 115–22 buying, see buying data consulting and competitions, 117–18 crawling, 115–16, 281 governments, 118–19 media, 118 valuation of, 71–77 accessibility, 72, 107 availability, 72–73 breadth, 76 cost, 73 determination, 74–76 dimensionality, 75 discrimination, 72–74 fungibility, 74 perishability and relevance, 74–75, 201 self-reinforcement, 76 time, 73–74 veracity, 75 volume of, 76–77 data analysts, 128–30, 132, 133, 137, 273 data as a service (DaaS), 116, 120 databases, 258 data brokers, 119–22 financial, 120–21 marketing, 120 data cleaning, 162–63 data distribution drift, 178 data drift, 176, 273 data-driven media, 118 data engineering, 52 data engineers, 128–30, 132, 133, 137, 161, 273 data exhaust, 80, 257–58, 273 data infrastructure engineers, 129–32, 137, 273 data integration and integrators, 86–90, 276 data labeling, 57, 58, 92–100, 273 best practices for, 98 human-in-the-loop (HIL) systems, 100–101, 276 management of, 98–99 measurement in, 99–100 missing labels, 178 outsourcing of, 101–2 profitability metrics and, 215–16 tools for, 93–97 data lake, 57, 163 data learning effects (DLEs), 15–47, 48, 69, 222, 273 competitive advantages of, 24, 33 data network effects, 19, 26–33, 44, 273 edges of, 24 entry-level, 26–29, 31–33, 36–37, 274 network effects versus, 24–25 next-level, 26–27, 29–33, 36–37, 278 what type to build, 33 economies of scale in, 34 formula for, 17–20 information accumulation and, 21 learning effects and, 20–21 limitations of, 21, 42–43 loops around, see loops network effects and, 24–26 powers of, 34–42 compounding, 36–38 cost leadership, 39–41 flywheels, 37–38 price optimization, 41–42 product utility, 35–36 winner-take-all dynamics, 34–35 product value and, 39 scale effects and, 21–23 variety and, 34–35 data learning loops, see loops data lock-in, 247–48 data networks, 109–10, 143–44, 273 normal networks versus, 26 underneath products, 25–26 data pipelines, 181, 216 breaks in, 87, 181 data platform, 57 data processing capabilities (computing power), 7, 8, 19, 22 data product managers, 129–32, 274 data rights, 78–82, 246 data science, 52–56 decoupling software engineering from, 133 data scientists, 54–56, 117, 128–30, 132–39, 161, 274 data stewards, 58, 274 data storage, 57, 81, 246–47, 257 data validators, 161 data valuation, 71–77 accessibility in, 72, 107 availability in, 72–73 breadth in, 76 cost in, 73 determination in, 74–76 dimensionality in, 75 discrimination in, 72–74 fungibility in, 74 perishability and relevance in, 74–75, 201 self-reinforcement in, 76 time in, 73–74 veracity in, 75 decision networks, 150, 153 decision trees, 149–50, 153 deduction and induction, 49–50 deep learning, 7, 147–48, 274 defensibility, 200, 274 defensible assets, 25 Dell, Michael, 226 Dell Technologies, 226 demand, 225 denial-of-service (DoS) attacks, 36 designers, 129 differential privacy, 117, 274 dimensionality reduction, 53, 274 disruption, 239–41 disruption theory, 239, 274 distributed systems, 8, 9 distribution costs, 243 DLEs, see data learning effects DoS (denial-of-service) attacks, 36 drift, 175–77, 203, 274 concept, 175–76 data, 176 minimizing, 201 e-commerce, 29, 31, 34, 37, 41, 84 economies of scale, 19, 34 ecosystem, 251–52 edges, 24, 274 enterprise resource planning (ERP), 161, 250, 274 entry-level data network effects, 26–29, 31–33, 36–37, 274 epochs, 173, 275 equity capital, 230 ETL (extract, transform, and load), 58, 275 evolutionary algorithms, 150–51, 153 expected error reduction, 96 expected model change, 96 Expensify, 85–86 Facebook, 25, 43, 112, 119, 122 features, 63–64, 145, 275 finding, 64–65 pricing and, 238–39 federated learning, 117, 275 feedback data, 36, 199–200 feed-forward networks, 151, 153 financial data brokers for, 120–21 stock market, 72, 74, 120–21 first-movers, 253–55, 275 flywheels, 37–38, 243–44 Ford, Henry, 49 fungibility of data, 74 Game of Life, 104 Gaussian mixture model, 275 generative adversarial networks (GANs), 152, 153 give-to-get model, 36 global multiuser models, 275 glossary, 271–82 Google, 111–12, 115, 195, 241, 251, 253–54 governments, 118–19 gradient boosted tree, 53, 275 gradient descent, 208 graph, 275 Gulf War, 6 hedge funds, 227 heuristics, 139, 231, 275 Hinton, Geoffrey, 7 histogram, 53, 275 holdout data, 199 horizontal products, 210–12, 276 HTML (hypertext markup language), 116, 276 human-generated data, 91–102 data labeling in, 57, 58, 92–100, 273 best practices for, 98 human-in-the-loop (HIL) systems, 100–101, 276 management of, 98–99 measurement in, 99–100 missing labels, 178 outsourcing of, 101–2 profitability metrics and, 215–16 tools for, 93–97 human learning, 16–17 hyperparameters, 173, 276 hypertext markup language (HTML), 116, 276 IBM (International Business Machines), 5–8, 255 image recognition, 76–77, 146 optical character, 72, 278 incumbents, 276 integrating, 245–46 leveraging the loop against, 256–61 independent software vendors (ISVs), 161, 248, 276 induction and deduction, 49–50 inductive logic programming (ILP), 149, 153 Informatica, 86 information, 1, 2, 276 data versus, 22–23 informational leverage, 3 Innovator’s Dilemma, The (Christensen), 239 input cost analysis, 215–16 input data, 199 insourcing, 102, 276 integration, 86–90, 276 predictions and, 171 testing, 174 integrations-first versus workflow-first companies, 88–89 intellectual leverage, 3 intellectual property (IP), 25, 251 intelligence, 1, 2, 5, 15, 16 artificial, see AI intelligent applications, 257–60, 276 intelligent systems, 19 interaction frequency, 197 interactive machine learning (IML), 96–97, 276 International Telecommunications Union (ITU), 250–51 Internet, 8, 19, 32, 69, 241–42, 244 inventory management software, 260 investment firms, 232 iPhone, 252 JIRA, 243 Kaggle, 9, 56, 117 Keras, 251 k-means, 276 knowledge economy, 21 Kubernetes, 251 language processing, 77, 94 latency, 158 layers of neurons, 7, 277 Lean AI, 48–68, 277 customer needs and, 49–50 decision tree for, 50–52 determining customer need for AI, 50–60 data and, 56–58 data science and, 54–56 sales and, 58–60 statistics and, 53–54 lean start-up versus, 61–62 levels in, 65–66 milestones for, 61 minimum viable product and, 62–63 model features lean start-ups, 61–62 learning human formula for, 16–17 machine formula for, 17–20 learning effects, 20–22, 277 moving beyond, 20–21 legacy applications, 257, 277 leverage, 3 linear optimization, 42 LinkedIn, 122 loans, 35, 37, 227 lock-in, 247–48 loops, 187–221, 273 drift and, 201 entropy and, 191–92 good versus bad, 191–92 metrics for measuring, see metrics moats versus, 187–88, 192–94 physics of, 190–92 prediction and, 202–3 product payoffs and, 195–98, 202 concave, 195–98 convex, 195–97, 202 picking the product to build, 198 repeatability in, 188–89 scale and, 198–201, 203 and data that doesn’t contribute to output, 199–200 loss, 207–8, 277 loss function, 275, 277 machine-generated data, 102–8 agent-based models, 103–5 simulation, 103–4 synthetic, 105–8 machine learning (ML), 9, 145–47, 277 types of, 147–48 machine learning engineers, 39, 56, 117–18, 129, 130, 132, 138, 139, 161, 277 machine learning management loop, 277 machine learning models (ML models), 9, 26, 27, 31, 52–56, 59, 61, 134 customer predictions and, 80–81 features of, 61, 63 machine learning models, building, 64–65, 143–54 compounding, 148–52 diverse disciplines in, 149–51 convolutional neural networks in, 151, 153 decision networks in, 150, 153 decision trees in, 149–50, 153 defining features, 146–47 evolutionary algorithms in, 150–51, 153 feed-forward networks in, 151, 153 generative adversarial networks in, 152, 153 inductive logic programming in, 149, 153 machine learning in, 151–52 primer for, 145–47 recurrent neural networks in, 151, 153 reinforcement learning in, 152, 153 statistical analysis in, 149, 153 machine learning models, managing, 155–86 acceptance, 157, 162–66 accountability and, 164 and augmentation versus automation, 163 budget and, 164 data cleaning and, 162–63 distribution and, 165 executive education and, 165–66 experiments and, 165 explainability and, 166 feature development and, 163 incentives and, 164 politics and, 163–66 product enhancements and, 165 retraining and, 163 and revenues versus costs, 164 schedule and, 163 technical, 162–63 and time to value, 164 usage tracking and, 166 decentralization versus centralization in, 156 experimentation versus implementation in, 155 implementation, 158–66 data, 158–59 security, 159–60 sensors, 160 services, 161 software, 159 staffing, 161–162 loop in, 156, 166–81 deployment, 171–72 monitoring, see monitoring model performance training, 168–69 redeploying, 181 reproducibility and, 170 rethinking, 181 reworking, 179–80 testing, 172–74 versioning, 169–70, 281 ROI in, 164, 176, 181 testing and observing in, 156 machine learning researchers, 129–34, 135–36, 138, 277 management of AI-First teams, 135–38 of data labeling, 98–99 of machine learning models, see machine learning models, managing manual acceptance, 208–9 manufacturing, 6 marketing, customer data coalitions and, 83 marketing segmentation, 277 McCulloch, Warren, 4–5 McDonald’s, 256 Mechanical Turk, 98, 99, 215 media, 118 medical applications, 90–91, 145, 208 metrics, 203 measurement, 203–9 accuracy, 203–4 area underneath the curve, 206, 272 binary classification, 204–6 loss, 207–8 manual acceptance, 208–9 precision and recall, 206–7 receiver operating characteristic, 205–6, 279 usage, 209 profitability, 209–18 data labeling and, 215–16 data pipes and, 216 input cost analysis, 215–16 research cost analysis, 217–18 unit analysis, 213–14 and vertical versus horizontal products, 210–12 Microsoft, 8, 247 Access, 257 Outlook, 252 military, 6, 7 minimum viable product (MVP), 62–63, 277 MIT (Massachusetts Institute of Technology), 4, 5 ML models, see machine learning models moats, 277 loops versus, 187–88, 192–94 mobile phones, 113 iPhone, 252 monitoring, 277 monitoring model performance, 174–78 accuracy, 175 bias, 177 data quality, 177–78 reworking and, 179–80 stability, 175–77 MuleSoft, 86, 87 negotiating data rights, 79, 80 Netflix, 242, 243 network effects, 15–16, 20, 22, 23, 44, 278 compounding of, 36 data network effects versus, 24–25 edges of, 24 limits to, 42–43 moving beyond, 24–26 products with versus without, 26 scale effects versus, 24 traditional, 27 value of, 27 networks, 7, 15, 17 data networks versus, 26 neural networks, 5, 7, 8, 19, 23, 53, 54, 277–78 neurons, 5, 7, 15 layers of, 7, 276 next-level data network effects, 26–27, 29–33, 36–37, 278 nodes, 21, 23–25, 27, 44, 278 NVIDIA, 250 Obama administration, 118 Onavo, 112 optical character recognition software, 72, 278 Oracle, 247, 248 outsourcing, 216 data labeling, 101–2 team members, 131 overfitting, 82 Pareto optimal solution, 56, 278 partial plots, 53, 278 payoffs, 195–98 concave, 195–98 convex, 195–97, 202 perceptron algorithm, 5 perishability of data, 74–75, 201 personalization, 255–56 personally identifiable information (PII), 81, 278 personnel lock-in, 248 perturbation, 178, 278 physical leverage, 3 Pitts, Walter, 4–5 POC (proof of concept), 59–60, 63, 278 positioning, 245–56 power generators, 209, 278 power teachers, 209 precision, 278 precision and recall, 206–7 prediction usability threshold (PUT), 62–64, 90, 91, 173, 200–202, 279 predictions, 34–35, 48, 63, 65, 148, 202–3 predictive pricing, 41, 42 prices charged by data vendors, 73 pricing of AI-First products, 236–39 customer data contribution and, 237 features and, 238–39 transactional, 237, 280 updating and, 238 usage-based, 237–38, 281 of data integration products, 87 optimization of, 41–42 personalized, 41 predictive, 41, 42 ROI-based, 235–36, 279 Principia Mathematica, 4 prisoner’s dilemma, 104 probability, in data labeling, 107 process automation, 6 process lock-in, 248 products, 59 features of, 61, 63 lock-in and, 248 utility of, 35–36 value of, 39 profit, 213 profitability metrics, 209–18 data labeling and, 215–16 data pipes and, 216 input cost analysis, 215–16 research cost analysis, 217–18 unit analysis, 213–14 and vertical versus horizontal products, 210–12 proof of concept (POC), 59–60, 63, 278 proprietary information, 44, 279 feedback data, 199–200 protocols, 248 public data, 115–22 buying, see buying data consulting and competitions, 117–18 crawling, 115–16, 281 governments, 118–19 media, 118 PUT (prediction usability threshold), 62–64, 90, 91, 173, 200–202, 278 quality, 175, 177–78 query by committee, 96 query languages, 279 random forest, 53, 64, 279 recall, 279 receiver operating characteristic (ROC) curve, 205–6, 279 recurrent neural networks (RNNs), 151, 153 recursion, 150, 279 regression, 64 reinforcement learning (RL), 103, 147–48, 152, 153, 279 relevance of data, 74–75 reliability, 175 reports, 171 research and development (R & D), 42 cost analysis, 217–18 revolutionary products, 252 robots, 6 ROI (return on investment), 55, 63–65, 93, 164, 176, 181, 198, 279 pricing based on, 235–36, 279 Russell, Bertrand, 4 sales, 58–60 Salesforce, 159, 212, 243, 248, 258 SAP (Systems Applications and Products in Data Processing), 6, 159, 161, 247, 248 SAS, 253 scalability, in data labeling, 106 scale, 20–22, 227, 279 economies of, 19, 34 loops and, 198–200, 203 in ML management, 158 moving beyond, 21–23 network effects versus, 24 scatter plot, 53, 280 scheme, 279 search engines, 31 secure multiparty computation, 117, 279 security, 159 Segment, 87–88 self-reinforcing data, 76 selling data, 122 sensors, 113–14, 160, 280 shopping online, 29, 31, 34, 37, 41, 84 simulation, 103–4, 280 ABMs versus, 105 social networks, 16, 20, 44 Facebook, 25, 43, 112, 119, 122 LinkedIn, 122 software, 159 traditional business models for, 233–34 software-as-a-service (SaaS), 87, 280 software development kits (SDKs), 112, 280 software engineering, decoupling data science from, 133 software engineers, 139, 134–37 Sony, 7 speed of data labeling, 108 spreadsheets, 171 Square Capital, 35 stability, 175–77 staging, 249–51 standardization, 247–48, 249–50 statistical analysis, 149, 153 statistical process control (SPC), 156, 173, 280 statistics, 53–54 stocks, 72, 74, 120–21 supervised machine learning, 147–48, 280 supply, 225 supply-chain tracking, 260 support vector machines, 280 synthetic data, 105–8, 216 system of engagement, 280 system of record, 243, 281 systems integrators (SIs), 161, 248, 281 Tableau, 253 talent loop, 260–61, 281 Taylor, Frederick W., 6 teams in proof of concept phase, 60 see also AI-First teams telecommunications industry, 250–51 telephones mobile, 113 iPhone, 253 networks, 23–25 templates, 171 temporal leverage, 3 threshold logic unit (TLU), 5 ticker data, 120–21 token-based incentives, 109–10 tools, 2–3, 93–97 training data, 199 transactional pricing, 237, 280 transaction costs, 243 transfer learning, 147–48 true and false, 204–6 Turing, Alan, 5 23andMe, 112 Twilio, 87 uncertainty sampling, 96 unit analysis, 213–14 United Nations, 250 unsupervised machine learning, 53, 147–48, 281 Upwork, 99 usability, 255–56 usage-based pricing, 237–38, 281 usage metrics, 209 user interface (UI), 89, 159, 281 utility of network effects, 42 of products, 35–36 validation data, 199 value chain, 18–19, 281 value proposition, 59 values, missing, 178 variable importance plots, 53, 281 variance reduction, 96 Veeva Systems, 212 vendors, 73, 161 data, prices charged by, 73 independent software, 161, 248, 276 lock-in and, 247–48 venture capital, 230 veracity of data, 75 versioning, 169–70, 281 vertical integration, 226–37, 239, 244, 252, 281 vertical products, 210–12, 282 VMWare, 248 waterfall charts, 282 Web crawlers, 115–16, 282 weights, 150, 281 workflow applications, 84–86, 253, 259, 282 workflow-first versus integrations-first companies, 88–89 yield management systems, 42 Zapier, 87 Zendesk, 233 zettabyte, 8, 282 Zetta Venture Partners, 8–9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z ABOUT THE AUTHOR Ash Fontana became one of the most recognized startup investors in the world after launching online investing at AngelList.


pages: 398 words: 86,855

Bad Data Handbook by Q. Ethan McCallum

Amazon Mechanical Turk, asset allocation, barriers to entry, Benoit Mandelbrot, business intelligence, cellular automata, chief data officer, Chuck Templeton: OpenTable:, cloud computing, cognitive dissonance, combinatorial explosion, commoditize, conceptual framework, database schema, DevOps, en.wikipedia.org, Firefox, Flash crash, functional programming, Gini coefficient, illegal immigration, iterative process, labor-force participation, loose coupling, natural language processing, Netflix Prize, quantitative trading / quantitative finance, recommendation engine, selection bias, sentiment analysis, statistical model, supply-chain management, survivorship bias, text mining, too big to fail, web application

He earned his Ph.D. in economics from Syracuse University and his undergraduate degree in economics from the University of Wisconsin at Madison. Brett Goldstein is the Commissioner of the Department of Innovation and Technology for the City of Chicago. He has been in that role since June of 2012. Brett was previously the city’s Chief Data Officer. In this role, he lead the city’s approach to using data to help improve the way the government works for its residents. Before coming to City Hall as Chief Data Officer, he founded and commanded the Chicago Police Department’s Predictive Analytics Group, which aims to predict when and where crime will happen. Prior to entering the public sector, he was an early employee with OpenTable and helped build the company for seven years.


pages: 464 words: 127,283

Smart Cities: Big Data, Civic Hackers, and the Quest for a New Utopia by Anthony M. Townsend

1960s counterculture, 4chan, A Pattern Language, Airbnb, Amazon Web Services, anti-communist, Apple II, Bay Area Rapid Transit, Burning Man, business process, call centre, carbon footprint, charter city, chief data officer, clean water, cleantech, cloud computing, computer age, congestion charging, connected car, crack epidemic, crowdsourcing, DARPA: Urban Challenge, data acquisition, Deng Xiaoping, digital map, Donald Davies, East Village, Edward Glaeser, game design, garden city movement, Geoffrey West, Santa Fe Institute, George Gilder, ghettoisation, global supply chain, Grace Hopper, Haight Ashbury, Hedy Lamarr / George Antheil, hive mind, Howard Rheingold, interchangeable parts, Internet Archive, Internet of things, Jacquard loom, Jane Jacobs, jitney, John Snow's cholera map, Joi Ito, Khan Academy, Kibera, Kickstarter, knowledge worker, load shedding, M-Pesa, Mark Zuckerberg, megacity, mobile money, mutually assured destruction, new economy, New Urbanism, Norbert Wiener, Occupy movement, off grid, openstreetmap, packet switching, Panopticon Jeremy Bentham, Parag Khanna, patent troll, Pearl River Delta, place-making, planetary scale, popular electronics, RFC: Request For Comment, RFID, ride hailing / ride sharing, Robert Gordon, self-driving car, sharing economy, Shenzhen special economic zone , Silicon Valley, Skype, smart cities, smart grid, smart meter, social graph, social software, social web, special economic zone, Steve Jobs, Steve Wozniak, Stuxnet, supply-chain management, technoutopianism, Ted Kaczynski, telepresence, The Death and Life of Great American Cities, too big to fail, trade route, Tyler Cowen: Great Stagnation, undersea cable, Upton Sinclair, uranium enrichment, urban decay, urban planning, urban renewal, Vannevar Bush, working poor, working-age population, X Prize, Y2K, zero day, Zipcar

When I asked him to speculate on what big data means for cities in the future, his response was quick and terse. “Governing and policy making based on what the vital signs are telling us, not anecdote,” he said.39 Perhaps not surprisingly, his partner in reinventing Chicago’s government as a data-driven enterprise is himself a crime mapper. The country’s first municipal chief data officer, Brett Goldstein was brought over from the Chicago Police Department where, Tolva says, “he was crunching huge amounts of past crime data to nightly redeploy squads based on probability curves of incidents.” But in his new role Goldstein can look beyond just police reports, at the many other socioeconomic indicators that can help suss out the conditions that foster crime.

Watchdog groups will need to step in and identify where the crucial conflicts lie. (And in fact, the Electronic Frontier Foundation is doing just this on behalf of a number of transit agencies being sued by another transit-arrival patent troll, Luxembourg-based ArrivalStar).18 Cities will need regular audits, perhaps conducted by a chief privacy officer or chief data officer charged with extending public control over government- and citizen-generated data. An intriguing option is to hand off this data to a trust equipped to manage it on behalf of citizens, covering its costs—and possibly generating a revenue stream for the city—by licensing the data. A growing number of start-ups and open-source projects, like the Personal Locker project started by Jeremie Miller, are exploring ways for individuals to control and even pool their private data to trade with companies.


pages: 157 words: 53,125

The Fifth Risk by Michael Lewis

Albert Einstein, Bernie Sanders, Biosphere 2, chief data officer, cloud computing, Donald Trump, Ferguson, Missouri, Silicon Valley, Steve Bannon, tail risk, the new new thing, uranium enrichment

They were on tapes in a basement of a NOAA office in Asheville, North Carolina. To get the data into a form he could use, Friedberg paid NOAA to put it on hard drives and ship them to him. He then moved the data, for free, to the cloud. “That was the first data set we were able to get onto the cloud,” said Ed Kearns, chief data officer at NOAA. “David showed Google and Amazon and Microsoft that there was a business case for taking it. Until we got it up, no one was able to reprocess the data.” Of course, without cloud computing there would have been no place to put the radar data. But once it was on the cloud it was generally accessible and could be used for any purpose.


Digital Transformation at Scale: Why the Strategy Is Delivery by Andrew Greenway,Ben Terrett,Mike Bracken,Tom Loosemore

Airbnb, bitcoin, blockchain, butterfly effect, call centre, chief data officer, choice architecture, cognitive dissonance, cryptocurrency, Diane Coyle, en.wikipedia.org, G4S, Internet of things, Kevin Kelly, Kickstarter, loose coupling, M-Pesa, minimum viable product, nudge unit, performance metric, ransomware, robotic process automation, Silicon Valley, social web, The future is already here, the market place, The Wisdom of Crowds

He is a Governor of the University of the Arts London, a member of the HS2 Design Panel and an advisor to the London Design Festival. He was inducted into the Design Week Hall of Fame in 2017. Mike Bracken was appointed Executive Director of Digital for the UK government in 2011 and the Chief Data Officer in 2014. He was responsible for overseeing and improving the government’s digital delivery of public services. After government, he sat on the board of the Co-operative Group as Chief Digital Officer. Before joining the civil service, Mike ran transformations in a variety of sectors in more than a dozen countries, including as Digital Development Director at Guardian News & Media.


Demystifying Smart Cities by Anders Lisdorf

3D printing, artificial general intelligence, autonomous vehicles, backpropagation, bitcoin, business intelligence, business process, chief data officer, clean water, cloud computing, computer vision, continuous integration, crowdsourcing, data is the new oil, digital twin, distributed ledger, don't be evil, Elon Musk, en.wikipedia.org, facts on the ground, Google Glasses, income inequality, Infrastructure as a Service, Internet of things, Masdar, microservices, Minecraft, platform as a service, ransomware, RFID, ride hailing / ride sharing, risk tolerance, self-driving car, smart cities, smart meter, software as a service, speech recognition, Stephen Hawking, Steve Jobs, Steve Wozniak, Stuxnet, Thomas Bayes, Turing test, urban sprawl, zero-sum game

Data governance Just as other forms of governance, data governance has to do with policies, processes, and decisions. In data governance, we look for who has what authority to create, change, and view specific types of data. For governance to work, we need someone to be responsible and make decisions. There will typically be a Chief Data Officer or CDO at the top. There will also be a governance board with key stakeholders. First it needs to be determined at a general logical level what data exists and is of relevance to the organization. What are the key concepts that exist, like buildings, persons, payments, devices, and so on? Once this has been determined, the responsibility for each concept has to be delegated to an owner.


pages: 296 words: 78,631

Hello World: Being Human in the Age of Algorithms by Hannah Fry

23andMe, 3D printing, Air France Flight 447, Airbnb, airport security, algorithmic bias, augmented reality, autonomous vehicles, backpropagation, Brixton riot, chief data officer, computer vision, crowdsourcing, DARPA: Urban Challenge, Douglas Hofstadter, Elon Musk, Firefox, Google Chrome, Gödel, Escher, Bach, Ignaz Semmelweis: hand washing, John Markoff, Mark Zuckerberg, meta-analysis, pattern recognition, Peter Thiel, RAND corporation, ransomware, recommendation engine, ride hailing / ride sharing, selection bias, self-driving car, Shai Danziger, Silicon Valley, Silicon Valley startup, Snapchat, speech recognition, Stanislav Petrov, statistical model, Stephen Hawking, Steven Levy, Tesla Model S, The Wisdom of Crowds, Thomas Bayes, Watson beat the top human players on Jeopardy!, web of trust, William Langewiesche, you are the product

Or a coupon for baby clothes will run alongside an ad for some cologne. Target is not alone in using these methods. Stories of what can be inferred from your data rarely hit the press, but the algorithms are out there, quietly hiding behind the corporate front lines. About a year ago, I got chatting to a chief data officer of a company that sells insurance. They had access to the full detail of people’s shopping habits via a supermarket loyalty scheme. In their analysis, they’d discovered that home cooks were less likely to claim on their home insurance, and were therefore more profitable. It’s a finding that makes good intuitive sense.


Industry 4.0: The Industrial Internet of Things by Alasdair Gilchrist

3D printing, additive manufacturing, Amazon Web Services, augmented reality, autonomous vehicles, barriers to entry, business intelligence, business process, chief data officer, cloud computing, connected car, cyber-physical system, deindustrialization, DevOps, digital twin, fault tolerance, global value chain, Google Glasses, hiring and firing, industrial robot, inflight wifi, Infrastructure as a Service, Internet of things, inventory management, job automation, low cost airline, low skilled workers, microservices, millennium bug, pattern recognition, peer-to-peer, platform as a service, pre–internet, race to the bottom, RFID, Skype, smart cities, smart grid, smart meter, smart transportation, software as a service, stealth mode startup, supply-chain management, The future is already here, trade route, undersea cable, web application, WebRTC, Y2K

However, this is not just any information but information that returns true value, aligned to the business strategy and goals. That requires data scientists with expert business knowledge regarding the company strategy and short-medium-long term goals. This is why there is a new C-suite position called the Chief Data Officer. Commitment to Innovation A company adopting IIOT has to make a commitment to innovation, as well as taking a long-term perspective to the IIoT project’s return on investments. Funding will be required for the capital outlay for sensors, devices, machines, and systems. Funding and patience will be required as performing the data capture and configuring the analytics’ parameters and algorithms might not result in immediate results; success may take some time to realize.


Mindf*ck: Cambridge Analytica and the Plot to Break America by Christopher Wylie

4chan, affirmative action, Affordable Care Act / Obamacare, availability heuristic, Berlin Wall, Bernie Sanders, big-box store, Boris Johnson, British Empire, call centre, Chelsea Manning, chief data officer, cognitive bias, cognitive dissonance, colonial rule, computer vision, conceptual framework, cryptocurrency, Daniel Kahneman / Amos Tversky, desegregation, disinformation, Dominic Cummings, Donald Trump, Downton Abbey, Edward Snowden, Elon Musk, Etonian, first-past-the-post, Google Earth, housing crisis, income inequality, indoor plumbing, information asymmetry, Internet of things, Julian Assange, Lyft, Marc Andreessen, Mark Zuckerberg, Menlo Park, move fast and break things, move fast and break things, Network effects, new economy, obamacare, Peter Thiel, Potemkin village, recommendation engine, Renaissance Technologies, Robert Mercer, Ronald Reagan, Rosa Parks, Sand Hill Road, Scientific racism, Shoshana Zuboff, side project, Silicon Valley, Skype, Steve Bannon, surveillance capitalism, uber lyft, unpaid internship, Valery Gerasimov, web application, WikiLeaks, zero-sum game

Channel 4 had to do a huge amount of detailed advance research, because any misstep could potentially blow the whole sting up. The carrot for Cambridge Analytica was 5 percent of the value of the man’s assets, if they succeeded in getting the (imaginary) funds released. We knew Alexander wouldn’t be able to resist. At the first two meetings, Ranjan met with chief data officer Alexander Tayler and managing director Mark Turnbull in private rooms at a hotel near Westminster. The executives pitched Cambridge Analytica’s data analysis work and suggested intelligence-gathering services, but nothing concrete came out of the meetings. They seemed cagey, hedging in how they talked about what Cambridge Analytica really did.


pages: 335 words: 97,468

Uncharted: How to Map the Future by Margaret Heffernan

23andMe, Affordable Care Act / Obamacare, Airbnb, Anne Wojcicki, anti-communist, Atul Gawande, autonomous vehicles, banking crisis, Berlin Wall, Boris Johnson, chief data officer, Chris Urmson, clean water, complexity theory, conceptual framework, cosmic microwave background, creative destruction, crowdsourcing, David Attenborough, discovery of penicillin, epigenetics, Fall of the Berlin Wall, fear of failure, George Santayana, gig economy, Google Glasses, index card, Internet of things, Jaron Lanier, job automation, Kickstarter, late capitalism, lateral thinking, Law of Accelerating Returns, liberation theology, mass immigration, mass incarceration, Murray Gell-Mann, Nate Silver, obamacare, oil shale / tar sands, passive investing, pattern recognition, Peter Thiel, prediction markets, RAND corporation, Ray Kurzweil, Rosa Parks, Sam Altman, Shoshana Zuboff, Silicon Valley, smart meter, Stephen Hawking, Steve Ballmer, Steve Jobs, surveillance capitalism, The Signal and the Noise by Nate Silver, Tim Cook: Apple, twin studies, University of East Anglia

Life is a lot easier when you think the system locks you in,’ is how Oliver Burrows describes the push to experiment versus the pull of the status quo. ‘So I’ve been quite deliberate, provoking people to challenge the constraints they work under, to ask if they’re real or meaningful – or just an excuse for keeping life simple.’ Burrows is scarcely your traditional picture of a revolutionary; he is the chief data officer at the Bank of England. His department of 150 already processes over a billion pieces of data a year. But he knew that the workload was bound to increase. A tougher regulatory environment now demands more data, analysed with increasing granularity. Resources wouldn’t increase in proportion to workload, and just making people work harder wasn’t a sustainable strategy.


pages: 309 words: 96,168

Masters of Scale: Surprising Truths From the World's Most Successful Entrepreneurs by Reid Hoffman, June Cohen, Deron Triff

23andMe, 3D printing, Airbnb, Anne Wojcicki, Ben Horowitz, bitcoin, Broken windows theory, Burning Man, call centre, chief data officer, clean water, collaborative consumption, Covid-19, COVID-19, crowdsourcing, desegregation, Elon Musk, financial independence, gender pay gap, hockey-stick growth, Internet of things, knowledge economy, late fees, Lean Startup, lone genius, Mark Zuckerberg, minimum viable product, move fast and break things, move fast and break things, Network effects, Paul Graham, Peter Thiel, polynesian navigation, race to the bottom, remote working, RFID, Ronald Reagan, Rubik’s Cube, Ruby on Rails, Sam Altman, Silicon Valley, Silicon Valley startup, Steve Jobs, TaskRabbit, the scientific method, Tim Cook: Apple, Travis Kalanick, two and twenty, Y Combinator, zero day, Zipcar

But you might be surprised to learn that Jenn Hyman’s clothing rental biz, Rent the Runway, is also deep into data, and always has been. “Actually, 80 percent of our corporate employees are engineers, data scientists, and product managers,” says Jenn. “We have very few people in merchandising and marketing. The first C-level hire that I made was a chief data officer, and he was in my first ten employees. From the very beginning of the company, we were thinking about data. “We are getting data from our customer over a hundred times a year,” says Jenn. “And she’s letting us know: Did she wear it? How many times did she wear it? Did she love it? Was it just okay?


pages: 385 words: 111,113

Augmented: Life in the Smart Lane by Brett King

23andMe, 3D printing, additive manufacturing, Affordable Care Act / Obamacare, agricultural Revolution, Airbnb, Albert Einstein, Amazon Web Services, Any sufficiently advanced technology is indistinguishable from magic, Apple II, artificial general intelligence, asset allocation, augmented reality, autonomous vehicles, barriers to entry, bitcoin, blockchain, business intelligence, business process, call centre, chief data officer, Chris Urmson, Clayton Christensen, clean water, congestion charging, crowdsourcing, cryptocurrency, deskilling, different worldview, disruptive innovation, distributed generation, distributed ledger, double helix, drone strike, Elon Musk, Erik Brynjolfsson, Fellow of the Royal Society, fiat currency, financial exclusion, Flash crash, Flynn Effect, future of work, gig economy, Google Glasses, Google X / Alphabet X, Hans Lippershey, Hyperloop, income inequality, industrial robot, information asymmetry, Internet of things, invention of movable type, invention of the printing press, invention of the telephone, invention of the wheel, James Dyson, Jeff Bezos, job automation, job-hopping, John Markoff, John von Neumann, Kevin Kelly, Kickstarter, Kim Stanley Robinson, Kodak vs Instagram, Leonard Kleinrock, lifelogging, low earth orbit, low skilled workers, Lyft, M-Pesa, Mark Zuckerberg, Marshall McLuhan, megacity, Metcalfe’s law, Minecraft, mobile money, money market fund, more computing power than Apollo, Network effects, new economy, obamacare, Occupy movement, Oculus Rift, off grid, packet switching, pattern recognition, peer-to-peer, Ray Kurzweil, RFID, ride hailing / ride sharing, Robert Metcalfe, Satoshi Nakamoto, Second Machine Age, selective serotonin reuptake inhibitor (SSRI), self-driving car, sharing economy, Shoshana Zuboff, Silicon Valley, Silicon Valley startup, Skype, smart cities, smart grid, smart transportation, Snapchat, social graph, software as a service, speech recognition, statistical model, stem cell, Stephen Hawking, Steve Jobs, Steve Wozniak, strong AI, TaskRabbit, technological singularity, telemarketer, telepresence, telepresence robot, Tesla Model S, The future is already here, The Future of Employment, Tim Cook: Apple, trade route, Travis Kalanick, Turing complete, Turing test, uber lyft, undersea cable, urban sprawl, V2 rocket, Watson beat the top human players on Jeopardy!, white picket fence, WikiLeaks

He is a graduate of MIT and attended graduate school at Harvard’s Kennedy School of Government. JP Rangaswami Born in Calcutta, JP Rangaswami (@jobsworth) read economics and worked as a financial journalist before changing careers over three decades ago to enter that strange space where society, technology and banking converge. Now 58, Rangaswami works as chief data officer at a major financial institution, having previously been chief scientist and chief information officer at a number of global institutions. He is Adjunct Professor at the School of Electronics and Computer Science at the University of Southampton. In addition, he is a Fellow of the British Computer Society, a Fellow of the Royal Society of the Arts and Venture Partner at Anthemis.


pages: 421 words: 110,406

Platform Revolution: How Networked Markets Are Transforming the Economy--And How to Make Them Work for You by Sangeet Paul Choudary, Marshall W. van Alstyne, Geoffrey G. Parker

3D printing, Affordable Care Act / Obamacare, Airbnb, Alvin Roth, Amazon Mechanical Turk, Amazon Web Services, Andrei Shleifer, Apple's 1984 Super Bowl advert, autonomous vehicles, barriers to entry, big data - Walmart - Pop Tarts, bitcoin, blockchain, business cycle, business process, buy low sell high, chief data officer, Chuck Templeton: OpenTable:, clean water, cloud computing, connected car, corporate governance, crowdsourcing, data acquisition, data is the new oil, digital map, discounted cash flows, disintermediation, Edward Glaeser, Elon Musk, en.wikipedia.org, Erik Brynjolfsson, financial innovation, Haber-Bosch Process, High speed trading, independent contractor, information asymmetry, Internet of things, inventory management, invisible hand, Jean Tirole, Jeff Bezos, jimmy wales, John Markoff, Khan Academy, Kickstarter, Lean Startup, Lyft, Marc Andreessen, market design, Metcalfe’s law, multi-sided market, Network effects, new economy, payday loans, peer-to-peer lending, Peter Thiel, pets.com, pre–internet, price mechanism, recommendation engine, RFID, Richard Stallman, ride hailing / ride sharing, Robert Metcalfe, Ronald Coase, Satoshi Nakamoto, self-driving car, shareholder value, sharing economy, side project, Silicon Valley, Skype, smart contracts, smart grid, Snapchat, software is eating the world, Steve Jobs, TaskRabbit, The Chicago School, the payments system, Tim Cook: Apple, transaction costs, Travis Kalanick, two-sided market, Uber and Lyft, Uber for X, uber lyft, winner-take-all economy, zero-sum game, Zipcar

In a provocative act designed to shed light on the issue, a young woman named Jennifer Lyn Morone has incorporated herself in order to assert an ownership interest in the data stream that she generates.35 Companies that profit from the use and sale of personal data, of course, are unlikely to find Morone’s gesture either amusing or persuasive. But the issue is not going to disappear. J. P. Rangaswami, chief data officer for Deutsche Bank, predicts: As we learn more about the value of personal and collective information, our approach to such information will mirror our natural motivations. We will learn to develop and extend these rights. The most important change will be to do with collective (sometimes, but not always, public) information.


pages: 482 words: 121,173

Tools and Weapons: The Promise and the Peril of the Digital Age by Brad Smith, Carol Ann Browne

Affordable Care Act / Obamacare, AI winter, airport security, Albert Einstein, algorithmic bias, augmented reality, autonomous vehicles, barriers to entry, Berlin Wall, Boeing 737 MAX, business process, call centre, Celtic Tiger, chief data officer, cloud computing, computer vision, corporate social responsibility, disinformation, Donald Trump, Edward Snowden, en.wikipedia.org, immigration reform, income inequality, Internet of things, invention of movable type, invention of the telephone, Jeff Bezos, Mark Zuckerberg, minimum viable product, national security letter, natural language processing, Network effects, new economy, pattern recognition, precision agriculture, race to the bottom, ransomware, Ronald Reagan, Rubik’s Cube, school vouchers, self-driving car, Shoshana Zuboff, Silicon Valley, Skype, speech recognition, Steve Ballmer, Steve Jobs, surveillance capitalism, The Rise and Fall of American Growth, Tim Cook: Apple, WikiLeaks, women in the workforce

We also need to develop data-sharing approaches that will create effective opportunities for companies, communities, and countries large and small to reap the benefits from data. In short, we need to democratize AI and the data on which it relies. So how do we create a bigger opportunity for smaller players in a world where large quantities of data matter? One person who may have the answer is Matthew Trunnell. Trunnell is the chief data officer at the Fred Hutchinson Cancer Research Center, a leading cancer research center in Seattle named for a hometown hero who pitched ten seasons for the Detroit Tigers and managed three major league baseball teams. In 1961, Fred Hutchinson took the Cincinnati Reds to the World Series. Sadly, Fred’s successful baseball career and life were cut short when he died of cancer in 1964 at the age of forty-five.5 His brother, Bill Hutchinson, was a surgeon who treated Fred’s cancer.


pages: 660 words: 141,595

Data Science for Business: What You Need to Know About Data Mining and Data-Analytic Thinking by Foster Provost, Tom Fawcett

Albert Einstein, Amazon Mechanical Turk, big data - Walmart - Pop Tarts, bioinformatics, business process, call centre, chief data officer, Claude Shannon: information theory, computer vision, conceptual framework, correlation does not imply causation, crowdsourcing, data acquisition, David Brooks, en.wikipedia.org, Erik Brynjolfsson, Gini coefficient, independent contractor, information retrieval, intangible asset, iterative process, Johann Wolfgang von Goethe, Louis Pasteur, Menlo Park, Nate Silver, Netflix Prize, new economy, p-value, pattern recognition, placebo effect, price discrimination, recommendation engine, Ronald Coase, selection bias, Silicon Valley, Skype, speech recognition, Steve Jobs, supply-chain management, text mining, The Signal and the Noise by Nate Silver, Thomas Bayes, transaction costs, WikiLeaks

” — Craig Vaughan Global Vice President at SAP “This timely book says out loud what has finally become apparent: in the modern world, Data is Business, and you can no longer think business without thinking data. Read this book and you will understand the Science behind thinking data.” — Ron Bekkerman Chief Data Officer at Carmel Ventures “A great book for business managers who lead or interact with data scientists, who wish to better understand the principals and algorithms available without the technical details of single-disciplinary books.” — Ronny Kohavi Partner Architect at Microsoft Online Services Division “Provost and Fawcett have distilled their mastery of both the art and science of real-world data analysis into an unrivalled introduction to the field.”