data is the new oil

Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die by Eric Siegel


The idea is simple, although that doesn’t make it easy. The challenge is tackled by a systematic, scientific means to develop and continually improve prediction—to literally learn to predict. The solution is machine learning—computers automatically developing new knowledge and capabilities by furiously feeding on modern society’s greatest and most potent unnatural resource: data. “Feed Me!”—Food for Thought for the Machine Data is the new oil. —European Consumer Commissioner Meglena Kuneva The only source of knowledge is experience. —Albert Einstein In God we trust. All others must bring data. —William Edwards Deming (a business professor famous for work in manufacturing) Most people couldn’t be less interested in data. It can seem like such dry, boring stuff. It’s a vast, endless regiment of recorded facts and figures, each alone as mundane as the most banal tweet, “I just bought some new sneakers!”

This is the assumption behind the leap of faith an organization takes when undertaking PA. Budgeting the staff and tools for a PA project requires this leap, knowing not what specifically will be discovered and yet trusting that something will be. Sitting on an expert panel at Predictive Analytics World, leading UK consultant Tom Khabaza put it this way: “Projects never fail due to lack of patterns.” With The Data Effect in mind, the scientist rests easy. Data is the new oil. It’s this century’s greatest possession and often considered an organization’s most important strategic asset. Several thought leaders have dubbed it as such—“the new oil”—including European Consumer Commissioner Meglena Kuneva, who also calls it “the new currency of the digital world.” It’s not a hyperbole. In 2012, Apple Inc. overtook Exxon Mobil Corporation, the world’s largest oil company, as the most valuable publicly traded company in the world.

Seven Databases in Seven Weeks: A Guide to Modern Databases and the NoSQL Movement by Eric Redmond, Jim Wilson, Jim R. Wilson


From Jim: First, I have to thank my family; Ruthy, your boundless patience and encouragement have been heartwarming. Emma and Jimmy, you’re two smart cookies, and your daddy loves you always. Also a special thanks to all the unsung heroes who monitor IRC, message boards, mailing lists, and bug systems ready to help anyone who needs you. Your dedication to open source keeps these projects kicking. Copyright © 2012, The Pragmatic Bookshelf. Preface It has been said that data is the new oil. If this is so, then databases are the fields, the refineries, the drills, and the pumps. Data is stored in databases, and if you’re interested in tapping into it, then coming to grips with the modern equipment is a great start. Databases are tools; they are the means to an end. Each database has its own story and its own way of looking at the world. The more you understand them, the better you will be at harnessing the latent power in the ever-growing corpus of data at your disposal.

Not-So-Good For: Because of the high degree of interconnectedness between nodes, graph databases are generally not suitable for network partitioning. Spidering the graph quickly means you can’t afford network hops to other database nodes, so graph databases don’t scale out well. It’s likely that if you use a graph database, it’ll be one piece of a larger system, with the bulk of the data stored elsewhere and only the relationships maintained in the graph. 9.2 Making a Choice As we said at the beginning, data is the new oil. We sit upon a vast ocean of data, yet until it’s refined into information, it’s unusable (and with a more crude comparison, there’s a lot of money in data these days). The ease of collecting and ultimately storing, mining, and refining the data out there starts with the database you choose. Deciding which database to choose is often more complex than merely considering which genre maps best to a given domain’s data.


Data-Ism: The Revolution Transforming Decision Making, Consumer Behavior, and Almost Everything Else by Steve Lohr


“Without the technology to analyze the data, it’s useless,” Zhou notes. “Now, it’s getting to be valuable.” In September 2014, Zhou left IBM to start her own company. The idea, she says, is inspired by the work she did at IBM, and researchers there will continue to pursue the underlying technologies she developed in service of corporations. But Zhou has her eye on the consumer market. If data is the new oil, she says, then we are all data wells, and potentially valuable ones. The data-infused profiles of a person’s traits and values, Zhou says, should be exploited by the individual as a kind of currency in exchange for truly personalized products, services, and advice from businesses, with tailored pricing as well. Even a prototype was months away when we spoke just after she departed from IBM, but her ambition is to help alter the terms of trade in digital commerce.


Future Crimes: Everything Is Connected, Everyone Is Vulnerable and What We Can Do About It by Marc Goodman


CHAPTER 3: MOORE’S OUTLAWS The World of Exponentials The Crime Singularity Control the Code, Control the World CHAPTER 4: YOU’RE NOT THE CUSTOMER, YOU’RE THE PRODUCT Our Growing Digital World—What They Never Told You The Social Network and Its Inventory—You You’re Leaking—How They Do It The Most Expensive Things in Life Are Free Terms and Conditions Apply (Against You) Mobile Me Pilfering Your Data? There’s an App for That Location, Location, Location CHAPTER 5: THE SURVEILLANCE ECONOMY You Thought Hackers Were Bad? Meet the Data Brokers Analyzing You But I’ve Got Nothing to Hide Privacy Risks and Other Unpleasant Surprises Opening Pandora’s Virtual Box Knowledge Is Power, Code Is King, and Orwell Was Right CHAPTER 6: BIG DATA, BIG RISK Data Is the New Oil Bad Stewards, Good Victims, or Both? Data Brokers Are Poor Stewards of Your Data Too Social Networking Ills Illicit Data: The Lifeblood of Identity Theft Stalkers, Bullies, and Exes—Oh My! Online Threats to Minors Haters Gonna Hate Burglary 2.0 Targeted Scams and Targeted Killings Counterintelligence Implications of Leaked Government Data So No Online Profile Is Better, Right? The Spy Who Liked Me CHAPTER 7: I.T.

LeT simply processed the data the public was leaking and leveraged them in real time to kill more people and outmaneuver authorities. That was terrorism in the digital age circa 2008. What might terrorists do with the technologies available today? What will they do with the technologies of tomorrow? The lesson of Mumbai is that exponential change applies not just for good but for evil as well. Data Is the New Oil Data is constantly being generated by everything around us. Every digital process, sensor, mobile phone, GPS device, car engine, medical lab test, credit card transaction, hotel door lock, report card, and social media exchange produces data. Smart phones are turning human beings into human sensors, generating vast sums of information about us. As a result, children born today will live their entire lives in the shadow of a massive digital footprint, with some 92 percent of infants already having an online presence.


Data Scientists at Work by Sebastian Gutierrez


In the interests of full disclosure, I fall into the camp of those who believe that data science is truly an emerging academic discipline and that data scientists as such have proper roles in organizations. Moreover, I believe that each of the subjects I interviewed for this book is indeed a data scientist—and, after having spent time with all of them, I couldn’t be more excited about the future of data science. Michael Palmer, “Data Is the New Oil,” ANA Marketing Maestros blog, November 3, 2006. 2 Susan Lund et al., “Game Changers: Five Opportunities for US Growth and Renewal,” McKinsey Global Institute Report, July 2013. americas/us_game_changers. 1 xvi Introduction Though some of them are wary of the hype that the field is attracting, all sixteen of these data scientists believe in the power of the work they are doing as well as the methods.


Dragnet Nation: A Quest for Privacy, Security, and Freedom in a World of Relentless Surveillance by Julia Angwin


Tracking is so crucial to the industry that in 2013 Randall Rothenberg, the president of the Interactive Advertising Bureau, said that if the industry lost its ability to track people, “billions of dollars in Internet advertising and hundreds of thousands of jobs dependent on it would disappear.” Meglena Kuneva, a member of the European Commission, summed it up best in 2009 when she said: “Personal data is the new oil of the Internet and the new currency of the digital world.” * * * If you were to build a taxonomy of trackers it would look something like this: GOVERNMENT • Incidental collectors. Agencies that collect data in their normal course of business, such as state motor vehicle registries and the IRS, but are not directly in the data business. • Investigators. Agencies that collect data about suspects as part of law enforcement investigations, such as the FBI and local police


Platform Revolution: How Networked Markets Are Transforming the Economy--And How to Make Them Work for You by Sangeet Paul Choudary, Marshall W. van Alstyne, Geoffrey G. Parker


As of 2011, there were more than three thousand games on Facebook, collectively weakening Zynga’s individual bargaining power.20 The startup’s response may be to sell, to fight back through multihoming, or to expand into other business arenas. Zynga, for example, now multihomes on Tencent’s QQ social network and on the Apple and Google mobile platforms, as well as offering its own cloud service. HOW PLATFORMS COMPETE (3): LEVERAGING THE VALUE OF DATA One of the clichés of the Internet economy is the saying “Data is the new oil”—and like most clichés, it contains a lot of truth. Data can be a source of enormous value to platform businesses, and well-run firms are using data to shore up their competitive positions in a wide variety of ways. Platform businesses can use data to improve their competitive performance in two general ways—tactically and strategically. An example of tactical data use is in the performance of A/B testing, to optimize particular tools or features of the platform.


The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World by Pedro Domingos


The same dynamic happens in any market where there’s lots of choice and lots of data. The race is on, and whoever learns fastest wins. It doesn’t stop with understanding customers better: companies can apply machine learning to every aspect of their operations, provided data is available, and data is pouring in from computers, communication devices, and ever-cheaper and more ubiquitous sensors. “Data is the new oil” is a popular refrain, and as with oil, refining it is big business. IBM, as well plugged into the corporate world as anyone, has organized its growth strategy around providing analytics to companies. Businesses look at data as a strategic asset: What data do I have that my competitors don’t? How can I take advantage of it? What data do my competitors have that I don’t? In the same way that a bank without databases can’t compete with a bank that has them, a company without machine learning can’t keep up with one that uses it.


Connectography: Mapping the Future of Global Civilization by Parag Khanna


Whether these governments seek to monitor, filter, or protect digital flows, the geographic (and legal) location of servers, cables, routers, and data centers now matters as much as the geography of oil pipelines. The differences are crucial, however. Internet data can be replicated infinitely and exist in multiple places at the same time. Additionally, it can be rerouted or smuggled “in” to its destination, while the receiver has the ability to come “out” as well to access it. If data is the new oil, it is certainly much more slippery. It is true that the Internet is no longer a truly borderless, parallel universe. Even Twitter, the world’s most free and unfiltered medium of one-to-many expression, preemptively restricts content banned in various countries, while Google Maps loads tailored maps approved by national authorities based on the user’s server location. Yet even if software or data services have to be customized to national restrictions such as after the EU’s 2015 decision invalidating the “Safe Harbor” agreement with the United States, these represent only partial frictions, not blockages.