bioinformatics

99 results back to index


Data Mining: Concepts and Techniques: Concepts and Techniques by Jiawei Han, Micheline Kamber, Jian Pei

backpropagation, bioinformatics, business intelligence, business process, Claude Shannon: information theory, cloud computing, computer vision, correlation coefficient, cyber-physical system, database schema, discrete time, disinformation, distributed generation, finite state, industrial research laboratory, information retrieval, information security, iterative process, knowledge worker, linked data, machine readable, natural language processing, Netflix Prize, Occam's razor, pattern recognition, performance metric, phenotype, power law, random walk, recommendation engine, RFID, search costs, semantic web, seminal paper, sentiment analysis, sparse data, speech recognition, statistical model, stochastic process, supply-chain management, text mining, thinkpad, Thomas Bayes, web application

For example, we may find a group of genes that express themselves similarly, which is highly interesting in bioinformatics, such as in finding pathways. ■ When analyzing in the sample/condition dimension, we treat each sample/condition as an object and treat the genes as attributes. In this way, we may find patterns of samples/conditions, or cluster samples/conditions into groups. For example, we may find the differences in gene expression by comparing a group of tumor samples and nontumor samples. Gene expression Gene expression matrices are popular in bioinformatics research and development. For example, an important task is to classify a new gene using the expression data of the gene and that of other genes in known classes.

Every enterprise benefits from collecting and analyzing its data: Hospitals can spot trends and anomalies in their patient records, search engines can do better ranking and ad placement, and environmental and public health agencies can spot patterns and abnormalities in their data. The list continues, with cybersecurity and computer network intrusion detection; monitoring of the energy consumption of household appliances; pattern analysis in bioinformatics and pharmaceutical data; financial and business intelligence data; spotting trends in blogs, Twitter, and many more. Storage is inexpensive and getting even less so, as are data sensors. Thus, collecting and storing data is easier than ever before. The problem then becomes how to analyze the data.

Web mining can help us learn about the distribution of information on the WWW in general, characterize and classify web pages, and uncover web dynamics and the association and other relationships among different web pages, users, communities, and web-based activities. It is important to keep in mind that, in many applications, multiple types of data are present. For example, in web mining, there often exist text data and multimedia data (e.g., pictures and videos) on web pages, graph data like web graphs, and map data on some web sites. In bioinformatics, genomic sequences, biological networks, and 3-D spatial structures of genomes may coexist for certain biological objects. Mining multiple data sources of complex data often leads to fruitful findings due to the mutual enhancement and consolidation of such multiple sources. On the other hand, it is also challenging because of the difficulties in data cleaning and data integration, as well as the complex interactions among the multiple sources of such data.


pages: 362 words: 104,308

Forty Signs of Rain by Kim Stanley Robinson

bioinformatics, business intelligence, double helix, Dr. Strangelove, experimental subject, Intergovernmental Panel on Climate Change (IPCC), Kim Stanley Robinson, phenotype, precautionary principle, prisoner's dilemma, Ronald Reagan, social intelligence, stem cell, the scientific method, zero-sum game

It was not a matter of her being warm and fuzzy, as you might expect from the usual characterizations of feminine thought—on the contrary, Anna’s scientific work (she still often coauthored papers in statistics, despite her bureaucratic load) often displayed a finicky perfectionism that made her a very meticulous scientist, a first-rate statistician—smart, quick, competent in a range of fields and really excellent in more than one. As good a scientist as one could find for the rather odd job of running the Bioinformatics Division at NSF, good almost to the point of exaggeration—too precise, too interrogatory—it kept her from pursuing a course of action with drive. Then again, at NSF maybe that was an advantage. In any case she was so intense about it. A kind of Puritan of science, rational to an extreme. And yet of course at the same time that was all such a front, as with the early Puritans; the hyperrational coexisted in her with all the emotional openness, intensity, and variability that was the American female interactional paradigm and social role.

Anna had been watching him, and now she said, “I suppose it is a bit of a rat race.” “Well, no more than anywhere else. In fact if I were home it’d probably be worse.” They laughed. “And you have your journal work too.” “That’s right.” Frank waved at the piles of typescripts: three stacks for Review of Bioinformatics, two for The Journal of Sociobiology. “Always behind. Luckily the other editors are better at keeping up.” Anna nodded. Editing a journal was a privilege and an honor, even though usually unpaid—indeed, one often had to continue to subscribe to a journal just to get copies of what one had edited.

Frank scrolled down the pages of the application with practiced speed. Yann Pierzinski, Ph.D. in biomath, Caltech. Still doing postdoc work with his thesis advisor there, a man Frank had come to consider a bit of a credit hog, if not worse. It was interesting, then, that Pierzinski had gone down to Torrey Pines to work on a temporary contract, for a bioinformatics researcher whom Frank didn’t know. Perhaps that had been a bid to escape the advisor. But now he was back. Frank dug into the substantive part of the proposal. The algorithm set was one Pierzinski had been working on even back in his dissertation. Chemical mechanics of protein creation as a sort of natural algorithm, in effect.


pages: 239 words: 45,926

As the Future Catches You: How Genomics & Other Forces Are Changing Your Work, Health & Wealth by Juan Enriquez

Albert Einstein, AOL-Time Warner, Apollo 13, Berlin Wall, bioinformatics, borderless world, British Empire, Buckminster Fuller, business cycle, creative destruction, digital divide, double helix, Ford Model T, global village, Gregor Mendel, half of the world's population has never made a phone call, Helicobacter pylori, Howard Rheingold, Jeff Bezos, Joseph Schumpeter, Kevin Kelly, knowledge economy, more computing power than Apollo, Neal Stephenson, new economy, personalized medicine, purchasing power parity, Ray Kurzweil, Richard Feynman, Robert Metcalfe, Search for Extraterrestrial Intelligence, SETI@home, Silicon Valley, spice trade, stem cell, the new new thing, yottabyte

The machines and technology coming out of the digital and genetic revolutions may allow people to leverage their mental capacity a thousand … A million … Or a trillionfold. Biology is now driven by applied math … statistics … computer science … robotics … The world’s best programmers are increasingly gravitating toward biology … You will be hearing a lot about two new fields in the coming months … Bioinformatics and Biocomputing. You rarely see bioinformaticians … They are too valuable to companies and universities. Things are moving too fast … And they are too passionate about what they do … To spend a lot of time giving speeches and interviews. But if you go into the bowels of Harvard Medical School … And are able to find the genetics department inside the Warren Alpert Building … (A significant test of intelligence in and of itself … Start by finding the staircase inspired by the double helix … and go past the bathrooms marked XX and XY …) There you can find a small den where George Church hangs out, surrounded by computers.

This is ground zero for a wonderful commune of engineers, physicists, molecular biologists, and physicians …3 And some of the world’s smartest graduate students … Who are trying to make sense of the 100 terabytes of data that come out of gene labs yearly … A task equivalent to trying to sort and use a million new encyclopedias … every year.4 You can’t build enough “wet” labs (labs full of beakers, cells, chemicals, refrigerators) to process and investigate all the opportunities this scale of data generates. The only way for Church & Co. to succeed … Is to force biology to divide … Into theoretical and applied disciplines. Which is why he is one of the founders of bioinformatics … A new discipline that attempts to predict what biologists will find … When they carry out wet-lab experiments in a few months, years, or decades. In a sense, this mirrors Craig Venter’s efforts at The Institute for Genomic Research and Celera. Celera and Church’s labs are information centers … not traditional labs … And a few smart people are going to be able to do … A lot of biology … Very quickly.

Countries, regions, governments, and companies that assume they are … And will remain … Dominant … Soon lose their competitive edge. (Particularly those whose leadership ignores or disparages emerging technologies … Remember those old saws: The sun never sets on the British Empire … Vive La France! … All roads lead to Rome … China, the Middle Kingdom.) Which is one of the reasons bioinformatics is so important … And why you should pay attention. What we are seeing is just the beginning of the digital-genomics convergence. When you think of a DNA molecule and its ability to … Carry our complete life code within each of our cells … Accurately copy the code … Billions of times per day … Read and execute life’s functions … Transmit this information across generations … It becomes clear that … The world’s most powerful and compact coding and information-processing system … is a genome.


pages: 133 words: 42,254

Big Data Analytics: Turning Big Data Into Big Money by Frank J. Ohlhorst

algorithmic trading, bioinformatics, business intelligence, business logic, business process, call centre, cloud computing, create, read, update, delete, data acquisition, data science, DevOps, extractivism, fault tolerance, information security, Large Hadron Collider, linked data, machine readable, natural language processing, Network effects, pattern recognition, performance metric, personalized medicine, RFID, sentiment analysis, six sigma, smart meter, statistical model, supply-chain management, warehouse automation, Watson beat the top human players on Jeopardy!, web application

Much of the disruption is fed by improved instrument and sensor technology; for instance, the Large Synoptic Survey Telescope has a 3.2-gigabyte pixel camera and generates over 6 petabytes of image data per year. It is the platform of Big Data that is making such lofty goals attainable. The validation of Big Data analytics can be illustrated by advances in science. The biomedical corporation Bioinformatics recently announced that it has reduced the time it takes to sequence a genome from years to days, and it has also reduced the cost, so it will be feasible to sequence an individual’s genome for $1,000, paving the way for improved diagnostics and personalized medicine. The financial sector has seen how Big Data and its associated analytics can have a disruptive impact on business.

Big Data has transformed astronomy from a field in which taking pictures of the sky was a large part of the job to one in which the pictures are all in a database already and the astronomer’s task is to find interesting objects and phenomena in the database. Transformation is taking place in the biological arena as well. There is now a well-established tradition of depositing scientific data into a public repository and of creating public databases for use by other scientists. In fact, there is an entire discipline of bioinformatics that is largely devoted to the maintenance and analysis of such data. As technology advances, particularly with the advent of next-generation sequencing, the size and number of available experimental data sets are increasing exponentially. Big Data has the potential to revolutionize more than just research; the analytics process has started to transform education as well.

The data preparation challenge even extends to analysis that uses only a single data set. Here there is still the issue of suitable database design, further complicated by the many alternative ways in which to store the information. Particular database designs may have certain advantages over others for analytical purposes. A case in point is the variety in the structure of bioinformatics databases, in which information on substantially similar entities, such as genes, is inherently different but is represented with the same data elements. Examples like these clearly indicate that database design is an artistic endeavor that has to be carefully executed in the enterprise context by professionals.


pages: 565 words: 151,129

The Zero Marginal Cost Society: The Internet of Things, the Collaborative Commons, and the Eclipse of Capitalism by Jeremy Rifkin

3D printing, active measures, additive manufacturing, Airbnb, autonomous vehicles, back-to-the-land, benefit corporation, big-box store, bike sharing, bioinformatics, bitcoin, business logic, business process, Chris Urmson, circular economy, clean tech, clean water, cloud computing, collaborative consumption, collaborative economy, commons-based peer production, Community Supported Agriculture, Computer Numeric Control, computer vision, crowdsourcing, demographic transition, distributed generation, DIY culture, driverless car, Eben Moglen, electricity market, en.wikipedia.org, Frederick Winslow Taylor, Free Software Foundation, Garrett Hardin, general purpose technology, global supply chain, global village, Hacker Conference 1984, Hacker Ethic, industrial robot, informal economy, information security, Intergovernmental Panel on Climate Change (IPCC), intermodal, Internet of things, invisible hand, Isaac Newton, James Watt: steam engine, job automation, John Elkington, John Markoff, John Maynard Keynes: Economic Possibilities for our Grandchildren, John Maynard Keynes: technological unemployment, Julian Assange, Kickstarter, knowledge worker, longitudinal study, low interest rates, machine translation, Mahatma Gandhi, manufacturing employment, Mark Zuckerberg, market design, mass immigration, means of production, meta-analysis, Michael Milken, mirror neurons, natural language processing, new economy, New Urbanism, nuclear winter, Occupy movement, off grid, off-the-grid, oil shale / tar sands, pattern recognition, peer-to-peer, peer-to-peer lending, personalized medicine, phenotype, planetary scale, price discrimination, profit motive, QR code, RAND corporation, randomized controlled trial, Ray Kurzweil, rewilding, RFID, Richard Stallman, risk/return, Robert Solow, Rochdale Principles, Ronald Coase, scientific management, search inside the book, self-driving car, shareholder value, sharing economy, Silicon Valley, Skype, smart cities, smart grid, smart meter, social web, software as a service, spectrum auction, Steve Jobs, Stewart Brand, the built environment, the Cathedral and the Bazaar, the long tail, The Nature of the Firm, The Structural Transformation of the Public Sphere, The Wealth of Nations by Adam Smith, The Wisdom of Crowds, Thomas Kuhn: the structure of scientific revolutions, Thomas L Friedman, too big to fail, Tragedy of the Commons, transaction costs, urban planning, vertical integration, warehouse automation, Watson beat the top human players on Jeopardy!, web application, Whole Earth Catalog, Whole Earth Review, WikiLeaks, working poor, Yochai Benkler, zero-sum game, Zipcar

Reducing the cost of electricity in the management of data centers goes hand in hand with cutting the cost of storing data, an ever larger part of the data-management process. And the sheer volume of data is mushrooming faster than the capacity of hard drives to save it. Researchers are just beginning to experiment with a new way of storing data that could eventually drop the marginal cost to near zero. In January 2013 scientists at the European Bioinformatics Institute in Cambridge, England, announced a revolutionary new method of storing massive electronic data by embedding it in synthetic DNA. Two researchers, Nick Goldman and Ewan Birney, converted text from five computer files—which included an MP3 recording of Martin Luther King Jr.’s “I Have a Dream” speech, a paper by James Watson and Francis Crick describing the structure of DNA, and all of Shakespeare’s sonnets and plays—and converted the ones and zeros of digital information into the letters that make up the alphabet of the DNA code.

Researchers add that DNA information can be preserved for centuries, as long as it is kept in a dark, cool environment.65 At this early stage of development, the cost of reading the code is high and the time it takes to decode information is substantial. Researchers, however, are reasonably confident that an exponential rate of change in bioinformatics will drive the marginal cost to near zero over the next several decades. A near zero marginal cost communication/energy infrastructure for the Collaborative Age is now within sight. The technology needed to make it happen is already being deployed. At present, it’s all about scaling up and building out.

Its network of thousands of scientists and plant breeders is continually searching for heirloom and wild seeds, growing them out to increase seed stock, and ferrying samples to the vault for long-term storage.32 In 2010, the trust launched a global program to locate, catalog, and preserve the wild relatives of the 22 major food crops humanity relies on for survival. The intensification of genetic-Commons advocacy comes at a time when new IT and computing technology is speeding up genetic research. The new field of bioinformatics has fundamentally altered the nature of biological research just as IT, computing, and Internet technology did in the fields of renewable-energy generation and 3D printing. According to research compiled by the National Human Genome Research Institute, gene-sequencing costs are plummeting at a rate that exceeds the exponential curves of Moore’s Law in computing power.33 Dr.


pages: 287 words: 86,919

Protocol: how control exists after decentralization by Alexander R. Galloway

Ada Lovelace, airport security, Alvin Toffler, Berlin Wall, bioinformatics, Bretton Woods, Charles Babbage, computer age, Computer Lib, Craig Reynolds: boids flock, Dennis Ritchie, digital nomad, discovery of DNA, disinformation, Donald Davies, double helix, Douglas Engelbart, Douglas Engelbart, easy for humans, difficult for computers, Fall of the Berlin Wall, Free Software Foundation, Grace Hopper, Hacker Ethic, Hans Moravec, informal economy, John Conway, John Markoff, John Perry Barlow, Ken Thompson, Kevin Kelly, Kickstarter, late capitalism, Lewis Mumford, linear programming, macro virus, Marshall McLuhan, means of production, Menlo Park, moral panic, mutually assured destruction, Norbert Wiener, old-boy network, OSI model, packet switching, Panopticon Jeremy Bentham, phenotype, post-industrial society, profit motive, QWERTY keyboard, RAND corporation, Ray Kurzweil, Reflections on Trusting Trust, RFC: Request For Comment, Richard Stallman, semantic web, SETI@home, stem cell, Steve Crocker, Steven Levy, Stewart Brand, Ted Nelson, telerobotics, The future is already here, the market place, theory of mind, urban planning, Vannevar Bush, Whole Earth Review, working poor, Yochai Benkler

Isomorphic Biopolitics As a final comment, it is worthwhile to note that the concept of “protocol” is related to a biopolitical production, a production of the possibility for experience in control societies. It is in this sense that Protocol is doubly materialist—in the sense of networked bodies inscribed by informatics, and Foreword: Protocol Is as Protocol Does xix in the sense of this bio-informatic network producing the conditions of experience. The biopolitical dimension of protocol is one of the parts of this book that opens onto future challenges. As the biological and life sciences become more and more integrated with computer and networking technology, the familiar line between the body and technology, between biologies and machines, begins to undergo a set of transformations.

Individual subjects are not only civil subjects, but also medical subjects for a medicine increasingly influenced by genetic science. The ongoing research and clinical trials in gene therapy, regenerative medicine, and genetic diagnostics reiterate the notion of the biomedical subject as being in some way amenable to a database. In addition to this bio-informatic encapsulation of individual and collective bodies, the transactions and economies between bodies are also being affected. Research into stem cells has ushered in a new era of molecular bodies that not only are self-generating like a reservoir (a new type of tissue banking), but that also create a tissue economy of potential biologies (lab-grown tissues and organs).

If layering is dependent upon portability, then portability is in turn enabled by the existence of ontology standards. These are some of the sites that Protocol opens up concerning the possible relations between information and biological networks. While the concept of biopolitics is often used at its most general level, Protocol asks us to respecify biopolitics in the age of biotechnology and bioinformatics. Thus one site of future engagement is in the zones where info-tech and bio-tech intersect. The “wet” biological body has not simply been superceded by “dry” computer code, just as the wet body no longer accounts for the virtual body. Biotechnologies of all sorts demonstrate this to us—in vivo tissue engineering, ethnic genome projects, gene-finding software, unregulated genetically modified foods, portable DNA diagnostics kits, and distributed proteomic computing.


pages: 361 words: 86,921

The End of Medicine: How Silicon Valley (And Naked Mice) Will Reboot Your Doctor by Andy Kessler

airport security, Andy Kessler, Bear Stearns, bioinformatics, Buckminster Fuller, call centre, Dean Kamen, digital divide, El Camino Real, employer provided health coverage, full employment, George Gilder, global rebalancing, Law of Accelerating Returns, low earth orbit, Metcalfe’s law, moral hazard, Network effects, off-the-grid, pattern recognition, personalized medicine, phenotype, Ray Kurzweil, Richard Feynman, Sand Hill Road, Silicon Valley, stem cell, Steve Jurvetson, vertical integration

These small jets can’t handle a volcanic eruption, can they? I decided to ask a question about something I might understand the answer to. “Don, you keep talking about this bioinformatics thing. Is it just some Oracle database? Is there something special about it?” Don leaned forward. “Nothing is easy in this world. We need a database of all known proteins plus a standardized way to store information from new proteins as they are characterized. The National Cancer Institute has something called CaBIG, the Cancer Bioinformatics Information Grid. It tries to define things from clinical trial data to genomics—even biospecimens.” “Okay.” “We’ll be part of it.

So we try to standardize simple things, like calibrating machines, like creating a real database of proteins and on and on. “To get biomarkers to be real, you have to have both specificity and sensitivity. Picking up on just one protein and not missing it. But there are millions of proteins—we probably know about 5,000 of them. I’m funding a bioinformatics platform to hold all this protein info. It’s amazing no one has done this yet. I’ve got a bunch of ex-Microsofters writing code.” “Why care about all those proteins? Don’t only a few work?” I asked. “Sure, but which ones? Turns out that if we use two markers—CA-125 and something else—maybe we can get the effectiveness to 0.95.

scanning of stents in arthritis ASCOT-LLA trials aspirin Astrophysical Journal Atmospheric Test Ban Treaty (1963) ATMs atomic mass atomographs ATP III trials Audible automation autopsies Avogadro’s number baby-boom generation back problems bacteria Baker, Laurence baldness Balestra, Mark balloon catheters banking industry barium platinocyanide Bell Labs Bentley, John Berlin, Andy Bernstein, Dr. beta-blockers beta-lymphocyte stimulator (BLyS) Bextra Bialystock, Max Billion-Dollar Molecule bioinformatics biological markers (biomarkers) biology: academic vs. applied analogue nature of digital technology for laws of molecular bioluminescence imagers biopsies Bio-Rad classification book biotechnology Bittner, Craig bladder infections Blake, Dr. “bleed to read” standard blood clots blood flow blood pressure blood serum blood tests: cost of laboratories for for molecular diagnosis results of for tumors blood vessels Blue Cross bone marrow cancer bone metastases bone scans Botox Bracewell, Ronald N.


pages: 721 words: 197,134

Data Mining: Concepts, Models, Methods, and Algorithms by Mehmed Kantardzić

Albert Einstein, algorithmic bias, backpropagation, bioinformatics, business cycle, business intelligence, business process, butter production in bangladesh, combinatorial explosion, computer vision, conceptual framework, correlation coefficient, correlation does not imply causation, data acquisition, discrete time, El Camino Real, fault tolerance, finite state, Gini coefficient, information retrieval, Internet Archive, inventory management, iterative process, knowledge worker, linked data, loose coupling, Menlo Park, natural language processing, Netflix Prize, NP-complete, PageRank, pattern recognition, peer-to-peer, phenotype, random walk, RFID, semantic web, speech recognition, statistical model, Telecommunications Act of 1996, telemarketer, text mining, traveling salesman, web application

In contrast to the (global) model structure, a temporal pattern is a local model that makes a specific statement about a few data samples in time. Spikes, for example, are patterns in a real-valued time series that may be of interest. Similarly, in symbolic sequences, regular expressions represent well-defined patterns. In bioinformatics, genes are known to appear as local patterns interspersed between chunks of noncoding DNA. Matching and discovery of such patterns are very useful in many applications, not only in bioinformatics. Due to their readily interpretable structure, patterns play a particularly dominant role in data mining. There have been many techniques used to model global or local temporal events. We will introduce only some of the most popular modeling techniques.

It paints a picture of the state-of-the-art techniques that can boost the capabilities of many existing data-mining tools and gives the novel developments of feature selection that have emerged in recent years, including causal feature selection and Relief. The book contains real-world case studies from a variety of areas, including text classification, web mining, and bioinformatics. Saul, L. K., et al., Spectral Methods for Dimensionality Reduction, in Semisupervised Learning, B. Schööelkopf, O. Chapelle and A. Zien eds., MIT Press, Cambridge, MA, 2005. Spectral methods have recently emerged as a powerful tool for nonlinear dimensionality reduction and manifold learning.

The question is when to use the linear kernel as a first choice. If the number of features is large, one may not need to map data to a higher dimensional space. Experiments showed that the nonlinear mapping does not improve the SVM performance. Using the linear kernel is good enough, and C is the only tuning parameter. Many microarray data in bioinformatics and collection of electronic documents for classification are examples of this data set type. As the number of features is smaller, and the number of samples increases, SVM successfully maps data to higher dimensional spaces using nonlinear kernels. One of the methods for finding optimal parameter values for an SVM is a grid search.


pages: 256 words: 67,563

Explaining Humans: What Science Can Teach Us About Life, Love and Relationships by Camilla Pang

autism spectrum disorder, backpropagation, bioinformatics, Brownian motion, correlation does not imply causation, data science, deep learning, driverless car, frictionless, job automation, John Nash: game theory, John von Neumann, Kickstarter, Nash equilibrium, neurotypical, phenotype, random walk, self-driving car, stem cell, Stephen Hawking

How to learn from your mistakes Deep learning, feedback loops and human memory 11. How to be polite Game theory, complex systems and etiquette Afterword Acknowledgements Index About the Author Dr Camilla Pang holds a PhD in Biochemistry from University College London and is a Postdoctoral Scientist specialising in Translational Bioinformatics. At the age of eight, Camilla was diagnosed with Autistic Spectrum Disorder (ASD), and ADHD at 26-years-old. Her career and studies have been heavily influenced by her diagnosis and she is driven by her passion for understanding humans, our behaviours and how we work. To my mother Sonia, father Peter and sister Lydia Introduction It was five years into my life on Earth that I started to think I’d landed in the wrong place.

If the rules are (mostly) unwritten, and no one can agree who sets them, then what can we do to avoid the nightmare scenario of a major etiquette breach? Being someone who is rather fond of a rulebook, I decided that the only way was to write my own. If no one would tell me what the laws of etiquette were, I would have to work them out for myself. In doing so, relying on techniques from computer modelling, game theory and my own field of bioinformatics, I have learned that a rulebook is perhaps the wrong way to think about etiquette. Because the rules are one thing, and they do exist, but they are not the only variable. It’s also about how they are tweaked, interpreted and applied into discrete situations. Individual behaviour is as important as collective habits, and the two influence each other in an unfolding symbiosis that you can never fully predict.

What makes it OK for my sister to mock my Frida Kahlo-esque unibrow, but not (I can promise you) for me to point out that her painted-on brows are reminiscent of Super Mario? We need a method for matching behaviour to context and filling in the gaps between our knowledge and ignorance of new situations. That is where homology, which we use to model the similarities between proteins, comes into its own. Homology is a core technique of bioinformatics, my field of study, where it is used to fill in the gaps in data sets we are still exploring, inferring from related cases. There will always be some missing data, but we can overcome this by using what we know about equivalent situations to inform what we don’t about this one. For instance, if you are trying to develop a new drug treatment for a particular form of cancer, and you have found a suitable protein to target, what you need to establish is its structure – the thing you will bind your treatment on to.


pages: 381 words: 78,467

100 Plus: How the Coming Age of Longevity Will Change Everything, From Careers and Relationships to Family And by Sonia Arrison

23andMe, 8-hour work day, Abraham Maslow, Albert Einstein, Anne Wojcicki, artificial general intelligence, attribution theory, Bill Joy: nanobots, bioinformatics, caloric restriction, caloric restriction, Clayton Christensen, dark matter, disruptive innovation, East Village, en.wikipedia.org, epigenetics, Frank Gehry, Googley, income per capita, indoor plumbing, Jeff Bezos, Johann Wolfgang von Goethe, Kickstarter, Larry Ellison, Law of Accelerating Returns, life extension, Nick Bostrom, personalized medicine, Peter Thiel, placebo effect, post scarcity, precautionary principle, radical life extension, Ray Kurzweil, rolodex, Silicon Valley, Silicon Valley billionaire, Simon Kuznets, Singularitarianism, smart grid, speech recognition, stem cell, Stephen Hawking, Steve Jobs, Steve Wozniak, Steven Levy, sugar pill, synthetic biology, Thomas Malthus, upwardly mobile, World Values Survey, X Prize

SU’s mission is practical: “to assemble, educate and inspire leaders who strive to understand and facilitate the development of exponentially advancing technologies in order to address humanity’s grand challenges.”20 The academic tracks are geared toward understanding how fast-moving technologies can work together, and more than half of them have a direct impact on the field of longevity research. These tracks include AI and robotics; nanotechnology, networks, and computing systems; biotechnology and bioinformatics; medicine and neuroscience; and futures studies and forecasting.21 SU is a place where mavens speak to those who are superfocused on changing the world for the better. It is no surprise, then, that it also functions as an institutional “connector”—the third component needed to successfully spread a game-changing meme.

If the source code of humans can be identified, then it is not that much of a leap to think about re-engineering it. Suddenly, biology became a field that computer geeks could attempt to tackle, which not only resulted in smart biohackers forming do-it-yourself biology clubs, but also increased the pace of advances in biology. Bioinformatics are moving at the speed of Moore’s Law and sometimes faster. To the extent that wealthy technology moguls influence public opinion and hackers seem cool, the context for the longevity meme is sizzling hot. In a Wired magazine interview in April 2010, Bill Gates, America’s richest man, told reporter Steven Levy that if he were a teenager today, “he’d be hacking biology.”57 Gates elaborated, saying, “Creating artificial life with DNA synthesis, that’s sort of the equivalent of machine-language programming.”

Policy makers, activists, journalists, educators, investors, philanthropists, analysts, entrepreneurs, and a whole host of others need to come together to fight for their lives. We now know that aging is plastic and that humanity’s time horizons are not set in stone. Larry Ellison, Bill Gates, Peter Thiel, Jeff Bezos, Larry Page, Sergey Brin, and Paul Allen have all recognized the wealth of opportunity in the bioinformatics revolution, but this is not enough. Other heroes must come forward—perhaps there is even one reading this sentence right now. The goal is more healthy time, which, as we have seen throughout this book, will lead to greater wealth and prospects for happiness. A longer health span means more time to enjoy the wonders of life, including relationships with family and friends, career building, knowledge seeking, adventure, and exploration.


pages: 350 words: 96,803

Our Posthuman Future: Consequences of the Biotechnology Revolution by Francis Fukuyama

Albert Einstein, Asilomar, assortative mating, Berlin Wall, bioinformatics, caloric restriction, caloric restriction, classic study, Columbine, cotton gin, demographic transition, digital divide, Fall of the Berlin Wall, Flynn Effect, Francis Fukuyama: the end of history, impulse control, life extension, Menlo Park, meta-analysis, out of africa, Peter Singer: altruism, phenotype, precautionary principle, presumed consent, Ray Kurzweil, Recombinant DNA, Scientific racism, selective serotonin reuptake inhibitor (SSRI), sexual politics, stem cell, Steven Pinker, Stuart Kauffman, The Bell Curve by Richard Herrnstein and Charles Murray, Turing test, twin studies

The Human Genome Project would not have been possible without parallel advances in the information technology required to record, catalog, search, and analyze the billions of bases making up human DNA. The merger of biology and information technology has led to the emergence of a new field, known as bioinformatics.3 What will be possible in the future will depend heavily on the ability of computers to interpret the mind-boggling amounts of data generated by genomics and proteomics and to build reliable models of phenomena such as protein folding. The simple identification of genes in the genome does not mean that anyone knows what it is they do.

Norton, 1994); Kathryn Brown, “The Human Genome Business Today,” Scientific American 283 (July 2000): 50–55; and Kevin Davies, Cracking the Genome: Inside the Race to Unlock Human DNA (New York: Free Press, 2001). 2 Carol Ezzell, “Beyond the Human Genome,” Scientific American 283, no. 1 ( July 2000): 64–69. 3 Ken Howard, “The Bioinformatics Gold Rush,” Scientific American 283, no. 1 (July 2000): 58–63. 4 Interview with Stuart A. Kauffman, “Forget In Vitro—Now It’s ‘In Silico,’” Scientific American 283, no. I July 2000): 62–63. 5 Gina Kolata, “Genetic Defects Detected in Embryos Just Days Old,” The New York Times, September 24, 1992, p.

Coppin. The Politics of Purity: Harvey Washington Wiley and the Origins of Federal Food Policy Ann Arbor, Mich.: University of Michigan Press, 1999. Hirschi, Travis, and Michael Gottfredson. A General Theory of Crime. Stanford, Calif.: Stanford University Press, 1990. Howard, Ken. “The Bioinformatics Gold Rush.” Scientific American 283, no. I (July 2000): 58–63. Hrdy, Sarah B., and Glenn Hausfater. Infanticide: Comparative and Evolutionary Perspectives. New York: Aldine Publishing, 1984. Hubbard, Ruth. The Politics of Women’s Biology. New Brunswick, N.J.: Rutgers University Press, 1990.


Beautiful Data: The Stories Behind Elegant Data Solutions by Toby Segaran, Jeff Hammerbacher

23andMe, airport security, Amazon Mechanical Turk, bioinformatics, Black Swan, business intelligence, card file, cloud computing, computer vision, correlation coefficient, correlation does not imply causation, crowdsourcing, Daniel Kahneman / Amos Tversky, DARPA: Urban Challenge, data acquisition, data science, database schema, double helix, en.wikipedia.org, epigenetics, fault tolerance, Firefox, Gregor Mendel, Hans Rosling, housing crisis, information retrieval, lake wobegon effect, Large Hadron Collider, longitudinal study, machine readable, machine translation, Mars Rover, natural language processing, openstreetmap, Paradox of Choice, power law, prediction markets, profit motive, semantic web, sentiment analysis, Simon Singh, social bookmarking, social graph, SPARQL, sparse data, speech recognition, statistical model, supply-chain management, systematic bias, TED Talk, text mining, the long tail, Vernor Vinge, web application

In the financial services domain, large data stores of past market activity are built to serve as the proving ground for complex new models developed by the Data Scientists of their domain, known as Quants. Outside of industry, I’ve found that grad students in many scientific domains are playing the role of the Data Scientist. One of our hires for the Facebook Data team came from a bioinformatics lab where he was building data pipelines and performing offline data analysis of a similar kind. The well-known Large Hadron Collider at CERN generates reams of data that are collected and pored over by graduate students looking for breakthroughs. Recent books such as Davenport and Harris’s Competing on Analytics (Harvard Business School Press, 2007), Baker’s The Numerati (Houghton Mifflin Harcourt, 2008), and Ayres’s Super Crunchers (Bantam, 2008) have emphasized the critical role of the Data Scientist across industries in enabling an organization to improve over time based on the information it collects.

DNA As a Data Source To a programming language, DNA is simply a string: char(3*10^6) human_genome; The full genomic information for man consists of 3 billion characters and is easily handled in memory by even the most inefficient home-brewed language. However, the process of determining the exact order of these 3 billion bases requires a significant effort spanning chemistry, bioinformatics, laboratory procedures, and a lot of spinning disks. The Human Genome Project aimed, for the first time, to sequence every one of these characters. A number of large, high-throughput institutes from around the world put academic competition aside and set about a task that would last 13 years and consume billions of dollars.

Although unsuitable for analysis, this data is useful should any run require a manual review to identify imaging problems or artifacts (oil, poor DNA clustering, and even fingerprints aren’t uncommon). Once the sequencing data is available, it is stored in two formats in a high-performance Oracle database. While production systems make good use of databases, bioinformatics tools tend to continue to work against flat files on a physical filesystem. To be sure that we cater to all tastes, the vast swaths of sequence information available in this sequence archive are also presented to Sanger’s internal compute farms via a Fuse user-space filesystem. This approach scales surprisingly well.


pages: 285 words: 78,180

Life at the Speed of Light: From the Double Helix to the Dawn of Digital Life by J. Craig Venter

Albert Einstein, Alfred Russel Wallace, Apollo 11, Asilomar, Barry Marshall: ulcers, bioinformatics, borderless world, Brownian motion, clean water, Computing Machinery and Intelligence, discovery of DNA, double helix, dual-use technology, epigenetics, experimental subject, global pandemic, Gregor Mendel, Helicobacter pylori, Isaac Newton, Islamic Golden Age, John von Neumann, Louis Pasteur, Mars Rover, Mikhail Gorbachev, phenotype, precautionary principle, Recombinant DNA, Richard Feynman, stem cell, Stuart Kauffman, synthetic biology, the scientific method, Thomas Kuhn: the structure of scientific revolutions, Turing machine

Within a computer it would be possible to explore the functions of proteins, protein–protein interactions, protein–DNA interactions, regulation of gene expression, and other features of cellular metabolism. In other words, a virtual cell could provide a new perspective on both the software and hardware of life. In the spring of 1996 Tomita and his students at the Laboratory for Bioinformatics at Keio started investigating the molecular biology of Mycoplasma genitalium (which we had sequenced in 1995) and by the end of that year had established the E-Cell Project. The Japanese team had constructed a model of a hypothetical cell with only 127 genes, which were sufficient for transcription, translation, and energy production.

Currently Novartis and other vaccine companies rely on the World Health Organization to identify and distribute the seed viruses. To speed up the process we are using a method called “reverse vaccinology,” which was first applied to the development of a meningococcal vaccine by Rino Rappuoli, now at Novartis. The basic idea is that the entire pathogenic genome of an influenza virus can be screened using bioinformatic approaches to identify and analyze its genes. Next, particular genes are selected for attributes that would make good vaccine targets, such as outer-membrane proteins. Those proteins then undergo normal testing for immune responses. My team has sequenced genes representing the diversity of influenza viruses that have been encountered since 2005.

“Natural selection as the process of accumulating genetic information in adaptive evolution.” Genetical Research 2 (1961): pp. 127–40. 7. Sydney Brenner. “Life’s code script.” Nature 482 (February 23, 2012): p. 461. 8. W. J. Kress and D. L. Erickson. “DNA barcodes: Genes, genomics, and bioinformatics.” Proceedings of the National Academy of Sciences 105, no. 8 (2008): pp. 2761–62. 9. Lulu Qian and Erik Winfree. “Scaling up digital circuit computation with DNA strand displacement cascades.” Science 332, no. 6034 (June 3, 2011): pp. 1196–201. 10. George M. Church, Yuan Gao, and Sriram Kosuri.


pages: 405 words: 117,219

In Our Own Image: Savior or Destroyer? The History and Future of Artificial Intelligence by George Zarkadakis

3D printing, Ada Lovelace, agricultural Revolution, Airbnb, Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem, animal electricity, anthropic principle, Asperger Syndrome, autonomous vehicles, barriers to entry, battle of ideas, Berlin Wall, bioinformatics, Bletchley Park, British Empire, business process, carbon-based life, cellular automata, Charles Babbage, Claude Shannon: information theory, combinatorial explosion, complexity theory, Computing Machinery and Intelligence, continuous integration, Conway's Game of Life, cosmological principle, dark matter, data science, deep learning, DeepMind, dematerialisation, double helix, Douglas Hofstadter, driverless car, Edward Snowden, epigenetics, Flash crash, Google Glasses, Gödel, Escher, Bach, Hans Moravec, income inequality, index card, industrial robot, intentional community, Internet of things, invention of agriculture, invention of the steam engine, invisible hand, Isaac Newton, Jacquard loom, Jacques de Vaucanson, James Watt: steam engine, job automation, John von Neumann, Joseph-Marie Jacquard, Kickstarter, liberal capitalism, lifelogging, machine translation, millennium bug, mirror neurons, Moravec's paradox, natural language processing, Nick Bostrom, Norbert Wiener, off grid, On the Economy of Machinery and Manufactures, packet switching, pattern recognition, Paul Erdős, Plato's cave, post-industrial society, power law, precautionary principle, prediction markets, Ray Kurzweil, Recombinant DNA, Rodney Brooks, Second Machine Age, self-driving car, seminal paper, Silicon Valley, social intelligence, speech recognition, stem cell, Stephen Hawking, Steven Pinker, Strategic Defense Initiative, strong AI, Stuart Kauffman, synthetic biology, systems thinking, technological singularity, The Coming Technological Singularity, The Future of Employment, the scientific method, theory of mind, Turing complete, Turing machine, Turing test, Tyler Cowen, Tyler Cowen: Great Stagnation, Vernor Vinge, Von Neumann architecture, Watson beat the top human players on Jeopardy!, Y2K

This dualistic software–hardware paradigm is applied across many fields, including life itself. Cells are the ‘computers’ that run a ‘program’ called the genetic code, or genome. The ‘code’ is written on the DNA. Cutting-edge research in biology does not take place in vitro in a wet lab, but in silico in a computer. Bioinformatics – the accumulation, tagging, storing, manipulation and mining of digital biological data – is the present, and future, of biology research. The computer metaphor for life is reinforced by its apparently successful application to real problems. Many disruptive new technologies in molecular biology – for instance ‘DNA printing’ – function on the basis of digital information.

Norbert Wiener’s cybernetic dream is slowly becoming a reality: the more information we have about systems, the more control we can exercise over them with the help of our computers. Big data are our newfound economic bounty. The big data economy In 2010, I took a contract as External Relations Officer at the European Bioinformatics Institute (EBI) at Hinxton, Cambridge. The Institute is part of the intergovernmental European Molecular Biology Laboratory, and its core mission is to provide an infrastructure for the storage and manipulation of biological data. This is the data that researchers in the life sciences produce every day, including information about the genes of humans and of other species, chemical molecules that might provide the basis for new therapies, proteins, and also about research findings in general.

As someone who facilitated communications between the Institute and potential government funders across Europe, I had first-hand experience of the importance that governments placed on biological data. Almost everyone understood the potential for driving innovation through this data, and was ready to support the expansion of Europe’s bioinformatics infrastructure, even as Europe was going through the Great Recession. The message was simple and clear: whoever owned the data owned the future. Governments and scientists are not the only ones to have jumped on the bandwagon of big data. The advent of social media and Google Search has transformed the marketing operations of almost every business in the world, big and small.


Pearls of Functional Algorithm Design by Richard Bird

bioinformatics, data science, functional programming, Kickstarter, Menlo Park, sorting algorithm

Final remarks The origins of the maximum segment sum problem go back to about 1975, and its history is described in one of Bentley’s (1987) programming pearls. For a derivation using invariant assertions, see Gries (1990); for an algebraic approach, see Bird (1989). The problem refuses to go away, and variations are still an active topic for algorithm designers because of potential applications in data-mining and bioinformatics; see Mu (2008) for recent results. The interest in the non-segment problem is what it tells us about any maximum marking problem in which the marking criterion can be formulated 78 Pearls of Functional Algorithm Design as a regular expression. For instance, it is immediate that there is an O(nk ) algorithm for computing the maximum at-least-length-k segment problem because F ∗ T n F ∗ (n ≥ k ) can be recognised by a k -state automaton.

The function sorttails is needed as a preliminary step in the Burrows–Wheeler algorithm for data compression, a problem we will take up in the following pearl. The problem of sorting the suffixes of a string has been treated extensively in the literature because it has other applications in string matching and bioinformatics; a good source is Gusfield (1997). This pearl was rewritten a number of times. Initially we started out with the idea of computing perm, a permutation that sorts a list. But perm is too specific in the way it treats duplicates: there is more than one permutation that sorts a list containing duplicate elements.

– array index, 25, 29, 87, 100 – prefix, 103, 119, 127 accumArray, 2, 5, 82, 123 applyUntil, 82 array, 29, 85 bounds, 25 break , 154, 164, 182 compare, 29 concatMap, 42 elems, 85 foldrn – fold over nonempty lists, 42 fork , 35, 83, 94, 118 inits, 66, 67, 117 listArray, 25, 100 minors, 172 nodups, 149 nub, 64 partition, 4 partitions, 38 reverse, 119, 244 scanl, 118, 238 scanr , 70 sort, 28, 95 sortBy, 29, 94 span, 67 subseqs, 57, 65, 157, 163 tails, 7, 79, 100, 102 transpose, 98, 150, 193 unfoldr , 202, 243 zip, 35, 83 zipWith, 83 Abelian group, 27 abides property, 3, 22 abstraction function, 129, 211, 226 accumulating function, 2 accumulating parameter, 131, 138, 140, 177, 253 adaptive encoding, 200 amortised time, 5, 118, 131, 133 annotating a tree, 170 arithmetic decoding, 201 arithmetic expressions, 37, 156 array update operation, 3, 6 arrays, 1, 2, 21, 29, 85, 99 association list, 29, 238 asymptotic complexity, 27 bags, 25, 50, 51 balanced trees, 21, 54, 234 Bareiss algorithm, 186 bijection, 129 binary search, 7, 10, 14, 15, 19, 54 binomial trees, 178 bioinformatics, 77, 90 Boolean satisfiability, 155 borders of a list, 103 bottom-up algorithm, 41 boustrophedon product, 245, 251, 260 breadth-first search, 136, 137, 178 Bulldozer algorithm, 196 bzip2, 101 call-tree, 168 Cartesian coordinates, 141, 155 Cartesian product, 149 celebrity clique, 56 Chió’s identity, 182 clique, 56 combinatorial patterns, 242 comparison-based sorting, 10, 16, 27 computaional geometry, 188 conjugate, 263 constraint satisfaction, 155 continuations, 273 coroutines, 273 275 276 cost function, 41, 48, 52 cyclic structures, 133, 179 data compression, 91, 198 data mining, 77 data refinement, 5, 48, 108, 114, 129, 210 deforestation, 168 depth-first search, 137, 221, 222 destreaming, 214 destreaming theorem, 214 Dilworth’s theorem, 54 divide and conquer, 1, 3, 5, 7, 8, 15, 21–23, 27, 29, 30, 65, 81, 171 dot product, 185 dynamic programming, 168 EOF (end-of-file symbol), 203 exhaustive search, 12, 33, 39, 57, 148, 156 facets, 190 failure function, 133 fictitious values, 14, 77 finite automaton, 74, 136 fission law of foldl, 130 fixpoint induction, 205 forests, 42, 174 fringe of a tree, 41 frontier, 137 fully strict composition, 243 fusion law of foldl, 76, 130, 195 fusion law of foldr , 34, 51, 52, 61, 247, 260, 261, 265 fusion law of foldrn, 43 fusion law of fork , 35 fusion law of unfoldr , 206, 212 Galil’s algorithm, 122 garbage collection, 165, 166 Garsia–Wachs algorithm, 49 Gaussian elimination, 180 graph traversal, 178, 221 Gray path order, 258 greedy algorithms, 41, 48, 50, 140 Gusfield’s Z algorithm, 116 Hu–Tucker algorithm, 49 Huffman coding, 91, 198, 201 immutable arrays, 25 incremental algorithm, 188, 191, 204 incremental decoding, 216 incremental encoding, 203, 209 indexitis, 150 inductive algorithm, 42, 93, 102 integer arithmetic, 182, 198, 208 integer division, 182 intermediate data structure, 168 interval expansion, 209, 210 inversion table, 10 inverting a function, 12, 93 involution, 150 iterative algorithm, 10, 82, 109, 113 Index Knuth and Ruskey algorithm, 258 Knuth’s spider spinning algorithm, 242 Koda–Ruskey algorithm, 242 law of iterate, 99 laws of filter , 118, 152 laws of fork , 35 lazy evaluation, 33, 147, 185, 243 leaf-labelled trees, 41, 165, 168 left spines, 43, 45, 177 left-inverse, 129 Leibniz formula, 180 lexicographic ordering, 45, 52, 64, 102, 104 linear ordering, 43 linked list, 225 longest common prefix, 103, 112, 120 longest decreasing subsequence, 54 loop invariants, 62, 111 lower bounds, 16, 27, 28, 64 Mahajan and Vinay’s algorithm, 186 majority voting problem, 62 matrices, 147, 181 matrix Cartesian product, 149 maximum marking problems, 77 maximum non-segment sum, 73 maximum segment sum, 73 maximum surpasser count, 7 McCarthy S-expression, 221 memo table, 163 memoisation, 162 merge, 26, 142, 158 mergesort, 29, 89, 171, 173 minimal element, 53 minimum cost tree, 44 minimum element, 53 minors, 181 model checking, 155 monads, 3, 114, 155 monotonicity condition, 48, 53 move-to-front encoding, 91 multisets, 25 narrowing, 199 nondeterministic functions, 43, 51 normal form, 160 online list labelling, 241 Open Problems Project, 31 optimal bracketing, 176 optimisation problems, 48, 176 order-maintenance problem, 241 overflow, 214 parametricitiy, 62 partial evaluation, 134 partial ordering, 53 partial preorder, 52 partition sort, 85 partition sorting, 87 perfect binary trees, 171 Index permutations, 79, 90, 91, 96, 97, 180, 189, 242, 251 planning algorithm, 136, 138 plumbing combinators, 36 prefix, 66 prefix ordering, 103, 105, 119 preorder traversal, 245, 270 principal submatrices, 185 program transformation, 221 PSPACE completeness, 136 queues, 109, 137, 248, 249 Quicksort, 5, 85, 89 radix sort, 95, 101 ranking a list, 79 rational arithmetic, 180, 188, 198 rational division, 181 recurrence relations, 15, 31, 88 refinement, 44, 48, 51–53, 80 regular cost function, 49 regular expression, 74 relations, 48, 167, 229 representation function, 129, 211 right spines, 177 Rose trees, 164, 245 rotations of a list, 91 rule of floors, 215 run-length encoding, 91 saddleback search, 14 safe replacement, 222 scan lemma, 118, 125 segments, 73, 171 Shannon–Fano coding, 198 sharing, 168, 173 shortest upravel, 50 simplex, 188 skeleton trees, 165 sliding-block puzzle, 136 smart constructors, 48, 170, 177 smooth algorithms, 241 solving a recursion, 98 sorting, 9, 10, 16, 91, 149 sorting numbers, 1, 3 sorting permutation, 10 space/time trade-offs, 156 spanning tree, 178 stable sorting algorithm, 86, 95 stacks, 137, 221, 222 streaming, 203, 214 streaming theorem, 204 string matching, 112, 117, 127 stringology, 103 subsequences, 50, 64, 74, 162, 177, 242 suffix tree, 101 suffixes, 79, 100 Sylvester’s identity, 186 thinning algorithm, 161 top-down algorithm, 41 totally acyclic digraph, 258 transitions, 242 trees, 130, 165, 248 tries, 163 tupling law of foldl, 118, 125 tupling law of foldr , 247 unfolds, 168 unmerges, 158, 159, 165 unravel, 50 upper triangular matrix, 185 Vandermonde’s convolution, 17 well-founded recursion, 4, 30 while loop, 111, 113 wholemeal programming, 150 windows of a text, 120 Young tableau, 28 277


Dinosaurs Rediscovered by Michael J. Benton

All science is either physics or stamp collecting, Bayesian statistics, biofilm, bioinformatics, classic study, David Attenborough, Ernest Rutherford, Ford Model T, germ theory of disease, Isaac Newton, lateral thinking, North Sea oil, nuclear winter, ocean acidification

These fights might seem inconsequential, but we are considering the fundamentals of how to document the wonders of biodiversity, and we are also addressing origins. Documenting biodiversity and origins is big science now – indeed, it forms part of the modern techniques termed, rather forbiddingly, phylogenomics and bioinformatics. Phylogenomics is the new discipline of establishing evolutionary trees from molecular data. Bioinformatics is the field of managing large data sets in the life sciences and number-crunching those data to produce information on the genetic basis of disease, adaptations, and cell function, and has applications fundamental to medicine and agriculture.

McNeill 215, 216, 218, 228–29, 234, 252 Allen, Percy 73 alligators 118, 164–65, 194 Allosaurus 49, 121, 188 animated skin of 250 diet 206 fact file 188–89 feeding mechanisms 186–88, 190–91, 193, 193 medullary bone 145 Morrison Formation 69, 71 movement 248 skulls 17–18, X teeth and bite force 188, 189, 192, 196 Alvarez, Luis 259–62, 260, 264, 267, 285, 286 Alvarez, Walter 259, 260, 261–62, 264 amber dinosaurs preserved in 131–32, VI extracting DNA from fossils in 136, 137 American Museum of Natural History (AMNH) 54, 156, 166, 243 American National Science Foundation 52 Amherst College Museum, Connecticut 223, 224–25, 227 Amphicoelias 206 analogues, modern 16 Anatosaurus 221, 221 Anchiornis 68–69, 70, V fact file 70 feathers 125, 126 flight 245 footprints 224–25, 225 angiosperms 78–79 animation 249–52, 251 Ankylosaurus 65, 79, 272 extinction 276 fact file 272–73 Hell Creek Formation 270 use of arms and legs 236 Anning, Mary 195 apatite 142 Apatosaurus 206 Archaeopteryx 110, 112, IV as ‘missing link’ fossil 114, 121 fact file 112–13 flight 114, 124, 247 Richard Owen and 111, 114 skeleton found at Solnhofen 111, 277 archosauromorphs 35–36, 37 archosaurs 16, 21–22, 35, 39, 56 Armadillosuchus 201 Asaro, Frank 259 Asilisaurus 32–33 asteroid impact 254–69, 275–76, 280, 281, 286–87, XIX Attenborough, David 98, 213 B Bakker, Bob 109–10, 115, 126 asteroid impact and extinction 262 Deinonychus 110, 111, 221, 244–45 dinosaurs as warm-blooded creatures 109, 116, 117 modern birds as dinosaurs 110 speed of dinosaurs 230 validity of Owen’s Dinosauria 57, 59 Baron, Matt 80–83 Barosaurus 206 Barreirosuchus 201 Barrett, Paul 80–83 Baryonyx 193 Bates, Karl 192 Bayesian statistical methods 273, 275 BBC Horizon 229, 264–65 Walking with Dinosaurs 249–52, 251 beetles 78, 139, 204 Beloc, Haiti 265–66, 265 Bernard Price Palaeontological Institute 160, 163 Bernardi, Massimo 43, 46 biodiversity, documenting 52 bioinformatics 52 bipedal dinosaurs arms and legs 235–40 early images of 219–21 movement and posture 221–22, 222, 249 speed 228 Bird, Roland T. 242–43 birds 145 brains 129 breathing 118 eggs 155, 158, 159, 166 evolution of 277, 278–79, 279–81, 280 feathers 125–26, 127 flight 244, 247, 248 gastroliths 194 growth 174 identifying ancestral genetic sequences 151–52 intelligence 128 as living dinosaurs 110–15, 118, 120–21, 124, 132 and the mass extinction 277–81 medullary bone 143, 145 Mesozoic birds from China 118–24 movement 234 sexual selection 126 using feet to hold prey down 235, 235 bite force 191–94 blood, identifying dinosaur 141–43 Bonaparte, José 239 bones 99 age of 155 bone histology 116–18, 119 bone remodelling 116–17 casting 100 composition 142 excavating from rock 87–99, 105 extracting blood from 141–42 first found 65 first illustrated 65 growth lines 116, 117, 154–55, 170, 172–73, 184 how dinosaurs’ jaws worked 186 mapping 93–94 reconstructing 99–101 structures 170, XIII Brachiosaurus 49, 69, 178–79 diet 206, 207–8 fact file 178–79 Morrison Formation 69 size 175 bracketing 15–17 brain size 128–30, XI, XII breakpoint analysis 42, 43 breathing 118 Bristol City Museum 104 Bristol Dinosaur Project 101–4 British Museum, London 111, 114 Brontosaurus 69, 225 Brookes, Richard 65 Brown, Barnum 273 Brusatte, Steve 32, 36–37, 39 bubble plots 42, 43 Buckland, William 67, 195 Buckley, Michael 142 Burroughs, Edgar Rice, The Land that Time Forgot 134 Butler, Richard 32 Button, David 208, 213 C Camarasaurus 175, 206, 208–9, 209, 213, IX Cano, Raúl 136 Carcharodontosaurus 196 Carnegie, Andrew 211 Carnian Pluvial Episode 40, 42, 43, 45, 46, 50 carnivores 201 see also individual dinosaurs Carnotaurus 201, 238, 239, 240 fact file 239 carotenoids 124 cartilage 142 Caudipteryx 121, 123 fact file 123 Centrosaurus 87, 88 fact file 88–89 ceratopsians 79, 143, 156 diversity of 272, 275 use of arms and legs 236 Ceratosaurus 69, 71, 187, 206 Cetiosaurus 57, 66 Chapman Andrews, Roy 156, 166 Charig, Alan 22–23, 34, 39 Chasmosaurus 87 Chen, Pei-ji 121 Chicxulub crater, Mexico 264–68, 267, 285, 286 Chin, Karen 195, 204 China Jurassic dinosaurs 68–71 Mesozoic birds from 118–24 Chinsamy-Turan, Anusuya 145 chitin 139 chromosomes 151–52 Chukar partridges 248 clades 55, 82, 110 cladistics 53–55, 82–83 cladograms 55, 56 Clashach, Scotland 85, 86 classic model 21, 21 classification, evolutionary trees 52–84, 60–61 climate climate change 22, 40, 41, 43 Cretaceous 269 identifying ancient 46–47 Late Triassic 40, 41, 43, 49 Triassic Period 48, 49 cloning 134–35, 137, 148–51, 150 Coelophysis 193, 236, I, X Colbert, Ned 22, 23, 34 Romer-Colbert ecological relay model 22, 35, 36, 39–40 size and core temperature 118 cold-blooded animals 116 collagen 142, 143 colour of dinosaurs 124–25 of feathers 8–10, 17, 139, V computational methods 35–39 Conan Doyle, Sir Arthur, The Lost World 133–34, 133, 135 Confuciusornis 144, 145, 147, XIII fact file 146–47 conifers 22, 131, 197, III Connecticut Valley 223–26, 224–25, 227, 243 contamination of DNA 138 continental plates 47 Cope, Edward 208 coprolites 195, 195, 197, 204 coprophagy 204 crests 126, 128, 143 Cretaceous 50, 71–75 birds 277–78 climate 269 decline of dinosaurs 274, 275 dinosaur evolution rates 77 ecosystems 205 in North America 240–42 ornithopods 71 sauropods 71 see also Early Cretaceous; Late Cretaceous Cretaceous–Palaeogene boundary 260, 261–62, 265–66, 269 evolution of birds 276, 277, 278–79 Cretaceous Terrestrial Revolution 77–80, 131 Crichton, Michael, Jurassic Park 134–35, 136 criticism and scientific method 287–88 crocodiles 218 Adamantina Formation food web 201–3 eggs and babies 155, 159, 164, 165 feeding methods 194 function of the snout 193 crurotarsans 39 CT (computerized tomographic) scanning 97, 99 dinosaur embryos 160, 162 dinosaur skulls 163, 191 Currie, Phil 86, 91, 121 Cuvier, Georges 257 D Dal Corso, Jacopo 40 Daohugou Bed, China 68 Darwin, Charles 23, 107, 114, 132, 287 Daspletosaurus 170, 171 dating dinosaurian diversification 44–46 de-extinction science 149, 151 death of dinosaurs see extinction Deccan Traps 268, 285, 287 Deinonychus 112, 114, 121 fact file 112–13 John Ostrom’s monograph on 110, 111, 113, 116, 244–45 movement 221 dentine 196, 197 Dial, Ken 248 diet collapsing food webs 204–5 dinosaur food webs 201–4 fossil evidence for 194–95 microwear on teeth and diet 199–201 niche division and specialization in 205–13 digital models 17, 18, 19, 191–94, 231–34, 249, 252 dimorphism, sexual 126, 143 dinomania 107 Dinosaur Park Formation, Drumheller 86, 91–99, 100 Dinosaur Provincial Park, Alberta 86, 87, 91–92, 91 Dinosaur Ridge, Colorado 240 Dinosauria 33, 55, 82, 107 discovery of the clade 57–59 Diplodocus 175, 210–11, II diet 207, 208–9, 213 fact file 210–11 Morrison Formation 69 skulls IX teeth and bite force 209, 213 diversification of dinosaurs 29, 44–46 DNA (deoxyribonucleic acid) 134–35 cloning 148–51 dinosaurian genome 151–52 extracting from fossils in amber 136 extracting from museum skins and skeletons 138 identifying dinosaur 136–37 survival of in fossils 138–39, 141 Doda, Bajazid 180 Dolly the sheep 148, 149 Dromaeosaurus 87, 121 duck-billed dinosaurs see hadrosaurs dung beetles 204 dwarf dinosaurs 180–84 Dysalotosaurus 145 Dzik, Jerzy 29, 31 E Early Cretaceous diversity of species on land and in sea 78 Jehol Beds 124 Wealden 72–74, 74, 75, 78 ecological relay model 21, 22, 35, 36, 39 ecology, and the origin of dinosaurs 23–25 education, using dinosaurs in 101–4 eggs, birds 155, 158, 159, 166 eggs, dinosaur 154, 155–56 dinosaur embryos 160–63 nests and parental care 163–67 size of 158–59 El Kef, Tunisia 276 Elgin, Scotland 25–26, 26, 34, 85–86 embryos, dinosaur 154, 160–63 enamel, tooth 196, 197 enantiornithines 277–78 encephalization quotient (EQ) 130 engineering models 17–18 Eoraptor 29 Erickson, Greg 154–55, 170, 172–73, 184–85, 197 eumelanin 124 eumelanosomes V Euoplocephalus 87, 88 fact file 88–89 Europasaurus 117 European Synchrotron Radiation Facility (ESRF) 162 evolution 13, 23, 40 evolutionary trees 52–84, 60–61, 281 Richard Owen’s views on 106–7, 114 size and 181, 184 Evolution (journal) 109 excavations 87–99 Dinosaur Park Formation 86, 91–99, 100 recording 92–97 extant phylogenetic bracket 16, 217 external fundamental system (EFS) 170 extinction Carnian Pluvial Episode 40, 42, 43, 45, 46, 50 end-Triassic event 64 mass extinction 254–85 Permian–Triassic mass extinction 14, 33–34, 46, 222 sudden or gradual 270–75 eyes 100 F faeces, fossil 194, 195, 197, 204 Falkingham, Peter 192, 226 feathers 99, 245 in amber 131, VI bird feathers 125–26, 127 colour of 8–10, 17, 139, V as insulation 126 melanosomes 8–10, 8, 17, 124–25, 132, V sexual signalling 126, 128, 143 Sinosauropteryx 8–9, 8, 10, 17, 119, 120–21, 125, 126 Field, Dan 279, 281 films, dinosaurs in 249–52 Jurassic Park 134–35, 136, 217, 252 finding dinosaurs 87–105 finite element analysis (FEA) 18, 190–91, 199, 208 fishes 128, 159, 163–64, 196 flight 244–49 flowering plants 78–79, III food webs 71–75, 201–4 Adamantina Formation 201–4, 202–3 collapsing 204–5 Wealden 74, 75 footprints 223–27, 240 megatracksites 242 photogrammetry 94 swimming tracks 242, 243 fossils casting 100 extracting skeletons from 94–99, 105 plants 269 reconstructing 99–101 scanning 97, 99 survival of organic molecules in 138–39, 141 Framestore 249–50 Froude, William 228–29 G Galton, Peter 58, 59, 110, 115, 221, 221 Garcia, Mariano 232, 234 gastroliths 194 Gatesy, Stephen 226, 231 gaur 148–49 Gauthier, Jacques 53, 59, 245 genetic engineering, bringing dinosaurs back to life with 148–51 genome, dinosaurian 151–52 geological time scale 6–7, 44–45 gharials 193, 194 gigantothermy 117, 118 Gill, Pam 199 glasses, impact 265–66, 269 gliding 245, 247, 248 Gorgosaurus 87, 170, 171 Granger, Walter 157 Great Exhibition (1851) 107, 108 Gregory, William 157 Grimaldi, David 131 growth dwarf dinosaurs 180–84 growth rates 154, 170–74, 184 growth rings 116, 117, 154–55, 170, 172–73, 184 growth spurts 145 how dinosaurs could be so huge 175–79 Gryposaurus 87 Gubbio, Italy 260, 261–62, 265, 266, 286 H hadrosaurs 79, 143 Dinosaur Park Formation 91–99, 100 diversity of 272, 275 first skeleton 218–19, 220 teeth 196–97, 198, 201, XVIII use of arms and legs 236 Hadrosaurus foulkii 220 Haiti 265–66, 265 Haldane, J.


ucd-csi-2011-02 by Unknown

bioinformatics, en.wikipedia.org, pattern recognition, The Wisdom of Crowds

Thus the work is more directed at the problem of Wikipedia vandalism than the issue of authoritativeness that is the subject of this paper. 3 Extracting and Comparing Network Motif Profiles The idea of characterizing networks in terms of network motif profiles is well established and has had a considerable impact in bioinformatics [10]. Our objective is to characterize Wikipedia pages in terms of network motif profiles and then examine whether or not different pages have characteristic network motif profiles. The datasets we considered were entries in the English language Wikipedia 2 on famous sociologists and footballers in the English Premiership 4 (see Table 1).


pages: 588 words: 131,025

The Patient Will See You Now: The Future of Medicine Is in Your Hands by Eric Topol

23andMe, 3D printing, Affordable Care Act / Obamacare, Anne Wojcicki, Atul Gawande, augmented reality, Big Tech, bioinformatics, call centre, Clayton Christensen, clean water, cloud computing, commoditize, computer vision, conceptual framework, connected car, correlation does not imply causation, creative destruction, crowdsourcing, dark matter, data acquisition, data science, deep learning, digital divide, disintermediation, disruptive innovation, don't be evil, driverless car, Edward Snowden, Elon Musk, en.wikipedia.org, Erik Brynjolfsson, Firefox, gamification, global village, Google Glasses, Google X / Alphabet X, Ignaz Semmelweis: hand washing, information asymmetry, interchangeable parts, Internet of things, Isaac Newton, it's over 9,000, job automation, Julian Assange, Kevin Kelly, license plate recognition, lifelogging, Lyft, Mark Zuckerberg, Marshall McLuhan, meta-analysis, microbiome, Nate Silver, natural language processing, Network effects, Nicholas Carr, obamacare, pattern recognition, personalized medicine, phenotype, placebo effect, quantum cryptography, RAND corporation, randomized controlled trial, Salesforce, Second Machine Age, self-driving car, Silicon Valley, Skype, smart cities, Smart Cities: Big Data, Civic Hackers, and the Quest for a New Utopia, Snapchat, social graph, speech recognition, stealth mode startup, Steve Jobs, synthetic biology, the scientific method, The Signal and the Noise by Nate Silver, The Wealth of Nations by Adam Smith, traumatic brain injury, Turing test, Uber for X, uber lyft, Watson beat the top human players on Jeopardy!, WikiLeaks, X Prize

Indeed, the state of California, which has the largest prenatal screening program in the world, with more than four hundred thousand expectant mothers assessed annually, already provides these tests to all pregnant women who have increased risk.26 Of course, we could also sequence the fetus’s entire genome instead of just doing the simpler screens. While that is not a commercially available test, and there are substantial bioinformatic challenges that lie ahead before it could be scalable, the anticipatory bioethical issues that this engenders are considerable.27 We are a long way off for determining what would constitute acceptable genomic criteria for early termination of pregnancy, since this not only relies on accurately determining a key genomic variant linked to a serious illness, but also understanding whether this condition would actually manifest.

Now it is possible to use sequencing to unravel the molecular diagnosis of an unknown condition, and the chances for success are enhanced when there is DNA from the mother and father, or other relatives, to use for anchoring and comparative sequencing analysis. At several centers around the country, the success rate for making the diagnosis ranges between 25 percent and 50 percent. It requires considerable genome bioinformatic expertise, for a trio of individuals will generate around 750 billion data points (six billion letters per sequence, three people, each done forty times to assure accuracy). Of course, just making the diagnosis is not the same as coming up with an effective treatment or a cure. But there have been some striking anecdotal examples of children whose lives were saved or had dramatic improvement.

The most far-reaching component of the molecular stethoscope appears to be cell-free RNA, which can potentially be used to monitor any organ of the body.82 Previously that was unthinkable in a healthy person. How could one possibly conceive of doing a brain or liver biopsy in someone as part of a normal checkup? Using high-throughput sequencing of cell-free RNA in the blood, and sophisticated bioinformatic methods to analyze this data, Stephen Quake and his colleagues at Stanford were able to show it is possible to follow the gene expression from each of the body’s organs from a simple blood sample. And that is changing all the time in each of us. This is an ideal case for deep learning to determine what these dynamic genomic signatures mean, to determine what can be done to change the natural history of a disease in the making, and to develop the path for prevention.


pages: 834 words: 180,700

The Architecture of Open Source Applications by Amy Brown, Greg Wilson

8-hour work day, anti-pattern, bioinformatics, business logic, c2.com, cloud computing, cognitive load, collaborative editing, combinatorial explosion, computer vision, continuous integration, Conway's law, create, read, update, delete, David Heinemeier Hansson, Debian, domain-specific language, Donald Knuth, en.wikipedia.org, fault tolerance, finite state, Firefox, Free Software Foundation, friendly fire, functional programming, Guido van Rossum, Ken Thompson, linked data, load shedding, locality of reference, loose coupling, Mars Rover, MITM: man-in-the-middle, MVC pattern, One Laptop per Child (OLPC), peer-to-peer, Perl 6, premature optimization, recommendation engine, revision control, Ruby on Rails, side project, Skype, slashdot, social web, speech recognition, the scientific method, The Wisdom of Crowds, web application, WebSocket

Amy Brown (editorial): Amy has a bachelor's degree in Mathematics from the University of Waterloo, and worked in the software industry for ten years. She now writes and edits books, sometimes about software. She lives in Toronto and has two children and a very old cat. C. Titus Brown (Continuous Integration): Titus has worked in evolutionary modeling, physical meteorology, developmental biology, genomics, and bioinformatics. He is now an Assistant Professor at Michigan State University, where he has expanded his interests into several new areas, including reproducibility and maintainability of scientific software. He is also a member of the Python Software Foundation, and blogs at http://ivory.idyll.org. Roy Bryant (Snowflock): In 20 years as a software architect and CTO, Roy designed systems including Electronics Workbench (now National Instruments' Multisim) and the Linkwalker Data Pipeline, which won Microsoft's worldwide Winning Customer Award for High-Performance Computing in 2006.

Rosangela Canino-Koning (Continuous Integration): After 13 years of slogging in the software industry trenches, Rosangela returned to university to pursue a Ph.D. in Computer Science and Evolutionary Biology at Michigan State University. In her copious spare time, she likes to read, hike, travel, and hack on open source bioinformatics software. She blogs at http://www.voidptr.net. Francesco Cesarini (Riak): Francesco Cesarini has used Erlang on a daily basis since 1995, having worked in various turnkey projects at Ericsson, including the OTP R1 release. He is the founder of Erlang Solutions and co-author of O'Reilly's Erlang Programming.

Returning to distributed systems and HDFS, Rob found many familiar problems, but all of the numbers had two or three more zeros. James Crook (Audacity): James is a contract software developer based in Dublin, Ireland. Currently he is working on tools for electronics design, though in a previous life he developed bioinformatics software. He has many audacious plans for Audacity, and he hopes some, at least, will see the light of day. Chris Davis (Graphite): Chris is a software consultant and Google engineer who has been designing and building scalable monitoring and automation tools for over 12 years. Chris originally wrote Graphite in 2006 and has lead the open source project ever since.


The New Harvest: Agricultural Innovation in Africa by Calestous Juma

agricultural Revolution, Albert Einstein, barriers to entry, bioinformatics, business climate, carbon footprint, clean water, colonial rule, conceptual framework, creative destruction, CRISPR, double helix, electricity market, energy security, energy transition, export processing zone, global value chain, high-speed rail, impact investing, income per capita, industrial cluster, informal economy, Intergovernmental Panel on Climate Change (IPCC), Joseph Schumpeter, knowledge economy, land tenure, M-Pesa, microcredit, mobile money, non-tariff barriers, off grid, out of africa, precautionary principle, precision agriculture, Recombinant DNA, rolling blackouts, search costs, Second Machine Age, self-driving car, Silicon Valley, sovereign wealth fund, structural adjustment programs, supply-chain management, synthetic biology, systems thinking, total factor productivity, undersea cable

These include rice, corn, mosquito, chicken, cattle, and 82 THE NEW HARVEST dozens of plant, animal, and human pathogens. The challenge facing Africa is building capacity in bioinformatics to understand the location and functions of genes. It is through the annotation of genomes that scientists can understand the role of genes and their potential contributions to agriculture, medicine, environmental management, and other fields. Bioinformatics could do for Africa what computer software did for India. The field would also give African science a new purpose and help to integrate the region into the global knowledge ecology.

See African Union Australia, 63–64, 67, 131 Awuah, Patrick, 241 Babban Gona agricultural franchise (Nigeria), 214–16 “Back Home” projects (Uganda Rural Development and Training Program), 153–54 Index bananas: diseases affecting, 70–71; EARTH University production of, 171; “Golden Banana” variety and, 72–73; transgenic varieties of, 66, 70–73 Bangladesh, 71–72, 75, 202 Bangladesh Agricultural Research Institute, 71–72 banks: agricultural sector financing and, 5–6, 93–94, 100–101, 107, 143, 185; clusters and, 107; educational partnerships and, 176; infrastructure and, 143; stateowned, 107; technology and, 49, 52 Banque Régionale de Solidarité (BRS), 100–101 beans: entrepreneurship and, 164; infrastructure and, 120, 122; innovation and, 92–93 Benin: educational videos on agriculture in, 202–3; gender inequality in, 149; rice cluster in, 99–102; solar-powered irrigation in, 129 Bhoomi Project, 52 biodiversity, 73, 77–78, 255–56, 259 bioinformatics, 82 biopolymers, 39, 56–58 biosafety, 79–80, 82 biotechnology: African Panel on Modern Biotechnology and, 251; benefits of, 68–76; biodiversity, 73, 77; debates regarding safety of, 76–80, 82; food security and, 64; frontiers of, 61–63; genomes and, xxi, 23, 62, 81–82; GM crops and, xxi, 62, 249; incomes and, 68, 79; innovation and, xviii, 23, 41, 63–70, 190, 239, 242–43, 251; land-saving aspects of, 74; 303 “leapfrogging” and, 64–65, 68; regulation and, xxi, 61, 63, 72, 76–81; research and, 87, 111, 190; transgenic crops and, 62–81; trends in, 63–67 Black Sigatoka fungus, 71 Blue Skies Agro-processing Company, Ltd., 197 Boston (Massachusetts), 243 Brazil: Agricultural Research Corporation in, 30, 113–14; drought-resistant crops in, 74; entrepreneurship and education in, 165–66; flash drying in, 90; fruit exports from, 197; infrastructure in, 114; innovation and, 113–14; National System for Agriculture Research and Innovation (SNPA) in, 114; technology and, 242–44 Brazilian Agricultural Research Corporation (EMBRAPA), 30, 113–14, 243 Brazilian Development Cooperation Agency, 245 breadfruit, 211–13 Breadfruit Institute, 213–14 brinjal crops, 71–72 BRS (Banque Régionale de Solidarité), 100–101 BSS-Société Industrielle pour la Production du Riz (BSS-SIPRi), 100–101 Burkina Faso: aquaculture in, 24; CAADP and, 27–28; cereal cultivation in, 36; service sector in, 22; transgenic crops in, 65, 71 Burundi, 174, 205 businesses.


pages: 571 words: 105,054

Advances in Financial Machine Learning by Marcos Lopez de Prado

algorithmic trading, Amazon Web Services, asset allocation, backtesting, behavioural economics, bioinformatics, Brownian motion, business process, Claude Shannon: information theory, cloud computing, complexity theory, correlation coefficient, correlation does not imply causation, data science, diversification, diversified portfolio, en.wikipedia.org, financial engineering, fixed income, Flash crash, G4S, Higgs boson, implied volatility, information asymmetry, latency arbitrage, margin call, market fragmentation, market microstructure, martingale, NP-complete, P = NP, p-value, paper trading, pattern recognition, performance metric, profit maximization, quantitative trading / quantitative finance, RAND corporation, random walk, risk free rate, risk-adjusted returns, risk/return, selection bias, Sharpe ratio, short selling, Silicon Valley, smart cities, smart meter, statistical arbitrage, statistical model, stochastic process, survivorship bias, transaction costs, traveling salesman

Geurts (2013): “Understanding variable importances in forests of randomized trees.” Proceedings of the 26th International Conference on Neural Information Processing Systems, pp. 431–439. Strobl, C., A. Boulesteix, A. Zeileis, and T. Hothorn (2007): “Bias in random forest variable importance measures: Illustrations, sources and a solution.” BMC Bioinformatics, Vol. 8, No. 25, pp. 1–11. White, A. and W. Liu (1994): “Technical note: Bias in information-based measures in decision tree induction.” Machine Learning, Vol. 15, No. 3, pp. 321–329. Note 1 http://blog.datadive.net/selecting-good-features-part-iii-random-forests/. CHAPTER 9 Hyper-Parameter Tuning with Cross-Validation 9.1 Motivation Hyper-parameter tuning is an essential step in fitting an ML algorithm.

Beyond the basic library for organizing user data into files, the HDF Group also provides a suite of tools and specialization of HDF5 for different applications. For example, HDF5 includes a performance profiling tool. NASA has a specialization of HDF5, named HDF5-EOS, for data from their Earth-Observing System (EOS); and the next-generation DNA sequence community has produced a specialization named BioHDF for their bioinformatics data. HDF5 provides an efficient way for accessing the storage systems on HPC platform. In tests, we have demonstrated that using HDF5 to store stock markets data significantly speeds up the analysis operations. This is largely due to its efficient compression/decompression algorithms that minimize network traffic and I/O operations, which brings us to our next point. 22.5.3 In Situ Processing Over the last few decades, CPU performance has roughly doubled every 18 months (Moore's law), while disk performance has been increasing less than 5% a year.

In economics, the same data-driven research activities have led to the wildly popular behavioral economics (Camerer and Loewenstein [2011]). Much of the recent advances in data-driven research are based on machine learning applications (Qiu et al. [2016], Rudin and Wagstaff [2014]). Their successes in a wide variety of fields, such as planetary science and bioinformatics, have generated considerable interest among researchers from diverse domains. In the rest of this section, we describe a few examples applying advanced data analysis techniques to various fields, where many of these use cases originated in the CIFT project. 22.6.1 Supernova Hunting In astronomy, the determination of many important facts such as the expansion speed of the universe, is performed by measuring the light from exploding type Ia supernovae (Bloom et al. [2012]).


pages: 400 words: 94,847

Reinventing Discovery: The New Era of Networked Science by Michael Nielsen

Albert Einstein, augmented reality, barriers to entry, bioinformatics, Cass Sunstein, Climategate, Climatic Research Unit, conceptual framework, dark matter, discovery of DNA, Donald Knuth, double helix, Douglas Engelbart, Douglas Engelbart, Easter island, en.wikipedia.org, Erik Brynjolfsson, fault tolerance, Fellow of the Royal Society, Firefox, Free Software Foundation, Freestyle chess, Galaxy Zoo, Higgs boson, Internet Archive, invisible hand, Jane Jacobs, Jaron Lanier, Johannes Kepler, Kevin Kelly, Large Hadron Collider, machine readable, machine translation, Magellanic Cloud, means of production, medical residency, Nicholas Carr, P = NP, P vs NP, publish or perish, Richard Feynman, Richard Stallman, selection bias, semantic web, Silicon Valley, Silicon Valley startup, Simon Singh, Skype, slashdot, social intelligence, social web, statistical model, Stephen Hawking, Stewart Brand, subscription business, tacit knowledge, Ted Nelson, the Cathedral and the Bazaar, The Death and Life of Great American Cities, The Nature of the Firm, The Wisdom of Crowds, University of East Anglia, Vannevar Bush, Vernor Vinge, Wayback Machine, Yochai Benkler

An overview of work on the Allen Brain Atlas may be found in Jonah Lehrer’s excellent article [120]. Most of the facts I relate are from that article. The paper announcing the atlas of gene expression in the mouse brain is [121]. Overviews of some of the progress and challenges in mapping the human connectome may be found in [119] and [125]. p 108: Bioinformatics and cheminformatics are now well-established fields, with a significant literature, and I won’t attempt to single out any particular reference for special mention. Astroinformatics has emerged more recently. See especially [24] for a manifesto on the need for astroinformatics. p 113: A report on the 2005 Playchess.com freestyle chess tournament may be found at [37], with follow-up commentary on the winners at [39].

See architecture of attention; restructuring expert attention augmented reality, 41, 87 autism-vaccine controversy, 156 Avatar (film), 34 Axelrod, Robert, 219 Baker, David, 146 basic research: economic scale of, 203 secrecy in, 87, 184–86 Bayh-Dole Act, 184–85 Benkler, Yochai, 218, 224 Bennett, John Caister, 149 Berges, Aida, 155 Bermuda Agreement, 7, 108, 190, 192, 222 Berners-Lee, Tim, 218 bioinformatics, 108 biology: data-driven intelligence in, 116–19 data web for, 121–22 open source, 48. See also genetics birdwatchers, 150 black holes, orbiting pair of, 96, 100–101, 103, 112, 114 Blair, Tony, 7, 156 Block, Peter, 218 blogs: architecture of attention and, 42, 56 as basis of Polymath Project, 1–2, 42 invention of, 20 in quantum computing, 187 rumors on, 201–2 scientific, 6, 165–69, 203–4 Borgman, Christine, 218 Boroson, Todd, 100–101, 103, 114 Borucki, William, 201 botany, 107 Brahe, Tycho, 104 brain atlases, 106, 108 British Chiropractic Association, 165–66 Brown, Zacary, 23–24, 27, 35, 41, 223 Burkina Faso, open architecture project in, 46–48 Bush, Vannevar, 217, 218 business: data-driven intelligence for, 112 data sharing methods in, 120.

See also amplifying collective intelligence Colwell, Robert, 218 combinatorial line, 211 comet hunters, 148–49 comment sites: successful examples of, 234 user-contributed, 179–81 commercialization of science, 87, 184–86 Company of Strangers, The (Seabright), 37 comparative advantage: architecture of attention and, 32, 33, 43, 56 examples from the sciences, 82, 83, 84, 85 for InnoCentive Challenges, 24, 43 modularity and, 56 technical meaning of, 223 competition: data sharing and, 103–4 as obstacle to collaboration, 86 in protein structure prediction, 147–48 for scientific jobs, 8, 9, 178, 186 Complexity Zoo, 233 computer code: in bioinformatics, 108 centralized development of new tools, 236 citation of, 196, 204–5 for complex experiments, 203 height=" information commons in, 57–59 sharing, 87, 183, 193, 204–5. See also Firefox; Linux; MathWorks competition; open source software computer games: addictive quality of, 146, 147 for folding proteins (see Foldit) connectome, human, 106, 121 conversation, offline small-group, 39–43 conversational critical mass, 30, 31, 33, 42 Cornell University Laboratory of Ornithology, 150 Cox, Alan, 57 Creative Commons, 219, 220 creative problem solving, 24, 30, 34, 35, 36, 38.


pages: 523 words: 148,929

Physics of the Future: How Science Will Shape Human Destiny and Our Daily Lives by the Year 2100 by Michio Kaku

agricultural Revolution, AI winter, Albert Einstein, Alvin Toffler, Apollo 11, Asilomar, augmented reality, Bill Joy: nanobots, bioinformatics, blue-collar work, British Empire, Brownian motion, caloric restriction, caloric restriction, cloud computing, Colonization of Mars, DARPA: Urban Challenge, data science, delayed gratification, digital divide, double helix, Douglas Hofstadter, driverless car, en.wikipedia.org, Ford Model T, friendly AI, Gödel, Escher, Bach, Hans Moravec, hydrogen economy, I think there is a world market for maybe five computers, industrial robot, Intergovernmental Panel on Climate Change (IPCC), invention of movable type, invention of the telescope, Isaac Newton, John Markoff, John von Neumann, Large Hadron Collider, life extension, Louis Pasteur, Mahatma Gandhi, Mars Rover, Mars Society, mass immigration, megacity, Mitch Kapor, Murray Gell-Mann, Neil Armstrong, new economy, Nick Bostrom, oil shale / tar sands, optical character recognition, pattern recognition, planetary scale, postindustrial economy, Ray Kurzweil, refrigerator car, Richard Feynman, Rodney Brooks, Ronald Reagan, Search for Extraterrestrial Intelligence, Silicon Valley, Simon Singh, social intelligence, SpaceShipOne, speech recognition, stem cell, Stephen Hawking, Steve Jobs, synthetic biology, telepresence, The future is already here, The Wealth of Nations by Adam Smith, Thomas L Friedman, Thomas Malthus, trade route, Turing machine, uranium enrichment, Vernor Vinge, Virgin Galactic, Wall-E, Walter Mischel, Whole Earth Review, world market for maybe five computers, X Prize

I imagine in the near future, many people will have the same strange feeling I did, holding the blueprint of their bodies in their hands and reading the intimate secrets, including dangerous diseases, lurking in the genome and the ancient migration patterns of their ancestors. But for scientists, this is opening an entirely new branch of science, called bioinformatics, or using computers to rapidly scan and analyze the genome of thousands of organisms. For example, by inserting the genomes of several hundred individuals suffering from a certain disease into a computer, one might be able to calculate the precise location of the damaged DNA. In fact, some of the world’s most powerful computers are involved in bioinformatics, analyzing millions of genes found in plants and animals for certain key genes. This could even revolutionize TV detective shows like CSI.

See Robotics/­AI Artificial vision Artsutanov, Yuri ASIMO robot, 2.­1, 2.­2, 2.­3 Asimov, Isaac, 2.­1, 6.­1, 8.­1 ASPM gene Asteroid landing Atala, Anthony Atomic force microscope Augmented reality Augustine Commission report, 6.­1, 6.­2 Avatar (movie), 1.­1, 2.­1, 6.­1, 7.­1 Avatars Backscatter X-­rays Back to the Future movies, 5.­1, 5.­2 Badylak, Stephen Baldwin, David E.­ Baltimore, David, 1.­1, 3.­1, 3.­2, 3.­3 Benford, Gregory Big bang research Binnig, Gerd Bioinformatics Biotechnology. See Medicine/­biotechnology Birbaumer, Niels Birth control Bismarck, Otto von Blade Runner (movie) Blue Gene computer Blümich, Bernhard, 1.­1, 1.­2 Boeing Corporation Booster-­rocket technologies Bova, Ben, 5.­1, 5.­2 Boys from Brazil, The (movie) Brain artificial body parts, adaptation to basic structure of emotions and growing a human brain Internet contact lenses and locating every neuron in as neural network parallel processing in reverse engineering of simulations of “­Brain drain”­ to the United States BrainGate device Brain injuries, treatment for Branson, Richard Brave New World (Huxley) Breast cancer Breazeal, Cynthia Brenner, Sydney Brooks, Rodney, 2.­1, 2.­2, 4.­1 Brown, Dan Brown, Lester Buckley, William F.­

See also Intellectual capitalism Carbon nanotubes, 4.­1, 6.­1 Carbon sequestration Cars driverless electric maglev, 5.­1, 9.­1 Cascio, Jamais Catoms Cave Man Principle biotechnology and computer animations and predicting the future and replicators and, 4.­1, 4.­2 robotics/AI and, 2.­1, 2.­2 sports and Cerf, Vint, 4.­1, 6.­1 Chalmers, David Charles, Prince of Wales Chemotherapy Chernobyl nuclear accident Chevy Volt Chinese Empire, 7.­1, 7.­2 Church, George Churchill, Winston, itr.­1, 8.­1 Cipriani, Christian Civilizations alien civilizations characteristics of various Types entropy and information processing and resistance to Type I civilization rise and fall of great empires rise of civilization on Earth science and wisdom, importance of transition from Type 0 to Type I, itr.­1, 8.­1, 8.­2 Type II civilizations, 8.­1, 8.­2, 8.­3 Type III civilizations, 8.­1, 8.­2 waste heat and Clarke, Arthur C.­ Clausewitz, Carl von Cloning, 3.­1, 3.­2 Cloud computing, 1.­1, 7.­1 Cochlear implants Code breaking Collins, Francis Comets Common sense, 2.­1, 2.­2, 2.­3, 7.­1, 7.­2 Computers animations created by augmented reality bioinformatics brain simulations carbon nanotubes and cloud computing, 1.­1, 7.­1 digital divide DNA computers driverless cars exponential growth of computer power (Moore’s law), 1.­1, 1.­2, 1.­3, 4.­1 fairy tale life and far future (2070) four stages of technology and Internet glasses and contact lenses, 1.­1, 1.­2 medicine and midcentury (2030) mind control of molecular and atomic transistors nanotechnology and near future (present to 2030) optical computers parallel processing physics of computer revolution quantum computers quantum dot computers quantum theory and, 1.­1, 4.­1, 4.­2, 4.­3 scrap computers self-­assembly and silicon chips, limitations of, 1.­1, 1.­2, 4.­1 telekinesis with 3-­D technology universal translators virtual reality wall screens See also Mind reading; Robotics/­AI Condorcet, Marquis de Conscious robots, 2.­1, 2.­2 Constellation Program COROT satellite, 6.­1, 8.­1 Crick, Francis Criminology Crutzen, Paul Culture in Type I civilization Customization of products Cybertourism, itr.­1, itr.­2 CYC project Damasio, Antonio Dating in 2100, 9.­1, 9.­2, 9.­3, 9.­4 Davies, Stephen Da Vinci robotic system Dawkins, Richard, 3.­1, 3.­2, 3.­3 Dawn computer Dean, Thomas Decoherence problem Deep Blue computer, 2.­1, 2.­2, 2.­3 Delayed gratification DEMO fusion reactor Depression treatments Designer children, 3.­1, 3.­2, 3.­3 Developing nations, 7.­1, 7.­2 Diamandis, Peter Dictatorships Digital divide Dinosaur resurrection Disease, elimination of, 3.­1, 8.­1 DNA chips DNA computers Dog breeds Donoghue, John, 1.­1, 1.­2 Dreams, photographing of Drexler, Eric Driverless cars Duell, Charles H.­


pages: 560 words: 158,238

Fifty Degrees Below by Kim Stanley Robinson

airport security, bioinformatics, bread and circuses, Burning Man, carbon credits, carbon tax, clean water, DeepMind, Donner party, full employment, Intergovernmental Panel on Climate Change (IPCC), invisible hand, iterative process, Kim Stanley Robinson, means of production, minimum wage unemployment, North Sea oil, off-the-grid, Ralph Waldo Emerson, Richard Feynman, statistical model, Stephen Hawking, the scientific method

And then Francesca Taolini, who had arranged for Yann’s hire by a company she consulted for, in the same way Frank had hoped to. Did she suspect that Frank had been after Yann? Did she know how powerful Yann’s algorithm might be? He googled her. Turned out, among many interesting things, that she was helping to chair a conference at MIT coming soon, on bioinformatics and the environment. Just the kind of event Frank might attend. NSF even had a group going already, he saw, to talk about the new federal institutes. Meet with her first, then go to Atlanta to meet with Yann—would that make his stock in the virtual market rise, triggering more intense surveillance?

So at work Anna spent her time trying to concentrate, over a persistent underlying turmoil of worry about her younger son. Work was absorbing, as always, and there was more to do than there was time to do it in, as always. And so it provided its partial refuge. But it was harder to dive in, harder to stay under the surface in the deep sea of bioinformatics. Even the content of the work reminded her, on some subliminal level, that health was a state of dynamic balance almost inconceivably complex, a matter of juggling a thousand balls while unicycling on a tightrope over the abyss—in a gale—at night—such that any life was an astonishing miracle, brief and tenuous.

Take a problem, break it down into parts (analyze), quantify whatever parts you could, see if what you learned suggested anything about causes and effects; then see if this suggested anything about long-term plans, and tangible things to do. She did not believe in revolution of any kind, and only trusted the mass application of the scientific method to get any real-world results. “One step at a time,” she would say to her team in bioinformatics, or Nick’s math group at school, or the National Science Board; and she hoped that as long as chaos did not erupt worldwide, one step at a time would eventually get them to some tolerable state. Of course there were all the hysterical operatics of “history” to distract people from this method and its incremental successes.


pages: 623 words: 448,848

Food Allergy: Adverse Reactions to Foods and Food Additives by Dean D. Metcalfe

active measures, Albert Einstein, autism spectrum disorder, bioinformatics, classic study, confounding variable, epigenetics, Helicobacter pylori, hygiene hypothesis, impulse control, life extension, longitudinal study, meta-analysis, mouse model, pattern recognition, phenotype, placebo effect, randomized controlled trial, Recombinant DNA, selection bias, statistical model, stem cell, twin studies, two and twenty

In silico methods for evaluating human allergenicity to novel proteins. Bioinformatics Workshop Meeting Report, February 23–24, 2005. Toxicol Sci 2005;88:307–10. 74 Ladics GS, Bannon GA, Silvanovich A, Cressman, RF. Comparison of conventional FASTA identity searches with the 80 amino acid sliding window FASTA search for the elucidation of potential identities to known allergens. Mol Nutr Food Res 2007;51:985–998. 75 Bannon G, Ogawa T. Evaluation of available IgE-binding epitope data and its utility in bioinformatics. Mol Nutr Food Res 2006;50:638–44. 76 Hileman RE, Silvanovich A, Goodman RE, et al. Bioinformatic methods for allergenicity assessment using a comprehensive allergen database.

Food allergen protein families Based on their shared amino acid sequences and conserved three-dimensional structures, proteins can be classified into families using various bioinformatics tools which form the basis of several protein family databases, one of which is Pfam [8]. Over the past 10 years or so there has been an explosion in the numbers of well characterized allergens, which have been sequenced and are being collected into a number of databases to facilitate bioinformatic analysis [9]. We have undertaken this analysis for both plant [1] and animal food allergens [10] along with pollen allergens [2]. They show similar distributions with the majority of allergens in each group falling into just 3–12 families with a tail 43 44 Chapter 4 of between 14 and 23 families comprising between 1 and 3 allergens each.

However, Aalberse [72] has noted that proteins sharing less than 50% identity across the full length of the protein sequence are unlikely to be cross-reactive, and immunological cross-reactivity may not occur unless the proteins share at least 70% identity. Recent published work has led to the harmonization of the methods used for bioinformatic searches and a better understanding of the data generated [73,74] from such studies. An additional bioinformatics approach can be taken by searching for 100% identity matches along short sequences contained in the query sequence as they are compared to sequences in a database. These regions of short amino acid sequence homologies are intended to represent the smallest sequence that could function as an IgE-binding epitope [75].


pages: 678 words: 216,204

The Wealth of Networks: How Social Production Transforms Markets and Freedom by Yochai Benkler

affirmative action, AOL-Time Warner, barriers to entry, bioinformatics, Brownian motion, business logic, call centre, Cass Sunstein, centre right, clean water, commoditize, commons-based peer production, dark matter, desegregation, digital divide, East Village, Eben Moglen, fear of failure, Firefox, Free Software Foundation, game design, George Gilder, hiring and firing, Howard Rheingold, informal economy, information asymmetry, information security, invention of radio, Isaac Newton, iterative process, Jean Tirole, jimmy wales, John Markoff, John Perry Barlow, Kenneth Arrow, Lewis Mumford, longitudinal study, machine readable, Mahbub ul Haq, market bubble, market clearing, Marshall McLuhan, Mitch Kapor, New Journalism, optical character recognition, pattern recognition, peer-to-peer, power law, precautionary principle, pre–internet, price discrimination, profit maximization, profit motive, public intellectual, radical decentralization, random walk, Recombinant DNA, recommendation engine, regulatory arbitrage, rent-seeking, RFID, Richard Stallman, Ronald Coase, scientific management, search costs, Search for Extraterrestrial Intelligence, SETI@home, shareholder value, Silicon Valley, Skype, slashdot, social software, software patent, spectrum auction, subscription business, tacit knowledge, technological determinism, technoutopianism, The Fortune at the Bottom of the Pyramid, the long tail, The Nature of the Firm, the strength of weak ties, Timothy McVeigh, transaction costs, vertical integration, Vilfredo Pareto, work culture , Yochai Benkler

As more of the process of drug discovery of potential leads can be done by modeling and computational analysis, more can be organized for peer production. The relevant model here is open bioinformatics. Bioinformatics generally is the practice of pursuing solutions to biological questions using mathematics and information technology. Open bioinformatics is a movement within bioinformatics aimed at developing the tools in an open-source model, and in providing access to the tools and the outputs on a free and open basis. Projects like these include the Ensmbl Genome Browser, operated by the European Bioinformatics Institute and the Sanger Centre, or the National Center for Biotechnology Information (NCBI), both of which use computer databases to provide access to data and to run various searches on combinations, patterns, and so forth, in the data.


pages: 424 words: 114,905

Deep Medicine: How Artificial Intelligence Can Make Healthcare Human Again by Eric Topol

"World Economic Forum" Davos, 23andMe, Affordable Care Act / Obamacare, AI winter, Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem, algorithmic bias, AlphaGo, Apollo 11, artificial general intelligence, augmented reality, autism spectrum disorder, autonomous vehicles, backpropagation, Big Tech, bioinformatics, blockchain, Cambridge Analytica, cloud computing, cognitive bias, Colonization of Mars, computer age, computer vision, Computing Machinery and Intelligence, conceptual framework, creative destruction, CRISPR, crowdsourcing, Daniel Kahneman / Amos Tversky, dark matter, data science, David Brooks, deep learning, DeepMind, Demis Hassabis, digital twin, driverless car, Elon Musk, en.wikipedia.org, epigenetics, Erik Brynjolfsson, fake news, fault tolerance, gamification, general purpose technology, Geoffrey Hinton, George Santayana, Google Glasses, ImageNet competition, Jeff Bezos, job automation, job satisfaction, Joi Ito, machine translation, Mark Zuckerberg, medical residency, meta-analysis, microbiome, move 37, natural language processing, new economy, Nicholas Carr, Nick Bostrom, nudge unit, OpenAI, opioid epidemic / opioid crisis, pattern recognition, performance metric, personalized medicine, phenotype, placebo effect, post-truth, randomized controlled trial, recommendation engine, Rubik’s Cube, Sam Altman, self-driving car, Silicon Valley, Skinner box, speech recognition, Stephen Hawking, techlash, TED Talk, text mining, the scientific method, Tim Cook: Apple, traumatic brain injury, trolley problem, War on Poverty, Watson beat the top human players on Jeopardy!, working-age population

In many leading medical schools throughout the country, there’s an “arms race” for Adam 1s and academic achievement, as Jonathan Stock at Yale University School of Medicine aptly points out.61 We need to be nurturing the Adam 2s, which is something that is all too often an area of neglect in medical education. There are many other critical elements that need to be part of the medical school curriculum. Future doctors need a far better understanding of data science, including bioinformatics, biocomputing, probabilistic thinking, and the guts of deep learning neural networks. Much of their efforts in patient care will be supported by algorithms, and they need to understand all the liabilities, to recognize bias, errors, false output, and dissociation from common sense. Likewise, the importance of putting the patient’s values and preferences first in any human-machine collaboration cannot be emphasized enough.

JAMA, 2017. 318(22): pp. 2199–2210. 52. Golden, J. A., “Deep Learning Algorithms for Detection of Lymph Node Metastases from Breast Cancer: Helping Artificial Intelligence Be Seen.” JAMA, 2017. 318(22): pp. 2184–2186. 53. Yang, S. J., et al., “Assessing Microscope Image Focus Quality with Deep Learning.” BMC Bioinformatics, 2018. 19(1): p. 77. 54. Wang et al., Deep Learning for Identifying Metastatic Breast Cancer. 55. Wong, D., and S. Yip, “Machine Learning Classifies Cancer.” Nature, 2018. 555(7697): pp. 446–447; Capper, D., et al., “DNA Methylation-Based Classification of Central Nervous System Tumours.” Nature, 2018. 555(7697): pp. 469–474. 56.

., et al., “Intelligent Image-Activated Cell Sorting.” Cell, 2018. 175(1): pp. 266–276 e13. 72. Weigert, M., et al., Content-Aware Image Restoration: Pushing the Limits of Fluorescence Microscopy, bioRxiv. 2017; Yang, S. J., et al., “Assessing Microscope Image Focus Quality with Deep Learning.” BMC Bioinformatics, 2018. 19(1): p. 77. 73. Ouyang, W., et al., “Deep Learning Massively Accelerates Super-Resolution Localization Microscopy.” Nat Biotechnol, 2018. 36(5): pp. 460–468. 74. Stumpe, M., “An Augmented Reality Microscope for Realtime Automated Detection of Cancer,” Google AI Blog. 2018. 75. Wise, J., “These Robots Are Learning to Conduct Their Own Science Experiments,” Bloomberg. 2018. 76.


Exploring Everyday Things with R and Ruby by Sau Sheong Chang

Alfred Russel Wallace, bioinformatics, business process, butterfly effect, cloud computing, Craig Reynolds: boids flock, data science, Debian, duck typing, Edward Lorenz: Chaos theory, Gini coefficient, income inequality, invisible hand, p-value, price stability, Ruby on Rails, Skype, statistical model, stem cell, Stephen Hawking, text mining, The Wealth of Nations by Adam Smith, We are the 99%, web application, wikimedia commons

CRAN is hosted by the R Foundation (the same organization that is developing R) and contains 3,646 packages as of this writing. CRAN is also mirrored in many sites worldwide. Another public repository is Bioconductor (http://www.bioconductor.org), an open source project that provides tools for bioinformatics and is primarily R-based. While the packages in Bioconductor are focused on bioinformatics, it doesn’t mean that they can’t be used for other domains. As of this writing, there are 516 packages in Bioconductor. Finally, there is R-Forge (http://r-forge.r-project.org), a collaborative software development application for R. It is based on FusionForge, a fork from GForge (on which RubyForge was based), which in turn was forked from the original software that was used to build SourceForge.


The Data Journalism Handbook by Jonathan Gray, Lucy Chambers, Liliana Bounegru

Amazon Web Services, barriers to entry, bioinformatics, business intelligence, carbon footprint, citizen journalism, correlation does not imply causation, crowdsourcing, data science, David Heinemeier Hansson, eurozone crisis, fail fast, Firefox, Florence Nightingale: pie chart, game design, Google Earth, Hans Rosling, high-speed rail, information asymmetry, Internet Archive, John Snow's cholera map, Julian Assange, linked data, machine readable, moral hazard, MVC pattern, New Journalism, openstreetmap, Ronald Reagan, Ruby on Rails, Silicon Valley, social graph, Solyndra, SPARQL, text mining, Wayback Machine, web application, WikiLeaks

Election financing (Helsingin Sanomat) 2. Brainstorm for ideas The participants of HS Open 2 came up with twenty different prototypes about what to do with the data. You can find all the prototypes on our website (text in Finnish). A bioinformatics researcher called Janne Peltola noted that campaign funding data looked like the gene data they research, in terms of containing many interdependencies. In bioinformatics there is an open source tool called Cytoscape that is used to map these interdependencies. So we ran the data through Cytoscape, and got a very interesting prototype. 3. Implement the idea on paper and on the Web The law on campaign funding states that elected members of parliament must declare their funding two months after the elections.


pages: 295 words: 66,912

Walled Culture: How Big Content Uses Technology and the Law to Lock Down Culture and Keep Creators Poor by Glyn Moody

Aaron Swartz, Big Tech, bioinformatics, Brewster Kahle, connected car, COVID-19, disinformation, Donald Knuth, en.wikipedia.org, full text search, intangible asset, Internet Archive, Internet of things, jimmy wales, Kevin Kelly, Kickstarter, non-fungible token, Open Library, optical character recognition, p-value, peer-to-peer, place-making, quantitative trading / quantitative finance, rent-seeking, text mining, the market place, TikTok, transaction costs, WikiLeaks

In 1997, Wired magazine published his in-depth feature about Linux and Linus Torvalds, the first mainstream article to describe the then-new world of free software. Moody’s full-length book on the topic, Rebel Code: Linux and the Open Source Revolution, appeared in 2001. His book Digital Code of Life: How Bioinformatics is Revolutionizing Science, Medicine, and Business, about the new field of bioinformatics, was published in 2004. In addition, Moody has written nearly 2,000 posts for Techdirt, and over 400 articles for Ars Technica. More recently, his writing has focussed on digital rights and privacy. Numerous posts about copyright, another area of particular interest, have appeared on the Copybuzz and Walled Culture blogs.


Mastering Structured Data on the Semantic Web: From HTML5 Microdata to Linked Open Data by Leslie Sikos

AGPL, Amazon Web Services, bioinformatics, business process, cloud computing, create, read, update, delete, Debian, en.wikipedia.org, fault tolerance, Firefox, Google Chrome, Google Earth, information retrieval, Infrastructure as a Service, Internet of things, linked data, machine readable, machine translation, natural language processing, openstreetmap, optical character recognition, platform as a service, search engine result page, semantic web, Silicon Valley, social graph, software as a service, SPARQL, text mining, Watson beat the top human players on Jeopardy!, web application, Wikidata, wikimedia commons, Wikivoyage

TopQuadrant (2015) TopBraid Composer Maestro Edition. www.topquadrant.com/tools/ide-topbraid-composer-maestro-edition/. Accessed 31 March 2015. 13. The Apache Software Foundation (2015) Apache Stanbol. http://stanbol.apache.org. Accessed 31 March 2015. 14. Fluent Editor. www.cognitum.eu/semantics/FluentEditor/. Accessed 15 April 2015. 15. The European Bioinformatics Institute (2015) ZOOMA. www.ebi.ac.uk/fgpt/zooma/. Accessed 31 March 2015. 16. Harispe, S. (2014) Semantic Measures Library & ToolKit. www.semantic-measures-library.org. Accessed 29 March 2015. 17. Motik, B., Shearer, R., Glimm, B., Stoilos, G., Horrocks, I. (2013) HermiT OWL Reasoner. http://hermit-reasoner.com.

The Toolkit features an AML text editor and a visual editor, an AML validator, and provides mapping and testing view for AML. Semantic Automated Discovery and Integration (SADI) Semantic Automated Discovery and Integration (SADI) is a lightweight set of Semantic Web Service design patterns (https://code.google.com/p/sadi/). It was primarily designed for scientific service publication and is especially useful in bioinformatics. Powered by web standards, SADI implements Semantic Web technologies to consume and produce RDF instances of OWL-DL classes, where input and output class URIs resolve to an OWL document through HTTP GET. SADI supports RDF/XML and Notation3 serializations. The SADI design patterns provide automatic discovery of appropriate services, based on user needs, and can automatically chain these services into complex analytical workflows.


pages: 292 words: 85,151

Exponential Organizations: Why New Organizations Are Ten Times Better, Faster, and Cheaper Than Yours (And What to Do About It) by Salim Ismail, Yuri van Geest

23andMe, 3D printing, Airbnb, Amazon Mechanical Turk, Amazon Web Services, anti-fragile, augmented reality, autonomous vehicles, Baxter: Rethink Robotics, behavioural economics, Ben Horowitz, bike sharing, bioinformatics, bitcoin, Black Swan, blockchain, Blue Ocean Strategy, book value, Burning Man, business intelligence, business process, call centre, chief data officer, Chris Wanstrath, circular economy, Clayton Christensen, clean water, cloud computing, cognitive bias, collaborative consumption, collaborative economy, commoditize, corporate social responsibility, cross-subsidies, crowdsourcing, cryptocurrency, dark matter, data science, Dean Kamen, deep learning, DeepMind, dematerialisation, discounted cash flows, disruptive innovation, distributed ledger, driverless car, Edward Snowden, Elon Musk, en.wikipedia.org, Ethereum, ethereum blockchain, fail fast, game design, gamification, Google Glasses, Google Hangouts, Google X / Alphabet X, gravity well, hiring and firing, holacracy, Hyperloop, industrial robot, Innovator's Dilemma, intangible asset, Internet of things, Iridium satellite, Isaac Newton, Jeff Bezos, Joi Ito, Kevin Kelly, Kickstarter, knowledge worker, Kodak vs Instagram, Law of Accelerating Returns, Lean Startup, life extension, lifelogging, loose coupling, loss aversion, low earth orbit, Lyft, Marc Andreessen, Mark Zuckerberg, market design, Max Levchin, means of production, Michael Milken, minimum viable product, natural language processing, Netflix Prize, NetJets, Network effects, new economy, Oculus Rift, offshore financial centre, PageRank, pattern recognition, Paul Graham, paypal mafia, peer-to-peer, peer-to-peer model, Peter H. Diamandis: Planetary Resources, Peter Thiel, Planet Labs, prediction markets, profit motive, publish or perish, radical decentralization, Ray Kurzweil, recommendation engine, RFID, ride hailing / ride sharing, risk tolerance, Ronald Coase, Rutger Bregman, Salesforce, Second Machine Age, self-driving car, sharing economy, Silicon Valley, skunkworks, Skype, smart contracts, Snapchat, social software, software is eating the world, SpaceShipOne, speech recognition, stealth mode startup, Stephen Hawking, Steve Jobs, Steve Jurvetson, subscription business, supply-chain management, synthetic biology, TaskRabbit, TED Talk, telepresence, telepresence robot, the long tail, Tony Hsieh, transaction costs, Travis Kalanick, Tyler Cowen, Tyler Cowen: Great Stagnation, uber lyft, urban planning, Virgin Galactic, WikiLeaks, winner-take-all economy, X Prize, Y Combinator, zero-sum game

Third, once that doubling pattern starts, it doesn’t stop. We use current computers to design faster computers, which then build faster computers, and so on. Finally, several key technologies today are now information-enabled and following the same trajectory. Those technologies include artificial intelligence (AI), robotics, biotech and bioinformatics, medicine, neuroscience, data science, 3D printing, nanotechnology and even aspects of energy. Never in human history have we seen so many technologies moving at such a pace. And now that we are information-enabling everything around us, the effects of the Kurzweil’s Law of Accelerating Returns are sure to be profound.

What was particularly interesting was the fact that none of the winners had prior experience with natural language processing (NLP). Nonetheless, they beat the experts, many of them with decades of experience in NLP under their belts. This can’t help but impact the current status quo. Raymond McCauley, Biotechnology & Bioinformatics Chair at Singularity University, has noticed that “When people want a biotech job in Silicon Valley, they hide their PhDs to avoid being seen as a narrow specialist.” So, if experts are suspect, where should we turn instead? As we’ve already noted, everything is measurable. And the newest profession making those measurements is the data scientist.


pages: 608 words: 150,324

Life's Greatest Secret: The Race to Crack the Genetic Code by Matthew Cobb

a long time ago in a galaxy far, far away, Anthropocene, anti-communist, Asilomar, Asilomar Conference on Recombinant DNA, Benoit Mandelbrot, Berlin Wall, bioinformatics, Claude Shannon: information theory, conceptual framework, Copley Medal, CRISPR, dark matter, discovery of DNA, double helix, Drosophila, epigenetics, factory automation, From Mathematics to the Technologies of Life and Death, Gregor Mendel, heat death of the universe, James Watt: steam engine, John von Neumann, Kickstarter, Large Hadron Collider, military-industrial complex, New Journalism, Norbert Wiener, phenotype, post-materialism, Recombinant DNA, Stephen Hawking, synthetic biology

Often the only basis for identifying the function of a gene is because its DNA sequence is similar to a gene in a different organism where a function has been demonstrated. This has led to a new discipline called genomics, which involves obtaining genomes and understanding their nature and evolution. It includes a new set of techniques, collectively called bioinformatics, which combine computing and population genetics to make inferences about the patterns of evolution and enable us to determine which genes have a common origin or function. Training biologists in the techniques of computer science will be an important part of twenty-first-century scientific education.

., ‘Exonic transcription factor binding directs codon choice and affects protein evolution’, Science, vol. 342, 2013, pp. 1367–72. Stern, K. G., ‘Nucleoproteins and gene structure’, Yale Journal of Biology and Medicine, vol. 19, 1947, pp. 937–49. Stevens, H., Life Out of Sequence: A Data-Driven History of Bioinformatics, London, University of Chicago Press, 2013. Strasser, B. J., ‘A world in one dimension: Linus Pauling, Francis Crick and the Central Dogma of molecular biology’, History and Philosophy of the Life Sciences, vol. 28, 2006, pp. 491–512. Stretton, A. O. W., ‘The first sequence: Fred Sanger and insulin’, Genetics, vol. 162, 2002, pp. 527–32.

awards 38, 50 Francis Crick on 132, 136, 216 health 38 on nucleic acids as the transforming principle 43–53 reactions to his ideas 55–9, 62–4, 68–70 transformation in pneumococci 34–41 Avery, Roy (brother of Oswald) 44–5, 59, 63 B Bacillus thuringiensis 270 bacteria based on ‘synthetic’ DNA 267 capsule formation and virulence 36–7 DNA sequences online 235 enzymatic adaptation 152 generality of transformation in 59 negative feedback in biosynthesis by 153–5 sexual reproduction 51 transformation in E. coli 51–2, 56, 61, 63 transformation in pneumococci 36–9, 63 bacteriophages see phages Bakewell, Robert 1–2 Baltimore, David 251–2 Bar-Hillel, Yehoshuua 144 Barnett, Leslie 193 base pairing complementary base pairing 106, 109 frequency in different genomes 295 κ and Π base pairs 278 spontaneous 102 unnatural base pairs 277–8, 285 Z and P base pairs 278 base sequence as the genetic code 111 relation to amino acid sequence 117, 124–6, 133 variability 54, 62, 70 bases, DNA hydrogen bonding between 58, 92, 101, 106 ratio of pyrimidines to purines 42, 91, 102, 106, 109 sequence variation and specificity 57–8 bases, nucleic acid defined 316 investigations of DNA and RNA 198 orientation 42 proportions within and between species 62, 90 tetranucleotide hypothesis 7, 42, 51, 54, 62, 90 see also purines; pyrimidines Bateson, Gregory 22 Baulcombe, David 259 Beadle, George at Chemical Basis of Heredity symposium 132 comments on Benzer’s work 162 Nobel Prize 215 one-gene-one-enzyme hypothesis 9–11, 204, 243–4 at the Washington Physics conference 33 behaviour, genetic effects 304–5 Beighton, Elwyn 102 Beljanski, Mirko 189–90 Bell, Florence 91, 93, 104 Benner, Steven 277–8 Benzer, Seymour 161–3, 165, 187n, 203, 215, 302 Berg, Paul 279, 281, 285 Bergmann, Max 46 β-galactosidase 152–3, 156, 158, 160, 165 ‘Big Science’ 311–12 Bigelow, Julian 22–4, 27 Biochemical and Biophysical Research Communications 180 bioinformatics 238 The Biological Replication of Macromolecules symposium 130 ‘Biological units endowed with genetic continuity’ meeting 53, 59–60 biosecurity 280–1, 285 biotechnology DNA fingerprinting as 231 fermentation as 268 genetically modified organisms 269–71, 284 regulation of 284–5 synthetic biology 277 Birney, Ewan 242, 247, 271 bits (binary digits) 27, 78 Blair, Tony 233 ‘blender experiments’ 68 Boivin, André on DNA leading to RNA 71, 140, 214 Mirsky and 56–7, 59 transformation in E. coli 51–2, 56 on varying DNA quantities 60–1 Botstein, David 231 Boveri, Theodor 3 Brachet, Jean 58, 71–2, 116 Bragg, Sir Laurence 94–5, 100, 105, 108 BRCA1 gene 234 Brenner, Sydney adaptor hypothesis 121, 135, 209 on cell-free systems 182 on the coding problem 172 coinage of ‘codon’ 203 collaboration with Crick 121, 125, 165–6, 189, 192–3 developmental biology interest 216 disproves overlapping code idea 123–4, 200 messenger RNA idea 165–7, 172, 178, 182, 190 Nobel Prize 215 nonsense codons 213 on using polynucleotides 189 work with viruses 174, 192, 200, 213 Bridges, Calvin 4 Brillouin, Léon 76, 202 Britten, Roy 243 Brookhaven Laboratory 174 BSE (bovine spongiform encephalopathy) 253–4 Burnet, Macfarlane Enzyme, Antigen and Virus: A Study of Macromolecular Pattern in Action 134–5, 139, 141, 146–7 on information flows 139–41, 146–7 meeting with Avery 34–5 on non-coding DNA 141, 222 Bush, Vannevar 20–1, 26 C ‘C-value paradox’ 246 caddis-fly 175 Caenorhabditis elegans 231–2, 258, 277 Cairns, John 218 Caldwell, P.


pages: 584 words: 149,387

Essential Scrum: A Practical Guide to the Most Popular Agile Process by Kenneth S. Rubin

bioinformatics, business cycle, business intelligence, business logic, business process, continuous integration, corporate governance, fail fast, hiring and firing, index card, inventory management, iterative process, Kanban, Lean Startup, loose coupling, minimum viable product, performance metric, shareholder value, six sigma, tacit knowledge, Y2K, you are the product

In 1988 he was fortunate to join ParcPlace Systems, a start-up company formed as a Xerox PARC spin-off, whose charter was to bring object-oriented technology out of the research labs and release it to the world. As a Smalltalk development consultant with many different organizations in the late 1980s and throughout the 1990s, Kenny was an early adopter of agile practices. His first use of Scrum was in 2000 for developing bioinformatics software. In the course of his career, Kenny has held many roles, including successful stints as a Scrum product owner, ScrumMaster, and member of development teams. In addition, he has held numerous executive management roles: CEO, COO, VP of Engineering, VP of Product Management, and VP of Professional Services.

His multifaceted background gives Kenny the ability to understand (and explain) Scrum and its implications equally well from multiple perspectives: from the development team to the executive board. Chapter 1. Introduction On June 21, 2000, I was employed as Executive Vice President at Genomica, a bioinformatics company in Boulder, Colorado. I remember the date because my son Asher was born at one o’clock that morning. His birth was a good start to the day. Asher was actually born on his predicted due date (in the United States this happens about 5% of the time). So we (really my wife, Jenine) had finished our nine-month “project” on schedule.

This need for rapid exploration and feedback did not mesh well with the detailed, up-front planning we had been doing. We also wanted to avoid big up-front architecture design. A previous attempt to create a next generation of Genomica’s core product had seen the organization spend almost one year doing architecture-only work to create a grand, unified bioinformatics platform. When the first real scientist-facing application was put on top of that architecture, and we finally validated design decisions made many months earlier, it took 42 seconds to tab from one field on the screen to the next field. If you think a typical user is impatient, imagine a molecular biologist with a Ph.D. having to wait 42 seconds!


Algorithms Unlocked by Thomas H. Cormen

bioinformatics, Donald Knuth, knapsack problem, NP-complete, optical character recognition, P = NP, Silicon Valley, sorting algorithm, traveling salesman

The size of a clique is the number of vertices it contains. As you might imagine, cliques play a role in social network theory. Modeling each individual as a vertex and relationships between individuals as undirected edges, a clique represents a group of individuals all of whom have relationships with each other. Cliques also have applications in bioinformatics, engineering, and chemistry. The clique problem takes two inputs, a graph G and a positive integer k, and asks whether G has a clique of size k. For example, the graph on the next page has a clique of size 4, shown with heavily shaded vertices, and no other clique of size 4 or greater. 192 Chapter 10: Hard?

The size of a vertex cover is the number of vertices it contains. As in the clique problem, the vertex-cover problem takes as input an undirected graph G and a positive integer m. It asks whether G has a vertex cover of size m. Like the clique problem, the vertex-cover problem has applications in bioinformatics. In another application, you have a building with hallways and cameras that can scan up to 360 degrees located at the intersections of hallways, and you want to know whether m cameras will allow you to see all the hallways. Here, edges model hallways and vertices model intersections. In yet another application, finding vertex covers helps in designing strategies to foil worm attacks on computer networks.


pages: 323 words: 92,135

Running Money by Andy Kessler

Alan Greenspan, Andy Kessler, Apple II, bioinformatics, Bob Noyce, British Empire, business intelligence, buy and hold, buy low sell high, call centre, Charles Babbage, Corn Laws, cotton gin, Douglas Engelbart, Fairchild Semiconductor, family office, flying shuttle, full employment, General Magic , George Gilder, happiness index / gross national happiness, interest rate swap, invisible hand, James Hargreaves, James Watt: steam engine, joint-stock company, joint-stock limited liability company, junk bonds, knowledge worker, Leonard Kleinrock, Long Term Capital Management, mail merge, Marc Andreessen, margin call, market bubble, Mary Meeker, Maui Hawaii, Menlo Park, Metcalfe’s law, Michael Milken, Mitch Kapor, Network effects, packet switching, pattern recognition, pets.com, railway mania, risk tolerance, Robert Metcalfe, Sand Hill Road, Silicon Valley, South China Sea, spinning jenny, Steve Jobs, Steve Wozniak, Suez canal 1869, Toyota Production System, TSMC, UUNET, zero-sum game

In an era of relatively stable currencies, the modern-day investor has to dig, early and often and everywhere. I’d still rather dig than get whacked by a runaway yen-carry trade. Another cycle is coming. The drivers of it are still unclear. 296 Running Money Likely suspects are things like wireless data, on-command computing, nanotechnology, bioinformatics, genomic sorting—who the hell knows what it will be. But this is what I do. Looking for the next barrier, the next piece of technology, the next waterfall and the next great, longterm investment. Sounds quaint. I’ve come a long way from tripping across Homa Simpson dolls trying to raise money in Hong Kong.

See AOL Andreessen, Marc, 197, 199 animation, 134–35 AOL (America Online), 69–73, 207, 208, 223, 290 Cisco routers and, 199 Inktomic cache software and, 143 Netscape Navigator purchase, 201, 225 Telesave deal, 72–73 TimeWarner deal, 223, 229 as top market cap company, 111 Apache Web server, 247 Apple Computer, 45, 127, 128 Apple II, 183 Applied Materials, 245 Archimedes (propeller ship), 94 Arkwright, Richard, 65 ARPANET, 186, 187, 189, 191 Arthur Andersen, 290 Artists and Repertoire (A&R), 212, 216 Asian debt crisis, 3, 150, 151, 229, 260 yen and, 162–65, 168, 292 @ (at sign), 187 AT&T, 61, 185–86, 189 August Capital, 2, 4 auto industry, 267–68 Aziz, Tariq, 26 Babbage, Charles, 93 Baker, James, 26 Balkanski, Alex, 44, 249 bandwidth, 60, 111, 121, 140, 180, 188–89 Baran, Paul, 184, 185 Barbados, 251, 254 300 Index Barksdale, Jim, 198, 199–201 Barksdale Group, 201 BASE, 249 BASIC computer language, 126, 127 BBN. See Bolt, Baranek and Newman Bechtolsheim, Andy, 191 Bedard, Kipp, 19–20 Bell, Dave, 127 Bell Labs, 103, 110 Berry, Hank, 205–6 Beyond.com, 208 Bezos, Jeff, 228 Biggs, Barton, 163 big-time trends. See waterfalls bioinformatics, 296 biotech industry, 237 Black, Joseph, 54 Blutcher (steam locomotive), 92 Boggs, David, 189, 190 Bolt, Baranek and Newman, 184, 187 bonds, 11, 30–31, 164 Bonsal, Frank, 144–49 Borislow, Daniel, 72–73 Bosack, Len, 191 Boulton, Matthew, 55–58, 65, 66, 89 Boulton & Watt Company, 56–58, 64, 65, 89, 246, 247, 272 Bowman, Larry, 291–92 Bowman Capital, 291 Brady bonds, 164 Britain, 42, 50–59, 258 industrial economy, 42, 64–68, 91–95, 272 patent law, 55 textile manufacture, 64–68 wealth creation, 257, 271–72 broadband, 164, 225 browsers, 196–201 Brunel, I.


pages: 315 words: 89,861

The Simulation Hypothesis by Rizwan Virk

3D printing, Albert Einstein, AlphaGo, Apple II, artificial general intelligence, augmented reality, Benoit Mandelbrot, bioinformatics, butterfly effect, Colossal Cave Adventure, Computing Machinery and Intelligence, DeepMind, discovery of DNA, Dmitri Mendeleev, Elon Musk, en.wikipedia.org, Ernest Rutherford, game design, Google Glasses, Isaac Newton, John von Neumann, Kickstarter, mandelbrot fractal, Marc Andreessen, Minecraft, natural language processing, Nick Bostrom, OpenAI, Pierre-Simon Laplace, Plato's cave, quantum cryptography, quantum entanglement, Ralph Waldo Emerson, Ray Kurzweil, Richard Feynman, Schrödinger's Cat, Search for Extraterrestrial Intelligence, Silicon Valley, Stephen Hawking, Steve Jobs, Steve Wozniak, technological singularity, TED Talk, time dilation, Turing test, Vernor Vinge, Zeno's paradox

Within computer science, video games and entertainment have played a unique role in driving the development of both hardware and software. Examples include the development of GPUs (graphics processing units) for optimized rendering, CGI (computer-generated effects), and CAD (computer-aided design), as well as artificial intelligence and bioinformatics. The most recent incarnation of fully immersive entertainment technology is virtual reality (VR). Despite wondering about the simulation hypothesis for many years, it wasn’t until VR and AI reached their current level of sophistication that I could see a clear path to how we might develop all-encompassing simulations like the one depicted in The Matrix, which led me to write this book.

Within computer science and AI, biological processes have shown that they can be utilized to get much smarter and more unique results—most of today’s machine learning is based on the conditioning of neural networks, which are based on biological algorithms. While there is still some way to go, the burgeoning field of bioinformatics and modeling of biological processes has made information and computation an integral part of the organic world! Most importantly, the physical world, which was thought of in classical physics as a set of physical objects moving in continuous paths around the heavens, has been updated. As quantum physics reveals that there is no such thing as a physical object, that most objects consist of empty space and electrons, we start to get into metaphysical questions about what is real in the world.


pages: 326 words: 88,968

The Science and Technology of Growing Young: An Insider's Guide to the Breakthroughs That Will Dramatically Extend Our Lifespan . . . And What You Can Do Right Now by Sergey Young

23andMe, 3D printing, Albert Einstein, artificial general intelligence, augmented reality, basic income, Big Tech, bioinformatics, Biosphere 2, brain emulation, caloric restriction, caloric restriction, Charles Lindbergh, classic study, clean water, cloud computing, cognitive bias, computer vision, coronavirus, COVID-19, CRISPR, deep learning, digital twin, diversified portfolio, Doomsday Clock, double helix, Easter island, Elon Musk, en.wikipedia.org, epigenetics, European colonialism, game design, Gavin Belson, George Floyd, global pandemic, hockey-stick growth, impulse control, Internet of things, late capitalism, Law of Accelerating Returns, life extension, lockdown, Lyft, Mark Zuckerberg, meta-analysis, microbiome, microdosing, moral hazard, mouse model, natural language processing, personalized medicine, plant based meat, precision agriculture, radical life extension, Ralph Waldo Emerson, Ray Kurzweil, Richard Feynman, ride hailing / ride sharing, Ronald Reagan, self-driving car, seminal paper, Silicon Valley, stem cell, Steve Jobs, tech billionaire, TED Talk, uber lyft, ultra-processed food, universal basic income, Virgin Galactic, Vision Fund, X Prize

Among the field of impatient researchers working on this problem was a German geneticist and biostatistician, UCLA professor Dr. Steve Horvath. From the time he was a teenager, Horvath dreamed of extending the human healthspan. For decades, however, his academic and professional interests took him down different paths—through mathematics and bioinformatics. By the time he came back to aging, Horvath had developed a different perspective than other biologists and researchers—he had become accustomed to looking at algorithms more than organisms. So Horvath aimed to combine the two perspectives and set about finding what data patterns could be associated with aging.

And yet, four months after she received her first dose of Opdivo, McKeown’s cancer was in full remission. Moores is part of a federally funded clinical study in the United States called I-PREDICT. The cancer patients in this trial all previously underwent conventional treatments in vain. The I-PREDICT team of radiologists, oncologists, geneticists, pharmacologists, and bioinformatics experts pool knowledge from each of their fields to arrive at drug combinations that are precisely tuned to the individual patient’s genetic presentation. Of seventy-three patients who were treated using this pharmacogenetic approach, those who received more precise treatments matched to their genomic alterations fared twice as well as those who did not.2 This is precision medicine (PM), sometimes called personalized medicine or predictive medicine, and it is about to completely transform every single aspect of health care.


pages: 599 words: 98,564

The Mutant Project: Inside the Global Race to Genetically Modify Humans by Eben Kirksey

23andMe, Abraham Maslow, Affordable Care Act / Obamacare, Albert Einstein, Bernie Sanders, bioinformatics, bitcoin, Black Lives Matter, blockchain, Buckminster Fuller, clean water, coronavirus, COVID-19, CRISPR, cryptocurrency, data acquisition, deep learning, Deng Xiaoping, Donald Trump, double helix, epigenetics, Ethereum, ethereum blockchain, experimental subject, fake news, gentrification, George Floyd, Jeff Bezos, lockdown, Mark Zuckerberg, megacity, microdosing, moral panic, move fast and break things, personalized medicine, phenotype, placebo effect, randomized controlled trial, Recombinant DNA, Shenzhen special economic zone , Shenzhen was a fishing village, Silicon Valley, Silicon Valley billionaire, Skype, special economic zone, statistical model, stem cell, surveillance capitalism, tech billionaire, technological determinism, upwardly mobile, urban planning, young professional

As the DNA sequencing data from the twins trickled back into the laboratory, it fell primarily on the shoulders of one person to make sense of the code. A star undergraduate student in Dr. He’s bioinformatics course, whom I will call Goran, was hired into the laboratory straight after graduation. The young computer whiz spent long hours hunched over his keyboard in a tiny office on the SUSTech campus he shared with a junior bioinformatics technician and the contact person for patients in the study. His lab mates thought of Goran as wise beyond his years. When Ryan Ferrell joined the team, the young technician slid his computer over so that the pair could share a desk.


pages: 648 words: 108,814

Solr 1.4 Enterprise Search Server by David Smiley, Eric Pugh

Amazon Web Services, bioinformatics, cloud computing, continuous integration, database schema, domain-specific language, en.wikipedia.org, fault tolerance, Firefox, information retrieval, Ruby on Rails, SQL injection, Wayback Machine, web application, Y Combinator

WebMynd is one of the largest installations of Solr, indexing up to two million HTML documents per day, and making heavy use of Solr's multicore features to enable a partially active index. Jerome Eteve holds a BSC in physics, maths and computing and an MSC in IT and bioinformatics from the University of Lille (France). After starting his career in the field of bioinformatics, where he worked as a biological data management and analysis consultant, he's now a senior web developer with interests ranging from database level issues to user experience online. He's passionate about open source technologies, search engines, and web application architecture.


The Deep Learning Revolution (The MIT Press) by Terrence J. Sejnowski

AI winter, Albert Einstein, algorithmic bias, algorithmic trading, AlphaGo, Amazon Web Services, Any sufficiently advanced technology is indistinguishable from magic, augmented reality, autonomous vehicles, backpropagation, Baxter: Rethink Robotics, behavioural economics, bioinformatics, cellular automata, Claude Shannon: information theory, cloud computing, complexity theory, computer vision, conceptual framework, constrained optimization, Conway's Game of Life, correlation does not imply causation, crowdsourcing, Danny Hillis, data science, deep learning, DeepMind, delayed gratification, Demis Hassabis, Dennis Ritchie, discovery of DNA, Donald Trump, Douglas Engelbart, driverless car, Drosophila, Elon Musk, en.wikipedia.org, epigenetics, Flynn Effect, Frank Gehry, future of work, Geoffrey Hinton, Google Glasses, Google X / Alphabet X, Guggenheim Bilbao, Gödel, Escher, Bach, haute couture, Henri Poincaré, I think there is a world market for maybe five computers, industrial robot, informal economy, Internet of things, Isaac Newton, Jim Simons, John Conway, John Markoff, John von Neumann, language acquisition, Large Hadron Collider, machine readable, Mark Zuckerberg, Minecraft, natural language processing, Neil Armstrong, Netflix Prize, Norbert Wiener, OpenAI, orbital mechanics / astrodynamics, PageRank, pattern recognition, pneumatic tube, prediction markets, randomized controlled trial, Recombinant DNA, recommendation engine, Renaissance Technologies, Rodney Brooks, self-driving car, Silicon Valley, Silicon Valley startup, Socratic dialogue, speech recognition, statistical model, Stephen Hawking, Stuart Kauffman, theory of mind, Thomas Bayes, Thomas Kuhn: the structure of scientific revolutions, traveling salesman, Turing machine, Von Neumann architecture, Watson beat the top human players on Jeopardy!, world market for maybe five computers, X Prize, Yogi Berra

The training set was 3D structures determined by x-ray crystallography. To our surprise, the secondary structure predictions for new proteins were far better than the best methods based on biophysics.10 This landmark study was the first application of machine learning to molecular sequences, a field that is now called bioinformatics. Backpropagating Errors 117 Another network that learned how to form the past tense of English verbs became a cause célèbre in the world of cognitive psychology as the rule-based old guard battled it out with the avant-garde PDP Group.11 The regular way to form the past tense of an English verb is to add the suffix “ed,” as in forming “trained” from “train.”

., 82 infomax ICA algorithm, 81, 83, 83f, 84, 86 neural nets and, 82, 90, 296n15 photograph, 83f on water structure, 296n15 writings, 79, 85f, 295n2, 295n4, 295n6, 296n9, 306n24 Bellman, Richard, 145, 304n4 Bellman equation. See Dynamic programming, algorithm for Benasich, April A., 184, 308n22 Bengio, Yoshua, 135, 139f, 141, 141f, 302nn4–5, 303n20, 304n25, 304n28 Berg, Howard C., 319n12 Berger, Hans, 86 Berra, Yogi, x Berry, Halle, 235, 236f, 237 Bi, Guoqiang Q., 216f Big data, 10, 164, 229 Bioinformatics, 116 Biophysics, 116. See also under Johns Hopkins University “Biophysics of Computation, The” (course), 104 Birds consulting with each other, 29f Birdsong, 155–156, 157f Bishop, Christopher M., 279 Index Black boxes the case against, 253–255 neural network as a black box, 123 Blakeslee, Sandra, 316n14 Blandford, Roger, 312n1 Blind source separation problem, 81, 82f, 83f Blocks World, 27 Boahen, Kwabena A., 313n14 Boltzmann, Ludwig, 99 Boltzmann learning, unsupervised, 106 Boltzmann machine backpropagation of errors contrasted with, 112 Charles Rosenberg on, 112 criticisms of, 106 diagram, 98b at equilibrium, 99 Geoffrey Hinton and, 49, 79, 104, 105f, 106, 110, 112, 127 hidden units, 98b, 101, 102, 104, 106, 109 learning mirror symmetries, 102, 104 limitations, 107 multilayer, 49, 104, 105f, 106, 109 for handwritten digit recognition and generation, 104, 105f, 106 overview, 97, 98b, 99, 101, 135 perceptron contrasted with, 99, 101, 102, 106, 109 restricted, 106 separating figure from ground with, 97, 100f supervised and unsupervised versions, 106 Boltzmann machine learning algorithm, 99, 101, 109, 133, 158 goal of, 99 history in neuroscience, 101 “wake” and “sleep” phases, 98b, 101–102 323 Boole, George, 54, 55f Boolean logic, 54 Border-ownerships cells, 99 Botvinick, Matthew, 317n15 Brain.


pages: 764 words: 261,694

The Elements of Statistical Learning (Springer Series in Statistics) by Trevor Hastie, Robert Tibshirani, Jerome Friedman

algorithmic bias, backpropagation, Bayesian statistics, bioinformatics, computer age, conceptual framework, correlation coefficient, data science, G4S, Geoffrey Hinton, greed is good, higher-order functions, linear programming, p-value, pattern recognition, random walk, selection bias, sparse data, speech recognition, statistical model, stochastic process, The Wisdom of Crowds

S TAT I S T I C S  ---- › springer.com The Elements of Statistical Learning During the past decade there has been an explosion in computation and information technology. With it have come vast amounts of data in a variety of fields such as medicine, biology, finance, and marketing. The challenge of understanding these data has led to the development of new tools in the field of statistics, and spawned new areas such as data mining, machine learning, and bioinformatics. Many of these tools have common underpinnings but are often expressed with different terminology. This book describes the important ideas in these areas in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with a liberal use of color graphics.

With the advent of computers and the information age, statistical problems have exploded both in size and complexity. Challenges in the areas of data storage, organization and searching have led to the new field of “data mining”; statistical and computational problems in biology and medicine have created “bioinformatics.” Vast amounts of data are being generated in many fields, and the statistician’s job is to make sense of it all: to extract important patterns and trends, and understand “what the data says.” We call this learning from data. The challenges in learning from data have led to a revolution in the statistical sciences.

Estimation of sparse Markov networks using modified logistic regression and the lasso, submitted. Hoerl, A. E. and Kennard, R. (1970). Ridge regression: biased estimation for nonorthogonal problems, Technometrics 12: 55–67. Hothorn, T. and Bühlmann, P. (2006). Model-based boosting in high dimensions, Bioinformatics 22(22): 2828–2829. Huber, P. (1964). Robust estimation of a location parameter, Annals of Mathematical Statistics 53: 73–101. Huber, P. (1985). Projection pursuit, Annals of Statistics 13: 435–475. Hunter, D. and Lange, K. (2004). A tutorial on MM algorithms, The American Statistician 58(1): 30–37.


pages: 137 words: 36,231

Information: A Very Short Introduction by Luciano Floridi

agricultural Revolution, Albert Einstein, bioinformatics, Bletchley Park, carbon footprint, Claude Shannon: information theory, Computing Machinery and Intelligence, conceptual framework, digital divide, disinformation, double helix, Douglas Engelbart, Douglas Engelbart, George Akerlof, Gordon Gekko, Gregor Mendel, industrial robot, information asymmetry, intangible asset, Internet of things, invention of writing, John Nash: game theory, John von Neumann, Laplace demon, machine translation, moral hazard, Nash equilibrium, Nelson Mandela, Norbert Wiener, Pareto efficiency, phenotype, Pierre-Simon Laplace, prisoner's dilemma, RAND corporation, RFID, Thomas Bayes, Turing machine, Vilfredo Pareto

Consider the following examples: medical information is information about medical facts (attributive use), not information that has curative properties; digital information is not information about something digital, but information that is in itself of digital nature (predicative use); and military information can be both information about something military (attributive) and of military nature in itself (predicative). When talking about biological or genetic information, the attributive sense is common and uncontroversial. In bioinformatics, for example, a database may contain medical records and genealogical or genetic data about a whole population. Nobody disagrees about the existence of this kind of biological or genetic information. It is the predicative sense that is more contentious. Are biological or genetic processes or elements intrinsically informational in themselves?


pages: 444 words: 117,770

The Coming Wave: Technology, Power, and the Twenty-First Century's Greatest Dilemma by Mustafa Suleyman

"World Economic Forum" Davos, 23andMe, 3D printing, active measures, Ada Lovelace, additive manufacturing, agricultural Revolution, AI winter, air gap, Airbnb, Alan Greenspan, algorithmic bias, Alignment Problem, AlphaGo, Alvin Toffler, Amazon Web Services, Anthropocene, artificial general intelligence, Asilomar, Asilomar Conference on Recombinant DNA, ASML, autonomous vehicles, backpropagation, barriers to entry, basic income, benefit corporation, Big Tech, biodiversity loss, bioinformatics, Bletchley Park, Blitzscaling, Boston Dynamics, business process, business process outsourcing, call centre, Capital in the Twenty-First Century by Thomas Piketty, ChatGPT, choice architecture, circular economy, classic study, clean tech, cloud computing, commoditize, computer vision, coronavirus, corporate governance, correlation does not imply causation, COVID-19, creative destruction, CRISPR, critical race theory, crowdsourcing, cryptocurrency, cuban missile crisis, data science, decarbonisation, deep learning, deepfake, DeepMind, deindustrialization, dematerialisation, Demis Hassabis, disinformation, drone strike, drop ship, dual-use technology, Easter island, Edward Snowden, effective altruism, energy transition, epigenetics, Erik Brynjolfsson, Ernest Rutherford, Extinction Rebellion, facts on the ground, failed state, Fairchild Semiconductor, fear of failure, flying shuttle, Ford Model T, future of work, general purpose technology, Geoffrey Hinton, global pandemic, GPT-3, GPT-4, hallucination problem, hive mind, hype cycle, Intergovernmental Panel on Climate Change (IPCC), Internet Archive, Internet of things, invention of the wheel, job automation, John Maynard Keynes: technological unemployment, John von Neumann, Joi Ito, Joseph Schumpeter, Kickstarter, lab leak, large language model, Law of Accelerating Returns, Lewis Mumford, license plate recognition, lockdown, machine readable, Marc Andreessen, meta-analysis, microcredit, move 37, Mustafa Suleyman, mutually assured destruction, new economy, Nick Bostrom, Nikolai Kondratiev, off grid, OpenAI, paperclip maximiser, personalized medicine, Peter Thiel, planetary scale, plutocrats, precautionary principle, profit motive, prompt engineering, QAnon, quantum entanglement, ransomware, Ray Kurzweil, Recombinant DNA, Richard Feynman, Robert Gordon, Ronald Reagan, Sam Altman, Sand Hill Road, satellite internet, Silicon Valley, smart cities, South China Sea, space junk, SpaceX Starlink, stealth mode startup, stem cell, Stephen Fry, Steven Levy, strong AI, synthetic biology, tacit knowledge, tail risk, techlash, techno-determinism, technoutopianism, Ted Kaczynski, the long tail, The Rise and Fall of American Growth, Thomas Malthus, TikTok, TSMC, Turing test, Tyler Cowen, Tyler Cowen: Great Stagnation, universal basic income, uranium enrichment, warehouse robotics, William MacAskill, working-age population, world market for maybe five computers, zero day

CAR T-cell therapies engineer bespoke immune response white blood cells to attack cancers; genetic editing looks set to cure hereditary heart conditions. Thanks to lifesaving treatments like vaccines, we are already accustomed to the idea of intervening in our biology to help us fight disease. The field of systems biology aims to understand the “larger picture” of a cell, tissue, or organism by using bioinformatics and computational biology to see how the organism works holistically; such efforts could be the foundation for a new era of personalized medicine. Before long the idea of being treated in a generic way will seem positively medieval; everything, from the kind of care we receive to the medicines we are offered, will be precisely tailored to our DNA and specific biomarkers.

More than a million researchers accessed the tool within eighteen months of launch, including virtually all the world’s leading biology labs, addressing questions from antibiotic resistance to the treatment of rare diseases to the origins of life itself. Previous experiments had delivered the structure of about 190,000 proteins to the European Bioinformatics Institute’s database, about 0.1 percent of known proteins in existence. DeepMind uploaded some 200 million structures in one go, representing almost all known proteins. Whereas once it might have taken researchers weeks or months to determine a protein’s shape and function, that process can now begin in a matter of seconds.


Scikit-Learn Cookbook by Trent Hauck

bioinformatics, book value, computer vision, data science, information retrieval, p-value

We'll walk through the various univariate feature selection methods: >>> from sklearn import datasets >>> X, y = datasets.make_regression(1000, 10000) 184 www.it-ebooks.info Chapter 5 Now that we have the data, we will compare the features that are included with the various methods. This is actually a very common situation when you're dealing in text analysis or some areas of bioinformatics. How to do it... First, we need to import the feature_selection module: >>> from sklearn import feature_selection >>> f, p = feature_selection.f_regression(X, y) Here, f is the f score associated with each linear model fit with just one of the features. We can then compare these features and based on this comparison, we can cull features. p is also the p value associated with that f value.


pages: 481 words: 121,669

The Invisible Web: Uncovering Information Sources Search Engines Can't See by Gary Price, Chris Sherman, Danny Sullivan

AltaVista, American Society of Civil Engineers: Report Card, Bill Atkinson, bioinformatics, Brewster Kahle, business intelligence, dark matter, Donald Davies, Douglas Engelbart, Douglas Engelbart, full text search, HyperCard, hypertext link, information retrieval, Internet Archive, it's over 9,000, joint-stock company, knowledge worker, machine readable, machine translation, natural language processing, pre–internet, profit motive, Project Xanadu, publish or perish, search engine result page, side project, Silicon Valley, speech recognition, stealth mode startup, Ted Nelson, Vannevar Bush, web application

FishBase is a relational database with fish information to cater to different professionals such as research scientists, fisheries managers, zoologists, and many more. FishBase on the Web contains practically all fish species known to science.” Search Form URL: http://www.fishbase.org/search.cfm GeneCards http://bioinformatics.weizmann.ac.il “GeneCards is a database of human genes, their products, and their involvement in diseases. It offers concise information about the functions of all human genes that have an approved symbol, as well as selected others [gene listing].” Search Form URL: http://bioinformatics.weizmann.ac.il/cards/ Integrated Taxonomic Information System (Biological Names) http://www.itis.usda.gov/plantproj/itis/index.html “The Integrated Taxonomic Information System (ITIS) is a partnership of U.S., Canadian, and Mexican agencies, other organizations, and taxonomic specialists cooperating on the development of an online, scientifically credible, list of biological names focusing on the biota of North America.”


pages: 199 words: 47,154

Gnuplot Cookbook by Lee Phillips

bioinformatics, computer vision, functional programming, general-purpose programming language, pattern recognition, statistical model, web application

I am grateful to the users of my gnuplot web pages for their interest, questions, and suggestions over the years, and to my family for their patience and support. About the Reviewers Andreas Bernauer is a Software Engineer at Active Group in Germany. He graduated at Eberhard Karls Universität Tübingen, Germany, with a Degree in Bioinformatics and received a Master of Science degree in Genetics from the University of Connecticut, USA. In 2011, he earned a doctorate in Computer Engineering from Eberhard Karls Universität Tübingen. Andreas has more than 10 years of professional experience in software engineering. He implemented the server-side scripting engine in the scheme-based SUnet web server, hosted the Learning-Classifier-System workshops in Tübingen.


pages: 303 words: 67,891

Advances in Artificial General Intelligence: Concepts, Architectures and Algorithms: Proceedings of the Agi Workshop 2006 by Ben Goertzel, Pei Wang

AI winter, artificial general intelligence, backpropagation, bioinformatics, brain emulation, classic study, combinatorial explosion, complexity theory, computer vision, Computing Machinery and Intelligence, conceptual framework, correlation coefficient, epigenetics, friendly AI, functional programming, G4S, higher-order functions, information retrieval, Isaac Newton, Jeff Hawkins, John Conway, Loebner Prize, Menlo Park, natural language processing, Nick Bostrom, Occam's razor, p-value, pattern recognition, performance metric, precautionary principle, Ray Kurzweil, Rodney Brooks, semantic web, statistical model, strong AI, theory of mind, traveling salesman, Turing machine, Turing test, Von Neumann architecture, Y2K

On the contrary, we believe that most of the current AI research works make little direct contribution to AGI, though these works have value for many other reasons. Previously we have mentioned “machine learning” as an example. One of us (Goertzel) has published extensively about applications of machine learning algorithms to bioinformatics. This is a valid, and highly important sort of research – but it doesn’t have much to do with achieving general intelligence. There is no reason to believe that “intelligence” is simply a toolbox, containing mostly unconnected tools. Since the current AI “tools” have been built according to very different theoretical considerations, to implement them as modules in a big system will not necessarily make them work together, correctly and efficiently.

Unlike most contemporary AI projects, it is specifically oriented towards artificial general intelligence (AGI), rather than being restricted by design to one narrow domain or range of cognitive functions. The NAIE integrates aspects of prior AI projects and approaches, including symbolic, neural-network, evolutionary programming and reinforcement learning. The existing codebase is being applied in bioinformatics, NLP and other domains. To save space, some of the discussion in this paper will assume a basic familiarity with NAIE structures such as Atoms, Nodes, Links, ImplicationLinks and so forth, all of which are described in previous references and in other papers in this volume. 1.2. Cognitive Development in Simulated Androids Jean Piaget, in his classic studies of developmental psychology [8] conceived of child development as falling into four stages, each roughly identified with an age group: infantile, preoperational, concrete operational, and formal.


pages: 761 words: 231,902

The Singularity Is Near: When Humans Transcend Biology by Ray Kurzweil

additive manufacturing, AI winter, Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem, Albert Einstein, anthropic principle, Any sufficiently advanced technology is indistinguishable from magic, artificial general intelligence, Asilomar, augmented reality, autonomous vehicles, backpropagation, Benoit Mandelbrot, Bill Joy: nanobots, bioinformatics, brain emulation, Brewster Kahle, Brownian motion, business cycle, business intelligence, c2.com, call centre, carbon-based life, cellular automata, Charles Babbage, Claude Shannon: information theory, complexity theory, conceptual framework, Conway's Game of Life, coronavirus, cosmological constant, cosmological principle, cuban missile crisis, data acquisition, Dava Sobel, David Brooks, Dean Kamen, digital divide, disintermediation, double helix, Douglas Hofstadter, en.wikipedia.org, epigenetics, factory automation, friendly AI, functional programming, George Gilder, Gödel, Escher, Bach, Hans Moravec, hype cycle, informal economy, information retrieval, information security, invention of the telephone, invention of the telescope, invention of writing, iterative process, Jaron Lanier, Jeff Bezos, job automation, job satisfaction, John von Neumann, Kevin Kelly, Law of Accelerating Returns, life extension, lifelogging, linked data, Loebner Prize, Louis Pasteur, mandelbrot fractal, Marshall McLuhan, Mikhail Gorbachev, Mitch Kapor, mouse model, Murray Gell-Mann, mutually assured destruction, natural language processing, Network effects, new economy, Nick Bostrom, Norbert Wiener, oil shale / tar sands, optical character recognition, PalmPilot, pattern recognition, phenotype, power law, precautionary principle, premature optimization, punch-card reader, quantum cryptography, quantum entanglement, radical life extension, randomized controlled trial, Ray Kurzweil, remote working, reversible computing, Richard Feynman, Robert Metcalfe, Rodney Brooks, scientific worldview, Search for Extraterrestrial Intelligence, selection bias, semantic web, seminal paper, Silicon Valley, Singularitarianism, speech recognition, statistical model, stem cell, Stephen Hawking, Stewart Brand, strong AI, Stuart Kauffman, superintelligent machines, technological singularity, Ted Kaczynski, telepresence, The Coming Technological Singularity, Thomas Bayes, transaction costs, Turing machine, Turing test, two and twenty, Vernor Vinge, Y2K, Yogi Berra

Kurzweil Technologies is working with UT to develop pattern recognition-based analysis from either "Holter" monitoring (twenty-four-hour recordings) or "Event" monitoring (thirty days or more). 190. Kristen Philipkoski, "A Map That Maps Gene Functions," Wired News, May 28, 2002, http://www.wired.com/news/medtech/0,1286,52723,00.html. 191. Jennifer Ouellette, "Bioinformatics Moves into the Mainstream," The Industrial Physicist (October–November 2003), http://www.sciencemasters.com/bioinformatics.pdf. 192. Port, Arndt, and Carey, "Smart Tools." 193. "Protein Patterns in Blood May Predict Prostate Cancer Diagnosis," National Cancer Institute, October 15, 2002, http://www.nci.nih.gov/newscenter/ProstateProteomics, reporting on Emanuel F.

DARPA's Information Processing Technology Office's project in this vein is called LifeLog, http://www.darpa.mil/ipto/Programs/lifelog; see also Noah Shachtman, "A Spy Machine of DARPA's Dreams," Wired News, May 20, 2003, http://www.wired.com/news/business/0,1367,58909,00.html; Gordon Bell's project (for Microsoft) is MyLifeBits, http://research.microsoft.com/research/barc/MediaPresence/MyLifeBits.aspx; for the Long Now Foundation, see http://longnow.org. 44. Bergeron is assistant professor of anesthesiology at Harvard Medical School and the author of such books as Bioinformatics Computing, Biotech Industry: A Global, Economic, and Financing Overview, and The Wireless Web and Healthcare. 45. The Long Now Foundation is developing one possible solution: the Rosetta Disk, which will contain extensive archives of text in languages that may be lost in the far future. They plan to use a unique storage technology based on a two-inch nickel disk that can store up to 350,000 pages per disk, with an estimated life expectancy of 2,000 to 10,000 years.


pages: 271 words: 52,814

Blockchain: Blueprint for a New Economy by Melanie Swan

23andMe, Airbnb, altcoin, Amazon Web Services, asset allocation, banking crisis, basic income, bioinformatics, bitcoin, blockchain, capital controls, cellular automata, central bank independence, clean water, cloud computing, collaborative editing, Conway's Game of Life, crowdsourcing, cryptocurrency, data science, digital divide, disintermediation, Dogecoin, Edward Snowden, en.wikipedia.org, Ethereum, ethereum blockchain, fault tolerance, fiat currency, financial innovation, Firefox, friendly AI, Hernando de Soto, information security, intangible asset, Internet Archive, Internet of things, Khan Academy, Kickstarter, Large Hadron Collider, lifelogging, litecoin, Lyft, M-Pesa, microbiome, Neal Stephenson, Network effects, new economy, operational security, peer-to-peer, peer-to-peer lending, peer-to-peer model, personalized medicine, post scarcity, power law, prediction markets, QR code, ride hailing / ride sharing, Satoshi Nakamoto, Search for Extraterrestrial Intelligence, SETI@home, sharing economy, Skype, smart cities, smart contracts, smart grid, Snow Crash, software as a service, synthetic biology, technological singularity, the long tail, Turing complete, uber lyft, unbanked and underbanked, underbanked, Vitalik Buterin, Wayback Machine, web application, WikiLeaks

“Primecoin: The Cryptocurrency Whose Mining Is Actually Useful.” Bitcoin Magazine, July 8, 2013. http://bitcoinmagazine.com/5635/primecoin-the-cryptocurrency-whose-mining-is-actually-useful/. 127 Myers, D.S., A.L. Bazinet, and M.P. Cummings. “Expanding the Reach of Grid Computing: Combining Globus-and BOINC-Based Systems.” Center for Bioinformatics and Computational Biology, Institute for Advanced Computer Studies, University of Maryland, February 6, 2007 (Draft). http://lattice.umiacs.umd.edu/latticefiles/publications/lattice/myers_bazinet_cummings.pdf. 128 Clenfield, J. and P. Alpeyev. “The Other Bitcoin Power Struggle.” Bloomberg Businessweek, April 24, 2014. http://www.businessweek.com/articles/2014-04-24/bitcoin-miners-seek-cheap-electricity-to-eke-out-a-profit. 129 Gimein, M.


pages: 573 words: 157,767

From Bacteria to Bach and Back: The Evolution of Minds by Daniel C. Dennett

Ada Lovelace, adjacent possible, Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem, AlphaGo, Andrew Wiles, Bayesian statistics, bioinformatics, bitcoin, Bletchley Park, Build a better mousetrap, Claude Shannon: information theory, computer age, computer vision, Computing Machinery and Intelligence, CRISPR, deep learning, disinformation, double entry bookkeeping, double helix, Douglas Hofstadter, Elon Musk, epigenetics, experimental subject, Fermat's Last Theorem, Gödel, Escher, Bach, Higgs boson, information asymmetry, information retrieval, invention of writing, Isaac Newton, iterative process, John von Neumann, language acquisition, megaproject, Menlo Park, Murray Gell-Mann, Necker cube, Norbert Wiener, pattern recognition, phenotype, Richard Feynman, Rodney Brooks, self-driving car, social intelligence, sorting algorithm, speech recognition, Stephen Hawking, Steven Pinker, strong AI, Stuart Kauffman, TED Talk, The Wealth of Nations by Adam Smith, theory of mind, Thomas Bayes, trickle-down economics, Turing machine, Turing test, Watson beat the top human players on Jeopardy!, Y2K

Empirical work in both areas has made enough progress in recent decades to encourage further inquiry, taking on board the default (and tentative) assumption that the “trees” of existing lineages we can trace back eventually have single trunks. Phylogenetic diagrams, or cladograms, such as the Great Tree of Life (which appears as figure 9.1) showing all the species, or more limited trees of descent in particular lineages, are getting clearer and clearer as bio-informatics research on the accumulation of differences in DNA sequences plug the gaps and correct the mistakes of earlier anatomical and physiological sleuthing.45 Glossogenetic trees, lineages of languages (figure 9.2), are also popular thinking tools, laying out the relations of descent among language families (and individual words) over many centuries.

Texts of Homer’s Iliad and Odyssey, for instance, were known to descend by copying from texts descended from texts descended from texts going back to their oral ancestors in Homeric times. Philologists and paleographers had been reconstructing lineages of languages and manuscripts (e.g., the various extant copies of Plato’s Dialogues) since the Renaissance, and some of the latest bio-informatic techniques used today to determine relationships between genomes are themselves refined descendants of techniques developed to trace patterns of errors (mutations) in ancient texts. As Darwin noted, “The formation of different languages and of distinct species, and the proofs that both have been developed through a gradual process, are curiously the same” (1871, p. 59).


pages: 201 words: 63,192

Graph Databases by Ian Robinson, Jim Webber, Emil Eifrem

Amazon Web Services, anti-pattern, bioinformatics, business logic, commoditize, corporate governance, create, read, update, delete, data acquisition, en.wikipedia.org, fault tolerance, linked data, loose coupling, Network effects, recommendation engine, semantic web, sentiment analysis, social graph, software as a service, SPARQL, the strength of weak ties, web application

The social network example helps illustrate how different technologies deal with con‐ nected data, but is it a valid use case? Do we really need to find such remote “friends?” But substitute social networks for any other domain, and you’ll see we experience similar performance, modeling and maintenance benefits. Whether music or data center man‐ agement, bio-informatics or football statistics, network sensors or time-series of trades, graphs provide powerful insight into our data. Let’s look, then, at another contemporary application of graphs: recommending products based on a user’s purchase history and the histories of their friends, neighbours, and other people like them.


The Rise of Yeast: How the Sugar Fungus Shaped Civilisation by Nicholas P. Money

agricultural Revolution, bioinformatics, CRISPR, double helix, flex fuel, Google Earth, Gregor Mendel, Louis Pasteur, microbiome

Carpology is the branch of botany concerned with the study of fruits and seeds. According to modern terminology, which specifies that fruits and seeds are produced by plants, we say that the Tulasne brothers described the fruit bodies and spores of fungi. 27. N. P. Money, Fungal Biology 117, 463–5 (2013). 28. H. Nilsson et al., Evolutionary Bioinformatics Online 4, 193–201 (2008); R. Blaalid et al., Molecular Ecology Resources 13, 218–24 (2013). 29. Kurtzman, Fell, and Boekhout (n. 13). 30. W. T. Starmer and M.-A. Lachance, Yeast Ecology, in C. P. Kurtzman, J. W. Fell, and T. Boekhout, The Yeasts: A Taxonomic Study, 5th edition (Amsterdam: Springer, 2011), 88–107. 31.


pages: 552 words: 168,518

MacroWikinomics: Rebooting Business and the World by Don Tapscott, Anthony D. Williams

"World Economic Forum" Davos, accounting loophole / creative accounting, airport security, Andrew Keen, augmented reality, Ayatollah Khomeini, barriers to entry, Ben Horowitz, bioinformatics, blood diamond, Bretton Woods, business climate, business process, buy and hold, car-free, carbon footprint, carbon tax, Charles Lindbergh, citizen journalism, Clayton Christensen, clean water, Climategate, Climatic Research Unit, cloud computing, collaborative editing, collapse of Lehman Brothers, collateralized debt obligation, colonial rule, commoditize, corporate governance, corporate social responsibility, creative destruction, crowdsourcing, death of newspapers, demographic transition, digital capitalism, digital divide, disruptive innovation, distributed generation, do well by doing good, don't be evil, en.wikipedia.org, energy security, energy transition, Evgeny Morozov, Exxon Valdez, failed state, fault tolerance, financial innovation, Galaxy Zoo, game design, global village, Google Earth, Hans Rosling, hive mind, Home mortgage interest deduction, information asymmetry, interchangeable parts, Internet of things, invention of movable type, Isaac Newton, James Watt: steam engine, Jaron Lanier, jimmy wales, Joseph Schumpeter, Julian Assange, Kevin Kelly, Kickstarter, knowledge economy, knowledge worker, machine readable, Marc Andreessen, Marshall McLuhan, mass immigration, medical bankruptcy, megacity, military-industrial complex, mortgage tax deduction, Netflix Prize, new economy, Nicholas Carr, ocean acidification, off-the-grid, oil shock, old-boy network, online collectivism, open borders, open economy, pattern recognition, peer-to-peer lending, personalized medicine, radical decentralization, Ray Kurzweil, RFID, ride hailing / ride sharing, Ronald Reagan, Rubik’s Cube, scientific mainstream, shareholder value, Silicon Valley, Skype, smart grid, smart meter, social graph, social web, software patent, Steve Jobs, synthetic biology, systems thinking, text mining, the long tail, the scientific method, The Wisdom of Crowds, transaction costs, transfer pricing, University of East Anglia, urban sprawl, value at risk, WikiLeaks, X Prize, Yochai Benkler, young professional, Zipcar

“We’re changing by orders of magnitude the sampling ability we have for the oceans,” says Benoît Pirenne, associate director of Neptune Canada.8 To cope with the flood of data, researchers using Neptune’s Oceans 2.0 platform can tag everything from images to data feeds to video streams from undersea cameras, identifying sightings of little-known organisms or examples of rare phenomena. Wikis provide a shared space for group learning, discussion, and collaboration, while a Facebook-like social networking application helps connect researchers working on similar problems. Meanwhile, over at the European Bioinformatics Institute, scientists are using Web services to revolutionize the way they extract and interpret data from different sources, and to create entirely new data services. Imagine, for example, you wanted to find out everything there is to know about a species, from its taxonomy and genetic sequence to its geographical distribution.

Now imagine you had the power to weave together all the latest data on that species from all of the world’s biological databases with just one click. It’s not far-fetched. That power is here, today. Projects like these have inspired researchers in many fields to emulate the changes that are already sweeping disciplines such as bioinformatics and high-energy physics. Having said that, there will be some difficult adjustments and issues such as privacy and national security to confront along the way. “We’re going from a data poor to a data rich world,” says Smarr. “And there’s a lag whenever an exponential change like this transforms the impossible into the routine.”


pages: 245 words: 64,288

Robots Will Steal Your Job, But That's OK: How to Survive the Economic Collapse and Be Happy by Pistono, Federico

3D printing, Albert Einstein, autonomous vehicles, bioinformatics, Buckminster Fuller, cloud computing, computer vision, correlation does not imply causation, en.wikipedia.org, epigenetics, Erik Brynjolfsson, Firefox, future of work, gamification, George Santayana, global village, Google Chrome, happiness index / gross national happiness, hedonic treadmill, illegal immigration, income inequality, information retrieval, Internet of things, invention of the printing press, Jeff Hawkins, jimmy wales, job automation, John Markoff, Kevin Kelly, Khan Academy, Kickstarter, Kiva Systems, knowledge worker, labor-force participation, Lao Tzu, Law of Accelerating Returns, life extension, Loebner Prize, longitudinal study, means of production, Narrative Science, natural language processing, new economy, Occupy movement, patent troll, pattern recognition, peak oil, post scarcity, QR code, quantum entanglement, race to the bottom, Ray Kurzweil, recommendation engine, RFID, Rodney Brooks, selection bias, self-driving car, seminal paper, slashdot, smart cities, software as a service, software is eating the world, speech recognition, Steven Pinker, strong AI, synthetic biology, technological singularity, TED Talk, Turing test, Vernor Vinge, warehouse automation, warehouse robotics, women in the workforce

The more companies automate, because of the need to increase their productivity, the more jobs will be lost, forever. The future of work and innovation is not in the past that we know, but in unfamiliar territory of the future that is yet to come. New and exciting fields are emerging every day. Synthetic biology, neurocomputation, 3D printing, contour crafting, molecular engineering, bioinformatics, life extension, robotics, quantum computing, artificial intelligence, machine learning, these new frontiers that are rapidly evolving and are just the beginning of a new, amazing era of our species that will bring about the greatest transformation of all time. A transformation that will make the industrial revolution look like an event of minor importance.


Big Data at Work: Dispelling the Myths, Uncovering the Opportunities by Thomas H. Davenport

Automated Insights, autonomous vehicles, bioinformatics, business intelligence, business process, call centre, chief data officer, cloud computing, commoditize, data acquisition, data science, disruptive innovation, Edward Snowden, Erik Brynjolfsson, intermodal, Internet of things, Jeff Bezos, knowledge worker, lifelogging, Mark Zuckerberg, move fast and break things, Narrative Science, natural language processing, Netflix Prize, New Journalism, recommendation engine, RFID, self-driving car, sentiment analysis, Silicon Valley, smart grid, smart meter, social graph, sorting algorithm, statistical model, Tesla Model S, text mining, Thomas Davenport, three-martini lunch

The testing firm Kaplan uses its big data to begin advising customers on effective learning and test-preparation strategies. Novartis focuses on big data—the health-care industry calls it informatics—to develop new drugs. Its CEO, Joe Jimenez, commented in an interview, “If you think about the amounts of data that are now available, bioinformatics capability is becoming very important, as is the ability to mine that data and really understand, for example, the specific mutations that are leading to certain types of cancers.”7 These companies’ big data efforts are directly focused on products, services, and customers. This has important implications, of course, for the organizational locus of big data and the processes and pace of new product development.


pages: 239 words: 70,206

Data-Ism: The Revolution Transforming Decision Making, Consumer Behavior, and Almost Everything Else by Steve Lohr

"World Economic Forum" Davos, 23andMe, Abraham Maslow, Affordable Care Act / Obamacare, Albert Einstein, Alvin Toffler, Bear Stearns, behavioural economics, big data - Walmart - Pop Tarts, bioinformatics, business cycle, business intelligence, call centre, Carl Icahn, classic study, cloud computing, computer age, conceptual framework, Credit Default Swap, crowdsourcing, Daniel Kahneman / Amos Tversky, Danny Hillis, data is the new oil, data science, David Brooks, driverless car, East Village, Edward Snowden, Emanuel Derman, Erik Brynjolfsson, everywhere but in the productivity statistics, financial engineering, Frederick Winslow Taylor, Future Shock, Google Glasses, Ida Tarbell, impulse control, income inequality, indoor plumbing, industrial robot, informal economy, Internet of things, invention of writing, Johannes Kepler, John Markoff, John von Neumann, lifelogging, machine translation, Mark Zuckerberg, market bubble, meta-analysis, money market fund, natural language processing, obamacare, pattern recognition, payday loans, personalized medicine, planned obsolescence, precision agriculture, pre–internet, Productivity paradox, RAND corporation, rising living standards, Robert Gordon, Robert Solow, Salesforce, scientific management, Second Machine Age, self-driving car, Silicon Valley, Silicon Valley startup, SimCity, six sigma, skunkworks, speech recognition, statistical model, Steve Jobs, Steven Levy, The Design of Experiments, the scientific method, Thomas Kuhn: the structure of scientific revolutions, Tony Fadell, unbanked and underbanked, underbanked, Von Neumann architecture, Watson beat the top human players on Jeopardy!, yottabyte

He had agreed to give a talk in Seattle at a conference hosted by Sage Bionetworks, a nonprofit organization dedicated to accelerate the sharing of data for biological research. Hammerbacher knew the two medical researchers who had founded the nonprofit, Stephen Friend and Eric Schadt. He had talked to them about how they might use big-data software to cope with the data explosion in bioinformatics and genomics. But the preparation for the speech forced him to really think about biology and technology, reading up and talking to people. The more Hammerbacher looked into it, the more intriguing the subject looked. Biological research, he says, could go the way of finance with its closed, proprietary systems and data being hoarded rather than shared.


pages: 245 words: 71,886

Spike: The Virus vs The People - The Inside Story by Jeremy Farrar, Anjana Ahuja

"World Economic Forum" Davos, bioinformatics, Black Monday: stock market crash in 1987, Boris Johnson, Brexit referendum, contact tracing, coronavirus, COVID-19, crowdsourcing, dark matter, data science, DeepMind, Demis Hassabis, disinformation, Dominic Cummings, Donald Trump, double helix, dual-use technology, Future Shock, game design, global pandemic, Kickstarter, lab leak, lockdown, machine translation, nudge unit, open economy, pattern recognition, precautionary principle, side project, social distancing, the scientific method, Tim Cook: Apple, zoonotic diseases

The wheels of COG-UK started turning on 4 March 2020, when Sharon sent a one-line email to five close contacts: Cambridge colleague Julian Parkhill, a former head of Pathogen Genomics at the Wellcome Trust Sanger Institute (dedicated to genome sequencing); Judith Breuer, a virolo-gist at University College London; Nick Loman, an expert in microbial genomics and bioinformatics at Birmingham University; David Aanensen, a genomic surveillance specialist at the Big Data Institute at Oxford University; and Richard Myers, at Public Health England. I wonder if you could call me on my mobile this afternoon, 2pm onwards. Sharon By then, Sharon had joined SAGE and was being bombarded, as we all were, with hundreds of emails a day.


HBase: The Definitive Guide by Lars George

Alignment Problem, Amazon Web Services, bioinformatics, create, read, update, delete, Debian, distributed revision control, domain-specific language, en.wikipedia.org, fail fast, fault tolerance, Firefox, FOSDEM, functional programming, Google Earth, information security, Kickstarter, place-making, revision control, smart grid, sparse data, web application

If we were to take 140 bytes per message, as used by Twitter, it would total more than 17 TB every month. Even before the transition to HBase, the existing system had to handle more than 25 TB a month.[12] In addition, less web-oriented companies from across all major industries are collecting an ever-increasing amount of data. For example: Financial Such as data generated by stock tickers Bioinformatics Such as the Global Biodiversity Information Facility (http://www.gbif.org/) Smart grid Such as the OpenPDC (http://openpdc.codeplex.com/) project Sales Such as the data generated by point-of-sale (POS) or stock/inventory systems Genomics Such as the Crossbow (http://bowtie-bio.sourceforge.net/crossbow/index.shtml) project Cellular services, military, environmental Which all collect a tremendous amount of data as well Storing petabytes of data efficiently so that updates and retrieval are still performed well is no easy feat.

A abort() method, HBaseAdmin class, Basic Operations Abortable interface, Basic Operations Accept header, switching REST formats, Supported formats, JSON (application/json), Protocol Buffer (application/x-protobuf) access control, Introduction to Coprocessors, HBase Versus Bigtable Bigtable column families for, HBase Versus Bigtable coprocessors for, Introduction to Coprocessors ACID properties, The Problem with Relational Database Systems add() method, Bytes class, The Bytes Class add() method, Put class, Single Puts addColumn() method, Get class, Single Gets addColumn() method, HBaseAdmin class, Schema Operations addColumn() method, Increment class, Multiple Counters addColumn() method, Scan class, Introduction addFamily() method, Get class, Single Gets addFamily() method, HTableDescriptor class, Table Properties addFamily() method, Scan class, Introduction, Client API: Best Practices add_peer command, HBase Shell, Replication alter command, HBase Shell, Data definition Amazon, The Dawn of Big Data, S3, S3 data requirements of, The Dawn of Big Data S3 (Simple Storage Service), S3, S3 Apache Avro, Introduction to REST, Thrift, and Avro (see Avro) Apache binary release for HBase, Apache Binary Release, Apache Binary Release Apache HBase, Quick-Start Guide (see HBase) Apache Hive, Hive (see Hive) Apache Lucene, Search Integration, Search Integration Apache Maven, Building the Examples (see Maven) Apache Pig, Pig (see Pig) Apache Solr, Search Integration Apache Whirr, deployment using, Apache Whirr, Apache Whirr Apache ZooKeeper, Implementation (see ZooKeeper) API, Native Java (see client API) append feature, for durability, Durability append() method, HLog class, HLog Class architecture, storage, Storage (see storage architecture) assign command, HBase Shell, Tools assign() method, HBaseAdmin class, Cluster Operations AssignmentManager class, The Region Life Cycle AsyncHBase client, Other Clients atomic read-modify-write, Dimensions, Tables, Rows, Columns, and Cells, Storage API, General Notes, Atomic compare-and-set, Atomic compare-and-set, Atomic compare-and-delete, Atomic compare-and-delete, Row Locks, WALEdit Class compare-and-delete operations, Atomic compare-and-delete, Atomic compare-and-delete compare-and-set, for put operations, Atomic compare-and-set, Atomic compare-and-set per-row basis for, Tables, Rows, Columns, and Cells, Storage API, General Notes row locks for, Row Locks for WAL edits, WALEdit Class auto-sharding, Auto-Sharding, Auto-Sharding Avro, Introduction to REST, Thrift, and Avro, Introduction to REST, Thrift, and Avro, Avro, Avro, Operation, Installation, Operation, Operation, Operation, Operation, Advanced Schemas documentation for, Operation installing, Installation port used by, Operation schema compilers for, Avro schema used by, Advanced Schemas starting server for, Operation stopping, Operation B B+ trees, B+ Trees, B+ Trees backup masters, adding, Adding a local backup master, Adding a backup master, Adding a backup master balancer, Load Balancing, Load Balancing, Node Decommissioning balancer command, HBase Shell, Tools, Load Balancing balancer() method, HBaseAdmin class, Cluster Operations, Load Balancing balanceSwitch() method, HBaseAdmin class, Cluster Operations, Load Balancing balance_switch command, HBase Shell, Tools, Load Balancing, Node Decommissioning base64 command, XML (text/xml) Base64 encoding, with REST, XML (text/xml), JSON (application/json) BaseEndpointCoprocessor class, The BaseEndpointCoprocessor class, The BaseEndpointCoprocessor class BaseMasterObserver class, The BaseMasterObserver class, The BaseMasterObserver class BaseRegionObserver class, The BaseRegionObserver class, The BaseRegionObserver class Batch class, The CoprocessorProtocol interface, The BaseEndpointCoprocessor class batch clients, Batch Clients batch operations, Batch Operations, Batch Operations, Caching Versus Batching, Caching Versus Batching, Custom Filters for scans, Caching Versus Batching, Caching Versus Batching, Custom Filters on tables, Batch Operations, Batch Operations batch() method, HTable class, Batch Operations, Batch Operations, Introduction to Counters Bigtable storage architecture, Backdrop, Summary, Nomenclature, HBase Versus Bigtable, HBase Versus Bigtable “Bigtable: A Distributed Storage System for Structured Data” (paper, by Google), Preface, Backdrop bin directory, Apache Binary Release BinaryComparator class, Comparators BinaryPrefixComparator class, Comparators binarySearch() method, Bytes class, The Bytes Class bioinformatics, data requirements of, The Dawn of Big Data BitComparator class, Comparators block cache, Single Gets, Introduction, Column Families, Column Families, Bloom Filters, Region Server Metrics, Client API: Best Practices, Configuration Bloom filters affecting, Bloom Filters controlling use of, Single Gets, Introduction, Client API: Best Practices enabling and disabling, Column Families metrics for, Region Server Metrics settings for, Configuration block replication, MapReduce Locality, MapReduce Locality blocks, Column Families, HFile Format, HFile Format, HFile Format, HFile Format compressing, HFile Format size of, Column Families, HFile Format Bloom filters, Column Families, Bloom Filters, Bloom Filters bypass() method, ObserverContext class, The ObserverContext class Bytes class, Single Puts, Single Gets, The Bytes Class, The Bytes Class C caching, Caching Versus Batching, Caching Versus Batching, Caching Versus Batching, The HTable Utility Methods, Client API: Best Practices, HBase Configuration Properties (see also block cache; Memcached) regions, The HTable Utility Methods for scan operations, Caching Versus Batching, Caching Versus Batching, Client API: Best Practices, HBase Configuration Properties Cacti server, JMXToolkit on, JMX Remote API call() method, Batch class, The CoprocessorProtocol interface CAP (consistency, availability, and partition tolerance) theorem, Nonrelational Database Systems, Not-Only SQL or NoSQL?


pages: 284 words: 79,265

The Half-Life of Facts: Why Everything We Know Has an Expiration Date by Samuel Arbesman

Albert Einstein, Alfred Russel Wallace, Amazon Mechanical Turk, Andrew Wiles, Apollo 11, bioinformatics, British Empire, Cesare Marchetti: Marchetti’s constant, Charles Babbage, Chelsea Manning, Clayton Christensen, cognitive bias, cognitive dissonance, conceptual framework, data science, David Brooks, demographic transition, double entry bookkeeping, double helix, Galaxy Zoo, Gregor Mendel, guest worker program, Gödel, Escher, Bach, Ignaz Semmelweis: hand washing, index fund, invention of movable type, Isaac Newton, John Harrison: Longitude, Kevin Kelly, language acquisition, Large Hadron Collider, life extension, Marc Andreessen, meta-analysis, Milgram experiment, National Debt Clock, Nicholas Carr, P = NP, p-value, Paul Erdős, Pluto: dwarf planet, power law, publication bias, randomized controlled trial, Richard Feynman, Rodney Brooks, scientific worldview, SimCity, social contagion, social graph, social web, systematic bias, text mining, the long tail, the scientific method, the strength of weak ties, Thomas Kuhn: the structure of scientific revolutions, Thomas Malthus, Tyler Cowen, Tyler Cowen: Great Stagnation

Nature Reviews Drug Discovery 5, no. 8 (August 2006): 689–702. 112 software designed to find undiscovered patterns: See TRIZ, a method of invention and discovery. For example, here: www.aitriz.org. 112 computerized systems devoted to drug repurposing: Sanseau, Philippe, and Jacob Koehler. “Editorial: Computational Methods for Drug Repurposing.” Briefings in Bioinformatics 12, no. 4 (July 1, 2011): 301–2. 112 can generate new and interesting: Darden, Lindley. “Recent Work in Computational Scientific Discovery.” In Proceedings of the Nineteenth Annual Conference of the Cognitive Science Society (1997) 161–66. 113 names a novel, computationally created: See TheoryMine: http://theorymine.co.uk. 116 A Cornell professor of earth and atmospheric sciences: Cisne, John L.


The Pattern Seekers: How Autism Drives Human Invention by Simon Baron-Cohen

23andMe, agricultural Revolution, airport security, Albert Einstein, Apollo 11, Asperger Syndrome, assortative mating, autism spectrum disorder, bioinformatics, coronavirus, corporate social responsibility, correlation does not imply causation, COVID-19, David Attenborough, discovery of penicillin, Elon Musk, en.wikipedia.org, Fellow of the Royal Society, Greta Thunberg, intentional community, invention of agriculture, Isaac Newton, James Watt: steam engine, Jim Simons, lateral thinking, longitudinal study, Menlo Park, meta-analysis, neurotypical, out of africa, pattern recognition, phenotype, Rubik’s Cube, Silicon Valley, six sigma, Skype, social intelligence, Stephen Hawking, Steven Levy, Steven Pinker, systems thinking, theory of mind, twin studies, zero-sum game

This paper is included in a thematic collection of articles, Philosophical Transactions of the Royal Society of London: Series B, ed. U. Frith and C. Hayes, 367(1599, 2012), 1471–2970. Hayes challenges the idea of abrupt cognitive change between humans and our ancestors, in favor of incremental changes. 22. See S. López et al. (2015), “Human dispersal out of Africa: A lasting debate,” Evolutionary Bioinformatics 11, 57–68; and N. Conard (2008), “A critical view of the evidence for a Southern African origin of behavioural modernity,” South African Archaeological Society Goodwin Series 10, 175–178. 23. Reports at the time dated the bone flute to be at least 35,000 years old, and Nicholas Conard wrote in an email to New York Times reporter John Noble Wilford that it was more like 40,000 years old.


pages: 232 words: 72,483

Immortality, Inc. by Chip Walter

23andMe, Airbnb, Albert Einstein, Arthur D. Levinson, bioinformatics, Buckminster Fuller, cloud computing, CRISPR, data science, disintermediation, double helix, Elon Musk, Isaac Newton, Jeff Bezos, Larry Ellison, Law of Accelerating Returns, life extension, Menlo Park, microbiome, mouse model, pattern recognition, Peter Thiel, phenotype, radical life extension, Ray Kurzweil, Recombinant DNA, Rodney Brooks, self-driving car, Silicon Valley, Silicon Valley startup, Snapchat, South China Sea, SpaceShipOne, speech recognition, statistical model, stem cell, Stephen Hawking, Steve Jobs, TED Talk, Thomas Bayes, zero day

he would ask. “No,” she would reply. “Why not?” “Because it was wicked hard to study and nobody is going to tackle it. They wouldn’t know where to begin.” Well, that was just too delicious a problem. So de Grey forsook his former job and took up gerontology while handling software development and bioinformatics at the Cambridge genetics lab where Adelaide and her students worked. Over the next several years he would harangue Adelaide for information, pore over textbooks and journals, pester biologists with every kind of question, and show up at conferences to interrogate anyone he could find. Despite becoming a gerontologist, de Grey didn’t care much for others in the field.


pages: 741 words: 199,502

Human Diversity: The Biology of Gender, Race, and Class by Charles Murray

23andMe, affirmative action, Albert Einstein, Alfred Russel Wallace, Asperger Syndrome, assortative mating, autism spectrum disorder, basic income, behavioural economics, bioinformatics, Cass Sunstein, correlation coefficient, CRISPR, Daniel Kahneman / Amos Tversky, dark triade / dark tetrad, domesticated silver fox, double helix, Drosophila, emotional labour, epigenetics, equal pay for equal work, European colonialism, feminist movement, glass ceiling, Gregor Mendel, Gunnar Myrdal, income inequality, Kenneth Arrow, labor-force participation, longitudinal study, meritocracy, meta-analysis, nudge theory, out of africa, p-value, phenotype, public intellectual, publication bias, quantitative hedge fund, randomized controlled trial, Recombinant DNA, replication crisis, Richard Thaler, risk tolerance, school vouchers, Scientific racism, selective serotonin reuptake inhibitor (SSRI), Silicon Valley, Skinner box, social intelligence, Social Justice Warrior, statistical model, Steven Pinker, The Bell Curve by Richard Herrnstein and Charles Murray, the scientific method, The Wealth of Nations by Adam Smith, theory of mind, Thomas Kuhn: the structure of scientific revolutions, twin studies, universal basic income, working-age population

For the visually similar figure in chapter 7, the unit of analysis was the individual and the cell entries were measures of genetic distance—Wright’s fixation index, FST. 9: The Landscape of Ancestral Population Differences 1. Responsibility for the GWAS Catalog was subsequently shared with the European Bioinformatics Institute (EBI). The GWAS Catalog is downloadable free of charge at its website, ebi.ac.uk/gwas. The level of statistical significance required for entry in the GWAS Catalog is p <1.0×10–5, which is more inclusive than the standard for statistical significance in the published literature (p <1.0×10–8).

LoParo, Devon, and Irwin Waldman. 2014. “Twins’ Rearing Environment Similarity and Childhood Externalizing Disorders: A Test of the Equal Environments Assumption.” Behavior Genetics 44 (6): 606–13. Lopez, Saioa, Lucy van Dorp, and Garrett Hallenthal. 2016. “Human Dispersal out of Africa: A Lasting Debate.” Evolutionary Bioinformatics 11 (S2): 57–68. Low, Bobbi S. 2015. Why Sex Matters: A Darwinian Look at Human Behavior. Princeton, NJ: Princeton University Press. Lubinski, David, and Camilla P. Benbow. 2006. “Study of Mathematically Precocious Youth After 35 Years: Uncovering Antecedents for the Development of Math-Science Expertise.”


pages: 313 words: 84,312

We-Think: Mass Innovation, Not Mass Production by Charles Leadbeater

1960s counterculture, Andrew Keen, barriers to entry, bioinformatics, c2.com, call centre, citizen journalism, clean water, cloud computing, complexity theory, congestion charging, death of newspapers, Debian, digital divide, digital Maoism, disruptive innovation, double helix, Douglas Engelbart, Edward Lloyd's coffeehouse, folksonomy, frictionless, frictionless market, future of work, game design, Garrett Hardin, Google Earth, Google X / Alphabet X, Hacker Ethic, Herbert Marcuse, Hernando de Soto, hive mind, Howard Rheingold, interchangeable parts, Isaac Newton, James Watt: steam engine, Jane Jacobs, Jaron Lanier, Jean Tirole, jimmy wales, Johannes Kepler, John Markoff, John von Neumann, Joi Ito, Kevin Kelly, knowledge economy, knowledge worker, lateral thinking, lone genius, M-Pesa, Mark Shuttleworth, Mark Zuckerberg, Marshall McLuhan, Menlo Park, microcredit, Mitch Kapor, new economy, Nicholas Carr, online collectivism, Paradox of Choice, planetary scale, post scarcity, public intellectual, Recombinant DNA, Richard Stallman, Shoshana Zuboff, Silicon Valley, slashdot, social web, software patent, Steven Levy, Stewart Brand, supply-chain management, synthetic biology, the Cathedral and the Bazaar, The Death and Life of Great American Cities, the long tail, the market place, The Wealth of Nations by Adam Smith, The Wisdom of Crowds, Thomas Kuhn: the structure of scientific revolutions, Tragedy of the Commons, Whole Earth Catalog, work culture , Yochai Benkler, Zipcar

They will bring with them the web’s culture of lateral, semi-structured free association. This new organisational landscape is taking shape all around us. Scientific research is becoming ever more a question of organising a vast number of pebbles. Young scientists especially in emerging fields like bioinformatics draw on hundreds of data banks; use electronic lab notebooks to record and then share their results daily, often through blogs and wikis; work in multi-disciplinary teams threaded around the world organised by social networks; they publish their results, including open source versions of the software used in their experiments and their raw data, in open access online journals.


pages: 314 words: 94,600

Business Metadata: Capturing Enterprise Knowledge by William H. Inmon, Bonnie K. O'Neil, Lowell Fryman

affirmative action, bioinformatics, business cycle, business intelligence, business process, call centre, carbon-based life, continuous integration, corporate governance, create, read, update, delete, database schema, en.wikipedia.org, folksonomy, informal economy, knowledge economy, knowledge worker, semantic web, tacit knowledge, The Wisdom of Crowds, web application

The terms were stored in a 11179 registry, and the registry metadata was mapped to UML structures from the Class Diagram. The solution includes three main layers: ✦ Layer 1: Enterprise Vocabulary Services: DL (description logics) and ontology, thesaurus ✦ Layer 2: CADSR: Metadata Registry, consisting of Common Data Elements ✦ Layer 3: Cancer Bioinformatics Objects, using UML Domain Models The NCI Thesaurus contains over 48,000 concepts. Although its emphasis is on machine understandability, NCI has managed to translate description logic somewhat into English. Linking concepts together is accomplished through roles, which are also concepts themselves.


pages: 426 words: 83,128

The Journey of Humanity: The Origins of Wealth and Inequality by Oded Galor

agricultural Revolution, Alfred Russel Wallace, Andrei Shleifer, Apollo 11, Berlin Wall, bioinformatics, colonial rule, Columbian Exchange, conceptual framework, COVID-19, creative destruction, Daniel Kahneman / Amos Tversky, David Ricardo: comparative advantage, deindustrialization, demographic dividend, demographic transition, Donald Trump, double entry bookkeeping, Easter island, European colonialism, Fall of the Berlin Wall, Francisco Pizarro, general purpose technology, germ theory of disease, income per capita, intermodal, invention of agriculture, invention of movable type, invention of the printing press, invention of the telegraph, James Hargreaves, James Watt: steam engine, Joseph-Marie Jacquard, Kenneth Arrow, longitudinal study, loss aversion, Louis Pasteur, means of production, out of africa, phenotype, rent-seeking, rising living standards, Robert Solow, Scramble for Africa, The Death and Life of Great American Cities, The Rise and Fall of American Growth, The Wealth of Nations by Adam Smith, Thomas Malthus, Walter Mischel, Washington Consensus, wikimedia commons, women in the workforce, working-age population, World Values Survey

Lipset, Seymour Martin, ‘Some social requisites of democracy: Economic development and political legitimacy’, American Political Science Review 53, no. 1 (1959): 69–105. Litina, Anastasia, ‘Natural land productivity, cooperation and comparative development’, Journal of Economic Growth 21, no. 4 (2016): 351–408. López, Saioa, Lucy Van Dorp and Garrett Hellenthal, ‘Human dispersal out of Africa: A lasting debate’, Evolutionary Bioinformatics 11 (2015): EBO-S33489. Lucas, Adrienne M., ‘The impact of malaria eradication on fertility’, Economic Development and Cultural Change 61, no. 3 (2013): 607–31. Lucas, Adrienne M., ‘Malaria eradication and educational attainment: evidence from Paraguay and Sri Lanka’, American Economic Journal: Applied Economics 2, no. 2 (2010): 46–71.


pages: 354 words: 91,875

The Willpower Instinct: How Self-Control Works, Why It Matters, and What You Can Doto Get More of It by Kelly McGonigal

banking crisis, behavioural economics, bioinformatics, Cass Sunstein, choice architecture, cognitive bias, delayed gratification, Dunning–Kruger effect, Easter island, game design, impulse control, lifelogging, loss aversion, low interest rates, meta-analysis, mirror neurons, PalmPilot, phenotype, Richard Thaler, social contagion, Stanford marshmallow experiment, Tragedy of the Commons, Walter Mischel

“Depression, Craving, and Substance Use Following a Randomized Trial of Mindfulness-Based Relapse Prevention.” Journal of Consulting and Clinical Psychology 78 (2010): 362–74. Chapter 10: Final Thoughts Page 237—“Only reasonable conclusion to a book about scientific ideas is: Draw your own conclusions”: Credit for this suggestion goes to Brian Kidd, Senior Bioinformatics Research Specialist, Institute for Infection Immunity and Transplantation, Stanford University. INDEX acceptance inner power of Adams, Claire addiction addict loses his cravings candy addict conquers sweet tooth chocoholic takes inspiration from Hershey’s Kisses dopamine’s role in drinking drug e-mail Facebook shopping smoker under social influence smoking Advisor-Teller Money Manager Intervention (ATM) Ainslie, George Air Force Academy, U.S.


pages: 286 words: 90,530

Richard Dawkins: How a Scientist Changed the Way We Think by Alan Grafen; Mark Ridley

Alfred Russel Wallace, Arthur Eddington, bioinformatics, Charles Babbage, cognitive bias, computer age, Computing Machinery and Intelligence, conceptual framework, Dava Sobel, double helix, Douglas Hofstadter, Easter island, epigenetics, Fellow of the Royal Society, Haight Ashbury, interchangeable parts, Isaac Newton, Johann Wolfgang von Goethe, John von Neumann, loose coupling, Murray Gell-Mann, Necker cube, phenotype, profit maximization, public intellectual, Ronald Reagan, Stephen Hawking, Steven Pinker, the scientific method, theory of mind, Thomas Kuhn: the structure of scientific revolutions, Yogi Berra, zero-sum game

The invention of an algorithmic biology Seth Bullock BIOLOGY and computing might not seem the most comfortable of bedfellows. It is easy to imagine nature and technology clashing as the green-welly brigade rub up awkwardly against the back-room boffins. But collaboration between the two fields has exploded in recent years, driven primarily by massive investment in the emerging field of bioinformatics charged with mapping the human genome. New algorithms and computational infrastructures have enabled research groups to collaborate effectively on a worldwide scale in building huge, exponentially growing genomic databases, to ‘mine’ these mountains of data for useful information, and to construct and manipulate innovative computational models of the genes and proteins that have been identified.


pages: 313 words: 95,077

Here Comes Everybody: The Power of Organizing Without Organizations by Clay Shirky

Andrew Keen, Andy Carvin, Berlin Wall, bike sharing, bioinformatics, Brewster Kahle, c2.com, Charles Lindbergh, commons-based peer production, crowdsourcing, digital rights, en.wikipedia.org, Free Software Foundation, Garrett Hardin, hiring and firing, hive mind, Howard Rheingold, Internet Archive, invention of agriculture, invention of movable type, invention of the printing press, invention of the telegraph, jimmy wales, John Perry Barlow, Joi Ito, Kuiper Belt, liberation theology, Mahatma Gandhi, means of production, Merlin Mann, Metcalfe’s law, Nash equilibrium, Network effects, Nicholas Carr, Picturephone, place-making, Pluto: dwarf planet, power law, prediction markets, price mechanism, prisoner's dilemma, profit motive, Richard Stallman, Robert Metcalfe, Ronald Coase, Silicon Valley, slashdot, social software, Stewart Brand, supply-chain management, the Cathedral and the Bazaar, the long tail, The Nature of the Firm, The Wealth of Nations by Adam Smith, The Wisdom of Crowds, Tragedy of the Commons, transaction costs, ultimatum game, Vilfredo Pareto, Wayback Machine, Yochai Benkler, Yogi Berra

Despite these resources and incentives, however, the solution didn’t come from China. On April 12, Genome Sciences Centre (GSC), a small Canadian lab specializing in the genetics of pathogens, published the genetic sequence of SARS. On the way, they had participated in not just one open network, but several. Almost the entire computational installation of GSC is open source; bioinformatics tools with names like BLAST, Phrap, Phred, and Consed, all running on Linux. GSC checked their work against Genbank, a public database of genetic sequences. They published their findings on their own site (run, naturally, using open source tools) and published the finished sequence to Genbank, for everyone to see.


pages: 336 words: 93,672

The Future of the Brain: Essays by the World's Leading Neuroscientists by Gary Marcus, Jeremy Freeman

23andMe, Albert Einstein, backpropagation, bioinformatics, bitcoin, brain emulation, cloud computing, complexity theory, computer age, computer vision, conceptual framework, correlation does not imply causation, crowdsourcing, dark matter, data acquisition, data science, deep learning, Drosophila, epigenetics, Geoffrey Hinton, global pandemic, Google Glasses, ITER tokamak, iterative process, language acquisition, linked data, mouse model, optical character recognition, pattern recognition, personalized medicine, phenotype, race to the bottom, Richard Feynman, Ronald Reagan, semantic web, speech recognition, stem cell, Steven Pinker, supply-chain management, synthetic biology, tacit knowledge, traumatic brain injury, Turing machine, twin studies, web application

When geneticists began exome sequencing in earnest, they encountered an unexpected complication. It turns out that each human individual carries a surprisingly high number of potentially deleterious mutations, typically more than one hundred. These are mutations that alter or disturb protein sequences in a way that is predicted to have a damaging effect on protein function, based on bioinformatic (computer-based) analyses. Each mutation might be extremely rare in the population, or even unique to the person or family in which it is found. How do we sift out the true causal mutations, the ones that are functionally implicated in the disorder or trait we are studying, against a broader background of irrelevant genomic change?


pages: 396 words: 96,049

Upgrade by Blake Crouch

bioinformatics, butterfly effect, cognitive dissonance, correlation does not imply causation, COVID-19, CRISPR, dark matter, deepfake, double helix, Douglas Hofstadter, driverless car, drone strike, glass ceiling, Google Earth, Gödel, Escher, Bach, Hyperloop, independent contractor, job automation, low earth orbit, messenger bag, mirror neurons, off grid, pattern recognition, phenotype, ride hailing / ride sharing, supervolcano, time dilation

“How big—” “Two, maybe. Likely five people.” “Any idea of—” God, I knew every question he would ask before he asked it. So much wasted time. So much inefficiency. “—who they might be?” I said, “She would need people who, as a group, could encompass biochemistry, molecular biology, genetics, and bioinformatics. Every one of them working at the height of their powers. I can’t imagine her pulling this off without a quantum-annealing or exascale processor.” I was speaking too fast. The average person speaks 100 to 130 words per minute. I was pushing 180. When had that started? I needed to slow down, stop drawing attention to my exploding intellect.


pages: 471 words: 94,519

Managing Projects With GNU Make by Robert Mecklenburg, Andrew Oram

bioinformatics, business logic, Free Software Foundation, functional programming, general-purpose programming language, Larry Wall, machine readable, Richard Stallman

(question mark), Wildcards calling functions and, Wildcards character classes, Wildcards expanding, Wildcards misuse, Wildcards pattern rules and, Rules ^ (tilde), Wildcards Windows filesystem, Cygwin and, Filesystem wordlist function, String Functions words function, String Functions X XML, Ant, XML Preprocessing build files, Ant preprocessing book makefile, XML Preprocessing About the Author Robert Mecklenburg began using Unix as a student in 1977 and has been programming professionally for 23 years. His make experience started in 1982 at NASA with Unix version 7. Robert received his Ph.D. in Computer Science from the University of Utah in 1991. Since then, he has worked in many fields ranging from mechanical CAD to bioinformatics, and he brings his extensive experience in C++, Java, and Lisp to bear on the problems of project management with make Colophon Our look is the result of reader comments, our own experimentation, and feedback from distribution channels. Distinctive covers complement our distinctive approach to technical topics, breathing personality and life into potentially dry subjects.


pages: 313 words: 101,403

My Life as a Quant: Reflections on Physics and Finance by Emanuel Derman

Bear Stearns, Berlin Wall, bioinformatics, Black-Scholes formula, book value, Brownian motion, buy and hold, capital asset pricing model, Claude Shannon: information theory, Dennis Ritchie, Donald Knuth, Emanuel Derman, financial engineering, fixed income, Gödel, Escher, Bach, haute couture, hiring and firing, implied volatility, interest rate derivative, Jeff Bezos, John Meriwether, John von Neumann, Ken Thompson, law of one price, linked data, Long Term Capital Management, moral hazard, Murray Gell-Mann, Myron Scholes, PalmPilot, Paul Samuelson, pre–internet, proprietary trading, publish or perish, quantitative trading / quantitative finance, Sharpe ratio, statistical arbitrage, statistical model, Stephen Hawking, Steve Jobs, stochastic volatility, technology bubble, the new new thing, transaction costs, volatility smile, Y2K, yield curve, zero-coupon bond, zero-sum game

With their deep pockets, he said "they had guys spending all their time running diff RMSs files and the O'Connor code" (Dill is one of the great suite of UNIX tools that make a programmer's life easier. It compares two different files of text and finds any common strings of words in them, a simpler version of current bio-informatics programs that search for common strings of DNA in the mouse and human genome.) I have no idea whether there were in fact commonalities, but even independent people coding the same wellknown algorithm might end up writing vaguely similar chunks of code. O'Connor eventually disappeared, too, absorbed into Swiss Bank, which itself subsequently merged with UBS.


pages: 313 words: 34,042

Tools for Computational Finance by Rüdiger Seydel

bioinformatics, Black-Scholes formula, Brownian motion, commoditize, continuous integration, discrete time, financial engineering, implied volatility, incomplete markets, interest rate swap, linear programming, London Interbank Offered Rate, mandelbrot fractal, martingale, random walk, risk free rate, stochastic process, stochastic volatility, transaction costs, value at risk, volatility smile, Wiener process, zero-coupon bond

.: Statistics of Financial Markets: An Introduction Hurwitz, A.; Kritikos, N.: Lectures on Number Theory Frauenthal, J. C.: Mathematical Modeling in Epidemiology Huybrechts, D.: Complex Geometry: An Introduction Freitag, E.; Busam, R.: Complex Analysis Isaev, A.: Introduction to Mathematical Methods in Bioinformatics Friedman, R.: Algebraic Surfaces and Holomorphic Vector Bundles Fuks, D. B.; Rokhlin, V. A.: Beginner’s Course in Topology Fuhrmann, P. A.: A Polynomial Approach to Linear Algebra Gallot, S.; Hulin, D.; Lafontaine, J.: Riemannian Geometry Istas, J.: Mathematical Modeling for the Life Sciences Iversen, B.: Cohomology of Sheaves Jacod, J.; Protter, P.: Probability Essentials Jennings, G.


RDF Database Systems: Triples Storage and SPARQL Query Processing by Olivier Cure, Guillaume Blin

Amazon Web Services, bioinformatics, business intelligence, cloud computing, database schema, fault tolerance, folksonomy, full text search, functional programming, information retrieval, Internet Archive, Internet of things, linked data, machine readable, NP-complete, peer-to-peer, performance metric, power law, random walk, recommendation engine, RFID, semantic web, Silicon Valley, social intelligence, software as a service, SPARQL, sparse data, web application

This fact is mainly due to the expansion of the Web and the load of information that can be harvested from our interactions with it, such as via personal computers, laptops, smartphones, and tablet devices. This data can be represented using various models and in the context of use cases thriving on the Web—that is, for social, geographical, recommendations, bioinformatics, network management, and fraud detection, to name a few, the graph data model is a particularly relevant choice. RDF, with its W3C recommendation status and its set of companions like SPARQL, SKOS, RDFS, and OWL, plays a primordial role in the graph data model ecosystem.The quantity and quality of tools, such as parsers, editors, and APIs, implemented to ease the use of RDF data attests for the strong enthusiasm surrounding this standard, as well as the importance to manage this data appropriately.The number of academic, open-source and commercial RDF stores presented in this book emphasize the importance of this tool category, the diversity of possible approaches, as well as the complexity to design efficient systems.


pages: 340 words: 97,723

The Big Nine: How the Tech Titans and Their Thinking Machines Could Warp Humanity by Amy Webb

"Friedman doctrine" OR "shareholder theory", Ada Lovelace, AI winter, air gap, Airbnb, airport security, Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem, algorithmic bias, AlphaGo, Andy Rubin, artificial general intelligence, Asilomar, autonomous vehicles, backpropagation, Bayesian statistics, behavioural economics, Bernie Sanders, Big Tech, bioinformatics, Black Lives Matter, blockchain, Bretton Woods, business intelligence, Cambridge Analytica, Cass Sunstein, Charles Babbage, Claude Shannon: information theory, cloud computing, cognitive bias, complexity theory, computer vision, Computing Machinery and Intelligence, CRISPR, cross-border payments, crowdsourcing, cryptocurrency, Daniel Kahneman / Amos Tversky, data science, deep learning, DeepMind, Demis Hassabis, Deng Xiaoping, disinformation, distributed ledger, don't be evil, Donald Trump, Elon Musk, fail fast, fake news, Filter Bubble, Flynn Effect, Geoffrey Hinton, gig economy, Google Glasses, Grace Hopper, Gödel, Escher, Bach, Herman Kahn, high-speed rail, Inbox Zero, Internet of things, Jacques de Vaucanson, Jeff Bezos, Joan Didion, job automation, John von Neumann, knowledge worker, Lyft, machine translation, Mark Zuckerberg, Menlo Park, move fast and break things, Mustafa Suleyman, natural language processing, New Urbanism, Nick Bostrom, one-China policy, optical character recognition, packet switching, paperclip maximiser, pattern recognition, personalized medicine, RAND corporation, Ray Kurzweil, Recombinant DNA, ride hailing / ride sharing, Rodney Brooks, Rubik’s Cube, Salesforce, Sand Hill Road, Second Machine Age, self-driving car, seminal paper, SETI@home, side project, Silicon Valley, Silicon Valley startup, skunkworks, Skype, smart cities, South China Sea, sovereign wealth fund, speech recognition, Stephen Hawking, strong AI, superintelligent machines, surveillance capitalism, technological singularity, The Coming Technological Singularity, the long tail, theory of mind, Tim Cook: Apple, trade route, Turing machine, Turing test, uber lyft, Von Neumann architecture, Watson beat the top human players on Jeopardy!, zero day

Over-the-counter medications are mostly gone, too, but compounding pharmacies have seen a resurgence. That’s because AGI helped accelerate critical developments in genetic editing and precision medicine. You now consult a computational pharmacist: specially trained pharmacists who have backgrounds in bioinformatics, medicine, and pharmacology. Computational pharmacy is a medical specialty, one that works closely with a new breed of AI-GPs: general practitioners who are trained in both medicine and technology. While AGI has obviated certain medical specialists—radiologists, immunologists, allergists, cardiologists, dermatologists, endocrinologists, anesthesiologists, neurologists, and others—doctors working in those fields had plenty of time to repurpose their skills for adjacent fields.


pages: 420 words: 100,811

We Are Data: Algorithms and the Making of Our Digital Selves by John Cheney-Lippold

algorithmic bias, bioinformatics, business logic, Cass Sunstein, centre right, computer vision, critical race theory, dark matter, data science, digital capitalism, drone strike, Edward Snowden, Evgeny Morozov, Filter Bubble, Google Chrome, Google Earth, Hans Moravec, Ian Bogost, informal economy, iterative process, James Bridle, Jaron Lanier, Julian Assange, Kevin Kelly, late capitalism, Laura Poitras, lifelogging, Lyft, machine readable, machine translation, Mark Zuckerberg, Marshall McLuhan, mass incarceration, Mercator projection, meta-analysis, Nick Bostrom, Norbert Wiener, offshore financial centre, pattern recognition, price discrimination, RAND corporation, Ray Kurzweil, Richard Thaler, ride hailing / ride sharing, Rosa Parks, Silicon Valley, Silicon Valley startup, Skype, Snapchat, software studies, statistical model, Steven Levy, technological singularity, technoutopianism, the scientific method, Thomas Bayes, Toyota Production System, Turing machine, uber lyft, web application, WikiLeaks, Zimmermann PGP

Louise Amoore, “On the Emergence of a Security Risk Calculus for Our Times,” Theory, Culture & Society 28, no. 6 (2011): 27. 20. Alexander Galloway, Gaming: Essays on Algorithmic Culture (Minneapolis: University of Minnesota Press, 2006), 103. 21. Nicholas Negroponte, Being Digital (New York: Vintage, 1995), 4. 22. Eugene Thacker, “Bioinformatics and Bio-logics,” Postmodern Culture 13, no. 2 (2003): 58. 23. Viktor Mayer-Schönberger and Kenneth Cukier, Big Data: A Revolution That Will Transform How We Live, Work, and Think (Boston: Eamon Dolan / Houghton Mifflin Harcourt, 2013). 24. Tyler Reigeluth, “Why Data Is Not Enough: Digital Traces as Control of Self and Self-Control,” Surveillance & Society 12, no. 2 (2014): 249. 25.


pages: 364 words: 99,897

The Industries of the Future by Alec Ross

"World Economic Forum" Davos, 23andMe, 3D printing, Airbnb, Alan Greenspan, algorithmic bias, algorithmic trading, AltaVista, Anne Wojcicki, autonomous vehicles, banking crisis, barriers to entry, Bernie Madoff, bioinformatics, bitcoin, Black Lives Matter, blockchain, Boston Dynamics, Brian Krebs, British Empire, business intelligence, call centre, carbon footprint, clean tech, cloud computing, collaborative consumption, connected car, corporate governance, Credit Default Swap, cryptocurrency, data science, David Brooks, DeepMind, Demis Hassabis, disintermediation, Dissolution of the Soviet Union, distributed ledger, driverless car, Edward Glaeser, Edward Snowden, en.wikipedia.org, Erik Brynjolfsson, Evgeny Morozov, fiat currency, future of work, General Motors Futurama, global supply chain, Google X / Alphabet X, Gregor Mendel, industrial robot, information security, Internet of things, invention of the printing press, Jaron Lanier, Jeff Bezos, job automation, John Markoff, Joi Ito, Kevin Roose, Kickstarter, knowledge economy, knowledge worker, lifelogging, litecoin, low interest rates, M-Pesa, machine translation, Marc Andreessen, Mark Zuckerberg, Max Levchin, Mikhail Gorbachev, military-industrial complex, mobile money, money: store of value / unit of account / medium of exchange, Nelson Mandela, new economy, off-the-grid, offshore financial centre, open economy, Parag Khanna, paypal mafia, peer-to-peer, peer-to-peer lending, personalized medicine, Peter Thiel, precision agriculture, pre–internet, RAND corporation, Ray Kurzweil, recommendation engine, ride hailing / ride sharing, Rubik’s Cube, Satoshi Nakamoto, selective serotonin reuptake inhibitor (SSRI), self-driving car, sharing economy, Silicon Valley, Silicon Valley startup, Skype, smart cities, social graph, software as a service, special economic zone, supply-chain management, supply-chain management software, technoutopianism, TED Talk, The Future of Employment, Travis Kalanick, underbanked, unit 8200, Vernor Vinge, Watson beat the top human players on Jeopardy!, women in the workforce, work culture , Y Combinator, young professional

While Seltzer makes the case that virtually every bit of our personal information is now available to those who want it, I do think there are parts of our lives that remain private and that we must fight to keep private. And I think the best way to do that is by focusing on defining rules for data retention and proper use. Most of our health information remains private, and the need for privacy will grow with the rise of genomics. John Quackenbush, a professor of computational biology and bioinformatics at Harvard, explained that “as soon as you touch genomic data, that information is fundamentally identifiable. I can erase your address and Social Security number and every other identifier, but I can’t anonymize your genome without wiping out the information that I need to analyze.” The danger of genomic information being widely available is difficult to overstate.


pages: 696 words: 111,976

SQL Hacks by Andrew Cumming, Gordon Russell

Apollo 13, bioinformatics, book value, business intelligence, business logic, business process, database schema, en.wikipedia.org, Erdős number, Firefox, full text search, Hacker Conference 1984, Hacker Ethic, leftpad, Paul Erdős, SQL injection, Stewart Brand, web application

Mimer is also taking active part in the standardization of SQL as a member of the ISO SQL-standardization committee ISO/IEC JTC1/SC32, WorkGroup 3, Database Languages. You can download free development versions of Mimer SQL from http://www.mimer.com. Troels Arvin lives with his wife and son in Copenhagen, Denmark. He went half-way through medical school before realizing that computer science was the thing to do. He has since worked in the web, bioinformatics, and telecommunications businesses. Troels is keen on database technology and maintains a slowly growing web page on how databases implement the SQL standard: http://troels.arvin.dk/db/rdbms. Acknowledgments We would like to thank our editor, Brian Jepson, for his hard work and exceptional skill; his ability to separate the wheat from the chaff was invaluable.


pages: 359 words: 110,488

Bad Blood: Secrets and Lies in a Silicon Valley Startup by John Carreyrou

Affordable Care Act / Obamacare, bioinformatics, corporate governance, Donald Trump, El Camino Real, Elon Musk, fake it until you make it, Google Chrome, John Markoff, Jony Ive, Kickstarter, Larry Ellison, Marc Andreessen, Mark Zuckerberg, Mars Rover, medical malpractice, Menlo Park, obamacare, Ponzi scheme, reality distortion field, ride hailing / ride sharing, Right to Buy, Sand Hill Road, Seymour Hersh, Sheryl Sandberg, side project, Silicon Valley, Silicon Valley startup, stealth mode startup, Steve Jobs, stock buybacks, supply-chain management, Travis Kalanick, ubercab, Wayback Machine

In the process of writing this book, I reached out to all of the key figures in the Theranos saga and offered them the opportunity to comment on any allegations concerning them. Elizabeth Holmes, as is her right, declined my interview requests and chose not to cooperate with this account. Prologue November 17, 2006 Tim Kemp had good news for his team. The former IBM executive was in charge of bioinformatics at Theranos, a startup with a cutting-edge blood-testing system. The company had just completed its first big live demonstration for a pharmaceutical company. Elizabeth Holmes, Theranos’s twenty-two-year-old founder, had flown to Switzerland and shown off the system’s capabilities to executives at Novartis, the European drug giant.


pages: 424 words: 108,768

Origins: How Earth's History Shaped Human History by Lewis Dartnell

agricultural Revolution, Anthropocene, back-to-the-land, bioinformatics, clean water, Columbian Exchange, decarbonisation, discovery of the americas, Donald Trump, Eratosthenes, financial innovation, Google Earth, Khyber Pass, Malacca Straits, megacity, meta-analysis, ocean acidification, oil shale / tar sands, out of africa, Pax Mongolica, peak oil, phenotype, rewilding, Rosa Parks, Silicon Valley, South China Sea, spice trade, Suez crisis 1956, supervolcano, trade route, transatlantic slave trade

‘Phylogeography of Asian wild rice, Oryza rufipogon, reveals multiple independent domestications of cultivated rice, Oryza sativa’, Proceedings of the National Academy of Sciences of the United States of America 103(25): 9578–83. López, S., L. van Dorp and G. Hellenthal (2015). ‘Human Dispersal Out of Africa: A Lasting Debate’, Evolutionary Bioinformatics Online 11(Suppl 2): 57–68. Lutgens, F. K. and E. J. Tarbuck (2000). The Atmosphere: An Introduction to Meteorology, 8th edition, Prentice Hall. Lyons, T. W., C. T. Reinhard and N. J. Planavsky (2014). ‘The rise of oxygen in Earth’s early ocean and atmosphere’, Nature 506: 307–15. Macalister, T. (2015).


pages: 502 words: 107,510

Natural Language Annotation for Machine Learning by James Pustejovsky, Amber Stubbs

Amazon Mechanical Turk, bioinformatics, cloud computing, computer vision, crowdsourcing, easy for humans, difficult for computers, finite state, Free Software Foundation, game design, information retrieval, iterative process, language acquisition, machine readable, machine translation, natural language processing, pattern recognition, performance metric, power law, sentiment analysis, social web, sparse data, speech recognition, statistical model, text mining

“Coupled Semi-Supervised Learning for Information Extraction.” In Proceedings of the 3rd ACM International Conference on Web Search and Data Mining (WSDM). Chomsky, Noam. 1957. Syntactic Structures. Paris: Mouton. Chuzhanova, N.A., A.J. Jones, and S. Margetts.1998. “Feature selection for genetic sequence classification. “Bioinformatics 14(2):139–143. Culotta, Aron, Michael Wick, Robert Hall, and Andrew McCallum. 2007. “First-Order Probabilistic Models for Coreference Resolution.” In Proceedings of the Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics (HLT/NAACL).


A Brief History of Everyone Who Ever Lived by Adam Rutherford

23andMe, agricultural Revolution, Albert Einstein, Alfred Russel Wallace, autism spectrum disorder, bioinformatics, British Empire, classic study, colonial rule, dark matter, delayed gratification, demographic transition, double helix, Drosophila, epigenetics, Eyjafjallajökull, Google Earth, Gregor Mendel, Higgs boson, Isaac Newton, Kickstarter, longitudinal study, meta-analysis, out of africa, phenotype, sceptred isle, theory of mind, Thomas Malthus, twin studies

Nowadays it has become a tiresome cliché to say that a person’s passion or quintessential characteristic is ‘in their DNA’. The satirical magazine Private Eye has a whole column dedicated to this phrase flopping out of journalists’ and celebrities’ mouths. Well, Ewan Birney is a man with DNA in his DNA. These days he heads the European Bioinformatics Institute in Hinxton, just outside Cambridge, one of the great global genome powerhouses. While our contemporaries went off to Koh Samui or Goa to find themselves on their year off before going up to university, Ewan had won a place in the lab of James Watson, at Cold Spring Harbor, just at the birth of genomics, the biological science that would come to dominate all others.


pages: 524 words: 120,182

Complexity: A Guided Tour by Melanie Mitchell

Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem, Albert Einstein, Albert Michelson, Alfred Russel Wallace, algorithmic management, anti-communist, Arthur Eddington, Benoit Mandelbrot, bioinformatics, cellular automata, Claude Shannon: information theory, clockwork universe, complexity theory, computer age, conceptual framework, Conway's Game of Life, dark matter, discrete time, double helix, Douglas Hofstadter, Eddington experiment, en.wikipedia.org, epigenetics, From Mathematics to the Technologies of Life and Death, Garrett Hardin, Geoffrey West, Santa Fe Institute, Gregor Mendel, Gödel, Escher, Bach, Hacker News, Hans Moravec, Henri Poincaré, invisible hand, Isaac Newton, John Conway, John von Neumann, Long Term Capital Management, mandelbrot fractal, market bubble, Menlo Park, Murray Gell-Mann, Network effects, Norbert Wiener, Norman Macrae, Paul Erdős, peer-to-peer, phenotype, Pierre-Simon Laplace, power law, Ray Kurzweil, reversible computing, scientific worldview, stem cell, Stuart Kauffman, synthetic biology, The Wealth of Nations by Adam Smith, Thomas Malthus, Tragedy of the Commons, Turing machine

The best-known applications are in the field of coding theory, which deals with both data compression and the way codes need to be structured to be reliably transmitted. Coding theory affects nearly all of our electronic communications; cell phones, computer networks, and the worldwide global positioning system are a few examples. Information theory is also central in cryptography and in the relatively new field of bioinformatics, in which entropy and other information theory measures are used to analyze patterns in gene sequences. It has also been applied to analysis of language and music and in psychology, statistical inference, and artificial intelligence, among many other fields. Although information theory was inspired by notions of entropy in thermodynamics and statistical mechanics, it is controversial whether or not information theory has had much of a reverse impact on those and other fields of physics.


pages: 396 words: 117,149

The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World by Pedro Domingos

Albert Einstein, Amazon Mechanical Turk, Arthur Eddington, backpropagation, basic income, Bayesian statistics, Benoit Mandelbrot, bioinformatics, Black Swan, Brownian motion, cellular automata, Charles Babbage, Claude Shannon: information theory, combinatorial explosion, computer vision, constrained optimization, correlation does not imply causation, creative destruction, crowdsourcing, Danny Hillis, data is not the new oil, data is the new oil, data science, deep learning, DeepMind, double helix, Douglas Hofstadter, driverless car, Erik Brynjolfsson, experimental subject, Filter Bubble, future of work, Geoffrey Hinton, global village, Google Glasses, Gödel, Escher, Bach, Hans Moravec, incognito mode, information retrieval, Jeff Hawkins, job automation, John Markoff, John Snow's cholera map, John von Neumann, Joseph Schumpeter, Kevin Kelly, large language model, lone genius, machine translation, mandelbrot fractal, Mark Zuckerberg, Moneyball by Michael Lewis explains big data, Narrative Science, Nate Silver, natural language processing, Netflix Prize, Network effects, Nick Bostrom, NP-complete, off grid, P = NP, PageRank, pattern recognition, phenotype, planetary scale, power law, pre–internet, random walk, Ray Kurzweil, recommendation engine, Richard Feynman, scientific worldview, Second Machine Age, self-driving car, Silicon Valley, social intelligence, speech recognition, Stanford marshmallow experiment, statistical model, Stephen Hawking, Steven Levy, Steven Pinker, superintelligent machines, the long tail, the scientific method, The Signal and the Noise by Nate Silver, theory of mind, Thomas Bayes, transaction costs, Turing machine, Turing test, Vernor Vinge, Watson beat the top human players on Jeopardy!, white flight, yottabyte, zero-sum game

Statistical Methods for Speech Recognition,* by Fred Jelinek (MIT Press, 1997), describes their application to speech recognition. The story of HMM-style inference in communication is told in “The Viterbi algorithm: A personal history,” by David Forney (unpublished; online at arxiv.org/pdf/cs/0504020v2.pdf). Bioinformatics: The Machine Learning Approach,* by Pierre Baldi and Søren Brunak (2nd ed., MIT Press, 2001), is an introduction to the use of machine learning in biology, including HMMs. “Engineers look to Kalman filtering for guidance,” by Barry Cipra (SIAM News, 1993), is a brief introduction to Kalman filters, their history, and their applications.


pages: 437 words: 113,173

Age of Discovery: Navigating the Risks and Rewards of Our New Renaissance by Ian Goldin, Chris Kutarna

"World Economic Forum" Davos, 2013 Report for America's Infrastructure - American Society of Civil Engineers - 19 March 2013, 3D printing, Airbnb, Albert Einstein, AltaVista, Asian financial crisis, asset-backed security, autonomous vehicles, banking crisis, barriers to entry, battle of ideas, Bear Stearns, Berlin Wall, bioinformatics, bitcoin, Boeing 747, Bonfire of the Vanities, bread and circuses, carbon tax, clean water, collective bargaining, Colonization of Mars, Credit Default Swap, CRISPR, crowdsourcing, cryptocurrency, Dava Sobel, demographic dividend, Deng Xiaoping, digital divide, Doha Development Round, double helix, driverless car, Edward Snowden, Elon Musk, en.wikipedia.org, epigenetics, experimental economics, Eyjafjallajökull, failed state, Fall of the Berlin Wall, financial innovation, full employment, Galaxy Zoo, general purpose technology, Glass-Steagall Act, global pandemic, global supply chain, Higgs boson, Hyperloop, immigration reform, income inequality, indoor plumbing, industrial cluster, industrial robot, information retrieval, information security, Intergovernmental Panel on Climate Change (IPCC), intermodal, Internet of things, invention of the printing press, Isaac Newton, Islamic Golden Age, Johannes Kepler, Khan Academy, Kickstarter, Large Hadron Collider, low cost airline, low skilled workers, Lyft, Mahbub ul Haq, Malacca Straits, mass immigration, Max Levchin, megacity, Mikhail Gorbachev, moral hazard, Nelson Mandela, Network effects, New Urbanism, non-tariff barriers, Occupy movement, On the Revolutions of the Heavenly Spheres, open economy, Panamax, Paris climate accords, Pearl River Delta, personalized medicine, Peter Thiel, post-Panamax, profit motive, public intellectual, quantum cryptography, rent-seeking, reshoring, Robert Gordon, Robert Metcalfe, Search for Extraterrestrial Intelligence, Second Machine Age, self-driving car, Shenzhen was a fishing village, Silicon Valley, Silicon Valley startup, Skype, smart grid, Snapchat, special economic zone, spice trade, statistical model, Stephen Hawking, Steve Jobs, Stuxnet, synthetic biology, TED Talk, The Future of Employment, too big to fail, trade liberalization, trade route, transaction costs, transatlantic slave trade, uber lyft, undersea cable, uranium enrichment, We are the 99%, We wanted flying cars, instead we got 140 characters, working poor, working-age population, zero day

Dwyer, Terence, PhD. (2015, October 1). “The Present State of Medical Science.” Interviewed by C. Kutarna, University of Oxford. 9. National Human Genome Research Institute (1998). “Twenty Questions about DNA Sequencing (and the Answers).” NHGRI. Retrieved from community.dur.ac.uk/biosci.bizhub/Bioinformatics/twenty_questions_about_DNA.htm. 10. Rincon, Paul (2014, January 15). “Science Enters $1,000 Genome Era.” BBC News. Retrieved from www.bbc.co.uk. 11. Regalado, Antonio (2014, September 24). “Emtech: Illumina Says 228,000 Human Genomes Will Be Sequenced This Year.” MIT Technology Review.


pages: 470 words: 109,589

Apache Solr 3 Enterprise Search Server by Unknown

bioinformatics, business logic, continuous integration, database schema, en.wikipedia.org, fault tolerance, Firefox, full text search, functional programming, information retrieval, natural language processing, performance metric, platform as a service, Ruby on Rails, SQL injection, Wayback Machine, web application

Without you, I wouldn't have this wonderful open source project to be so incredibly proud to be a part of! I look forward to meeting more of you at the next LuceneRevolution or Euro Lucene conference. About the Reviewers Jerome Eteve holds a MSc in IT and Sciences from the University of Lille (France). After starting his career in the field of bioinformatics where he worked as a Biological Data Management and Analysis Consultant, he's now a Senior Application Developer with interests ranging from architecture to delivering a great user experience online. He's passionate about open source technologies, search engines, and web application architecture.


pages: 523 words: 112,185

Doing Data Science: Straight Talk From the Frontline by Cathy O'Neil, Rachel Schutt

Amazon Mechanical Turk, augmented reality, Augustin-Louis Cauchy, barriers to entry, Bayesian statistics, bike sharing, bioinformatics, computer vision, confounding variable, correlation does not imply causation, crowdsourcing, data science, distributed generation, Dunning–Kruger effect, Edward Snowden, Emanuel Derman, fault tolerance, Filter Bubble, finite state, Firefox, game design, Google Glasses, index card, information retrieval, iterative process, John Harrison: Longitude, Khan Academy, Kickstarter, machine translation, Mars Rover, Nate Silver, natural language processing, Netflix Prize, p-value, pattern recognition, performance metric, personalized medicine, pull request, recommendation engine, rent-seeking, selection bias, Silicon Valley, speech recognition, statistical model, stochastic process, tacit knowledge, text mining, the scientific method, The Wisdom of Crowds, Watson beat the top human players on Jeopardy!, X Prize

What people might not know is that the “datafication” of our offline behavior has started as well, mirroring the online data collection revolution (more on this later). Put the two together, and there’s a lot to learn about our behavior and, by extension, who we are as a species. It’s not just Internet data, though—it’s finance, the medical industry, pharmaceuticals, bioinformatics, social welfare, government, education, retail, and the list goes on. There is a growing influence of data in most sectors and most industries. In some cases, the amount of data collected might be enough to be considered “big” (more on this in the next chapter); in other cases, it’s not. But it’s not only the massiveness that makes all this new data interesting (or poses challenges).


Succeeding With AI: How to Make AI Work for Your Business by Veljko Krunic

AI winter, Albert Einstein, algorithmic trading, AlphaGo, Amazon Web Services, anti-fragile, anti-pattern, artificial general intelligence, autonomous vehicles, Bayesian statistics, bioinformatics, Black Swan, Boeing 737 MAX, business process, cloud computing, commoditize, computer vision, correlation coefficient, data is the new oil, data science, deep learning, DeepMind, en.wikipedia.org, fail fast, Gini coefficient, high net worth, information retrieval, Internet of things, iterative process, job automation, Lean Startup, license plate recognition, minimum viable product, natural language processing, recommendation engine, self-driving car, sentiment analysis, Silicon Valley, six sigma, smart cities, speech recognition, statistical model, strong AI, tail risk, The Design of Experiments, the scientific method, web application, zero-sum game

Similar to many other areas that have captured the popular imagination, it’s not universally agreed what all the fields are that are a part of data science. Some of the fields that are often considered part of data science include statistics, programming, mathematics, machine learning, operational research, and others [66]. Closely related fields that are sometimes considered part of data science include bioinformatics and quantitative analysis. While AI and data science closely overlap, they aren’t identical, because AI includes fields such as robotics, which are traditionally not considered part of data science. Harris, Murphy, and Vaisman’s book [66] provides a good summary of the state of data science before the advancement of deep learning.


pages: 399 words: 118,576

Ageless: The New Science of Getting Older Without Getting Old by Andrew Steele

Alfred Russel Wallace, assortative mating, bioinformatics, caloric restriction, caloric restriction, clockwatching, coronavirus, correlation does not imply causation, COVID-19, CRISPR, dark matter, deep learning, discovery of penicillin, double helix, Easter island, epigenetics, Hans Rosling, Helicobacter pylori, life extension, lone genius, megastructure, meta-analysis, microbiome, mouse model, parabiotic, Peter Thiel, phenotype, precautionary principle, radical life extension, randomized controlled trial, Silicon Valley, stealth mode startup, stem cell, TED Talk, zero-sum game

Any errors or omissions are my own. I would like to thank the Francis Crick Institute for allowing me to continue as a visiting researcher, allowing me to retain access to the scientific literature which underpins this book, in particular to Nick Luscombe for giving a physicist a chance to work in biology, and to the whole Bioinformatics and Computational Biology Lab for helping give me the grounding without which I would not have been able to write it. I am also hugely indebted to my editors, Alexis Kirschbaum, Kristine Puopolo and Jasmine Horsey, for their faith in my writing, for finding the book you’ve just read hidden in my first draft and for making the editing process thoroughly enjoyable.


pages: 370 words: 112,809

The Equality Machine: Harnessing Digital Technology for a Brighter, More Inclusive Future by Orly Lobel

2021 United States Capitol attack, 23andMe, Ada Lovelace, affirmative action, Airbnb, airport security, Albert Einstein, algorithmic bias, Amazon Mechanical Turk, augmented reality, barriers to entry, basic income, Big Tech, bioinformatics, Black Lives Matter, Boston Dynamics, Charles Babbage, choice architecture, computer vision, Computing Machinery and Intelligence, contact tracing, coronavirus, corporate social responsibility, correlation does not imply causation, COVID-19, crowdsourcing, data science, David Attenborough, David Heinemeier Hansson, deep learning, deepfake, digital divide, digital map, Elon Musk, emotional labour, equal pay for equal work, feminist movement, Filter Bubble, game design, gender pay gap, George Floyd, gig economy, glass ceiling, global pandemic, Google Chrome, Grace Hopper, income inequality, index fund, information asymmetry, Internet of things, invisible hand, it's over 9,000, iterative process, job automation, Lao Tzu, large language model, lockdown, machine readable, machine translation, Mark Zuckerberg, market bubble, microaggression, Moneyball by Michael Lewis explains big data, natural language processing, Netflix Prize, Network effects, Northpointe / Correctional Offender Management Profiling for Alternative Sanctions, occupational segregation, old-boy network, OpenAI, openstreetmap, paperclip maximiser, pattern recognition, performance metric, personalized medicine, price discrimination, publish or perish, QR code, randomized controlled trial, remote working, risk tolerance, robot derives from the Czech word robota Czech, meaning slave, Ronald Coase, Salesforce, self-driving car, sharing economy, Sheryl Sandberg, Silicon Valley, social distancing, social intelligence, speech recognition, statistical model, stem cell, Stephen Hawking, Steve Jobs, Steve Wozniak, surveillance capitalism, tech worker, TechCrunch disrupt, The Future of Employment, TikTok, Turing test, universal basic income, Wall-E, warehouse automation, women in the workforce, work culture , you are the product

Illiberal countries have used facial recognition and other technologies to surveil minorities, to control speech, and to rapidly extract immense amounts of behavioral, biometric, and genetic information. Indeed, in many ways the race is skewed in favor of less democratic and more authoritarian countries, which can mandate disclosures of bioinformatics, for example, and do not have the same privacy safeguards in place that slow down data collection and experimentation. To be sure, the same technology can serve to support and to surveil, to learn and to manipulate, to heal and to harm, to detect and to conceal, to equalize and to exclude. The silicon curtain is the new term to describe the barriers to the transfer of technology between China and the West.


pages: 476 words: 120,892

Life on the Edge: The Coming of Age of Quantum Biology by Johnjoe McFadden, Jim Al-Khalili

agricultural Revolution, Albert Einstein, Alfred Russel Wallace, bioinformatics, Bletchley Park, complexity theory, dematerialisation, double helix, Douglas Hofstadter, Drosophila, Ernest Rutherford, Gregor Mendel, Gödel, Escher, Bach, invention of the printing press, Isaac Newton, James Watt: steam engine, Late Heavy Bombardment, Louis Pasteur, Medieval Warm Period, New Journalism, phenotype, quantum entanglement, Richard Feynman, Schrödinger's Cat, seminal paper, synthetic biology, theory of mind, traveling salesman, uranium enrichment, Zeno's paradox

Olsson, “Increased transcription levels induce higher mutation rates in a hypermutating cell line,” Journal of Immunology, vol. 166: 8 (2001), pp. 5051–7. 8 P. Cui, F. Ding, Q. Lin, L. Zhang, A. Li, Z. Zhang, S. Hu and J. Yu, “Distinct contributions of replication and transcription to mutation rate variation of human genomes,” Genomics, Proteomics and Bioinformatics, vol. 10: 1 (2012), pp. 4–10. 9 J. Cairns, J. Overbaugh and S. Millar, “The origin of mutants,” Nature, vol. 335 (1988), pp. 142–5. 10 John Cairns on Jim Watson, Cold Spring Harbor Oral History Collection. Interview available at: http://library.cshl.edu/oralhistory/interview/james-d-watson/meeting-jim-watson/watson/. 11 J.


pages: 413 words: 119,587

Machines of Loving Grace: The Quest for Common Ground Between Humans and Robots by John Markoff

A Declaration of the Independence of Cyberspace, AI winter, airport security, Andy Rubin, Apollo 11, Apple II, artificial general intelligence, Asilomar, augmented reality, autonomous vehicles, backpropagation, basic income, Baxter: Rethink Robotics, Bill Atkinson, Bill Duvall, bioinformatics, Boston Dynamics, Brewster Kahle, Burning Man, call centre, cellular automata, Charles Babbage, Chris Urmson, Claude Shannon: information theory, Clayton Christensen, clean water, cloud computing, cognitive load, collective bargaining, computer age, Computer Lib, computer vision, crowdsourcing, Danny Hillis, DARPA: Urban Challenge, data acquisition, Dean Kamen, deep learning, DeepMind, deskilling, Do you want to sell sugared water for the rest of your life?, don't be evil, Douglas Engelbart, Douglas Engelbart, Douglas Hofstadter, Dr. Strangelove, driverless car, dual-use technology, Dynabook, Edward Snowden, Elon Musk, Erik Brynjolfsson, Evgeny Morozov, factory automation, Fairchild Semiconductor, Fillmore Auditorium, San Francisco, From Mathematics to the Technologies of Life and Death, future of work, Galaxy Zoo, General Magic , Geoffrey Hinton, Google Glasses, Google X / Alphabet X, Grace Hopper, Gunnar Myrdal, Gödel, Escher, Bach, Hacker Ethic, Hans Moravec, haute couture, Herbert Marcuse, hive mind, hype cycle, hypertext link, indoor plumbing, industrial robot, information retrieval, Internet Archive, Internet of things, invention of the wheel, Ivan Sutherland, Jacques de Vaucanson, Jaron Lanier, Jeff Bezos, Jeff Hawkins, job automation, John Conway, John Markoff, John Maynard Keynes: Economic Possibilities for our Grandchildren, John Maynard Keynes: technological unemployment, John Perry Barlow, John von Neumann, Kaizen: continuous improvement, Kevin Kelly, Kiva Systems, knowledge worker, Kodak vs Instagram, labor-force participation, loose coupling, Marc Andreessen, Mark Zuckerberg, Marshall McLuhan, medical residency, Menlo Park, military-industrial complex, Mitch Kapor, Mother of all demos, natural language processing, Neil Armstrong, new economy, Norbert Wiener, PageRank, PalmPilot, pattern recognition, Philippa Foot, pre–internet, RAND corporation, Ray Kurzweil, reality distortion field, Recombinant DNA, Richard Stallman, Robert Gordon, Robert Solow, Rodney Brooks, Sand Hill Road, Second Machine Age, self-driving car, semantic web, Seymour Hersh, shareholder value, side project, Silicon Valley, Silicon Valley startup, Singularitarianism, skunkworks, Skype, social software, speech recognition, stealth mode startup, Stephen Hawking, Steve Ballmer, Steve Jobs, Steve Wozniak, Steven Levy, Stewart Brand, Strategic Defense Initiative, strong AI, superintelligent machines, tech worker, technological singularity, Ted Nelson, TED Talk, telemarketer, telepresence, telepresence robot, Tenerife airport disaster, The Coming Technological Singularity, the medium is the message, Thorstein Veblen, Tony Fadell, trolley problem, Turing test, Vannevar Bush, Vernor Vinge, warehouse automation, warehouse robotics, Watson beat the top human players on Jeopardy!, We are as Gods, Whole Earth Catalog, William Shockley: the traitorous eight, zero-sum game

Immediately after he read the message, two large men burst into his office and instructed him that it was essential he immediately accompany them to an undisclosed location in Woodside, the elite community populated by Silicon Valley’s technology executives and venture capitalists. This was Page’s surprise fortieth birthday party, orchestrated by his wife, Lucy Southworth, a Stanford bioinformatics Ph.D. A crowd of 150 people in appropriate alien-themed costumes had gathered, including Google cofounder Sergey Brin, who wore a dress. In the basement of the sprawling mansion where the party was held, a robot arm grabbed small boxes one at a time and gaily tossed the souvenirs to an appreciative crowd.


pages: 561 words: 120,899

The Theory That Would Not Die: How Bayes' Rule Cracked the Enigma Code, Hunted Down Russian Submarines, and Emerged Triumphant From Two Centuries of Controversy by Sharon Bertsch McGrayne

Abraham Wald, Alan Greenspan, Bayesian statistics, bioinformatics, Bletchley Park, British Empire, classic study, Claude Shannon: information theory, Daniel Kahneman / Amos Tversky, data science, double helix, Dr. Strangelove, driverless car, Edmond Halley, Fellow of the Royal Society, full text search, government statistician, Henri Poincaré, Higgs boson, industrial research laboratory, Isaac Newton, Johannes Kepler, John Markoff, John Nash: game theory, John von Neumann, linear programming, longitudinal study, machine readable, machine translation, meta-analysis, Nate Silver, p-value, Pierre-Simon Laplace, placebo effect, prediction markets, RAND corporation, recommendation engine, Renaissance Technologies, Richard Feynman, Richard Feynman: Challenger O-ring, Robert Mercer, Ronald Reagan, seminal paper, speech recognition, statistical model, stochastic process, Suez canal 1869, Teledyne, the long tail, Thomas Bayes, Thomas Kuhn: the structure of scientific revolutions, traveling salesman, Turing machine, Turing test, uranium enrichment, We are all Keynesians now, Yom Kippur War

Ron Howard, who had become interested in Bayes while at Harvard, was working on Bayesian networks in Stanford’s economic engineering department. A medical student, David E. Heckerman, became interested too and for his Ph.D. dissertation wrote a program to help pathologists diagnose lymph node diseases. Computerized diagnostics had been tried but abandoned decades earlier. Heckerman’s Ph.D. in bioinformatics concerned medicine, but his software won a prestigious national award in 1990 from the Association for Computing Machinery, the professional organization for computing. Two years later, Heckerman went to Microsoft to work on Bayesian networks. The Federal Drug Administration (FDA) allows the manufacturers of medical devices to use Bayes in their final applications for FDA approval.


pages: 502 words: 124,794

Nexus by Ramez Naam

artificial general intelligence, bioinformatics, Brownian motion, crowdsourcing, Golden Gate Park, Great Leap Forward, hive mind, Ken Thompson, low earth orbit, mandatory minimum, Menlo Park, pattern recognition, the scientific method, upwardly mobile, VTOL

He was occupied at the moment, but would come by in a few hours. Niran hung up the phone, smiled to himself. It would be wonderful to see Thanom again. 35 ROOTS "I wasn't born Samantha Cataranes. I was born Sarita Catalan. I grew up in southern California, in a little town near San Diego. My parents were Roberto and Anita. They both worked in bioinformatics, had met on the job. I had a sister, Ana." Sorrow welled up from her. Tears began to flow again, silently running down the side of her face. Kade felt troubled, concerned, empathic. He stroked her hair, sent kindness. "My parents were hippies. The kind of hippies who worked in tech but went camping with the family, had singalongs with friends.


Text Analytics With Python: A Practical Real-World Approach to Gaining Actionable Insights From Your Data by Dipanjan Sarkar

bioinformatics, business intelligence, business logic, computer vision, continuous integration, data science, deep learning, Dr. Strangelove, en.wikipedia.org, functional programming, general-purpose programming language, Guido van Rossum, information retrieval, Internet of things, invention of the printing press, iterative process, language acquisition, machine readable, machine translation, natural language processing, out of africa, performance metric, premature optimization, recommendation engine, self-driving car, semantic web, sentiment analysis, speech recognition, statistical model, text mining, Turing test, web application

Topic models are also often known as probabilistic statistical models, which use specific statistical techniques including singular valued decomposition and latent dirichlet allocation to discover connected latent semantic structures in text data that yield topics and concepts. They are used extensively in text analytics and even bioinformatics. Automated document summarizationis the process of using a computer program or algorithm based on statistical and ML techniques to summarize a document or corpus of documents such that we obtain a short summary that captures all the essential concepts and themes of the original document or corpus.


pages: 445 words: 129,068

The Speed of Dark by Elizabeth Moon

bioinformatics, gravity well, hiring and firing, industrial robot, life extension, theory of mind, We are as Gods

I believe God is important and does not make mistakes. My mother used to joke about God making mistakes, but I do not think if He is God He makes mistakes. So it is not a silly question. Do I want to be healed?And of what? The only self I know is this self, the person I am now, the autistic bioinformatics specialist fencer lover of Marjory. And I believe in his only begotten son, Jesus Christ, who actually in the flesh asked that question of the man by the pool. The man who perhaps—the story does not say—had gone there because people were Page 183 tired of him being sick and disabled, who perhaps had been content to lie down all day, but he got in the way.


pages: 532 words: 139,706

Googled: The End of the World as We Know It by Ken Auletta

"World Economic Forum" Davos, 23andMe, AltaVista, An Inconvenient Truth, Andy Rubin, Anne Wojcicki, AOL-Time Warner, Apple's 1984 Super Bowl advert, Ben Horowitz, bioinformatics, Burning Man, carbon footprint, citizen journalism, Clayton Christensen, cloud computing, Colonization of Mars, commoditize, company town, corporate social responsibility, creative destruction, death of newspapers, digital rights, disintermediation, don't be evil, facts on the ground, Firefox, Frank Gehry, Google Earth, hypertext link, Innovator's Dilemma, Internet Archive, invention of the telephone, Jeff Bezos, jimmy wales, John Markoff, Kevin Kelly, knowledge worker, Larry Ellison, Long Term Capital Management, Marc Andreessen, Mark Zuckerberg, Marshall McLuhan, Mary Meeker, Menlo Park, Network effects, new economy, Nicholas Carr, PageRank, Paul Buchheit, Peter Thiel, Ralph Waldo Emerson, Richard Feynman, Sand Hill Road, Saturday Night Live, semantic web, sharing economy, Sheryl Sandberg, Silicon Valley, Skype, slashdot, social graph, spectrum auction, stealth mode startup, Stephen Hawking, Steve Ballmer, Steve Jobs, strikebreaker, Susan Wojcicki, systems thinking, telemarketer, the Cathedral and the Bazaar, the long tail, the scientific method, The Wisdom of Crowds, Tipper Gore, Upton Sinclair, vertical integration, X Prize, yield management, zero-sum game

Measured by growth, it was Google’s best year, with revenues soaring 60 percent to $16.6 billion, with international revenues contributing nearly half the total, and with profits climbing to $4.2 billion. Google ended the year with 16,805 full-time employees, offices in twenty countries, and the search engine available in 117 languages. And the year had been a personally happy one for Page and Brin. Page married Lucy Southworth, a former model who earned her Ph.D. in bioinformatics in January 2009 from Stanford; they married seven months after Brin wed Anne Wojcicki. But Sheryl Sandberg was worried. She had held a ranking job in the Clinton administration before, joining Google in 2001, where she supervised all online sales for AdWords and AdSense, and was regularly hailed by Fortune magazine as one of the fifty most powerful female executives in America.


pages: 398 words: 31,161

Gnuplot in Action: Understanding Data With Graphs by Philipp Janert

bioinformatics, business intelligence, Debian, general-purpose programming language, iterative process, mandelbrot fractal, pattern recognition, power law, random walk, Richard Stallman, six sigma, sparse data, survivorship bias

Then, the project had to be ■ ■ ■ ■ Free and open source Available for the Linux platform Active and mature Available as a standalone product and allowing interactive use (this requirement eliminates libraries and graphics command languages) 348 APPENDIX C ■ ■ C.3.1 Reasonably general purpose (this eliminates specialized tools for molecular modeling, bio-informatics, high-energy physics, and so on) Comparable to or going beyond gnuplot in at least some respects Math and statistics programming environments R The R language and environment (www.r-project.org) are in many ways the de facto standard for statistical computing and graphics using open source tools.


pages: 458 words: 135,206

CTOs at Work by Scott Donaldson, Stanley Siegel, Gary Donaldson

Amazon Web Services, Andy Carvin, bioinformatics, business intelligence, business process, call centre, centre right, cloud computing, computer vision, connected car, crowdsourcing, data acquisition, distributed generation, do what you love, domain-specific language, functional programming, glass ceiling, Hacker News, hype cycle, Neil Armstrong, orbital mechanics / astrodynamics, pattern recognition, Pluto: dwarf planet, QR code, Richard Feynman, Ruby on Rails, Salesforce, shareholder value, Silicon Valley, Skype, smart grid, smart meter, software patent, systems thinking, thinkpad, web application, zero day, zero-sum game

With a teammate we developed a brand-new type of biological sensor that we called “TIGER” (Threat ID through Genetic Evaluation of Risk). That technology won The Wall Street Journal “gold” Technology Innovation Award in 2009 for the best invention of the year. It relies on a combination of advanced biotech hardware with groundbreaking bio-informatics techniques that were based on our radar signal processing expertise. Information from a sensor like that can feed into our epidemiology and disease tracking work. That's an example of a sensor at the front end through information flow at the back end. In the cyber security domain, our subsidiary, CloudShield, has a very special piece of hardware that enables real-time, deep packet inspection of network traffic at network line speeds, and that allows you to find cyber threats embedded in the traffic.


pages: 486 words: 132,784

Inventors at Work: The Minds and Motivation Behind Modern Inventions by Brett Stern

Apple II, augmented reality, autonomous vehicles, bioinformatics, Build a better mousetrap, business process, cloud computing, computer vision, cyber-physical system, distributed generation, driverless car, game design, Grace Hopper, human-factors engineering, Richard Feynman, Silicon Valley, skunkworks, Skype, smart transportation, speech recognition, statistical model, stealth mode startup, Steve Jobs, Steve Wozniak, the market place, value engineering, Yogi Berra

Dougherty: Oftentimes, inventors who are prosecuting their application pro se are unaware that they may ask the examiner for assistance in drafting allowable claims if there is allowable subject matter in the written disclosure. The examiner’s function is to allow valid patents. So, they will help the inventor come to an allowable subject matter if it exists in the application. Stern: Which technologies or fields exhibit high-growth trends in terms of patents? Calvert: One area that is going to be big is bioinformatics, which is biology and computer software working together. Dougherty: Medical device art is a high-growth area, too. People are living longer and they’re seeking to reduce costs for an enhanced life. Devices are getting smaller. Nanotechnology is already enabling medical devices, for example, that can travel through your bloodstream, collecting and reporting medical data in real time.


pages: 660 words: 141,595

Data Science for Business: What You Need to Know About Data Mining and Data-Analytic Thinking by Foster Provost, Tom Fawcett

Albert Einstein, Amazon Mechanical Turk, Apollo 13, big data - Walmart - Pop Tarts, bioinformatics, business process, call centre, chief data officer, Claude Shannon: information theory, computer vision, conceptual framework, correlation does not imply causation, crowdsourcing, data acquisition, data science, David Brooks, en.wikipedia.org, Erik Brynjolfsson, Gini coefficient, Helicobacter pylori, independent contractor, information retrieval, intangible asset, iterative process, Johann Wolfgang von Goethe, Louis Pasteur, Menlo Park, Nate Silver, Netflix Prize, new economy, p-value, pattern recognition, placebo effect, price discrimination, recommendation engine, Ronald Coase, selection bias, Silicon Valley, Skype, SoftBank, speech recognition, Steve Jobs, supply-chain management, systems thinking, Teledyne, text mining, the long tail, The Signal and the Noise by Nate Silver, Thomas Bayes, transaction costs, WikiLeaks

Neural networks for credit scoring. In Goonatilake, S., & Treleaven, P. (Eds.), Intelligent Systems for Finance and Business, pp. 61–69. John Wiley and Sons Ltd., West Sussex, England. Letunic, & Bork (2006). Interactive tree of life (iTOL): an online tool for phylogenetic tree display and annotation. Bioinformatics, 23 (1). Lin, J.-H., & Vitter, J. S. (1994). A theory for memory-based learning. Machine Learning, 17, 143–167. Lloyd, S. P. (1982). Least square quantization in PCM. IEEE Transactions on Information Theory, 28 (2), 129–137. MacKay, D. (2003). Information Theory, Inference and Learning Algorithms, Chapter 20.


pages: 476 words: 148,895

Cooked: A Natural History of Transformation by Michael Pollan

biofilm, bioinformatics, Columbian Exchange, correlation does not imply causation, creative destruction, dematerialisation, Drosophila, energy security, Gary Taubes, Helicobacter pylori, Hernando de Soto, hygiene hypothesis, Kickstarter, Louis Pasteur, Mason jar, microbiome, off-the-grid, peak oil, pneumatic tube, Ralph Waldo Emerson, Steven Pinker, women in the workforce

European Molecular Biology Organization, Vol. 7, No. 10, 2006. Bravo, Javier A., et al. “Ingestion of Lactobacillus Strain Regulates Emotional Behavior and Central GABA Receptor Expression in a Mouse Via the Vagus Nerve.” www.pnas.org/cgi/doi/10.1073/pnas.1102999108. Desiere, Frank, et al. “Bioinformatics and Data Knowledge: The New Frontiers for Nutrition and Food.” Trends in Food Science & Technology 12 (2002): 215–29. Douwes, J., et al. “Farm Exposure in Utero May Protect Against Asthma.” European Respiratory Journal 32 (2008): 603–11. Ege, M.J., et al. Parsifal study team. “Prenatal Farm Exposure Is Related to the Expression of Receptors of the Innate Immunity and to Atopic Sensitization in School-Age Children.”


pages: 339 words: 57,031

From Counterculture to Cyberculture: Stewart Brand, the Whole Earth Network, and the Rise of Digital Utopianism by Fred Turner

"World Economic Forum" Davos, 1960s counterculture, A Declaration of the Independence of Cyberspace, Alan Greenspan, Alvin Toffler, Apple's 1984 Super Bowl advert, back-to-the-land, Bill Atkinson, bioinformatics, Biosphere 2, book value, Buckminster Fuller, business cycle, Californian Ideology, classic study, Claude Shannon: information theory, complexity theory, computer age, Computer Lib, conceptual framework, Danny Hillis, dematerialisation, distributed generation, Douglas Engelbart, Douglas Engelbart, Dr. Strangelove, Dynabook, Electric Kool-Aid Acid Test, Fairchild Semiconductor, Ford Model T, From Mathematics to the Technologies of Life and Death, future of work, Future Shock, game design, George Gilder, global village, Golden Gate Park, Hacker Conference 1984, Hacker Ethic, Haight Ashbury, Herbert Marcuse, Herman Kahn, hive mind, Howard Rheingold, informal economy, intentional community, invisible hand, Ivan Sutherland, Jaron Lanier, John Gilmore, John Markoff, John Perry Barlow, John von Neumann, Kevin Kelly, knowledge economy, knowledge worker, Lewis Mumford, market bubble, Marshall McLuhan, mass immigration, means of production, Menlo Park, military-industrial complex, Mitch Kapor, Mondo 2000, Mother of all demos, new economy, Norbert Wiener, peer-to-peer, post-industrial society, postindustrial economy, Productivity paradox, QWERTY keyboard, Ralph Waldo Emerson, RAND corporation, reality distortion field, Richard Stallman, Robert Shiller, Ronald Reagan, Shoshana Zuboff, Silicon Valley, Silicon Valley ideology, South of Market, San Francisco, Steve Jobs, Steve Wozniak, Steven Levy, Stewart Brand, systems thinking, technoutopianism, Ted Nelson, Telecommunications Act of 1996, The Hackers Conference, the strength of weak ties, theory of mind, urban renewal, Vannevar Bush, We are as Gods, Whole Earth Catalog, Whole Earth Review, Yom Kippur War

Like the scientists and technicians of the Rad Lab and Los Alamos in World War II, the contributors to the first Artificial Life Conference quickly established an intellectual trading zone. Specialists in robotics presented papers on questions of cultural evolution; computer scientists used new algorithms to model seemingly biological patterns of growth; bioinformatics specialists applied what they believed to be principles of natural ecologies to the development of social structures. For these scientists, as formerly for members of the Rad Lab and the cold war research institutes that followed it, systems theory served as a contact language and computers served as key supports for a systems orientation toward interdisciplinary work.


pages: 772 words: 150,109

As Gods: A Moral History of the Genetic Age by Matthew Cobb

"World Economic Forum" Davos, Apollo 11, Asilomar, bioinformatics, Black Lives Matter, Build a better mousetrap, clean water, coronavirus, COVID-19, CRISPR, cryptocurrency, cuban missile crisis, double helix, Dr. Strangelove, Drosophila, Electric Kool-Aid Acid Test, Fellow of the Royal Society, Food sovereignty, global pandemic, Gordon Gekko, greed is good, Higgs boson, lab leak, mega-rich, military-industrial complex, Nelson Mandela, offshore financial centre, out of africa, planetary scale, precautionary principle, profit motive, Project Plowshare, QR code, Ralph Waldo Emerson, Recombinant DNA, Richard Feynman, Ronald Reagan, Scientific racism, Silicon Valley, Skype, stem cell, Steve Jobs, Steve Wozniak, Steven Pinker, Stewart Brand, synthetic biology, tacit knowledge, Thomas Kuhn: the structure of scientific revolutions, Wayback Machine, We are as Gods, Whole Earth Catalog

The data point to a natural spillover event such as we have seen in the past and will almost certainly see again.112 The long, painstaking research that was required before the bat origin of SARS was identified explains why there was no immediate agreement on which animal species was the original host of SARS-CoV-2 – such things take a long time even in the absence of a global pandemic.113 One possible solution to concerns about identifying manipulated pathogens, and indeed a potential resolution to some of the more outlandish speculation about the origin of SARS-CoV-2, may lie in the use of genetic engineering forensics – complex bioinformatic analyses – to determine whether an organism involved in a disease outbreak has been genetically modified and, if so, to infer its likely origin. This work is in its infancy, but a network of laboratories, RefBio, has recently been set up under the auspices of the United Nations to gather sequence data from future events.114 ✴ Throughout the half-century history of genetic engineering there have been persistent concerns that the apparent simplicity of the methods involved might enable terrorists or biohackers to replicate experiments with potentially disastrous results.


pages: 855 words: 178,507

The Information: A History, a Theory, a Flood by James Gleick

Ada Lovelace, Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem, Albert Einstein, AltaVista, bank run, bioinformatics, Bletchley Park, Brownian motion, butterfly effect, Charles Babbage, citation needed, classic study, Claude Shannon: information theory, clockwork universe, computer age, Computing Machinery and Intelligence, conceptual framework, crowdsourcing, death of newspapers, discovery of DNA, Donald Knuth, double helix, Douglas Hofstadter, en.wikipedia.org, Eratosthenes, Fellow of the Royal Society, Gregor Mendel, Gödel, Escher, Bach, Henri Poincaré, Honoré de Balzac, index card, informal economy, information retrieval, invention of the printing press, invention of writing, Isaac Newton, Jacquard loom, Jaron Lanier, jimmy wales, Johannes Kepler, John von Neumann, Joseph-Marie Jacquard, Lewis Mumford, lifelogging, Louis Daguerre, machine translation, Marshall McLuhan, Menlo Park, microbiome, Milgram experiment, Network effects, New Journalism, Norbert Wiener, Norman Macrae, On the Economy of Machinery and Manufactures, PageRank, pattern recognition, phenotype, Pierre-Simon Laplace, pre–internet, quantum cryptography, Ralph Waldo Emerson, RAND corporation, reversible computing, Richard Feynman, Rubik’s Cube, Simon Singh, Socratic dialogue, Stephen Hawking, Steven Pinker, stochastic process, talking drums, the High Line, The Wisdom of Crowds, transcontinental railway, Turing machine, Turing test, women in the workforce, yottabyte

The “jumping the shark” entry in Wikipedia advised in 2009, “See also: jumping the couch; nuking the fridge.” Is this science? In his 1983 column, Hofstadter proposed the obvious memetic label for such a discipline: memetics. The study of memes has attracted researchers from fields as far apart as computer science and microbiology. In bioinformatics, chain letters are an object of study. They are memes; they have evolutionary histories. The very purpose of a chain letter is replication; whatever else a chain letter may say, it embodies one message: Copy me. One student of chain-letter evolution, Daniel W. VanArsdale, listed many variants, in chain letters and even earlier texts: “Make seven copies of it exactly as it is written” [1902]; “Copy this in full and send to nine friends” [1923]; “And if any man shall take away from the words of the book of this prophecy, God shall take away his part out of the book of life” [Revelation 22:19].♦ Chain letters flourished with the help of a new nineteenth-century technology: “carbonic paper,” sandwiched between sheets of writing paper in stacks.


pages: 504 words: 89,238

Natural language processing with Python by Steven Bird, Ewan Klein, Edward Loper

bioinformatics, business intelligence, business logic, Computing Machinery and Intelligence, conceptual framework, Donald Knuth, duck typing, elephant in my pajamas, en.wikipedia.org, finite state, Firefox, functional programming, Guido van Rossum, higher-order functions, information retrieval, language acquisition, lolcat, machine translation, Menlo Park, natural language processing, P = NP, search inside the book, sparse data, speech recognition, statistical model, text mining, Turing test, W. E. B. Du Bois

[Heim and Kratzer, 1998] Irene Heim and Angelika Kratzer. Semantics in Generative Grammar. Blackwell, 1998. [Hirschman et al., 2005] Lynette Hirschman, Alexander Yeh, Christian Blaschke, and Alfonso Valencia. Overview of BioCreAtIvE: critical assessment of information extrac tion for biology. BMC Bioinformatics, 6, May 2005. Supplement 1. [Hodges, 1977] Wilfred Hodges. Logic. Penguin Books, Harmondsworth, 1977. [Huddleston and Pullum, 2002] Rodney D. Huddleston and Geoffrey K. Pullum. The Cambridge Grammar of the English Language. Cambridge University Press, 2002. [Hunt and Thomas, 2000] Andrew Hunt and David Thomas.


pages: 574 words: 164,509

Superintelligence: Paths, Dangers, Strategies by Nick Bostrom

agricultural Revolution, AI winter, Albert Einstein, algorithmic trading, anthropic principle, Anthropocene, anti-communist, artificial general intelligence, autism spectrum disorder, autonomous vehicles, backpropagation, barriers to entry, Bayesian statistics, bioinformatics, brain emulation, cloud computing, combinatorial explosion, computer vision, Computing Machinery and Intelligence, cosmological constant, dark matter, DARPA: Urban Challenge, data acquisition, delayed gratification, Demis Hassabis, demographic transition, different worldview, Donald Knuth, Douglas Hofstadter, driverless car, Drosophila, Elon Musk, en.wikipedia.org, endogenous growth, epigenetics, fear of failure, Flash crash, Flynn Effect, friendly AI, general purpose technology, Geoffrey Hinton, Gödel, Escher, Bach, hallucination problem, Hans Moravec, income inequality, industrial robot, informal economy, information retrieval, interchangeable parts, iterative process, job automation, John Markoff, John von Neumann, knowledge worker, Large Hadron Collider, longitudinal study, machine translation, megaproject, Menlo Park, meta-analysis, mutually assured destruction, Nash equilibrium, Netflix Prize, new economy, Nick Bostrom, Norbert Wiener, NP-complete, nuclear winter, operational security, optical character recognition, paperclip maximiser, pattern recognition, performance metric, phenotype, prediction markets, price stability, principal–agent problem, race to the bottom, random walk, Ray Kurzweil, recommendation engine, reversible computing, search costs, social graph, speech recognition, Stanislav Petrov, statistical model, stem cell, Stephen Hawking, Strategic Defense Initiative, strong AI, superintelligent machines, supervolcano, synthetic biology, technological singularity, technoutopianism, The Coming Technological Singularity, The Nature of the Firm, Thomas Kuhn: the structure of scientific revolutions, time dilation, Tragedy of the Commons, transaction costs, trolley problem, Turing machine, Vernor Vinge, WarGames: Global Thermonuclear War, Watson beat the top human players on Jeopardy!, World Values Survey, zero-sum game

Advances in Monte Carlo approximation techniques, for example, are directly applied in computer vision, robotics, and computational genetics. Another advantage is that it lets researchers from different disciplines more easily pool their findings. Graphical models and Bayesian statistics have become a shared focus of research in many fields, including machine learning, statistical physics, bioinformatics, combinatorial optimization, and communication theory.35 A fair amount of the recent progress in machine learning has resulted from incorporating formal results originally derived in other academic fields. (Machine learning applications have also benefitted enormously from faster computers and greater availability of large data sets


pages: 741 words: 164,057

Editing Humanity: The CRISPR Revolution and the New Era of Genome Editing by Kevin Davies

23andMe, Airbnb, Anne Wojcicki, Apple's 1984 Super Bowl advert, Asilomar, bioinformatics, California gold rush, clean water, coronavirus, COVID-19, CRISPR, crowdsourcing, discovery of DNA, disinformation, Doomsday Clock, double helix, Downton Abbey, Drosophila, Edward Jenner, Elon Musk, epigenetics, fake news, Gregor Mendel, Hacker News, high-speed rail, hype cycle, imposter syndrome, Isaac Newton, John von Neumann, Kickstarter, life extension, Mark Zuckerberg, microbiome, Mikhail Gorbachev, mouse model, Neil Armstrong, New Journalism, ocean acidification, off-the-grid, personalized medicine, Peter Thiel, phenotype, QWERTY keyboard, radical life extension, RAND corporation, Recombinant DNA, rolodex, scientific mainstream, Scientific racism, seminal paper, Shenzhen was a fishing village, side project, Silicon Valley, Silicon Valley billionaire, Skype, social distancing, stem cell, Stephen Hawking, Steve Jobs, Steven Pinker, Stewart Brand, synthetic biology, TED Talk, the long tail, Thomas Kuhn: the structure of scientific revolutions, Thomas Malthus, traumatic brain injury, warehouse automation

In 2002, Eugene Koonin, a Russian expat computational biologist at the National Center for Biotechnology Information at the NIH, and his colleague Kira Makarova, described a series of bacterial genes they suspected to be part of a DNA repair system.10 What they didn’t realize was that these genes were sitting adjacent to the CRISPR array and—as we shall soon see—play an essential role in the function of CRISPR and gene editing. * * * After a few years working in Oxford, Mojica returned to Alicante in 1997 to set up his own group. With little funding, Mojica tried to do some very cheap experiments, “even though I had no idea about bioinformatics.” The nagging question was the origin of the spacer DNA, the sequencers interspersed between the repeats. “The easiest thing is to look at the databases and expect that something comes out, but we didn’t get anything—until 2003.” By now, the DNA databases were bursting with bacterial and archaea genomes, many of which carried versions of these repeats.


pages: 821 words: 178,631

The Rust Programming Language by Steve Klabnik, Carol Nichols

anti-pattern, billion-dollar mistake, bioinformatics, business logic, business process, cryptocurrency, data science, DevOps, duck typing, Firefox, functional programming, Internet of things, iterative process, pull request, reproducible builds, Ruby on Rails, type inference

Through efforts such as this book, the Rust teams want to make systems concepts more accessible to more people, especially those new to programming. Companies Hundreds of companies, large and small, use Rust in production for a variety of tasks. Those tasks include command line tools, web services, DevOps tooling, embedded devices, audio and video analysis and transcoding, cryptocurrencies, bioinformatics, search engines, Internet of Things applications, machine learning, and even major parts of the Firefox web browser. Open Source Developers Rust is for people who want to build the Rust programming language, community, developer tools, and libraries. We’d love to have you contribute to the Rust language.


pages: 1,331 words: 183,137

Programming Rust: Fast, Safe Systems Development by Jim Blandy, Jason Orendorff

bioinformatics, bitcoin, Donald Knuth, duck typing, Elon Musk, Firefox, fizzbuzz, functional programming, mandelbrot fractal, Morris worm, MVC pattern, natural language processing, reproducible builds, side project, sorting algorithm, speech recognition, Turing test, type inference, WebSocket

We’ll also cover a wide range of topics that come up naturally as your project grows, including how to document and test Rust code, how to silence unwanted compiler warnings, how to use Cargo to manage project dependencies and versioning, how to publish open source libraries on crates.io, and more. Crates Rust programs are made of crates. Each crate is a Rust project: all the source code for a single library or executable, plus any associated tests, examples, tools, configuration, and other junk. For your fern simulator, you might use third-party libraries for 3D graphics, bioinformatics, parallel computation, and so on. These libraries are distributed as crates (see Figure 8-1). Figure 8-1. A crate and its dependencies The easiest way to see what crates are and how they work together is to use cargo build with the --verbose flag to build an existing project that has some dependencies.


pages: 612 words: 187,431

The Art of UNIX Programming by Eric S. Raymond

A Pattern Language, Albert Einstein, Apple Newton, barriers to entry, bioinformatics, Boeing 747, Clayton Christensen, combinatorial explosion, commoditize, Compatible Time-Sharing System, correlation coefficient, David Brooks, Debian, Dennis Ritchie, domain-specific language, don't repeat yourself, Donald Knuth, end-to-end encryption, Everything should be made as simple as possible, facts on the ground, finite state, Free Software Foundation, general-purpose programming language, George Santayana, history of Unix, Innovator's Dilemma, job automation, Ken Thompson, Larry Wall, level 1 cache, machine readable, macro virus, Multics, MVC pattern, Neal Stephenson, no silver bullet, OSI model, pattern recognition, Paul Graham, peer-to-peer, premature optimization, pre–internet, publish or perish, revision control, RFC: Request For Comment, Richard Stallman, Robert Metcalfe, Steven Levy, the Cathedral and the Bazaar, transaction costs, Turing complete, Valgrind, wage slave, web application

XHTML, the latest version of HTML, is also an XML application described by a DTD, which explains the family resemblance between XHTML and DocBook tags. The XHTML toolchain consists of Web browsers that can format HTML as flat ASCII, together with any of a number of ad-hoc HTML-to-print utilities. Many other XML DTDs are maintained to help people exchange structured information in fields as diverse as bioinformatics and banking. You can look at a list of repositories to get some idea of the variety available. The DocBook Toolchain Normally, what you'll do to make XHTML from your DocBook sources is use the xmlto(1) front end. Your commands will look like this: bash$ xmlto xhtml foo.xml bash$ ls *.html ar01s02.html ar01s03.html ar01s04.html index.html In this example, you converted an XML-DocBook document named foo.xml with three top-level sections into an index page and two parts.


pages: 933 words: 205,691

Hadoop: The Definitive Guide by Tom White

Amazon Web Services, bioinformatics, business intelligence, business logic, combinatorial explosion, data science, database schema, Debian, domain-specific language, en.wikipedia.org, exponential backoff, fallacies of distributed computing, fault tolerance, full text search, functional programming, Grace Hopper, information retrieval, Internet Archive, Kickstarter, Large Hadron Collider, linked data, loose coupling, openstreetmap, recommendation engine, RFID, SETI@home, social graph, sparse data, web application

This is a good example where both SQL and MapReduce are required for solving the end user problem and something that is possible to achieve easily with Hive. Data analysis Hive and Hadoop can be easily used for training and scoring for data analysis applications. These data analysis applications can span multiple domains such as popular websites, bioinformatics companies, and oil exploration companies. A typical example of such an application in the online ad network industry would be the prediction of what features of an ad makes it more likely to be noticed by the user. The training phase typically would involve identifying the response metric and the predictive features.


pages: 1,201 words: 233,519

Coders at Work by Peter Seibel

Ada Lovelace, Bill Atkinson, bioinformatics, Bletchley Park, Charles Babbage, cloud computing, Compatible Time-Sharing System, Conway's Game of Life, Dennis Ritchie, domain-specific language, don't repeat yourself, Donald Knuth, fallacies of distributed computing, fault tolerance, Fermat's Last Theorem, Firefox, Free Software Foundation, functional programming, George Gilder, glass ceiling, Guido van Rossum, history of Unix, HyperCard, industrial research laboratory, information retrieval, Ken Thompson, L Peter Deutsch, Larry Wall, loose coupling, Marc Andreessen, Menlo Park, Metcalfe's law, Multics, no silver bullet, Perl 6, premature optimization, publish or perish, random walk, revision control, Richard Stallman, rolodex, Ruby on Rails, Saturday Night Live, side project, slashdot, speech recognition, systems thinking, the scientific method, Therac-25, Turing complete, Turing machine, Turing test, type inference, Valgrind, web application

But we have to be willing to try and take advantage of that, but also take advantage of the integration of systems and the fact that data's coming from everywhere. It's no longer encapsulated with the program, the code. We're seeing now, I think, vast amounts of data, which is accessible. And it's numeric data as well as the informational kinds of data, and will be stored all over the globe, especially if you're working in some of the bioinformatics kind of stuff. And we have to be able to create a platform, probably composed of a lot of parts, which is going to enable those things to come together—computational capability that is probably quite different than we have now. And we also need to, sooner or later, address usability and integrity of these systems.


pages: 798 words: 240,182

The Transhumanist Reader by Max More, Natasha Vita-More

"World Economic Forum" Davos, 23andMe, Any sufficiently advanced technology is indistinguishable from magic, artificial general intelligence, augmented reality, Bill Joy: nanobots, bioinformatics, brain emulation, Buckminster Fuller, cellular automata, clean water, cloud computing, cognitive bias, cognitive dissonance, combinatorial explosion, Computing Machinery and Intelligence, conceptual framework, Conway's Game of Life, cosmological principle, data acquisition, discovery of DNA, Douglas Engelbart, Drosophila, en.wikipedia.org, endogenous growth, experimental subject, Extropian, fault tolerance, Flynn Effect, Francis Fukuyama: the end of history, Frank Gehry, friendly AI, Future Shock, game design, germ theory of disease, Hans Moravec, hypertext link, impulse control, index fund, John von Neumann, joint-stock company, Kevin Kelly, Law of Accelerating Returns, life extension, lifelogging, Louis Pasteur, Menlo Park, meta-analysis, moral hazard, Network effects, Nick Bostrom, Norbert Wiener, pattern recognition, Pepto Bismol, phenotype, positional goods, power law, precautionary principle, prediction markets, presumed consent, Project Xanadu, public intellectual, radical life extension, Ray Kurzweil, reversible computing, RFID, Ronald Reagan, scientific worldview, silicon-based life, Singularitarianism, social intelligence, stem cell, stochastic process, superintelligent machines, supply-chain management, supply-chain management software, synthetic biology, systems thinking, technological determinism, technological singularity, Ted Nelson, telepresence, telepresence robot, telerobotics, the built environment, The Coming Technological Singularity, the scientific method, The Wisdom of Crowds, transaction costs, Turing machine, Turing test, Upton Sinclair, Vernor Vinge, Von Neumann architecture, VTOL, Whole Earth Review, women in the workforce, zero-sum game

Consequently, there is an unbridgeable gap which would-be enhancers cannot ethically cross. This view incorporates a rather static view of what it will be possible for future genetic ­enhancers to know and test beforehand. Any genetic enhancement techniques will first be ­extensively tested and perfected in animal models. Second, a vastly expanded bioinformatics enterprise will become crucial to understanding the ramifications of proposed genetic inter­ventions (National Resource Center for Cell Analysis). As scientific understanding improves, the risk versus benefit calculations of various prospective genetic enhancements of embryos will shift. The arc of ­scientific discovery and technological progress strongly suggests that it will happen in the next few decades.


pages: 903 words: 235,753

The Stack: On Software and Sovereignty by Benjamin H. Bratton

1960s counterculture, 3D printing, 4chan, Ada Lovelace, Adam Curtis, additive manufacturing, airport security, Alan Turing: On Computable Numbers, with an Application to the Entscheidungsproblem, algorithmic trading, Amazon Mechanical Turk, Amazon Robotics, Amazon Web Services, Andy Rubin, Anthropocene, augmented reality, autonomous vehicles, basic income, Benevolent Dictator For Life (BDFL), Berlin Wall, bioinformatics, Biosphere 2, bitcoin, blockchain, Buckminster Fuller, Burning Man, call centre, capitalist realism, carbon credits, carbon footprint, carbon tax, carbon-based life, Cass Sunstein, Celebration, Florida, Charles Babbage, charter city, clean water, cloud computing, company town, congestion pricing, connected car, Conway's law, corporate governance, crowdsourcing, cryptocurrency, dark matter, David Graeber, deglobalization, dematerialisation, digital capitalism, digital divide, disintermediation, distributed generation, don't be evil, Douglas Engelbart, Douglas Engelbart, driverless car, Edward Snowden, Elon Musk, en.wikipedia.org, Eratosthenes, Ethereum, ethereum blockchain, Evgeny Morozov, facts on the ground, Flash crash, Frank Gehry, Frederick Winslow Taylor, fulfillment center, functional programming, future of work, Georg Cantor, gig economy, global supply chain, Google Earth, Google Glasses, Guggenheim Bilbao, High speed trading, high-speed rail, Hyperloop, Ian Bogost, illegal immigration, industrial robot, information retrieval, Intergovernmental Panel on Climate Change (IPCC), intermodal, Internet of things, invisible hand, Jacob Appelbaum, James Bridle, Jaron Lanier, Joan Didion, John Markoff, John Perry Barlow, Joi Ito, Jony Ive, Julian Assange, Khan Academy, Kim Stanley Robinson, Kiva Systems, Laura Poitras, liberal capitalism, lifelogging, linked data, lolcat, Mark Zuckerberg, market fundamentalism, Marshall McLuhan, Masdar, McMansion, means of production, megacity, megaproject, megastructure, Menlo Park, Minecraft, MITM: man-in-the-middle, Monroe Doctrine, Neal Stephenson, Network effects, new economy, Nick Bostrom, ocean acidification, off-the-grid, offshore financial centre, oil shale / tar sands, Oklahoma City bombing, OSI model, packet switching, PageRank, pattern recognition, peak oil, peer-to-peer, performance metric, personalized medicine, Peter Eisenman, Peter Thiel, phenotype, Philip Mirowski, Pierre-Simon Laplace, place-making, planetary scale, pneumatic tube, post-Fordism, precautionary principle, RAND corporation, recommendation engine, reserve currency, rewilding, RFID, Robert Bork, Sand Hill Road, scientific management, self-driving car, semantic web, sharing economy, Silicon Valley, Silicon Valley ideology, skeuomorphism, Slavoj Žižek, smart cities, smart grid, smart meter, Snow Crash, social graph, software studies, South China Sea, sovereign wealth fund, special economic zone, spectrum auction, Startup school, statistical arbitrage, Steve Jobs, Steven Levy, Stewart Brand, Stuxnet, Superbowl ad, supply-chain management, supply-chain management software, synthetic biology, TaskRabbit, technological determinism, TED Talk, the built environment, The Chicago School, the long tail, the scientific method, Torches of Freedom, transaction costs, Turing complete, Turing machine, Turing test, undersea cable, universal basic income, urban planning, Vernor Vinge, vertical integration, warehouse automation, warehouse robotics, Washington Consensus, web application, Westphalian system, WikiLeaks, working poor, Y Combinator, yottabyte

This also relates to what Heidegger once called our “confrontation with planetary technology” (an encounter that he never managed to actually make and which most Heideggerians manage to endlessly defer, or “differ”).15 That encounter should be motivated by an invested interest in several “planetary technologies” working at various scales of matter, and based on, in many respects, what cheap supercomputing, broadband networking, and isomorphic data management methodologies make possible to research and application. These include—but are no means limited to—geology (e.g., geochemistry, geophysics, oceanography, glaciology), earth sciences (e.g., focusing on the atmosphere, lithospere, biosphere, hydrosphere), as well as the various programs of biotechnology (e.g., bioinformatics, synthetic biology, cell therapy), of nanotechnology (e.g., materials, machines, medicines), of economics (e.g., modeling price, output cycles, disincentivized externalities), of neuroscience (e.g., behavioral, cognitive, clinical), and of astronomy (e.g., astrobiology, extragalactic imaging, cosmology).


pages: 1,373 words: 300,577

The Quest: Energy, Security, and the Remaking of the Modern World by Daniel Yergin

"Hurricane Katrina" Superdome, "World Economic Forum" Davos, accelerated depreciation, addicted to oil, Alan Greenspan, Albert Einstein, An Inconvenient Truth, Asian financial crisis, Ayatollah Khomeini, banking crisis, Berlin Wall, bioinformatics, book value, borderless world, BRICs, business climate, California energy crisis, carbon credits, carbon footprint, carbon tax, Carl Icahn, Carmen Reinhart, clean tech, Climategate, Climatic Research Unit, colonial rule, Colonization of Mars, corporate governance, cuban missile crisis, data acquisition, decarbonisation, Deng Xiaoping, Dissolution of the Soviet Union, diversification, diversified portfolio, electricity market, Elon Musk, energy security, energy transition, Exxon Valdez, facts on the ground, Fall of the Berlin Wall, fear of failure, financial innovation, flex fuel, Ford Model T, geopolitical risk, global supply chain, global village, Great Leap Forward, Greenspan put, high net worth, high-speed rail, hydraulic fracturing, income inequality, index fund, informal economy, interchangeable parts, Intergovernmental Panel on Climate Change (IPCC), It's morning again in America, James Watt: steam engine, John Deuss, John von Neumann, Kenneth Rogoff, life extension, Long Term Capital Management, Malacca Straits, market design, means of production, megacity, megaproject, Menlo Park, Mikhail Gorbachev, military-industrial complex, Mohammed Bouazizi, mutually assured destruction, new economy, no-fly zone, Norman Macrae, North Sea oil, nuclear winter, off grid, oil rush, oil shale / tar sands, oil shock, oil-for-food scandal, Paul Samuelson, peak oil, Piper Alpha, price mechanism, purchasing power parity, rent-seeking, rising living standards, Robert Metcalfe, Robert Shiller, Robert Solow, rolling blackouts, Ronald Coase, Ronald Reagan, Sand Hill Road, Savings and loan crisis, seminal paper, shareholder value, Shenzhen special economic zone , Silicon Valley, Silicon Valley billionaire, Silicon Valley startup, smart grid, smart meter, South China Sea, sovereign wealth fund, special economic zone, Stuxnet, Suez crisis 1956, technology bubble, the built environment, The Nature of the Firm, the new new thing, trade route, transaction costs, unemployed young men, University of East Anglia, uranium enrichment, vertical integration, William Langewiesche, Yom Kippur War

We did not know that DNA was the genetic material until 1946. The Green Revolution in the late 1960s was an example of beginning to apply modern biology to plant improvement.”19 Many of the people working in this field are applying the know-how that emerged from the sequencing of the human genome. Calling on the new fields of bioinformatics and computational biology, and using what is called highthroughput experimentation, they seek to identify specific genes and their functions. The aim is to speed up the process of evolution, selecting for characteristics that will make such tall grasses as miscanthus and switchgrass effective energy crops that can grow in marginal lands that would not be cultivated for food.


pages: 1,199 words: 332,563

Golden Holocaust: Origins of the Cigarette Catastrophe and the Case for Abolition by Robert N. Proctor

"RICO laws" OR "Racketeer Influenced and Corrupt Organizations", bioinformatics, carbon footprint, clean water, corporate social responsibility, Deng Xiaoping, desegregation, disinformation, Dr. Strangelove, facts on the ground, friendly fire, germ theory of disease, global pandemic, index card, Indoor air pollution, information retrieval, invention of gunpowder, John Snow's cholera map, language of flowers, life extension, New Journalism, optical character recognition, pink-collar, Ponzi scheme, Potemkin village, precautionary principle, publication bias, Ralph Nader, Ronald Reagan, selection bias, speech recognition, stem cell, telemarketer, Thomas Kuhn: the structure of scientific revolutions, Triangle Shirtwaist Factory, Upton Sinclair, vertical integration, Yogi Berra

MCV faculty also helped undermine public health advocacy: in 1990 James Kilpatrick from biostatistics, working also as a consultant for the Tobacco Institute, wrote to the editor of the New York Times criticizing Stanton Glantz and William Parmley’s demonstration of thirty-five thousand U.S. cardiovascular deaths per annum from exposure to secondhand smoke.49 Glantz by this time was commonly ridiculed by the industry, which even organized skits (to practice courtroom scenarios) in which health advocates were given thinly disguised names: Glantz was “Ata Glance” or “Stanton Glass, professional anti-smoker”; Alan Blum was “Alan Glum” representing “Doctors Ought to Kvetch” or “Doctors Opposed to People Exhaling Smoke” (DOPES); Richard Daynard was “Richard Blowhard” from the “Product Liability Education Alliance,” and so forth.50 VCU continues even today to have close research relationships with Philip Morris, covering topics as diverse as pharmacogenomics, bioinformatics, and behavioral genetics.51 SYMBIOSIS It would be a mistake to characterize this interpenetration of tobacco and academia as merely a “conflict of interest”; the relationship has been far more symbiotic. We are really talking about a confluence of interests, and sometimes even a virtual identity of interests.