# NP-complete

32 results back to index

pages: 236 words: 50,763

The Golden Ticket: P, NP, and the Search for the Impossible by Lance Fortnow

Beyond Karp After Karp’s paper, an industry in computer science arose showing the NP-completeness of a large variety of search problems. Over the next several years, many professors and grad students took a number of known search problems (as well as some new ones) and showed them to be NP-complete. A classic 1979 book* lists over three hundred major NP-complete problems. NP-complete problems continually arise in computer science, physics, biology, economics, and many other areas that reach that pinnacle of hardness. A Google Scholar search on “NP-Complete” returns over 138,000 scientific articles from 1972 to 2011, nearly 10,000 in 2011 alone. We can’t list all of them here, but let me give you a flavor of some of them. Dominating Set Are there fifty people in Frenemy such that everyone else is friends with at least one of them? NP-complete. Partition into Triangles The dorm rooms of the Frenemy Institute can each hold three students.

Now a computer search will start to take significant time, and 100 × 100 games can defeat today’s fastest machines. Large Sudoku games are NP-complete. Think you are good at Sudoku? If you can reliably solve large puzzles, then you can solve satisfiability, traveling salesman problems, and thousands of other NP-complete problems. Sudoku is hardly the only NP-complete solitaire game. Consider the Minesweeper game built into Microsoft Windows. Figure 4-5. Minesweeper. Each number in a square tells how many mines are within one square (horizontally, vertically, or diagonally) from that square. Click on a square and you will reveal that number. Set a flag if you think there are bombs there. Click on a bomb and you lose the game. Determining whether one can win a large Minesweeper game is also NP-complete. Here are the remaining bombs in the above game. Figure 4-6.

Knuth got many write-in suggestions, some straightforward, like “intractable” and “obstinate,” and others a bit more tongue-in-cheek, like “hard-boiled” (in honor of Cook) and “hard-ass” (“hard as satisfiability”). The winning write-in vote was for “NP-complete,” which was proposed by several people at Bell Labs in New Jersey after considerable discussion among the researchers there. The word “complete” comes from mathematical logic, where a set of facts is complete if it can explain all the true statements in some logical system. Analogously, “NP-complete” means those problems in NP powerful enough that they can be used to solve any other problem in NP. Knuth was not entirely happy with this choice but was willing to live with it. He truly wanted a single English world that captured the intuitive meaning of hard search problems, a term for the masses. In a 1974 wrap-up of his survey, Knuth wrote “NP-complete actually smacks of being a little too technical for a mass audience, but it’s not so bad as to be unusable.”

Algorithms Unlocked by Thomas H. Cormen

—is the question that has perplexed computer scientists for all these years. We often call it the “P D NP‹ problem.” The NP-complete problems are the “hardest” in NP. Informally, a problem is NP-complete if it satisfies two conditions: (1) it’s in NP and (2) if a polynomial-time algorithm exists for the problem, then there is a way to convert every problem in NP into this problem in such a way as to solve them all in polynomial time. If a polynomial-time algorithm exists for any NP-complete problem—that is, if any NP-complete problem is in P—then P D NP. Because NP-complete problems are the hardest in NP, if it turns out that any problem in NP is not polynomial-time solvable, then none of the NP-complete problems are. A problem is NP-hard if it satisfies the second condition for NP-completeness but may or may not be in NP. Here’s a handy list of the pertinent definitions: P: problems solvable in polynomial time, i.e., we can solve the problem in time polynomial in the size of the input to the problem.

To take the frustration to a whole new level, here’s the most amazing fact: if there were an algorithm that ran in O.nc / time for any of these problems, where c is a constant, then there would be an algorithm that ran in O.nc / time for all of these problems. We call these problems NP-complete. An algorithm that runs in time O.nc / on an input of size n, where c is a constant, is a polynomial-time algorithm, so called because nc with some coefficient would be the most significant term in the running time. We know of no polynomial-time algorithm for any NP-complete problem, but nobody has proven that it’s impossible to solve some NP-complete problem in polynomial time. And there’s even more frustration: many NP-complete problems are almost the same as problems that we know how to solve in polynomial time. Just a small tweak separates them. For example, recall from Chapter 6 that the Bellman-Ford algorithm finds shortest paths from a single source in a directed graph, even if the graph has negative-weight edges, in ‚.nm/ time, where the graph has n vertices and m edges.

But if we ask whether a connected, undirected graph has a hamiltonian cycle, that’s NP-complete. Notice that the question is not “what is the order of vertices on a hamiltonian cycle in this graph?” but just the more basic “yes or no: is it possible to construct a hamiltonian cycle on this graph?” NP-complete problems come up surprisingly often, which is why I’m including material on them in this book. If you are trying to find a polynomial-time algorithm for a problem that turns out to be NPcomplete, you are likely to be in for a big helping of disappointment. (But see the section on perspective, pages 208–211.) The concept of NP-complete problems has been around since the early 1970s, and people were trying to solve problems that turned out to be NP-complete (such as the traveling-salesman problem) well before then.

pages: 396 words: 117,149

The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World by Pedro Domingos

Because of NP-completeness, all it takes to answer it is to prove that one NP-complete problem is efficiently solvable (or not). NP is not the hardest class of problems in computer science, but it’s arguably the hardest “realistic” class: if you can’t even check a problem’s solution before the universe ends, what’s the point of trying to solve it? Humans are good at solving NP problems approximately, and conversely, problems that we find interesting (like Tetris) often have an “NP-ness” about them. One definition of artificial intelligence is that it consists of finding heuristic solutions to NP-complete problems. Often, we do this by reducing them to satisfiability, the canonical NP-complete problem: Can a given logical formula ever be true, or is it self-contradictory? If we invent a learner that can learn to solve satisfiability, it has a good claim to being the Master Algorithm.

Figuring out how proteins fold into their characteristic shapes; reconstructing the evolutionary history of a set of species from their DNA; proving theorems in propositional logic; detecting arbitrage opportunities in markets with transaction costs; inferring a three-dimensional shape from two-dimensional views; compressing data on a disk; forming a stable coalition in politics; modeling turbulence in sheared flows; finding the safest portfolio of investments with a given return, the shortest route to visit a set of cities, the best layout of components on a microchip, the best placement of sensors in an ecosystem, or the lowest energy state of a spin glass; scheduling flights, classes, and factory jobs; optimizing resource allocation, urban traffic flow, social welfare, and (most important) your Tetris score: these are all NP-complete problems, meaning that if you can efficiently solve one of them you can efficiently solve all problems in the class NP, including each other. Who would have guessed that all these problems, superficially so different, are really the same? But if they are, it makes sense that one algorithm could learn to solve all of them (or, more precisely, all efficiently solvable instances). P and NP are the two most important classes of problems in computer science. (The names are not very mnemonic, unfortunately.) A problem is in P if we can solve it efficiently, and it’s in NP if we can efficiently check its solution. The famous P = NP question is whether every efficiently checkable problem is also efficiently solvable. Because of NP-completeness, all it takes to answer it is to prove that one NP-complete problem is efficiently solvable (or not).

If there’s a limit to what Bayes can learn, we haven’t found it yet. The argument from computer science When I was a senior in college, I wasted a summer playing Tetris, a highly addictive video game where variously shaped pieces fall from above and which you try to pack as closely together as you can; the game is over when the pile of pieces reaches the top of the screen. Little did I know that this was my introduction to NP-completeness, the most important problem in theoretical computer science. Turns out that, far from an idle pursuit, mastering Tetris—really mastering it—is one of the most useful things you could ever do. If you can solve Tetris, you can solve thousands of the hardest and most important problems in science, technology, and management—all in one fell swoop. That’s because at heart they are all the same problem.

pages: 523 words: 143,139

Algorithms to Live By: The Computer Science of Human Decisions by Brian Christian, Tom Griffiths

The intractable scheduling problems that Eugene Lawler encountered in chapter 5 fall into this category. An NP-hard problem that is itself in NP is known as “NP-complete.” See Karp, “Reducibility Among Combinatorial Problems,” for the classic result showing that a version of the traveling salesman problem is NP-complete, and Fortnow, The Golden Ticket: P, NP, and the Search for the Impossible, for an accessible introduction to P and NP. most computer scientists believe that there aren’t any: In a 2002 survey of one hundred leading theoretical computer scientists, sixty-one thought P ≠ NP and only nine thought P = NP (Gasarch, “The P =? NP Poll”). While proving P = NP could be done by exhibiting a polynomial-time algorithm for an NP-complete problem, proving P ≠ NP requires making complex arguments about the limits of polynomial-time algorithms, and there wasn’t much agreement among the people surveyed about exactly what kind of mathematics will be needed to solve this problem.

What’s more, many other optimization problems: This includes versions of vertex cover and set cover—two problems identified as belonging to NP in Karp, “Reducibility Among Combinatorial Problems,” where twenty-one problems were famously shown to be in this set. By the end of the 1970s, computer scientists had identified some three hundred NP-complete problems (Garey and Johnson, Computers and Intractability), and the list has grown significantly since then. These include some problems that are very familiar to humans. In 2003, Sudoku was shown to be NP-complete (Yato and Seta, “Complexity and Completeness”), as was maximizing the number of cleared rows in Tetris, even with perfect knowledge of future pieces (Demaine, Hohenberger, and Liben-Nowell, “Tetris Is Hard, Even to Approximate”). In 2012, determining whether there exists a path to the end of the level in platformer games like Super Mario Brothers was officially added to the list (Aloupis, Demaine, and Guo, “Classic Nintendo Games are (NP-) Hard”).

(With n = 3, this involves figuring out the path a knife would have to travel to cut three sets of points in half; if those sets of points correspond to two pieces of bread and a piece of ham, the result is a perfectly bisected sandwich.) Finding Nash equilibria is actually PPAD-complete, meaning that if there were an efficient algorithm for solving it then all other problems in the class could also be solved efficiently (including making the world’s neatest sandwiches). But being PPAD-complete is not quite so bad as being NP-complete. P, the class of efficiently solvable problems, could be equal to PPAD without being equal to NP. As of this writing the jury is still out: it’s theoretically possible that somebody could devise an efficient algorithm for finding Nash equilibria, but most experts aren’t holding their breath. “much of its credibility as a prediction”: Christos Papadimitriou, “The Complexity of Finding Nash Equilibria,” in Nisan et al., Algorithmic Game Theory.

Applied Cryptography: Protocols, Algorithms, and Source Code in C by Bruce Schneier

If any DNA survives this screening, it is examined to find the sequence of roads that it represents: This is the solution to the directed Hamiltonian path problem. By definition, an instance of any NP-complete problem can be transformed, in polynomial time, into an instance of any other NP-complete problem, and therefore into an instance of the directed Hamiltonian path problem. Since the 1970s, cryptologists have been trying to use NP-complete problems for public-key cryptography. While the instance that Adleman solved was very modest (seven cities on his map, a problem that can be solved by inspection in a few minutes), the technique is in its infancy and has no forbidding obstacles keeping it from being extended to larger problems. Thus, arguments about the security of cryptographic protocols based on NP-complete problems, arguments that heretofore have begun, “Suppose an adversary has a million processors, each of which can perform a million tests each second,” may soon have to be replaced with, “Suppose an adversary has a thousand fermentation vats, each 20,000 liters in capacity.”

Steven Cook [365] proved that the Satisfiability problem (given a propositional Boolean formula, is there a way to assign truth values to the variables that makes the formula true?) is NP-complete . This means that, if Satisfiability is solvable in polynomial time, then P = NP. Conversely, if any problem in NP can be proven not to have a deterministic polynomial-time algorithm, the proof will show that Satisfiability does not have a deterministic polynomial-time algorithm either. No problem is harder than Satisfiability in NP. Since Cook’s seminal paper was published, a huge number of problems have been shown to be equivalent to Satisfiability; hundreds are listed in [600], and some examples follow. By equivalent, I mean that these problems are also NP-complete; they are in NP and also as hard as any problem in NP . If their solvability in deterministic polynomial time were resolved, the P versus NP question would be solved.

To access the contents, click the chapter and section titles. Applied Cryptography, Second Edition: Protocols, Algorthms, and Source Code in C (cloth) Go! Keyword Brief Full Advanced Search Search Tips (Publisher: John Wiley & Sons, Inc.) Author(s): Bruce Schneier ISBN: 0471128457 Publication Date: 01/01/96 Search this book: Go! Previous Table of Contents Next ----------- NP-Complete Problems Michael Garey and David Johnson compiled a list of over 300 NP-complete problems [600]. Here are just a few of them: — Traveling Salesman Problem. A traveling salesman has to visit n different cities using only one tank of gas (there is a maximum distance he can travel). Is there a route that allows him to visit each city exactly once on that single tank of gas? (This is a generalization of the Hamiltonian Cycle problem—see Section 5.1.) — Three-Way Marriage Problem.

pages: 71 words: 14,237

21 Recipes for Mining Twitter by Matthew A. Russell

Example 1-29 illustrates an approach for taking a NetworkX graph and discovering the cliques in it. Example 1-29. Analyzing friendship cliques (see http://github.com/ptwobrussell/Recipes-for-Mining -Twitter/blob/master/recipe__clique_analysis.py) # -*- coding: utf-8 -*import sys import json import networkx as nx 50 | The Recipes G = sys.argv[1] g = nx.read_gpickle(G) # # # # Finding cliques is a hard problem, so this could take a while for large graphs. See http://en.wikipedia.org/wiki/NP-complete and http://en.wikipedia.org/wiki/Clique_problem cliques = [c for c in nx.find_cliques(g)] num_cliques = len(cliques) clique_sizes = [len(c) for c in cliques] max_clique_size = max(clique_sizes) avg_clique_size = sum(clique_sizes) / num_cliques max_cliques = [c for c in cliques if len(c) == max_clique_size] num_max_cliques = len(max_cliques) max_clique_sets = [set(c) for c in max_cliques] people_in_every_max_clique = list(reduce(lambda x, y: x.intersection(y), max_clique_sets)) print print print print print print print print print print 'Num 'Avg 'Max 'Num cliques:', num_cliques clique size:', avg_clique_size clique size:', max_clique_size max cliques:', num_max_cliques 'People in all max cliques:' json.dumps(people_in_every_max_clique, indent=4) 'Max cliques:' json.dumps(max_cliques, indent=4) For purposes of illustration, Mining the Social Web (O’Reilly) included an analysis conducted in mid-2010 that determined the following statistics for Tim O’Reilly’s ~700 friendships: Num Avg Max Num Num cliques: 762573 clique size: 14 clique size: 26 max cliques: 6 people in every max clique: 20 Some of the more interesting insight from the analysis was that there are six different cliques of size 26 in Tim O’Reilly’s friendships, which means that those six variations of 26 people all “know” one another to the point that they were at least interested in receiving each other’s status updates in their tweet stream.

As an example of practical application, if you wanted to get Tim’s attention at a conference or Maker event, but can’t seem to track him down, you might start looking for one of 1.18 Analyzing Friendship Cliques | 51 these other people since there’s more than a reasonable likelihood that they’re closely connected with him. As a more technical aside, be advised that finding cliques is a hard problem in both a figurative sense, but also in the computational sense that it’s a problem that belongs to a class of problems that are known as NP-Complete. In layman’s terms, being NPComplete means that the combinatorics of the problem, whether it be finding cliques or otherwise, grow explosively as the size of the input for the problem grows. Practically speaking, this means that for a very large graph, you either need to wait a very long time to get an exact answer to the problem of finding cliques, or you need to be willing to settle for an approximate solution.

pages: 391 words: 71,600

Hit Refresh: The Quest to Rediscover Microsoft's Soul and Imagine a Better Future for Everyone by Satya Nadella, Greg Shaw, Jill Tracie Nichols

Graph coloring is part of computational complexity theory in which you must assign labels, traditionally called colors, to elements of a graph within certain constraints. Think of it this way: Imagine coloring the U.S. map so that no state sharing a common border receives the same color. What is the minimal number of colors you would need to accomplish this task? My master’s thesis was about developing the best heuristics to accomplish complex graph coloring in nondeterministic polynomial time, or NP-complete. In other words, how can I solve a problem that has limitless possibilities in a way that is fast and good but not always optimal? Do we solve this as best we can right now, or work forever for the best solution? Theoretical computer science really grabbed me because it showed the limits to what today’s computers can do. It led me to become fascinated by mathematicians and computer scientists John Von Neumann and Alan Turing, and by quantum computing, which I will write about later as we look ahead to artificial intelligence and machine learning.

Anu and I had decided to marry when I made a trip back to India just before joining Microsoft. I had known Anu all of my life. Her dad and my father had joined the IAS together and we were family friends. In fact, Anu’s dad and I shared a passion for talking endlessly about cricket, something we continue to this day. He had played for his school and college, captaining both teams. When exactly I fell in love with Anu is what computer scientists would call an NP complete question. I can come up with many times and places but there is no one answer. In other words, it’s complex. Our families were close. Our social circles were the same. As kids we had played together. We overlapped in school and college. Our beloved family dog came from Anu’s family dog’s litter. But once I moved to the United States, I lost touch with her. When I went back to India for a visit, we saw each other again.

See also smartphones; and specific products mobility, 42, 43, 54, 58, 70, 73, 88, 108, 216, 225 Mojang, 106–8 Moore, Gordon, 161 Moore’s Law, 140, 161 motion-sensing, 145 Motorola, 72 Mount Rainier, 19 mouse, 142 movable type, 152 Mulally, Alan, 64 multiculturalism, 19 multinational corporations, role of, 12, 235–39 Mundie, Craig, 30, 163 Musk, Elon, 203 Muslims, 19 Myerson, Terry, 3, 82, 105, 109 Myhrvold, Nathan, 30 Nadella, Anu, 7–8, 14, 27, 30–33, 41, 86, 92–93, 114 Nadella, father, 16–18, 20–21, 36, 56 Nadella, mother, 16–18, 20, 22, 86, 114, 181 nano-machines, 228 Nanyuki, Kenya, 99 Napoleonic Wars, 188 Narayen, Shantanu, 20, 136 National Aeronautics and Space Administration (NASA), 144, 146 National Football League (NFL), 10–11 National ICT for Development Policy (Malawi), 216 national security, 174–75, 228 National Security Agency (NSA), 172, 174–75, 181 Native Americans, 156 natural language, 151 Nehru, Jawaharlal, 16 Nepal, 44 Netflix, 30, 126 Netherland (O’Neill), 18 .Net, 58 networking, 45, 49, 213 Neuborne, Burt, 190 New Yorker, 233 New York Times, 24, 80, 176, 195, 209 New Zealand, 78 Nichols, Jill Tracie, 66–67, 95 Nietzsche, Friedrich, 79, 147 Nilekani, Nandan, 222 Nokia, 64, 71–73, 106, 133 nondeterministic polynomial time (NP-complete), 25 North, Douglass, 184 North Dakota, 47 North Korea, 169–70 Novell, 26 Numoto, Takeshi, 59 Obama, Barack, 3, 81, 175–76, 211–12, 214 Obama, Michelle, 211 Oculus Rift, 125, 144 Office, 2, 47, 53–54, 59, 68, 73, 81, 85, 123–25, 137, 203 Office 365, 44–45, 61, 62, 85, 123–25, 152, 233 OneDrive, 121 O’Neill, Joseph, 18 One Microsoft, 102, 108 OneNote, 103, 121 OneWeek growth hack, 103–5 online services, 46–47, 51 open-source, 62, 102 opportunity, 79, 238 Oracle, 3, 26, 81 Orlando nightclub massacre, 117 Osmania University, 36 Outlook, 104, 121 Ozzie, Ray, 46, 52–53 Parthasarathy, Sanjay, 115–16 Partnership on AI, 200 partnerships, 76, 78, 120–38 passion, 92, 94, 100, 103, 242 Pataudi, Tiger, 37 pattern recognition, 54, 150, 152, 201 pay equity, 113–14 PC Revolution, 1, 28, 45, 71, 89, 108, 139, 213 decline of, 66, 70, 79 perception, 150, 152, 154 pharmacies, 218, 223 Phillips, James, 58 PhotoShop, 136 photosynthesis, 160 Pichai, Sundar, 131 pivot tables, 143 Pixar, 13 platform shifts, 76, 142 point-of-sales devices, 128–29 Poland, 223 policymaking, 82, 182, 189–92, 214, 223–28, 235 poverty, 99, 237 Power BI, 121 Powerpoint, 121 Practo, 222 predictive power, 42, 88 Predix platform, 127 printing press, 152 PRISM, 172–73 Prism Skylabs, 153 privacy, 170, 172–80, 186–91, 193–94, 202, 205, 224, 230, 238 probabilistic decision making, 54 productivity, 76, 79, 88, 124, 126, 226–28 product launches, 98–100 public health, 237 public-private partnerships, 225–26 public sector, 222, 228, 237–38 Qatar, 225 Qi Lu, Dr., 51–52 Qualcomm, 3, 80, 131–32 quantum computing, 11, 110, 140–42, 159–67, 209, 212, 239 qubits, 160–61, 164, 166–67 logical, 167 topological, 162, 166 radiation monitoring, 44 railroads, 215 Ramakrishnan, Raghu, 58–59 Ranji Trophy, 38, 40 Rashid, Rick, 30 Reagan, Ronald, 181 Real Madrid, 71 Red Dog, 52–53, 58 Red Hat, 125 Reform Government Surveillance alliance, 174 refugees, 218 regulatory systems, 130, 186–90, 215, 227–28 relational algebra, 26 respect, 135, 202 retailers, 128–29, 153 rice production, 44 Rilke, Rainer Maria, 12 Rise and Fall of American Growth (Gordon), 234 risk-taking, 111, 220 River Runs Through It, A (Maclean), 56 Roberts, John, 185–86 robotics, 13, 145, 149–50, 202–4, 208–9, 228, 231–32, 239 Rochester Institute of Technology, 146 Rocky movies, 44–45 Rogan, Seth, 169 Rolling Stones, 98 Rolls-Royce, 127 Romer, Paul, 229 root-cause analysis, 61 RSA-2048 encryption, 162 Rubinstein, Ira, 32–33 run times, 203 rural areas, connecting, 99 Russia, 172 Russinovich, Mark, 58 Rwanda Vision 2020, 216 safety, 153, 172, 176–80, 182, 185, 188, 194, 228 sales conference, 86–95, 100, 118 Salesforce, 121 Samsung, 132–34 San Bernardino attacks, 177, 179, 189 Sanders, Bernie, 230 Sanskrit, 16, 181 Saudi Arabia, 225 scale, 50, 161 Schiller, Phil, 124 Schmidt, Eric, 26 Scott, Kevin, 82 search and seizure, 185–86 Search Checkpoint #1, 51 search engines, 46–52.

pages: 416 words: 112,268

Human Compatible: Artificial Intelligence and the Problem of Control by Stuart Russell

The first algorithm for theorem-proving in first-order logic worked by reducing first-order sentences to (very large numbers of) propositional sentences: Martin Davis and Hilary Putnam, “A computing procedure for quantification theory,” Journal of the ACM 7 (1960): 201–15. 3. An improved algorithm for propositional inference: Martin Davis, George Logemann, and Donald Loveland, “A machine program for theorem-proving,” Communications of the ACM 5 (1962): 394–97. 4. The satisfiability problem—deciding whether a collection of sentences is true in some world—is NP-complete. The reasoning problem—deciding whether a sentence follows from the known sentences—is co-NP-complete, a class that is thought to be harder than NP-complete problems. 5. There are two exceptions to this rule: no repetition (a stone may not be played that returns the board to a situation that existed previously) and no suicide (a stone may not be placed such that it would immediately be captured—for example, if it is already surrounded). 6. The work that introduced first-order logic as we understand it today (Begriffsschrift means “concept writing”): Gottlob Frege, Begriffsschrift, eine der arithmetischen nachgebildete Formelsprache des reinen Denkens (Halle, 1879).

Suppose that LoopChecker exists. Now write a program Q that calls LoopChecker as a subroutine, with Q itself and X as inputs, and then does the opposite of what LoopChecker(Q,X) predicts. So, if LoopChecker says that Q halts, Q doesn’t halt, and vice versa. Thus, the assumption that LoopChecker exists leads to a contradiction, so LoopChecker cannot exist. 40. I say “appear” because, as yet, the claim that the class of NP-complete problems requires superpolynomial time (usually referred to as P ≠ NP) is still an unproven conjecture. After almost fifty years of research, however, nearly all mathematicians and computer scientists are convinced the claim is true. 41. Lovelace’s writings on computation appear mainly in her notes attached to her translation of an Italian engineer’s commentary on Babbage’s engine: L. F. Menabrea, “Sketch of the Analytical Engine invented by Charles Babbage,” trans.

Hands-On Machine Learning With Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems by Aurelien Geron

Solutions to these exercises are available in Appendix A. 1 Graphviz is an open source graph visualization software package, available at http://www.graphviz.org/. 2 P is the set of problems that can be solved in polynomial time. NP is the set of problems whose solutions can be verified in polynomial time. An NP-Hard problem is a problem to which any NP problem can be reduced in polynomial time. An NP-Complete problem is both NP and NP-Hard. A major open mathematical question is whether or not P = NP. If P ≠ NP (which seems likely), then no polynomial algorithm will ever be found for any NP-Complete problem (except perhaps on a quantum computer). 3 log2 is the binary logarithm. It is equal to log2(m) = log(m) / log(2). 4 A reduction of entropy is often called an information gain. 5 See Sebastian Raschka’s interesting analysis for more details. 6 It randomly selects the set of features to evaluate at each node.

Warning As you can see, the CART algorithm is a greedy algorithm: it greedily searches for an optimum split at the top level, then repeats the process at each level. It does not check whether or not the split will lead to the lowest possible impurity several levels down. A greedy algorithm often produces a reasonably good solution, but it is not guaranteed to be the optimal solution. Unfortunately, finding the optimal tree is known to be an NP-Complete problem:2 it requires O(exp(m)) time, making the problem intractable even for fairly small training sets. This is why we must settle for a “reasonably good” solution. Computational Complexity Making predictions requires traversing the Decision Tree from the root to a leaf. Decision Trees are generally approximately balanced, so traversing the Decision Tree requires going through roughly O(log2(m)) nodes.3 Since each node only requires checking the value of one feature, the overall prediction complexity is just O(log2(m)), independent of the number of features.

Crypto: How the Code Rebels Beat the Government Saving Privacy in the Digital Age by Steven Levy

Knuth reminded them of an interesting mathematical phenomenon: while it is child’s play to multiply a pair of prime numbers, reversing the process—a task known as factoring—is an assignment that could confound the devil himself. Could this phenomenon be the basis for a devilishly challenging one-way function? Though Diffie and Hellman did not choose to pursue this idea, others would. Another alternative involved computational complexity, and Diffie pored over a book on the subject, particularly a chapter on what was known as NP-complete functions. The class of NP-complete problems, Diffie later wrote, are “problems thought not to be solvable in polynomial time on any deterministic computer.” This meant that they were so hard that you could set your Macintosh, or even your Cray supercomputer (if you were the NSA), to work on the problem and when you checked the results a few trillion years later, you wouldn’t even be in the general neighborhood of solving it.

It would be exhilarating to work with the two people in the world who best understood the problem. “I was basically isolated until I met Whit and Marty,” he says. “I was ready to keep banging away until I got some response, but there was no one else who was interested in pursuing this.” Merkle arrived at Stanford convinced that his most promising idea revolved around a scheme built around finding trapdoor one-way functions involving the NP-complete problem. The system was built around a mathematical problem known as knapsacks. To understand his scheme, picture, naturally, a knapsack. “The idea is to put things into this knapsack, to exactly fill it to the brim without going either over or under,” he says. Diffie would later describe the problem as that of a shipping clerk faced with a collection of packages of various sizes and shapes who had to find the absolute best way to stuff the packages in the mailbag.

., ref-1, ref-2, ref-3 Mosaic, ref-1 “Multiuser Cryptographic Techniques” (Diffie and Hellman), ref-1, ref-2, ref-3 Murray, Patty, ref-1 Myhrvold, Nathan, ref-1, ref-2, ref-3, ref-4, ref-5, ref-6 Mykotronx, ref-1, ref-2, ref-3 National Bureau of Standards (NBS), ref-1, ref-2, ref-3, ref-4, ref-5, ref-6, ref-7 National Institute of Standards and Technology (NIST), ref-1, ref-2, ref-3, ref-4, ref-5, ref-6, ref-7, ref-8, ref-9, ref-10, ref-11 National Research Council (NRC), ref-1, ref-2 National Science Foundation (NSF), ref-1, ref-2, ref-3, ref-4, ref-5, ref-6, ref-7 National Security Access Field (NSAF), ref-1 National Security Agency (NSA), ref-1, ref-2, ref-3, ref-4, ref-5, ref-6, ref-7, ref-8, ref-9, ref-10, ref-11, ref-12, ref-13, ref-14, ref-15, ref-16, ref-17, ref-18, ref-19, ref-20, ref-21, ref-22, ref-23, ref-24, ref-25, ref-26, ref-27, ref-28, ref-29, ref-30, ref-31, ref-32, ref-33 Clipper and, ref-1, ref-2, ref-3, ref-4, ref-5, ref-6, ref-7, ref-8, ref-9, ref-10, ref-11 DES and, ref-1, ref-2, ref-3, ref-4, ref-5, ref-6, ref-7, ref-8, ref-9 Diffie and, ref-1, ref-2, ref-3, ref-4, ref-5, ref-6, ref-7, ref-8, ref-9, ref-10 dual roles of, ref-1 GCHQ and, ref-1, ref-2 Kahn and, ref-1, ref-2 Lotus Notes and, ref-1, ref-2, ref-3, ref-4 NRC report and, ref-1 NSAF and, ref-1 Project Overtake of, ref-1 security failures at, ref-1 Snuffle and, ref-1 National Security Decision Directive, ref-1 Nelson, Mike, ref-1, ref-2, ref-3, ref-4, ref-5, ref-6 Netscape, ref-1, ref-2, ref-3 Neukom, Bill, ref-1 “New Directions in Cryptography” (Diffie and Hellman), ref-1, ref-2, ref-3, ref-4, ref-5, ref-6, ref-7 New York Times, ref-1, ref-2 Nicolai, Carl, ref-1, ref-2 nonrepudiation feature, ref-1 nonsecret encryption, ref-1 Notes, ref-1, ref-2, ref-3, ref-4, ref-5, ref-6, ref-7, ref-8 NP-complete functions, ref-1, ref-2 O’Brien, Bart, ref-1, ref-2, ref-3, ref-4, ref-5, ref-6, ref-7 Odom, William E., ref-1 Omura, Jim, ref-1, ref-2 one-time pads, ref-1, ref-2, ref-3 one-way functions, ref-1, ref-2, ref-3, ref-4, ref-5, ref-6, ref-7, ref-8, ref-9, ref-10, ref-11 factoring, ref-1, ref-2, ref-3 knapsacks, ref-1, ref-2, ref-3, ref-4, ref-5, ref-6 trapdoor, ref-1, ref-2, ref-3, ref-4 Ozzie, Ray, ref-1, ref-2, ref-3, ref-4, ref-5, ref-6, ref-7 paranoia levels, ref-1 Parker, Donn, ref-1, ref-2 passwords, ref-1, ref-2, ref-3, ref-4, ref-5, ref-6, ref-7 Pasta, John R., ref-1 Patel, Marilyn, ref-1 patents, ref-1, ref-2, ref-3, ref-4, ref-5, ref-6, ref-7, ref-8, ref-9, ref-10, ref-11, ref-12, ref-13, ref-14, ref-15, ref-16, ref-17, ref-18, ref-19, ref-20, ref-21, ref-22 Patterson, Nick, ref-1, ref-2, ref-3, ref-4 Penet, ref-1, ref-2, ref-3, ref-4, ref-5 Phasorphone, ref-1 Podesta, John, ref-1, ref-2 Poe, Edgar Allan, ref-1, ref-2, ref-3 pornography, ref-1 Press, Frank, ref-1 Pretty Good Privacy (PGP), ref-1, ref-2, ref-3, ref-4, ref-5, ref-6, ref-7, ref-8, ref-9, ref-10, ref-11, ref-12 breaking of, ref-1, ref-2 Prime, Geoffrey, ref-1 prime numbers, ref-1, ref-2, ref-3, ref-4, ref-5, ref-6, ref-7 Project C43, ref-1, ref-2 Project Overtake, ref-1 public key cryptography, ref-1, ref-2, ref-3, ref-4, ref-5, ref-6, ref-7, ref-8, ref-9, ref-10, ref-11, ref-12, ref-13, ref-14, ref-15, ref-16, ref-17, ref-18, ref-19, ref-20, ref-21, ref-22, ref-23, ref-24, ref-25, ref-26 certification of, ref-1 escrow and, ref-1, ref-2, ref-3, ref-4, ref-5, ref-6, ref-7, ref-8, ref-9, ref-10 nonsecret encryption and, ref-1 Secure Sockets Layer, ref-1, ref-2, ref-3 security failures and, ref-1 Public Key Partners (PKP), ref-1, ref-2, ref-3, ref-4, ref-5, ref-6 Puzzle Palace, The (Bamford), ref-1, ref-2 quantum computers, ref-1 Random Number Generator (RNG), ref-1 Ray, Charles, ref-1 RC-2, ref-1, ref-2, ref-3 RC-4, ref-1, ref-2, ref-3, ref-4 Reagan, Ronald, ref-1, ref-2 Reeds, Jim, ref-1 remailers (anonymous servers), ref-1, ref-2, ref-3, ref-4, ref-5 Penet, ref-1, ref-2, ref-3, ref-4, ref-5 Richardson, Elliot, ref-1 Ritner, Peter, ref-1, ref-2 Rivest, Ron, ref-1, ref-2, ref-3, ref-4, ref-5, ref-6, ref-7, ref-8, ref-9, ref-10, ref-11, ref-12, ref-13, ref-14, ref-15, ref-16, ref-17, ref-18, ref-19, ref-20 Rizzo, Paul, ref-1 Roberts, Larry, ref-1 Robinson, William B., ref-1, ref-2 Rohrbacher, Dana, ref-1 Rosenblum, Howard, ref-1, ref-2 Rotenberg, Marc, ref-1, ref-2 RSA, ref-1, ref-2, ref-3, ref-4, ref-5, ref-6, ref-7, ref-8, ref-9, ref-10, ref-11, ref-12, ref-13, ref-14, ref-15, ref-16, ref-17, ref-18, ref-19, ref-20, ref-21, ref-22, ref-23, ref-24, ref-25, ref-26, ref-27, ref-28, ref-29, ref-30, ref-31 Netscape and, ref-1 patents for, ref-1, ref-2, ref-3, ref-4, ref-5, ref-6, ref-7, ref-8, ref-9, ref-10, ref-11 personal computers and, ref-1, ref-2 RSA Data Security, Inc., ref-1, ref-2, ref-3, ref-4, ref-5, ref-6, ref-7, ref-8, ref-9, ref-10, ref-11, ref-12, ref-13, ref-14, ref-15, ref-16 conference of, ref-1, ref-2, ref-3, ref-4 Sacco, Luigi, ref-1 Safire, William, ref-1 S-boxes (substitution boxes), ref-1, ref-2, ref-3, ref-4, ref-5, ref-6, ref-7 Schiller, Jeff, ref-1 Schneier, Bruce, ref-1, ref-2, ref-3 Schnorr, Claus, ref-1, ref-2 Schroeppel, Richard, ref-1, ref-2, ref-3, ref-4 Schwartz, John J., ref-1 Science, ref-1, ref-2, ref-3, ref-4 Scientific American, ref-1, ref-2, ref-3, ref-4, ref-5, ref-6, ref-7, ref-8 Scientologists, ref-1, ref-2, ref-3 search warrants, ref-1, ref-2, ref-3, ref-4 secret sharing, ref-1 Secure Sockets Layer (SSL), ref-1, ref-2, ref-3 Security and Freedom through Encryption (SAFE) bill, ref-1, ref-2, ref-3 Security Dynamics, ref-1 Senate Bill ref-1, ref-2, ref-3, ref-4 servers, ref-1, ref-2 anonymous, see remailers Sessions, William, ref-1, ref-2, ref-3, ref-4, ref-5, ref-6 Shamir, Adi, ref-1, ref-2, ref-3, ref-4, ref-5, ref-6, ref-7, ref-8, ref-9, ref-10, ref-11, ref-12, ref-13, ref-14, ref-15, ref-16, ref-17 Shannon, Claude, ref-1, ref-2, ref-3, ref-4, ref-5 shareware, ref-1, ref-2 signals intelligence, ref-1 signatures, digital, ref-1, ref-2, ref-3, ref-4, ref-5, ref-6, ref-7, ref-8, ref-9, ref-10, ref-11, ref-12, ref-13 blind, ref-1, ref-2 DSA, ref-1, ref-2, ref-3 Silver, Roland, ref-1, ref-2, ref-3, ref-4 Simmons, Gus, ref-1 Simons, Jim, ref-1, ref-2 Skipjack, ref-1, ref-2, ref-3, ref-4, ref-5, ref-6, ref-7, ref-8, ref-9, ref-10, ref-11 Snefru, ref-1 Snow, Brian, ref-1 Snuffle, ref-1 stream ciphers, ref-1 Studeman, William O., ref-1, ref-2 substitution boxes (S-boxes), ref-1, ref-2, ref-3, ref-4, ref-5, ref-6, ref-7 SWIFT, ref-1 T Attack (differential cryptanalysis), ref-1, ref-2, ref-3 telephones: cellular, ref-1 security devices for, ref-1, ref-2, ref-3, ref-4, ref-5, ref-6, ref-7, ref-8, ref-9, ref-10 Tempest technology, ref-1 Tenet, George, ref-1 Tessera, ref-1 threshold scheme, ref-1 Time, ref-1 time-sharing, ref-1, ref-2 toll payments, ref-1 trapdoors, ref-1, ref-2, ref-3, ref-4, ref-5, ref-6, ref-7, ref-8, ref-9 knapsacks, ref-1, ref-2, ref-3, ref-4, ref-5, ref-6 one-way function, ref-1, ref-2, ref-3, ref-4 Senate bill and, ref-1, ref-2, ref-3 Tritter, Alan, ref-1, ref-2, ref-3, ref-4 Tuchman, Walter, ref-1, ref-2, ref-3, ref-4, ref-5, ref-6, ref-7, ref-8, ref-9, ref-10 univectors, ref-1, ref-2 Usenet, ref-1, ref-2 vector space, ref-1 VeriSign, ref-1 Very Large Scale Integration (VLSI), ref-1 ViaCrypt, ref-1 virtual private networks, ref-1 Wagner, Dave, ref-1 Walker, Steve, ref-1 Wall Street Journal, ref-1, ref-2 Warren, Jim, ref-1, ref-2 Washington Post, ref-1 web of trust, ref-1 Weingarten, Fred, ref-1 Weldon, Curt, ref-1 Williamson, Malcolm, ref-1, ref-2, ref-3, ref-4, ref-5, ref-6 Windows, ref-1 wiretapping, ref-1, ref-2, ref-3, ref-4, ref-5, ref-6, ref-7 Wise, William, ref-1 World Wide Web, ref-1, ref-2 browsers for, ref-1, ref-2, ref-3 Wormser, Dave, ref-1 Wylie, Shawn, ref-1 Xerox Corporation, ref-1, ref-2 xor operations, ref-1 Zero Knowledge, ref-1 zero-knowledge proofs of identity, ref-1 Zimmermann, Kacie, ref-1 Zimmermann, Phil, ref-1, ref-2, ref-3, ref-4, ref-5, ref-6, ref-7, ref-8, ref-9, ref-10, ref-11 contents acknowledgments preface the loner the standard public key prime time selling crypto patents and keys crypto anarchy the clipper chip slouching toward crypto epilogue: the open secret notes bibliography glossary index VIKING Published by the Penguin Group Penguin Putnam Inc., 375 Hudson Street, New York, New York 10014, U.S.A.

pages: 1,331 words: 163,200

Hands-On Machine Learning With Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems by Aurélien Géron

Solutions to these exercises are available in Appendix A. 1 Graphviz is an open source graph visualization software package, available at http://www.graphviz.org/. 2 P is the set of problems that can be solved in polynomial time. NP is the set of problems whose solutions can be verified in polynomial time. An NP-Hard problem is a problem to which any NP problem can be reduced in polynomial time. An NP-Complete problem is both NP and NP-Hard. A major open mathematical question is whether or not P = NP. If P ≠ NP (which seems likely), then no polynomial algorithm will ever be found for any NP-Complete problem (except perhaps on a quantum computer). 3 log2 is the binary logarithm. It is equal to log2(m) = log(m) / log(2). 4 A reduction of entropy is often called an information gain. 5 See Sebastian Raschka’s interesting analysis for more details. 6 It randomly selects the set of features to evaluate at each node.

Warning As you can see, the CART algorithm is a greedy algorithm: it greedily searches for an optimum split at the top level, then repeats the process at each level. It does not check whether or not the split will lead to the lowest possible impurity several levels down. A greedy algorithm often produces a reasonably good solution, but it is not guaranteed to be the optimal solution. Unfortunately, finding the optimal tree is known to be an NP-Complete problem:2 it requires O(exp(m)) time, making the problem intractable even for fairly small training sets. This is why we must settle for a “reasonably good” solution. Computational Complexity Making predictions requires traversing the Decision Tree from the root to a leaf. Decision Trees are generally approximately balanced, so traversing the Decision Tree requires going through roughly O(log2(m)) nodes.3 Since each node only requires checking the value of one feature, the overall prediction complexity is just O(log2(m)), independent of the number of features.

pages: 205 words: 20,452

Data Mining in Time Series Databases by Mark Last, Abraham Kandel, Horst Bunke

Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology, Cambridge University Press. Median Strings: A Review 191 14. Guyon, Schomaker, L., Plamondon, R., Liberman, M. and Janet, S. (1994). UNIPEN Project on On-Line Exchange and Recognizer Benchmarks. Proc. of 12th Int. Conf. on Pattern Recognition, pp. 29–33. 15. de la Higuera, C. and Casacuberta, F. (2000). Topology of Strings: Median String is NP-Complete. Theoretical Computer Science, 230(1/2), 39–48. 16. Jiang, X., Schiﬀmann, L., and Bunke, H. (2000). Computation of Median Shapes. Proc. of 4th. Asian Conf. on Computer Vision, pp. 300–305, Taipei. 17. Jiang, X., Abegglen, K., and Bunke, H. (2001). Genetic Computation of Generalized Median Strings. (Submitted for publication.) 18. Jiang, X., Abegglen, K., Bunke, H., and Csirik, J. (2002). Dynamic Computation of Generalized Median Strings, Pattern Analysis and Applications.

Sanchez, G., Llados, J., and Tombre, K. (2002). A Mean String Algorithm to Compute the Average Among a Set of 2D Shapes. Pattern Recognition Letters, 23(1–3), 203–213. 34. Sankoﬀ, D. and Kruskal, J.B. (1983). (eds.). Time Warps, String Edits, and Macromolecules: The Theory and Practice of String Comparison, AddisonWesley. 35. Sim, J.S. and Park, K. (2001). The Consensus String Problem for a Metric is NP-Complete. Journal of Discrete Algorithms, 2(1). 36. Stephen, G.A. (1994). String Searching Algorithms, World Scientiﬁc. 37. Subramanyan, K. and Dean, D. (2000). A Procedure to Average 3D Anatomical Structures. Medical Image Analysis, 4(4), 317–334. 38. Wagner, R.A. and Fischer, M.J. (1974). The String-to-String Correction Problem. Journal of the ACM, 21(1), 168–173. 39. Wang, L. and Jiang, Y. (1994). On the Complexity of Multiple Sequence Alignment.

pages: 931 words: 79,142

Concepts, Techniques, and Models of Computer Programming by Peter Van-Roy, Seif Haridi

Other problems are expensive for more fundamental reasons. For example, NP-complete problems. These problems are in the class NP, i.e., it is easy to check a solution, if you are given a candidate.14 But ﬁnding a solution may be much harder. A simple example is the circuit satisﬁability problem. Given a combinational digital circuit that consists of And, Or, and Not gates, does there exist a set of input values that makes the output 1? This problem is NP-complete [47]. An NP-complete problem is a special kind of NP problem with the property that if you can solve one in polynomial time, then you can solve all in polynomial time. Many computer scientists have tried over several decades to ﬁnd polynomial-time solutions to NP-complete problems, 13. Kurzweil claims that the growth rate is increasing and will lead to a “singularity” somewhere near the middle of the 21st century. 14.

pages: 120 words: 17,504

Data Structures & Algorithms Interview Questions You'll Most Likely Be Asked by Vibrant Publishers

Give examples of problems solved using greedy algorithms. Answer: A greedy algorithm is any algorithm that makes the local optimal choice at each stage with the hope of finding the global optimum. A classical problem which can be solved using a greedy strategy is the traveling salesman problem. Another problems that can be solved using greedy algorithms are the graph coloring problem and all the NP-complete problems. 164: Which are the pillars of a greedy algorithm? Answer: In general, greedy algorithms have five pillars: a) a candidate set, from which a solution is created b) a selection function, which chooses the best candidate to be added to the solution c) a feasibility function, that is used to determine if a candidate can be used to contribute to a solution d) an objective function, which assigns a value to a solution, or a partial solution e) a solution function, which will indicate when we have discovered a complete solution 165: What is a backtracking algorithm?

pages: 828 words: 205,338

Write Great Code, Volume 2 by Randall Hyde

For this reason, the optimization process is usually a case of compromise management, where you make tradeoffs and sacrifice certain subgoals (for example, running certain sections of the code a little slower) in order to create a reasonable result (for example, creating a program that doesn’t consume too much memory). 5.4.4.2 Optimization’s Effect on Compile Time You might think that it’s possible to choose a single goal (for example, highest possible performance) and optimize strictly for that. However, the compiler must also be capable of producing an executable result in a reasonable amount of time. The optimization process is an example of what complexity theory calls an NP-complete problem. These are problems that are, as far as we know, intractable. That is, a guaranteed correct result cannot be produced (for example, an optimal version of a program) without computing all possibilities and choosing the best result from those possibilities. Unfortunately, the time generally required to solve an NP-complete problem increases exponentially with the size of the input, which in the case of compiler optimization means roughly the number of lines of source code. This means that in the worst case, producing a truly optimal program would take longer than it was worth.

Toast by Stross, Charles

I was elbow-deep in an eviscerated PC, performing open heart surgery on a diseased network card, when the news about the traveling salesman theorem came in. Over on the other side of the office John’s terminal beeped, notification of incoming mail. A moment later my own workstation bonged. “Hey, Geoff! Get a load of this!” I carried on screwing the card back into its chassis. John is not a priority interrupt. “Someone’s come up with a proof that NP-complete problems lie in P! There’s a posting in comp.risks saying they’ve used it to find an O*(n2 ) solution to the traveling salesman problem, and it scales! Looks like April First has come early this year, doesn’t it?” I dropped the PC’s lid on the floor hastily and sat down at my workstation. Another cubed-sphere hypothesis, another flame war in the math newsgroups—or something more serious? “When did it arrive?”

Burgundy has gained a small reputation as an information buffer—a side-effect of its monarch’s paranoid attitude towards encroachment by other deities. It exports confidentiality, delay-line buffers, fine wines, and dissidents. In other words, it’s the last place a futures trader specializing in spread option trades that leverage developments in the artificial intelligence market—someone like me—would normally be seen dead in. On the other hand . . . As an accountant specializing in Monte Carlo solutions to NP-complete low-knowledge systems I am naturally of interest to the Queen, who made me an offer I would have considered long and hard at the best of times; in my circumstances back on Dordogne—exposed to risk, to say the least—I could hardly consider refusing her invitation. Many technologies are banned from the system, in an attempt to reduce the vulnerability of the monarchy to algorithmic complexity attacks.

pages: 541 words: 109,698

Mining the Social Web: Finding Needles in the Social Haystack by Matthew A. Russell

A maximal clique, on the other hand, is one that is not a subgraph of another clique. The maximum clique is also a maximal clique in that it isn’t a subgraph of any other clique. However, various other maximal cliques often exist in graphs and need not necessarily share any nodes with the maximum clique. Figure 4-2, for example, illustrates a maximum clique of size 4, but there are several other maximal cliques of size 3 in the graph as well. Finding cliques is an NP-complete problem (implying an exponential runtime), but NetworkX does provide the find_cliques method, which delivers a solid implementation that takes care of the heavy lifting. Just be advised that it might take a long time to run as graphs get beyond a reasonably small size. Figure 4-2. An example graph containing a maximum clique of size 4 Example 4-11 could be modified in any number of ways to compute interesting things, so long as you’ve first harvested the necessary friendship information.

Assuming you’ve first constructed a graph as demonstrated in the previous section, it shows you how to use some of the NetworkX APIs for finding and analyzing cliques. Example 4-11. Using NetworkX to find cliques in graphs (friends_followers__clique_analysis.py) # -*- coding: utf-8 -*- import sys import json import networkx as nx G = sys.argv[1] g = nx.read_gpickle(G) # Finding cliques is a hard problem, so this could # take awhile for large graphs. # See http://en.wikipedia.org/wiki/NP-complete and # http://en.wikipedia.org/wiki/Clique_problem cliques = [c for c in nx.find_cliques(g)] num_cliques = len(cliques) clique_sizes = [len(c) for c in cliques] max_clique_size = max(clique_sizes) avg_clique_size = sum(clique_sizes) / num_cliques max_cliques = [c for c in cliques if len(c) == max_clique_size] num_max_cliques = len(max_cliques) max_clique_sets = [set(c) for c in max_cliques] people_in_every_max_clique = list(reduce(lambda x, y: x.intersection(y), max_clique_sets)) print 'Num cliques:', num_cliques print 'Avg clique size:', avg_clique_size print 'Max clique size:', max_clique_size print 'Num max cliques:', num_max_cliques print print 'People in all max cliques:' print json.dumps(people_in_every_max_clique, indent=4) print print 'Max cliques:' print json.dumps(max_cliques, indent=4) Sample output from the script follows for Tim O’Reilly’s 600+ friendships and reveals some interesting insights.

Mining of Massive Datasets by Jure Leskovec, Anand Rajaraman, Jeffrey David Ullman

Interestingly, the technique for doing this search on a large graph involves finding large frequent itemsets, as was discussed in Chapter 6. 10.3.1Finding Cliques Our first thought about how we could find sets of nodes with many edges between them is to start by finding a large clique (a set of nodes with edges between any two of them). However, that task is not easy. Not only is finding maximal cliques NP-complete, but it is among the hardest of the NP-complete problems in the sense that even approximating the maximal clique is hard. Further, it is possible to have a set of nodes with almost all edges between them, and yet have only relatively small cliques. EXAMPLE 10.10Suppose our graph has nodes numbered 1, 2, . . . , n and there is an edge between two nodes i and j unless i and j have the same remainder when divided by k.

., 226 Near-neighbor search, see Locality-sensitive hashing Nearest neighbor, 17, 419, 447, 455 Negative border, 218 Negative example, 423 Neighbor, 357 Neighborhood, 367, 375 Neighborhood profile, 367 Netflix challenge, 2, 295, 321 Network, see Social network Neural net, 419 Newman, M.E.J., 382 Newspaper articles, 109, 286, 296 Non-Euclidean distance, 238, see also Cosine distance, see also Edit distance, see also Hamming distance, see also Jaccard distance Non-Euclidean space, 252, 254 Norm, 87, 88 Normal distribution, 244 Normalization, 305, 307, 318 Normalized cut, 344 NP-complete problem, 338 Numerical feature, 416, 455 O’Callaghan, L., 265 Off-line algorithm, 270 Olston, C., 67 Omiecinski, E., 226 On-line advertising, see Advertising On-line algorithm, 16, 270 On-line learning, 421 On-line retailer, 194, 268, 292, 294 Open directory, 174, 422 OR-construction, 95 Orr, G.B., 458 Orthogonal vectors, 231, 389 Orthonormal matrix, 397, 402 Orthonormal vectors, 389, 393 Out-component, 159 Outlier, 230 Output, 54 Overfitting, 303, 319, 419, 420, 432, 456 Overlapping Communities, 350 Overture, 276 Own pages, 178 Paepcke, A., 122 Page, L., 154, 190 PageRank, 3, 15, 28, 29, 40, 154, 156, 168 Pairs, see Frequent pairs Palmer, C.R., 383 Pan, J.

pages: 678 words: 159,840

The Debian Administrator's Handbook, Debian Wheezy From Discovery to Mastery by Raphaal Hertzog, Roland Mas

With packages all depending upon each other, it is sometimes necessary to migrate a large number of packages simultaneously, which is impossible when some are uploading updates regularly. On the other hand, the script identifying the families of related packages works hard to create them (this would be an NP-complete problem, for which, fortunately, we know some good heuristics). This is why we can manually interact with and guide this script by suggesting groups of packages, or imposing the inclusion of certain packages in a group, even if this temporarily breaks some dependencies. This functionality is accessible to the Release Managers and their assistants. Recall that an NP-complete problem is of an exponential algorithmic complexity according to the size of the data, here being the length of the code (the number of figures) and the elements involved. The only way to resolve it is frequently to examine all possible configurations, which could require enormous means.

pages: 245 words: 12,162

In Pursuit of the Traveling Salesman: Mathematics at the Limits of Computation by William J. Cook

Lenstra et al., eds. History of Mathematical Programming—A Collection of Personal Reminiscences. NorthHolland. 32–54. [11] Flood, M. 1954. Operations research and logistics. Proceedings of the First Ordnance Conference on Operations Research. Office of Ordnance Research, Durham, North Carolina. 3–32. [12] Garey, M. R., D. S. Johnson. 1979. Computers and Intractability: A Guide to the Theory of NP-Completeness. Freeman, San Francisco, California. [13] Gomory, R. E. 1966. The traveling salesman problem. Proceedings of the IBM Scientific Computing Symposium on Combinatorial Problems. IBM, White Plains, New York. 93–121. [14] Grötschel, M., O. Holland. 1991. Solution of large-scale symmetric travelling salesman problems. Mathematical Programming 51, 141–202. 224 Bibliography [15] Held, M., R. M. Karp. 1962.

The Art of Computer Programming by Donald Ervin Knuth

., bs are any addition chains, then J(]T djcubj) < r + 5 + ]T aj — 1 for any (r + 1) x E + 1) matrix of nonnegative integers dj. 41. [SICOMP 10 A981), 638-646.] The stated formula can be proved whenever A > 9m2. Since this is a polynomial in m, and since the problem of finding a minimum vertex cover is NP-hard (see Section 7.9), the problem of computing l(ni,..., nm) is NP-complete. [It is unknown whether or not the problem of computing l(n) is NP- complete. But it seems plausible that an optimum chain for, say, X^fcTcI ^fc+i2Afe2 would entail an optimum chain for {ni,... ,nm}, when A is sufficiently large.] SECTION 4.6.4 1. Set y «— x2, then compute ((... (u2n+iy + U2n-i)y + • • ¦ )y + U\)x. 2. Replacing x in B) by the polynomial x + xo leads to the following procedure: Gl. Do step G2 for k = n, n — 1, ..., 0 (in this order), and stop.

He has extended the results to multivariate polynomials in SICOMP 9 A980), 225-229. The tensor rank of an arbitrary m x n x 2 tensor in a suitably large field has been determined by Joseph Ja'Ja', SICOMP 8 A979), 443-462; JACM 27 A980), 822-830. See also his interesting discussion of commutative bilinear forms in SICOMP 9 A980), 713-728. However, the problem of computing the tensor rank of an arbitrary nxnx n tensor over any finite field is NP-complete [J. Hastad, Journal of Algorithms 11 A990), 644-654]. 4.6.4 EVALUATION OF POLYNOMIALS 515 For further reading. In this section we have barely scratched the surface of a very large subject in which many beautiful theories are emerging. Considerably more comprehensive treatments can be found in the books Computational Com- Complexity of Algebraic and Numeric Problems by A. Borodin and I. Munro (New York: American Elsevier, 1975); Polynomial and Matrix Computations 1 by D.

Normal distribution, 56, 122, 384, 565. tail of, 139. variations, 132, 139. Normal evaluation schemes, 506, 709-710. Normal numbers, 177. Normalization of divisors, 272-273, 282-283. Normalization of floating point numbers, 215-217, 227-228, 238, 248-249, 254, 616. Normand, Jean-Marie, 29. Norton, Graham Hilton, 372, 673. Norton, Karl Kenneth, 383. Norton, Victor Thane, Jr., 607. Notations, index to, 730-734. Nozaki, Akihiro (ff*§Hg3A), 524. NP-complete problems, 499, 514, 585, 698. Null space of a matrix, 443-444, 456, 659-660, 681. Number field sieve, 403. Number fields, 331, 333, 345, 403, 674. Number sentences, 605. Number system: A language for representing numbers. balanced binary, 213. balanced decimal, 211. balanced mixed-radix, 103, 293, 631. balanced ternary, 207-208, 209, 227, 283, 353. binary (radix 2), 195, 198-206, 209-213, 419, 461, 483. combinatorial, 209. complex, 205-206, 209-210, 292. decimal (= denary, radix ten), 197—199, 210, 320-326, 374. duodecimal (radix twelve), 199-200. factorial, 66, 209.

Algorithms in C++ Part 5: Graph Algorithms by Robert Sedgewick

Does the presence of cycles in a flow network make more difficult the task of computing a maxflow? We have seen many examples of digraph-processing problems that are much more difficult to solve in the presence of cycles. Perhaps the most prominent example is the shortest-paths problem in weighted digraphs with edge weights that could be negative (see Section 21.7), which is simple to solve in linear time if there are no cycles but NP-complete if cycles are allowed. But the maxflow problem, remarkably, is no easier for acyclic networks. Property 22.16 The maxflow problem for acyclic networks is equivalent to the standard maxflow problem. Proof: Again, we need only to show that the standard problem reduces to the acyclic problem. Given any network with V vertices and E edges, we construct a network with 2V +2 vertices and E +3V edges that is not just acyclic but has a simple structure.

Hell, “On the history of the minimum spanning tree problem,” Annals of the History of Computing, 7 (1985). D. B. Johnson, “Efficient shortest path algorithms,” Journal of the ACM, 24 (1977). D. E. Knuth, The Art of Computer Programming. Volume 1:Fundamental Algorithms, third edition, Addison-Wesley, 1997. J. R. Kruskal Jr., “On the shortest spanning subtree of a graph and the traveling salesman problem,” Proceedings AMS, 7, 1 (1956). K. Mehlhorn, Data Structures and Algorithms 2:NP-Completeness and Graph Algorithms, Springer-Verlag, 1984. C. H. Papadimitriou and K. Steiglitz, Combinatorial Optimization: Algorithms and Complexity, Prentice-Hall, 1982. R. C. Prim, “Shortest connection networks and some generalizations,” Bell System Technical Journal, 36 (1957). R. E. Tarjan, “Depth-first search and linear graph algorithms,” SIAM Journal on Computing, 1, 2 (1972). R. E. Tarjan, Data Structures and Network Algorithms, Society for Industrial and Applied Mathematics, Philadelphia, PA, 1983.

A Natural History of Beer by Rob DeSalle

This is a tough question, because if we choose an outgroup that is too far away (milk, say), the root of the tree will become random and meaningless, and if we choose an outgroup from within the beers (barleywine, for instance), we will run the risk of artificially rooting the tree next to it. As a compromise, we tried rooting our tree with two relatively closely related beverages, gruit and wine. Next, as already mentioned, came the sheer computational difficulty that is involved in evaluating 10100 different trees. Computing the solutions for so many trees poses what mathematicians and computer scientists call an NP-complete problem: even though we know there is a finite solution to the problem, we don’t have the computing power to find it, so we must seek another road. In other words, with this many trees to examine we need to use shortcut methods that eliminate large swaths of those trees that will definitely not be part of the solution. One shortcut method of rooting the tree would be simply to interpose a root between the two larger groups of beers (say, lagers and ales) into which all the other beers are thought to fall.

pages: 571 words: 105,054

Snippet 21.3 Evaluating all trajectories Note that this procedure selects an globally optimal trajectory without relying on convex optimization. A solution will be found even if the covariance matrices are ill-conditioned, transaction cost functions are non-continuous, etc. The price we pay for this generality is that calculating the solution is extremely computationally intensive. Indeed, evaluating all trajectories is similar to the traveling-salesman problem. Digital computers are inadequate for this sort of NP-complete or NP-hard problems; however, quantum computers have the advantage of evaluating multiple solutions at once, thanks to the property of linear superposition. The approach presented in this chapter set the foundation for Rosenberg et al. [2016], which solved the optimal trading trajectory problem using a quantum annealer. The same logic can be applied to a wide range on financial problems involving path dependency, such as a trading trajectory.

RDF Database Systems: Triples Storage and SPARQL Query Processing by Olivier Cure, Guillaume Blin

All these DL names define a given language with a set of constructors, such as conjunctions, disjunctions, negations, quantifiers, cardinalities, etc. These languages are also characterized by complexities for a set of selected reasoning problems. For example, with DLs like ALC (considered the Storage and Indexing of RDF Data first DL of interest due to its set of constructors) or SHIQ, the combined complexities of CQ answering are respectively Exptime-complete and 2-ExpTime-complete and NP-complete for data complexity. Besides these high complexities, some lightweight DLs have been identified with polynomial complexity for CQ answering. The DL-Lite family is one of them and it motivated the creation of the OWL2QL recommendation. The basic DL-Lite logic provides key conceptual modeling constructs like concept inclusion and disjointness, as well as domain and range constraints; other DL-Lite logics support functionality constraints, property inclusion and disjointness, and data types.

pages: 404 words: 113,514

Atrocity Archives by Stross, Charles

Anyway, the theorem has been rediscovered periodically ever since; it has also been suppressed efficiently, if a little bit less violently, because nobody wants it out in the open where Joe Random Cypherpunk can smear it across the Internet. The theorem is a hack on discrete number theory that simultaneously disproves the Church-Turing hypothesis (wave if you understood that) and worse, permits NP-complete problems to be converted into P-complete ones. This has several consequences, starting with screwing over most cryptography algorithms--translation: all your bank account are belong to us--and ending with the ability to computationally generate a Dho-Nha geometry curve in real time. This latter item is just slightly less dangerous than allowing nerds with laptops to wave a magic wand and turn them into hydrogen bombs at will.

Entangled Life: How Fungi Make Our Worlds, Change Our Minds & Shape Our Futures by Merlin Sheldrake

Valles-Colomer M, Falony G, Darzi Y, Tigchelaar EF, Wang J, Tito RY, Schiweck C, Kurilshikov A, Joossens M, Wijmenga C, et al. 2019. The neuroactive potential of the human gut microbiota in quality of life and depression. Nature Microbiology: 623–32. van Delft FC, Ipolitti G, Nicolau DV, Perumal A, Kašpar O, Kheireddine S, Wachsmann-Hogiu S, Nicolau DV. 2018. Something has to give: scaling combinatorial computing by biological agents exploring physical networks encoding NP-complete problems. Journal of the Royal Society Interface Focus 8: 20180034. van der Heijden MG. 2016. Underground networking. Science 352: 290–91. van der Heijden MG, Bardgett RD, Straalen NM. 2008. The unseen majority: soil microbes as drivers of plant diversity and productivity in terrestrial ecosystems. Ecology Letters 11: 296–310. van der Heijden MG, Dombrowski N, Schlaeppi K. 2017. Continuum of root-fungal symbioses for plant nutrition.

pages: 1,737 words: 491,616

Rationality: From AI to Zombies by Eliezer Yudkowsky

If you limit yourself to simple Boolean propositions of the form ((A or B or C) and (B or ¬C or D) and (D or ¬A or ¬C)…), conjunctions of disjunctions of three variables, then this is a very famous problem called 3-SAT, which is one of the first problems ever to be proven NP-complete. So just because you don’t see a contradiction in the Zombie World at first glance, it doesn’t mean that no contradiction is there. It’s like not seeing a contradiction in the Riemann Hypothesis at first glance. From conceptual possibility (“I don’t see a problem”) to logical possibility, in the full technical sense, is a very great leap. It’s easy to make it an NP-complete leap, and with first-order theories you can make it arbitrarily hard to compute even for finite questions. And it’s logical possibility of the Zombie World, not conceptual possibility, that is needed to suppose that a logically omniscient mind could know the positions of all the atoms in the universe, and yet need to be told as an additional non-entailed fact that we have inner listeners.

Chalmers, The Conscious Mind. 222 Zombie Responses I’m a bit tired today, having stayed up until 3 a.m. writing yesterday’s >6,000-word essay on zombies, so today I’ll just reply to Richard, and tie up a loose end I spotted the next day. (A) Richard Chappell writes: A terminological note (to avoid unnecessary confusion): what you call “conceivable,” others of us would merely call “apparently conceivable.” The gap between “I don’t see a contradiction yet” and “this is logically possible” is so huge (it’s NP-complete even in some simple-seeming cases) that you really should have two different words. As the zombie argument is boosted to the extent that this huge gap can be swept under the rug of minor terminological differences, I really think it would be a good idea to say “conceivable” versus “logically possible” or maybe even have a still more visible distinction. I can’t choose professional terminology that has already been established, but in a case like this, I might seriously refuse to use it.

The Art of Computer Programming: Sorting and Searching by Donald Ervin Knuth

[N—t — 1:N — t], where t = mn + n + 1. a) Describe all inputs (xq, x\ ,..., xn) that are not sorted by such a network, in terms of the behavior of the special subnetwork. b) Given a set of clauses such as (yi V y^ V y-i) A (y2 V y3 V y^) A ..., explain how to construct a special subnetwork such that Fig. 61 sorts all inputs if and only if the clauses are unsatisfiable. [Hence the task of deciding whether a comparator sequence forms a sorting network is co-NP-complete, in the sense of Section 7.9.] 3.3.4 NETWORKS FOR SORTING 243 53. [30] (Periodic sorting networks.) The following two 16-networks illustrate general recursive constructions of t-level networks for n = 2* in the case t = 4: I tx (a) (b) If we number the input lines from 0 to 2* — 1, the Zth level in case (a) has comparators [i:j] where i mod 2t+1~l < 2t~l and j = i © Bt+1~l - 1); there are t2t~~1 comparators altogether, as in the bitonic merge.

Nikitin, Andrei Ivanovich (Hhkhthh, AHflpeH HBaHOBH^i), 351. Nitty-gritty, 317-343, 357-379. Nodine, Mark Howard, 698. Non-messing-up theorem, 90, 238, 669. Nondeterministic adversary, 219. Norlund, Niels Erik, 638. Normal deviate, 69. Normal distribution, approximately, 45, 650. Norwegian language, 520. Noshita, Kohei (IFTS ?), 213, 218. Notations, index to, 752-756. Novelli, Jean-Christophe, 614. NOW-Sort, 390. NP-complete problems, 242. Nsort, 390. Null permutation, 25, 36. Number-crunching computers, 175, 381, 389-390. Numerical instability, 41. Nyberg, Christopher, 390. Oberhettinger, Fritz, 131. Oblivious algorithms, 219-220. O'Connor, Daniel J., 225. Octrees, 565. INDEX AND GLOSSARY 771 j Odd-even merge, 223-226, 228, 230, J(l 243, 244. Odd-even transposition sort, 240. Odd permutations, 19, 196. Odell, Margaret K., 394.

pages: 574 words: 164,509

Superintelligence: Paths, Dangers, Strategies by Nick Bostrom

is a televised game show with trivia questions about history, literature, sports, geography, pop culture, science, and other topics. Questions are presented in the form of clues, and often involve wordplay. Poker Varied Computer poker players remain slightly below the best humans for full-ring Texas hold ‘em but perform at a superhuman level in some poker variants.52 FreeCell Superhuman Heuristics evolved using genetic algorithms produce a solver for the solitaire game FreeCell (which in its generalized form is NP-complete) that is able to beat high-ranking human players.53 Go Very strong amateur level As of 2012, the Zen series of go-playing programs has reached rank 6 dan in fast games (the level of a very strong amateur player), using Monte Carlo tree search and machine learning techniques.54 Go-playing programs have been improving at a rate of about 1 dan/year in recent years. If this rate of improvement continues, they might beat the human world champion in about a decade.

pages: 721 words: 197,134

Data Mining: Concepts, Models, Methods, and Algorithms by Mehmed Kantardzić

For different tests, even for a different order of their application, different trees will be generated. Ideally, we would like to choose a test at each stage of sample-set splitting so that the final tree is small. Since we are looking for a compact decision tree that is consistent with the training set, why not explore all possible trees and select the simplest? Unfortunately, the problem of finding the smallest decision tree consistent with a training data set is NP-complete. Enumeration and analysis of all possible trees will cause a combinatorial explosion for any real-world problem. For example, for a small database with five attributes and only 20 training examples, the possible number of decision trees is greater than 106, depending on the number of different values for every attribute. Therefore, most decision tree-construction methods are non-backtracking, greedy algorithms.

pages: 496 words: 174,084

Masterminds of Programming: Conversations With the Creators of Major Programming Languages by Federico Biancuzzi, Shane Warden

Brian maintained a log of major decisions we made as we designed the language. I found his log invaluable. Computer Science What constitutes research in computer science? Al: This is a wonderful question, and one that does not have a well-defined answer. I think computer science research has broadened enormously in its scope. We still have the deep, unsolved, quintessential questions of computer science: how do we prove that a problem like factoring or an NP-complete problem is actually hard; how do we model complex systems like a human cell or the human brain; how can we construct scalable, trustworthy systems; how can programmers build arbitrarily reliable software; how can we make software with human-friendly characteristics like emotion or intelligence; how far can we extend Moore’s Law? Today, the scale and scope of computing has exploded. We are trying to organize and access all of the world’s information, and computers and computation are affecting all areas of daily life.