What is an NP-complete problem? Why is it such an important topic in computer science?
Honestly, Wikipedia might be the best place to look for an answer to this.
If NP = P, then we can solve very hard problems much faster than we thought we could before. If we solve only one NP-Complete problem in P (polynomial) time, then it can be applied to all other problems in the NP-Complete category.
It's a class of problems where we must simulate every possibility to be sure we have the optimal solution.
There are a lot of good heuristics for some NP-Complete problems, but they are only an educated guess at best.
an NP problem is one where a computer algorithm that verifies a solution can be created in polynomial time.
an NP-Complete problem is NP, but also if you can solve it in polynomial time (called P) then all NP problems are P.
So get crackin'.
NP stands for Non-deterministic Polynomial time.
This means that the problem can be solved in Polynomial time using a Non-deterministic Turing machine (like a regular Turing machine but also including a non-deterministic "choice" function). Basically, a solution has to be testable in poly time. If that's the case, and a known NP problem can be solved using the given problem with modified input (an NP problem can be reduced to the given problem) then the problem is NP complete.
The main thing to take away from an NP-complete problem is that it cannot be solved in polynomial time in any known way. NP-Hard/NP-Complete is a way of showing that certain classes of problems are not solvable in realistic time.
Edit: As others have noted, there are often approximation solutions for NP-Complete problems. In this case, the approximation solution usually gives a approximation bound using special notation which tells us how close the approximation is.
NP-Complete means something very specific and you have to be careful or you will get the definition wrong. First, an NP problem is a yes/no problem such that
- There is polynomial-time proof for every instance of the problem with a "yes" answer that the answer is "yes", or (equivalently)
- There exists a polynomial-time algorithm (possibly using random variables) that has a non-zero probability of answering "yes" if the answer to an instance of the problem is "yes" and will say "no" 100% of the time if the answer is "no." In other words, the algorithm must have a false-negative rate less than 100% and no false positives.
A problem X is NP-Complete if
- X is in NP, and
- For any problem Y in NP, there is a "reduction" from Y to X: a polynomial-time algorithm that transforms any instance of Y into an instance of X such that the answer to the Y-instance is "yes" if and only if the answer X-instance is "yes".
If X is NP-complete and a deterministic, polynomial-time algorithm exists that can solve all instances of X correctly (0% false-positives, 0% false-negatives), then any problem in NP can be solved in deterministic-polynomial-time (by reduction to X).
So far, nobody has come up with such a deterministic polynomial-time algorithm, but nobody has proven one doesn't exist (there's a million bucks for anyone who can do either: the is the P = NP problem). That doesn't mean that you can't solve a particular instance of an NP-Complete (or NP-Hard) problem. It just means you can't have something that will work reliably on all instances of a problem the same way you could reliably sort a list of integers. You might very well be able to come up with an algorithm that will work very well on all practical instances of a NP-Hard problem.
Its a class of problems. The class P consists of those problems that are solvanle in polynomial time. For example, they could be solved in O(n^k) for some constant k, where n is the size of the input. Simply put, you can write a program that will run in reasonable time.
The class NP consists of those problems that are verifiable in polynomial time. That is, if we are given a potential solution then we could check if the given solution is correct in polynomial time.
Some examples are the Boolean Satisfiability (SAT) problem, or the Hamiltonian-cycle problem. There are many problems that are known to be in the class NP. NP-complete means the problem is at least as hard as any problem in NP.
It is important to computer science because it has been proven that any problem in NP can be transformed into another problem in NP-complete. That means that a solution to any one NP-complete problem is a solution to all NP problems.
Many algorithms in security depends on the fact that no known solutions exist for NP hard problems. It would definitely have a significant impact on computing if a solution were found.
If you're looking for an example of an NP-complete problem then I suggest you take a look at 3-SAT.
The basic premise is you have an expression in conjunctive normal form, which is a way of saying you have a series of expressions joined by ORs that all must be true:
(a or b) and (b or !c) and (d or !e or f) ...
The 3-SAT problem is to find a solution that will satisfy the expression where each of the OR-expressions has exactly 3 booleans to match:
(a or !b or !c) and (!a or b or !d) and (b or !c or d) ...
A solution to this one might be (a=T, b=T, c=F, d=F). However, no algorithm has been discovered that will solve this problem in the general case in polynomial time. What this means is that the best way to solve this problem is to do essentially a brute force guess-and-check and try different combinations until you find one that works.
What's special about the 3-SAT problem is that ANY NP-complete problem can be reduced to a 3-SAT problem. This means that if you can find a polynomial-time algorithm to solve this problem then you get $1,000,000, not to mention the respect and admiration of computer scientists and mathematicians around the world.
The definitions for NP complete problems above is correct, but I thought I might wax lyrical about their philosophical importance as nobody has addressed that issue yet.
Almost all complex problems you'll come up against will be NP Complete. There's something very fundamental about this class, and which just seems to be computationally different from easily solvable problems. They sort of have their own flavour, and it's not so hard to recognise them. This basically means that any moderately complex algorithm is impossible for you to solve exactly -- scheduling, optimising, packing, covering etc.
But not all is lost if a problem you'll encounter is NP Complete. There is a vast and very technical field where people study approximation algorithms, which will give you guarantees for being close to the solution of an NP complete problem. Some of these are incredibly strong guarantees -- for example, for 3sat, you can get a 7/8 guarantee through a really obvious algorithm. Even better, in reality, there are some very strong heuristics, which excel at giving great answers (but no guarantees!) for these problems.
Note that two very famous problems -- graph isomorphism and factoring -- are not known to be P or NP.
We need to separate algorithms and problems. We write algorithms to solve problems, and they scale in a certain way. Although this is a simplification, let's label an algorithm with a 'P' if the scaling is good enough, and 'NP' if it isn't.
It's helpful to know things about the problems we're trying to solve, rather than the algorithms we use to solve them. So we'll say that all the problems which have a well-scaling algorithm are "in P". And the ones which have a poor-scaling algorithm are "in NP".
That means that lots of simple problems are "in NP" too, because we can write bad algorithms to solve easy problems. It would be good to know which problems in NP are the really tricky ones, but we don't just want to say "it's the ones we haven't found a good algorithm for". After all, I could come up with a problem (call it X) that I think needs a super-amazing algorithm. I tell the world that the best algorithm I could come up with to solve X scales badly, and so I think that X is a really tough problem. But tomorrow, maybe somebody cleverer than me invents an algorithm which solves X and is in P. So this isn't a very good definition of hard problems.
All the same, there are lots of problems in NP that nobody knows a good algorithm for. So if I could prove that X is a certain sort of problem: one where a good algorithm to solve X could also be used, in some roundabout way, to give a good algorithm for every other problem in NP. Well now people might be a bit more convinced that X is a genuinely tricky problem. And in this case we call X NP-Complete.
What is NP?
NP is the set of all decision problems (question with yes-or-no answer) for which the 'yes'-answers can be verified in polynomial time (O(n^k) where n is the problem size, and k is a constant) by a deterministic Turing machine. Polynomial time is sometimes used as the definition of fast or quickly.
What is P?
P is the set of all decision problems which can be solved in polynomial time by a deterministic Turing machine. Since it can solve in polynomial time, it can also be verified in polynomial time. Therefore P is a subset of NP.
What is NP-Complete?
A problem x that is in NP is also in NP-Complete if and only if every other problem in NP can be quickly (ie. in polynomial time) transformed into x. In other words:
- x is in NP, and
- Every problem in NP is reducible to x
So what makes NP-Complete so interesting is that if any one of the NP-Complete problems was to be solved quickly then all NP problems can be solved quickly. Also see What’s “P=NP?”, and why is it such a famous question?
What is NP-Hard?
NP-Hard are problems that are at least as hard as the hardest problems in NP. Note that NP-Complete problems are also NP-hard. However not all NP-hard problems are NP (or even a decision problem), despite having 'NP' as a prefix. That is the NP in NP-hard does not mean 'non-deterministic polynomial time'. Yes this is confusing but its usage is entrenched and unlikely to change.
There is a very good arsdigita lecture on discrete mathematics that explains what an NP-complete problem is.
The first 50 minutes are mainly on boolean algebra. So jump right to the beginning of minute 53 if you are only interested in the concepts of P, NP, NP-completeness, the boolean satisfiability problem and reduction.