Recently I have been getting a lot of visits (well what I consider a lot) to this blog because of the Knapsack Problem. Since this seems to be so popular, and people don’t want to go to Wikipedia, I suppose I will dedicate at least one post to the topic.
Suppose that you have a collection of objects , with respective weights and value . Your task is to find a subset of these items with maximum value. However, this is one condition, the items you choose must be able to fit in a bag that can only carry a maximum weight of before breaking.
For those of you that are more mathematically inclined, and want to see the short version of the problem, here it is;
Where means that you choose to take object , and means that you do not.
I will present two different approaches to solving this problem. Both these approaches can be applied to a wide range of problems, so the reasoning behind the algorithms is more important than the particular cases that follow.
In the solutions it is assumed that , and means that for .
Branch and Bound
In this approach, instead of solving the problem as it is presented in we consider the relaxation
You should notice that any feasible solution (one that satisfies the constraints) to is also a feasible solution to . The reason we consider the relaxation is because there might be a way to quickly obtain an answer.
So suppose that we find to be the optimal solution to . If it turns out that we also have that is optimal for . Well this is not interesting at all, so without loss of generality suppose that . So even though is optimal for it is not feasible for . To fix this small oversight we split into two subcases:
Now the optimal value of each of these problems will have . Furthermore, if we let be the optimal values of it is clear that and Now suppose we have found to be the optimal solution to and for , and .
If satisfies the constraints of then we are finished. If not, we then split into two problems, similar to how we split into two problems. Now these problems give optimal values . We then see which of is largest, and check to make sure that its optimal point satisfies the constraints of . If it does not, we just split it into two problems, solve those problems, and compare our list of optimal values and points.
I’ll leave it to you to show that this process will not only terminate, but also find the optimal solution to .
The reason for the Branch in the name of this approach is obvious. If our optimal solution to is not feasible for we split (branch) it into two subproblems.
To see the reason for Bound is as follows. Suppose that we perform the process and have a set of optimal values and optimal points . Then the solution to has optimal value of at most . However, if we must split because does not satisfy the necessary constraints, then the optimal values from the subproblems will be at most . So we therefore have a bound on the possible optimal value.
The Branch and Bound approach rest on the assumption that you are able to solve the relaxed problem. Usually it is assumed that solving the relaxed problem is not only possible, but can be done rather quickly.
I won’t go into too much (if any) detail about what dynamic programming is. However, I will say this; the main idea is that the optimal solution for can be built from the optimal solution for , kind of like induction if you think about it.
In our case we will let be the maximum value we can achieve by only using objects and with maximum weight of . Thus our desired answer is .
For the sake of simplicity we will define for all . Now how do we determine the value of Well there are two cases
In this case it is not possible to add object without exceeding out weight constraint. Thus .
In this case we have two choices. We can either take object or we can leave it behind.
If we leave it behind then
However, if we choose to take it we must remember that the total weight is equal to . Therefore, if object is taken we have that
In this case
As you can see, the dynamic programming solution is easier to program, but I feel that it is also nice to know the branch and bound approach. After reading the above solutions you should be able to quite easily create your own code. For the dynamic programming approach I suggest using arrays since recursion can be quite expensive. As for the branch and bound solution, I suggest using some sort of heap in order to store the information about the subproblems that you will create.
Here are some other problems that can be though of as variants of the Knapsack problem, even though at first glance they might not appear to be. All of these problems can be solve by adjusting the above solutions to the original Knapsack problem. Although the above solutions might work, they might not be the most efficient.
Given an integer and a collection of integers does there exist a subset such that
Case 1: and .
In this case our optimization problem can be written as
So if the answer to this problem is then such a subset does exist. Otherwise, the answer is no.
If and then just multiply by and solve the problem in the above form.
The problem can now be written as (although it is no longer a linear program)
If the answer is then there is a subset sums to , otherwise there is not.
Suppose that there are copies of each item. Then we can either treat each copy as unique, or instead solve