Professional Documents
Culture Documents
Jacek Dzikowski
Illinois Institute of Technology
e-mail: dzikjac@iit.edu
Abstract:
The goal of this report was to implement compare four different 0/1 Knapsack algorithms (brute force,
backtracking, best-first branch and bound, and dynamic programming) in terms of their performance on various
sets of data. All algorithms were first programmed using object-oriented programming language (C++) and then
carefully tested with specially generated data sets. The results that were gathered during these tests served as a
basis for further analysis, discussion and drawing conclusions concerning the performance characteristics of all
four algorithms. Final discussion leads to the conclusion that backtracking method seems to be the best in certain
conditions, while Brute Force is evidently the least efficient algorithm as expected.
JACEK D ZIKOWSKI
www.iit.edu/~dzikjac e-mail: dzikjac@iit.edu
Introduction
The 0/1 Knapsack problem is an optimisation problem. The idea is to select an optimal (most valuable) subset of
the set of all items (each characterised by its value and weight), knowing the maximum weight limit. Solutions to
this problem are widely used for example in loading/delivery applications.
There are several existing approaches/algorithms to this problem. Obviously, and it will be proved later, all of
them led to the same result, but the methods used are different. In this report the focus will be on the following
four 0/1 Knapsack algorithms:
q Brute Force – find optimal result by trying all existing item combinations,
q Backtracking – use a State Space concept – find optimal solution by traversing the State Space tree and
backtracking when a node is not promising,
q Best-First Branch and Bound – use a State Space concept – expand the nodes with highest bound first (best-
first) to find the optimal solution,
q Dynamic Programming – create and use the dynamic programming matrix to find the optimal solution.
Obviously, using a different method implies difference in performance (usually also range of application). 0/1
Knapsack algorithms are no different. All four presented algorithmic approaches should be different both in
terms of run-time and resources consumption.
My approach to verify this statement and to find out which algorithm is the best (in given circumstances) was to
prepare a computer program (using an object-oriented programming language) that allows to run all algorithms
on a particular set of data and measure the algorithm performance.
This program would be run in order to collect necessary data (run-time, array sizes, queue sizes, etc.) which will
be further used to analyse the overall performance of 0/1 Knapsack algorithms.
Due to its nature, the Brute Force algorithm is expected to perform the worst [O(2number of items )]. On the other side
the performance of Backtracking and Best-First Branch and Bound algorithms should be much better, however
its performance is strongly dependant upon the input data set. Similar remark applies also to the Dynamic
Programming approach, however here memory utilisation is predictable (more specific the matrix size is
predictable). Of course the cost of increased performance is in memory utilisation (recursive calls stack, queue
size or dynamic programming array).
Algorithm implementation:
q Brute Force – this algorithm checks all the possible combinations of items by means of a simple for loop
(from 1 to 2number of items ). It utilises the binary representation of every combination/number from 1 to 2number of
items
(computed recursively) to calculate total profit and weight. At the same time total weight is being
verified whether it exceeds the Weight Limit or not. If so, total profit is 0. Once a better combination is
discovered it is stored as a result.
q Backtracking – this algorithm utilises the idea of virtual State Space tree. Every branch in this tree
corresponds to one combination. Each node is expanded only if it is promising (additional function is
implemented for this purpose) otherwise the algorithm backtracks. It is a recursive algorithm – recursive
calls are executed only if the node is promising. Total weight and profit is computed accordingly and the
optimal solution stored similarly like in Brute Force algorithm.
q Best-First Branch and Bound – another State Space tree algorithm, however its implementation and ideas
are different. There is no recursion in my implementation of this algorithm. Instead it uses a priority queue
to traverse along the State Space tree (and to keep several states in memory – unlike Brute Force and
Backtracking algorithms, which keep only one state at the time). Here the nodes are enqueued into the queue
2
JACEK D ZIKOWSKI
www.iit.edu/~dzikjac e-mail: dzikjac@iit.edu
only if their bound (computed by means of additional function) is satisfactory/promising (highest bound is
the priority). The implementation utilises a simple while loop [while (Queue not empty)] to obtain the result.
Optimal solution is computed similarly as in previous algorithms.
q Dynamic Programming – here a two-dimensional dynamic programming is used (of size number of
items*weight limit). It stores solutions to all sub-problems that may constitute an optimal solution. The
algorithm starts by filling first row and column with zeros and then filling it until it finds the optimal
solution as in other dynamic programming algorithms. Filling the array is realised by means of two nested
loops and two if statements. Optimal profit will be then the last (bottom right) entry in the array and
corresponding weight is then extracted by a single while loop.
The implementation of main functions representing these algorithms is presented in the appendices. Pseudo-code
that was a basis for the implementation of Backtracking and Best-First Branch and Bound algorithm can be
found in [1].
Objects
The program defines several objects:
q item – a structure for storing item information (value, weight, value/weight ratio). Items are further stored in
an Item_Array,
q node – a class defining a State Space tree node for Backtracking and Best-First Branch and Bound
algorithms. It defines such variables as Profit, Weight, Level and Bound together with functions to set and
extract these values. Additionally it defines overloading of relational operators for queueing purposes,
q test – main object containing the array of items and implementation of all four algorithms together with their
supporting functions and other auxiliary functions (loading data, sorting, etc,).
q generator – a class defining the data set generator,
q userInterface – a class defining the user interface (menus, timer).
Sorting
Algorithms other than Brute Force require its input (item) array to be sorted according to the value/weight ratio
of the item. I implemented Quick Sort algorithm for this task. Sorting is done before the run-time is computed,
hence it does not influence the overall performance measurement.
Data sets:
The generator mechanism that is included in the program generates a data file of the following format:
Number_Of_Items
Item_Number Item_Value Item_Weight (one to a line)
Weight_Limit
10
1 12 43
2 8 48
3 62 46
4 2 11
5 7 48
6 87 97
7 27 5
8 77 59
9 24 85
10 60 6
145
The generator allows specifying the size of the sample and the range of values. Additionally, data can either
strongly correlated or uncorrelated.
Measurements:
All algorithms were tested using the same sets of data (generated with parameters: number of items n, range of
values 100, type of data 1/3, seed 300). Number of items (5, 10, 15, 20, 30, 40, 80, 160) and type of data
(Uncorrelated/Strongly Correlated) - variable. I decided to expand the range of number of items due to the fact,
that Brute Force algorithm was not working for inputs with n>30 (my decimal to binary algorithm was not
working with data structures better than long).
All tests (excluding Brute Force for n>10) were repeated 100 000 times to obtain better accuracy.
3
JACEK D ZIKOWSKI
www.iit.edu/~dzikjac e-mail: dzikjac@iit.edu
Metrics:
q Run-time – calculated for every algorithm; measured in seconds using timer implemented in class
userInterafce (it stores the system time at the start and at the end of running the algorithm and then computes
the difference),
q Number of expanded nodes/combinations – a metric used to show how many nodes/combinations an
algorithm has to check in order to find the right solution. Used with Backtracking, Best-First Branch and
Bound and Brute Force algorithms,
q Queue size – maximum priority queue size that was reached during the execution of the Best-First Branch
and Bound algorithm. Memory utilisation metric,
q Number of recursive calls – maximum number of recursive calls that are not finished yet during the
execution of the Backtracking algorithm. Memory utilisation metric.
q Dynamic Programming array size – Memory utilisation metric.
Results
Performance results:
The performance results that I obtained after running tests for each algorithm with uncorrelated data are shown
in Table 1. Additionally the solutions to the 0/1 Knapsack problem for the given uncorrelated data sets are
presented in Table 2.
Best run-time performance results are indicated in Table 1 with bold face.
Table 2 . Optimal profit weight results for different number of items (uncorrelated data).
The graphical representation of the runtime results for uncorrelated data is shown on Figure 1. Since the
differences are significant the graph is plotted in using semi-logarithmic scale.
4
JACEK D ZIKOWSKI
www.iit.edu/~dzikjac e-mail: dzikjac@iit.edu
2
10 --- Brute Force
- - Dynamic Programming
-.- Best-First Branch and Bound
1
10 ... Backtracking
0
10
Execution time [s]
-1
10
-2
10
-3
10
-4
10
-5
10
The performance results that I obtained after running tests for each algorithm with strongly correlated data are
shown in Table 3. Additionally the solutions to the 0/1 Knapsack problem for the given strongly correlated data
sets are presented in Table 4.
Best run-time performance results are indicated in Table 3 with bold face.
Table 3. Performance comparison of various 0/1 Knapsack algorithms (strongly correlated data).
Table 4. Optimal profit weight results for different number of items (strongly correlated data).
5
JACEK D ZIKOWSKI
www.iit.edu/~dzikjac e-mail: dzikjac@iit.edu
The graphical representation of the runtime results for strongly correlated data is shown on Figure 2. Since the
differences are significant the graph is plotted in using semi-logarithmic scale.
0
10
Execution time [s]
-1
10
-2
10
-3
10
-4
10
-5
10
20 40 60 80 100 120 140 160
Number of items
q Brute Force
Normal scale Semilog scale
600
2
Execution time [s] (solid - uncorrelated, dashed - correlated)
10
500
1
10
400 0
10
-1
300 10
-2
10
200
-3
10
100
-4
10
5 10 15 20 25 30 5 10 15 20 25 30
Number of items Number of items
6
JACEK D ZIKOWSKI
www.iit.edu/~dzikjac e-mail: dzikjac@iit.edu
q Backtracking
-3
x 10 Normal scale Semilog scale
-2
11 10
10
Execution time [s] (solid - uncorrelated, dashed - correlated)
8 -3
10
5 -4
10
4
2 -5
10
1
0.8
Execution time [s] (solid - uncorrelated, dashed - correlated)
-1
0.7 10
0.6
-2
10
0.5
0.4 -3
10
0.3
-4
0.2 10
0.1
-5
10
20 40 60 80 100 120 140 160 20 40 60 80 100 120 140 160
Number of items Number of items
Figure 5. Uncorrelated data vs. correlated data run-time (Best-First Branch and Bound).
7
JACEK D ZIKOWSKI
www.iit.edu/~dzikjac e-mail: dzikjac@iit.edu
q Dynamic Programming
-2
0.02
Execution time [s] (solid - uncorrelated, dashed - correlated)
0.018
0.016
0.014
-3
10
0.012
0.01
0.008
-4
0.006 10
0.004
0.002
Memory utilisation:
Following are the graphs that depict how all four algorithms behave in terms of memory utilisation (and
changing data type). These graphs provide only general quantitative sense of algorithm memory demands since
the metrics I used are very general (no actual precise measurement). All graphs are meant only to illustrate the
tendencies, which is enough for this analysis. Also any quantitative comparison based on these data and graphs
will not be valid since comparing number of recursive calls with the array size (without going into details) makes
no sense here.
5
10
2
10
1
10
0
10
20 40 60 80 100 120 140 160
Number of items
8
JACEK D ZIKOWSKI
www.iit.edu/~dzikjac e-mail: dzikjac@iit.edu
5
10
--- Brute Force
- - Dynamic Programming
-.- Best-First Branch and Bound
4 ... Backtracking
10
3
10
2
10
1
10
q Brute Force
25
(solid - uncorrelated, dashed - correlated)
20
15 1
10
10
5
5 10 15 20 25 30 5 10 15 20 25 30
Number of items Number of items
Figure 9. Uncorrelated data vs. correlated data - Memory utilization (Brute Force).
9
JACEK D ZIKOWSKI
www.iit.edu/~dzikjac e-mail: dzikjac@iit.edu
2
140 10
(solid - uncorrelated, dashed - correlated)
100
80
60
40
1
10
20
Figure 10. Uncorrelated data vs. correlated data - Memory utilization (Backtracking).
12000
(solid - uncorrelated, dashed - correlated)
10000 3
10
8000
2
10
6000
4000
1
10
2000
0
10
20 40 60 80 100 120 140 160 20 40 60 80 100 120 140 160
Number of items Number of items
Figure 11. Uncorrelated data vs. correlated data - Memory utilization (Best-First Branch and Bound).
10
JACEK D ZIKOWSKI
www.iit.edu/~dzikjac e-mail: dzikjac@iit.edu
5
x 10 Normal scale Semilog scale
3.5
3 5
10
(solid - uncorrelated, dashed - correlated)
4
10
1.5
0.5 3
10
Figure 12. Uncorrelated data vs. correlated data - Memory utilization (Dynamic Programming).
Discussion
The first and obvious conclusion that appears when we look at the data in Tables 2 and 4 is that all algorithms
indeed provide the same exact solution to the 0/1 Knapsack problem. The differences are in performance and
resources requirements.
11
JACEK D ZIKOWSKI
www.iit.edu/~dzikjac e-mail: dzikjac@iit.edu
Dynamic Programming – very good?
As it was stated before in some circumstances a Dynamic Programming Approach can become the best solution.
Analysing Figures 1 and 2 suggests that both Backtracking and Best-First Branch and Bound algorithms can be
better than Dynamic Programming algorithm. However, Dynamic Programming approach still provides a decent
run-time performance [O(number of items * Weight Limit)]. Additionally Dynamic Programming approach has
important advantages – similarly to Brute Force algorithm both its run-time performance and memory
requirements are data set independent (see Figures 6 and 12). Furthermore its memory requirements are
predictable (meaning that we know the size of an array once we know the input), which is not the case of
Backtracking and Best-First Branch and Bound approach (only worst case scenario can be estimated). Hence its
performance and requirements can be estimated for any valid set of data. Now, most of the real-world
applications involve various sets of data – both uncorrelated and strongly correlated. This is the application
where Dynamic Programming will perform very well, but the cost will be having a large amount of memory
allocated for the array.
Summary
Definitely Brute Force algorithm is the worst among all four. It can find however some limited application (for
small sized data sets). Both Backtracking and Best-First Branch and Bound algorithms are comparable in terms
of performance, however Backtracking memory requirements are independent of the input type, which is an
advantage. Finally Dynamic Programming approach seems to be worse at the first glance then Backtracking and
Best-First Branch and Bound, but on the average (various data type and range) it can be better (especially than
Best-First Branch and Bound). However this is by the expense of having a fixed large array in memory.
Acknowledgements
I would like to acknowledge Mr. David Pisinger for his generator code that I adapted and included in my
program.
References
1. R. E. Neapolitan, K. Naimipour, “Foundations of algorithms using C++ pseudocode”, Jones and Bartlet
Publishers.
2. T.H. Cormen, C.E. Leiserson, R.L. Rivest, “Introduction to algorithms”, McGraw-Hill 1990.
Appendices
Algorithm implementation:
q Brute Force
void knapsackBruteForce()
//Main brute force 0/1 Knapsack routine
{
//Initialize variables
int Optimal_Profit=0;
int Optimal_Weight=0;
12
JACEK D ZIKOWSKI
www.iit.edu/~dzikjac e-mail: dzikjac@iit.edu
for(long i=1;i<knapsackBruteForce_TwoRaisedTo(Number_Of_Items);i++)
{
//Calculate Profit for a given combination of items
knapsackBruteForce_CalculateProfit(i,0,0,0);
//If obtained profit greater than the one so far...
if(Result_Array[0]>Optimal_Profit)
{
//Remember it
Optimal_Profit=Result_Array[0];
Optimal_Weight=Result_Array[1];
}
}
//Store the final result in the Result Array
Result_Array[0]=Optimal_Profit;
Result_Array[1]=Optimal_Weight;
return;
}
q Backtracking
void knapsackBackTracking(int Index,int Profit,int Weight)
//Backtracking algorithm for 0/1 Knapsack problem
{
//If the result matches requirements - store it
if(Weight<=Weight_Limit && Profit>Result_Array[0])
{
Result_Array[0]=Profit;
Result_Array[1]=Weight;
}
//If next Item is promising...
if(knapsackBackTracking_Promising(Index,Profit,Weight))
{
//...run recursive formulas for 1 (add)
knapsackBackTracking(Index+1,Profit+Item_Array[Index+1].Value,Weight+Item_Array[Index+1].Wei
ght);
//...and 0 (don't add)
knapsackBackTracking(Index+1,Profit,Weight);
}
return;
}
13
JACEK D ZIKOWSKI
www.iit.edu/~dzikjac e-mail: dzikjac@iit.edu
//if(My_Queue.size()>Info[0]) Info[0]=My_Queue.size();
//Dequeue first item and store it in node V
Node_V=My_Queue.top();
My_Queue.pop();
//If Bound is greater than the profit so far
if(Node_V.returnBound()>Result_Array[0])
{
//Reset node U (up a level, copy V's values)
Node_U.setLevel(Node_V.returnLevel()+1);
Node_U.setWeight(Node_V.returnWeight()+Item_Array[Node_U.returnLevel()].Weight);
Node_U.setProfit(Node_V.returnProfit()+Item_Array[Node_U.returnLevel()].Value);
//If node U's weight and profit match requuirements
if(Node_U.returnWeight()<=Weight_Limit &&
Node_U.returnProfit()>Result_Array[0])
{
//Store optimal (so far) values
Result_Array[0]=Node_U.returnProfit();
Result_Array[1]=Node_U.returnWeight();
}
//Calculate new bound
Node_U.setBound(knapsackBound(Node_U));
//If new U's bound greater than profit so far...
if(Node_U.returnBound()>Result_Array[0])
{
//...enqueue U
My_Queue.push(Node_U);
}
//Reset node U
Node_U.setWeight(Node_V.returnWeight());
Node_U.setProfit(Node_V.returnProfit());
Node_U.setBound(knapsackBound(Node_U));
//If new U's bound greater than profit so far...enqueue U
if(Node_U.returnBound()>Result_Array[0]) My_Queue.push(Node_U);
if(My_Queue.size()>Info[0]) Info[0]=My_Queue.size();
}
}
return;
}
q Dynamic Programming
void knapsackDynamicProgramming()
//0/1 Knapsack - Dynamic Programming algorithm
{
//Initialize variables
int w,i;
item Temp_Item;
int** DP_Array;
//Array size
Info[0]=Weight_Limit+1;
Info[1]=Number_Of_Items;
14
JACEK D ZIKOWSKI
www.iit.edu/~dzikjac e-mail: dzikjac@iit.edu
//Main loop
for(i=1;i<Number_Of_Items;i++)
{
//Initialize Temp_Item
Temp_Item=Item_Array[i-1];
for(w=0;w<Weight_Limit+1;w++)
{
//Search for the optimal solution...
if(Temp_Item.Weight<=w)
{
//...by going through the array
if(Temp_Item.Value+DP_Array[i-1][w-Temp_Item.Weight]>DP_Array[i-
1][w])
{
DP_Array[i][w]=Temp_Item.Value+DP_Array[i-1][w-
Temp_Item.Weight];
}
else DP_Array[i][w]=DP_Array[i-1][w];
}
else DP_Array[i][w]=DP_Array[i-1][w];
}
}
//Store obtained optimal profit
Result_Array[0]=DP_Array[Number_Of_Items-1][Weight_Limit];
while(i==DP_Array[Number_Of_Items-1][w]) w--;
//Store optimal weight
Result_Array[1]=w+1;
//Delete 2D array
for(int j=0;j<Number_Of_Items;j++) delete [] DP_Array[j];
delete [] DP_Array;
return;
}
15