You are on page 1of 22

Performance of Evolutionary Algorithms on NK Landscapes with Nearest Neighbor

Interactions and Tunable Overlap

Martin Pelikan, Kumara Sastry, David E. Goldberg, Martin V. Butz, and Mark Hauschild

MEDAL Report No. 2009002

January 2009

Abstract
This paper presents a class of NK landscapes with nearest-neighbor interactions and tunable overlap. The
considered class of NK landscapes is solvable in polynomial time using dynamic programming; this allows us to
generate a large number of random problem instances with known optima. Several variants of standard genetic
algorithms and estimation of distribution algorithms are then applied to the generated problem instances. The
results are analyzed and related to scalability theory for selectorecombinative genetic algorithms and estimation
of distribution algorithms.

Keywords
NK fitness landscape, hierarchical BOA, genetic algorithm, univariate marginal distribution algorithm, performance
analysis, scalability, crossover, hybridization.

Missouri Estimation of Distribution Algorithms Laboratory (MEDAL)


Department of Mathematics and Computer Science
University of Missouri–St. Louis
One University Blvd., St. Louis, MO 63121
E-mail: medal@cs.umsl.edu
WWW: http://medal.cs.umsl.edu/
Performance of Evolutionary Algorithms on NK Landscapes with
Nearest Neighbor Interactions and Tunable Overlap
Martin Pelikan, Kumara Sastry, David E. Goldberg, Martin V. Butz, and Mark Hauschild

Abstract
This paper presents a class of NK landscapes with nearest-neighbor interactions and tunable
overlap. The considered class of NK landscapes is solvable in polynomial time using dynamic
programming; this allows us to generate a large number of random problem instances with
known optima. Several variants of standard genetic algorithms and estimation of distribution
algorithms are then applied to the generated problem instances. The results are analyzed
and related to scalability theory for selectorecombinative genetic algorithms and estimation of
distribution algorithms.

Keywords: NK fitness landscape, hierarchical BOA, genetic algorithm, univariate marginal dis-
tribution algorithm, performance analysis, scalability, crossover, hybridization.

1 Introduction
Testing on random instances of challenging classes of problems is an important approach to testing
optimization techniques. NK fitness landscapes (Kauffman, 1989; Kauffman, 1993) were introduced
by Kauffman as tunable models of rugged fitness landscape and are one of the most popular classes of
problems used in testing optimization techniques on random problem instances. An NK landscape is
a function defined on binary strings of fixed length and is characterized by two parameters: (1) n for
the overall number of bits and (2) k for the neighborhood size. For each bit, k neighbors are specified
and a function is given that determines the fitness contribution for any combination of values of the
bit and its neighbors. In general, neighbors for each bit can be chosen arbitrarily. NK landscapes
are NP-complete for k > 1, although some variants of NK landscapes are polynomially solvable
and there exist approximation algorithms for other cases (Wright, Thompson, & Zhang, 2000; Gao
& Culberson, 2002; Choi, Jung, & Kim, 2005). Nonetheless, NK landscapes remain a challenge for
any optimization algorithm and they are also interesting from the perspective of complexity theory
and computational biology (Kauffman, 1993; Altenberg, 1997; Wright, Thompson, & Zhang, 2000;
Gao & Culberson, 2002; Aguirre & Tanaka, 2003; Choi, Jung, & Kim, 2005). Despite that, only few
in-depth empirical studies exist that discuss performance of selectorecombinative genetic algorithms
and estimation of distribution algorithms on NK landscapes and that relate empirical results to
existing scalability theory.
The purpose of this paper is to propose a class of NK landscapes with nearest-neighbor in-
teractions and tunable overlap, and present an in-depth empirical performance analysis of various
genetic and evolutionary algorithms on the proposed class of NK landscapes. The paper considers
the genetic algorithm with twopoint and uniform crossover, the univariate marginal distribution
algorithm, and the hierarchical Bayesian optimization algorithm. All algorithms are improved by
incorporating a simple deterministic local search based on single-bit flips. To provide insight into

1
the effects of overlap between subproblems on algorithm performance, the number of bits that over-
lap between consequent subproblems is controlled by a user-specified parameter. All considered
types of NK landscape instances are challenging yet solvable in polynomial time using dynamic
programming; this allows us to consider a large number of random instances with known optima
in practical time. The results are related to existing scalability theory and interesting directions
for future research are outlined. The work presented in this paper combines and extends some
of the facets of two previous studies on similar problems, specifically, the analysis of evolutionary
algorithms on random additively decomposable problems (Pelikan, Sastry, Butz, & Goldberg, 2006;
Sastry, Pelikan, & Goldberg, 2007) and the analysis of evolutionary algorithms on standard NK
landscapes (Pelikan, Sastry, Butz, & Goldberg, 2008).
The paper starts by describing NK landscapes in section 2. Section 3 outlines the dynamic
programming algorithm which can find guaranteed global optima of the considered instances in
polynomial time. Section 4 outlines the compared algorithms. Section 5 presents experimental
results. Section 6 discusses future work. Finally, section 7 summarizes and concludes the paper.

2 NK Landscapes
This section describes NK landscapes. First, approaches to testing evolutionary algorithms are
briefly discussed. The general form of NK landscapes is then described. Next, the classes of NK
landscape instances considered in this paper are discussed and the method used to generate random
problem instances of these classes is outlined.

2.1 Testing Evolutionary Algorithms


There are three basic approaches to testing optimization techniques:

(1) Testing on the boundary of the design envelope using artificial, adversarial test problems. For
example, fully deceptive concatenated traps (Ackley, 1987; Deb & Goldberg, 1991) represent
a class of artificial test problems that can be used to test whether the optimization algorithm
can automatically decompose the problem and exploit the discovered decomposition effectively.
Testing on artificial problems on the boundary of the design envelope is also a common practice
outside standard optimization; for example, consider testing car safety using car crash tests or
testing durability of cell phones using drop tests.

(2) Testing on classes of random problems. For example, to test algorithms for solving maximum
satisfiability (MAXSAT) problems, large sets of random formulas in conjunctive normal form
can be generated and analyzed (Cheeseman, Kanefsky, & Taylor, 1991). Similar approaches to
testing are common outside standard optimization as well; for example, new software products
are often tested by groups of beta testers in order to discover all problems in situations that
were not expected during the testing on the boundary of the design envelope.

(3) Testing on real-world problems or their approximations. For example, the problem of designing
military antennas can be considered for testing (Santarelli, Goldberg, & Yu, 2004). As an
example outside optimization, when a new model of an airplane has been designed and thor-
oughly tested on the boundary of its design envelope, it is ready to take its real-world test—the
first actual flight.

In this paper, we focus on the testing on random classes of problems. More specifically, we
consider random instances of a restricted class of NK landscapes with nearest-neighbor interactions

2
and tunable overlap. All considered instances are solvable in low-order polynomial time with a
dynamic programming algorithm. This allows us to generate a large number of random problem
instances with known optima in practical time. The main focus is on simple hybrids based on
standard selectorecombinative genetic algorithms, estimation of distribution algorithms, and the
deterministic hill climbing based on single-bit flips.
The general form of NK landscapes is described next. Then, the considered class of NK land-
scapes and the procedure for generating random problem instances are outlined.

2.2 Problem Definition


An NK fitness landscape (Kauffman, 1989; Kauffman, 1993) is fully defined by the following com-
ponents:

• The number of bits, n.

• The number of neighbors per bit, k.

• A set of k neighbors Π(Xi ) for the i-th bit, Xi , for every i ∈ {0, . . . , n − 1}.

• A subfunction fi defining a real value for each combination of values of Xi and Π(Xi ) for
every i ∈ {0, . . . , n − 1}. Typically, each subfunction is defined as a lookup table with 2k+1
values.

The objective function fnk to maximize is defined as

X
n−1
fnk (X0 , X1 , . . . , Xn−1 ) = fi (Xi , Π(Xi )).
i=0

The difficulty of optimizing NK landscapes depends on all of the four components defining an
NK problem instance. One useful approach to analyzing complexity of NK landscapes is to focus on
the influence of k on problem complexity. For k = 0, NK landscapes are simple unimodal functions
similar to onemax or binint, which can be solved in linear time and should be easy for practically
any genetic and evolutionary algorithm. The global optimum of NK landscapes can be obtained in
polynomial time (Wright, Thompson, & Zhang, 2000) even for k = 1; on the other hand, for k > 1,
the problem of finding the global optimum of unrestricted NK landscapes is NP-complete (Wright
et al., 2000). The problem becomes polynomially solvable with dynamic programming even for
k > 1 if the neighbors are restricted to only adjacent string positions (Wright et al., 2000) or
if the subfunctions are generated according to some distributions (Gao & Culberson, 2002). For
unrestricted NK landscapes with k > 1, a polynomial-time approximation algorithm exists with
the approximation threshold 1 − 1/2k+1 (Wright et al., 2000).

2.3 NK Instances with Nearest Neighbors and Tunable Overlap


In this paper we consider NK instances with the following two restrictions:

1. Neighbors of each bit are restricted to the k bits that immediately follow this bit. When there
are fewer than k bits left to the right of the considered bit, the neighborhood is restricted to
contain all the bits to the right of the considered bit.

3
2. Some subproblems may be excluded to provide a mechanism for tuning the size of the overlap
between consequent subproblems. Specifically, the fitness is defined as
j k
n−1
X
step

fnk (X0 , X1 , . . . , Xn−1 ) = fi (Xi×step , Π(Xi )),


i=0

where step ∈ {1, 2, . . . , k + 1} is a parameter denoting the step with which the basis bits are
selected. For standard NK landscapes, step = 1. With larger values of step, the amount
of overlap between consequent subproblems can be reduced. For step = k + 1, the problem
becomes separable (the subproblems are fully independent).

The reason for restricting neighborhoods to nearest neighbors was to ensure that the problem
instances can be solved in polynomial time even for k > 1 using a simple dynamic programming
algorithm. The main motivation for introducing the step parameter was to provide a mechanism for
tuning the strength of the overlap between different subproblems. The resulting class of problems is
a subset of standard, unrestricted NK landscapes (Kauffman, 1989; Kauffman, 1993). Furthermore,
the resulting instances are a superset of the polynomially solvable random additively decomposable
problems introduced in Pelikan, Sastry, Butz, and Goldberg (2006).
The subfunctions in the considered class of NK landscapes are encoded as look-up tables; thus,
the subfunctions can be defined arbitrarily.

2.4 Generating Random Instances


The overall number of subfunctions and the set of neighbors for each of these subfunctions are
fully specified by parameters n, k, and step. The only component that varies from instance to
instance for any valid combination of values of n, k, and step are the subfunctions themselves and
the encoding, as described below.
The lookup table for all possible instantiations of bits in each subfunction is generated randomly
using the same distribution for each entry in the table. Each of the values is generated using the
uniform distribution over interval [0, 1).
To make the instances more challenging, string positions in each instance are shuffled randomly.
This is done by reordering string positions according to a randomly generated permutation using
the uniform distribution over all permutations.
The following section describes the dynamic programming approach which can be used to find
guaranteed optima for the aforementioned class of NK instances in polynomial time.

3 Dynamic Programing for Nearest-Neighbor NK Landscapes


The dynamic programming algorithm used to solve the described class of NK landscape instances
is based on Pelikan et al. (2006). It uses the knowledge of the location of subproblems and the
permutation imposed on the string positions, and considers subproblems in order from left to right
according to the original permutation of string positions before shuffling. For example, consider the
problem with n = 7, k = 2, and step = 2, which contains 4 subproblems defined in the following
subsets of positions (according to the original permutation of the string positions): {0, 1, 2} for the
subproblem f0 , {2, 3, 4} for f1 , {4, 5, 6} for f2 , and {6} for f3 . The dynamic programming algorithm
processes the subproblems in the following order: (f0 , f1 , f2 , f3 ). For each subproblem, the optimal
fitness contribution of this and the previous subproblems is computed for any combination of bits

4
that overlap with the next subproblem to the right. The global optimum is then given by the
computed fitness contribution of the last subproblem on the right.
Denoting by o = k −step+1 the maximum number of bits in which the subproblems overlap and
by m the overall number of subproblems, the dynamic programming algorithm starts by creating a
matrix G = (gi,j ) of size m × 2o . The element gi,j for i ∈ {0, 1, . . . , m − 1} and j ∈ {0, 1, . . . , 2o − 1}
encodes the maximum fitness contribution of the first (i + 1) subproblems where the o (or fewer)
bits that overlap with the next subproblem to the right are equal to j using integer representation
for these o bits. The last few subproblems may overlap with the next subproblem in fewer than o
bits; that is why some entries in the matrix G are not going to be used. For example, for the above
example problem with n = 7, k = 2, and step = 2, g1,0 represents the best fitness contribution of f0
and f1 (ignoring f2 and f3 ) under the assumption that the 5th bit is 0; analogically, g1,1 represents
the best fitness contribution of f0 and f1 under the assumption that the 5th bit is 1.
The algorithm starts by considering all 2k+1 instances of the k + 1 bits in the first subproblem,
and records the best found fitness for each combination of values of the o (or fewer) bits that
overlap with the second subproblem; the resulting values are stored in the first row of G (elements
g0,j ). Then, the algorithm goes through all the remaining subproblems from left to right. For
the subproblem fi , all 2k+1 instances of the k + 1 bits in this subproblem are examined; the only
exception may be the right-most subproblems, for which the neighborhood may be restricted due
to the fixed string length. For each instance, the algorithm first looks at the column j ′ of G that
corresponds to the o (or fewer) bits of the subproblem fi that overlap with the previous subproblem
fi−1 . The fitness contribution is computed as the sum of gi−1,j ′ and the fitness contribution of
the considered instance of fi . For each possible instantiation of bits that overlap with the next
subproblem, the optimum fitness contribution is recorded in matrix G, forming the next row (that
is, row i) of the matrix.
After processing all subproblems, the value of the global optimum is equal to the fitness contri-
bution stored in the first element of the last row of G. The values that lead to the optimum fitness
can be found by examining all choices made when choosing the best combination of bits in each
subproblem.

4 Compared Algorithms
This section outlines the optimization algorithms discussed in this paper: (1) the genetic algo-
rithm (GA) (Holland, 1975; Goldberg, 1989), (2) the univariate marginal distribution algorithm
(UMDA) (Mühlenbein & Paaß, 1996), and (3) the hierarchical Bayesian optimization algorithm
(hBOA) (Pelikan & Goldberg, 2001; Pelikan, 2005). Additionally, the section describes the deter-
ministic hill climber (DHC) (Pelikan & Goldberg, 2003), which is incorporated into all compared
algorithms to improve their performance. In all compared algorithms, candidate solutions are
represented by binary strings of n bits.
The genetic algorithm (GA) (Holland, 1975; Goldberg, 1989) evolves a population of candidate
solutions with the first population generated at random according to the uniform distribution
over all binary strings. Each iteration starts by selecting promising solutions from the current
population; we use binary tournament selection without replacement. New solutions are created by
applying variation operators to the population of selected solutions. Specifically, crossover is used
to exchange bits and pieces between pairs of candidate solutions and mutation is used to perturb the
resulting solutions. Here we use uniform or two-point crossover, and bit-flip mutation (Goldberg,
1989). To maintain useful diversity in the population, the new candidate solutions are incorporated
into the original population using restricted tournament selection (RTS) (Harik, 1995). The run is

5
terminated when termination criteria are met. In this paper, each run is terminated either when
the global optimum has been found or when a maximum number of iterations has been reached.
The univariate marginal distribution algorithm (UMDA) (Mühlenbein & Paaß, 1996) proceeds
similarly as GA. However, instead of using crossover and mutation to create new candidate solu-
tions, UMDA learns a probability vector (Juels, 1998; Baluja, 1994) for the selected solutions and
generates new candidate solutions from this probability vector. The probability vector stores the
proportion of 1s in each position of the selected population. Each bit of a new candidate solution
is set to 1 with the probability equal to the proportion of 1s in this position; otherwise, the bit is
set to 0. Consequently, the variation operator of UMDA preserves the proportions of 1s in each
position while decorrelating different string positions.
The hierarchical Bayesian optimization algorithm (hBOA) (Pelikan & Goldberg, 2001; Pelikan,
2005) proceeds similarly as UMDA. However, to model promising solutions and generate new
candidate solutions, Bayesian networks with local structures (Chickering, Heckerman, & Meek,
1997; Friedman & Goldszmidt, 1999) are used instead of the simple probability vector of UMDA.
The deterministic hill climber (DHC) is incorporated into GA, UMDA and hBOA to improve
their performance. DHC takes a candidate solution represented by an n-bit binary string on input.
Then, it performs one-bit changes on the solution that lead to the maximum improvement of
solution quality. DHC is terminated when no single-bit flip improves solution quality and the
solution is thus locally optimal. Here, DHC is used to improve every solution in the population
before the evaluation is performed.

5 Experiments
This section describes experiments and presents experimental results. First, problem instances and
the experimental setup are discussed. Next, the analysis of hBOA, UMDA and several GA variants
is presented. Finally, all algorithms are compared and the results of the comparisons are discussed.

5.1 Problem Instances


The parameters n, k, and step were set as follows: n ∈ {20, 30, 40, 50, 60, 70, 80, 90, 100, 120},
k ∈ {2, 3, 4, 5}, and step ∈ {1, 2, . . . , k + 1}. For each combination of n, k, and step, we generated
10,000 random problem instances. Then, we applied GA, UMDA and hBOA to each of these
instances and collected empirical results, which were subsequently analyzed. That means that
overall 180,000 unique problem instances were generated and all of them were tested with every
algorithm included in this study.
For UMDA, largest instances were infeasible even with extremely large population sizes of more
than 106 ; that is why some problem sizes are excluded for this algorithm and the main focus is on
GA and hBOA.

5.2 Compared Algorithms


The following list summarizes the algorithms included in this study:

(i) Hierarchical BOA (hBOA).

(ii) Univariate marginal distribution algorithm (UMDA).

(iii) Genetic algorithm with uniform crossover and bit-flip mutation.

6
(iv) Genetic algorithm with two-point crossover and bit-flip mutation.

5.3 Experimental Setup


To select promising solutions, binary tournament selection without replacement is used. New
solutions (offspring) are incorporated into the old population using RTS with window size w =
min{n, N/5} as suggested in Pelikan (2005). In hBOA, Bayesian networks with decision trees (Chick-
ering et al., 1997; Friedman & Goldszmidt, 1999; Pelikan, 2005) are used and the models are
evaluated using the Bayesian-Dirichlet metric with likelihood equivalence (Heckerman et al., 1994;
Chickering et al., 1997) and a penalty for model complexity (Friedman & Goldszmidt, 1999; Pelikan,
2005). All GA variants use bit-flip mutation with the probability of flipping each bit pm = 1/n. Two
common crossover operators are considered in a GA: two-point and uniform crossover. For both
crossover operators, the probability of applying crossover is set to 0.6. A stochastic hill climber
with bit-flip mutation has also been considered in the initial stage, but the performance of this
algorithm was far inferior compared to any other algorithm included in the comparison and most
problem instances included in the comparison were intractable with this algorithm; that is why the
results for this algorithm are omitted.
For each problem instance and each algorithm, an adequate population size is approximated with
the bisection method (Sastry, 2001; Pelikan, 2005); here, the bisection method finds an adequate
population size to find the optimum in 10 out of 10 independent runs. Each run is terminated when
the global optimum has been found (success) or when the maximum number of generations n is
reached before the optimum is reached (failure). The results for each problem instance comprise of
the following statistics: (1) the population size, (2) the number of iterations (generations), (3) the
number of evaluations, and (4) the number of flips of DHC. The most important statistic relating to
the overall complexity of each algorithm is the number of flips of DHC, since this statistic combines
all important statistics and can be consistently compared regardless of the used algorithm. That
is why we focus on presenting the results with respect to the overall number of DHC flips until the
optimum has been found.
For each combination of values of n, k and step, all observed statistics were averaged over the
10,000 random instances. Since for each instance, 10 successful runs were performed, for each n, k
and step and each algorithm the results are averaged over the 100,000 successful runs. Overall, for
all algorithms except for UMDA, the results correspond to 1,800,000 successful runs on a total of
180,000 unique problem instances.

5.4 Initial Performance Analysis


The number of DHC flips until optimum is shown in figure 1 for hBOA, figure 2 for UMDA,
figure 3 for GA with uniform crossover, and figure 4 for GA with twopoint crossover. The number
of evaluations for k = 2 and k = 5 with step = 1 is shown in figure 5; for brevity, we omit analogical
results for other values of k and step.
There are three main observations that can be made from these results. First of all, for hBOA,
the number of DHC flips as well as the number of evaluations both appear to be upper bounded by
a low-order polynomial for all values of k and step. However, for GA with both crossover operators,
for larger values of k the growth of the number of DHC flips and the number of evaluations appears
to be worse than polynomial with respect to n and it can be expected that the results will get
even worse for k > 5. The worst performance is obtained with UMDA, for which the growth of the
number of evaluations and the number of DHC flips appears to be faster than polynomial for all
values of k and step.

7
4
10 4
k=2, step=1 10 k=3, step=1

Number of flips (hBOA)

Number of flips (hBOA)


k=2, step=2 k=3, step=2
k=2, step=3 k=3, step=3
k=3, step=4
3
10 3
10

2 2
10 10
20 40 60 80 100 20 40 60 80 100
Problem size Problem size
5
10
k=4, step=1 k=5, step=1
Number of flips (hBOA)

Number of flips (hBOA)


4
10 k=4, step=2 k=5, step=2
k=4, step=3 k=5, step=3
k=4, step=4 4 k=5, step=4
k=4, step=5 10 k=5, step=5
3
k=5, step=6
10

3
10

2
10
20 40 60 80 100 20 40 60 80 100
Problem size Problem size

Figure 1: Average number of flips for hBOA.

To visualize the effects of k on performance of all compared algorithms, figure 6 shows the
growth of the number of DHC flips with k for hBOA and GA on problems of size n = 120; the
results for UMDA are not included, because UMDA was incapable of solving many instances of
this size in practical time. Two cases are considered: (1) step = 1, corresponding to standard
NK landscapes and (2) step = k + 1, corresponding to the separable problem with no interactions
between the different subproblems. For both cases, the vertical axis is shown in log-scale to support
the hypothesis that the time complexity of selectorecombinative genetic algorithms should grow
exponentially fast with the order of problem decomposition even when recombination is capable
of identifying and processing the subproblems in an adequate problem decomposition. The results
confirm this hypothesis—indeed, the number of flips for all algorithms appears to grow at least
exponentially fast with k, regardless of the value of the step parameter.

5.5 Comparison of All Algorithms


How do the different algorithms compare in terms of performance? While it is difficult to compare
the exact running times due to the variety of computer hardware used and the accuracy of time
measurements, we can easily compare other recorded statistics, such as the number of DHC flips
or the number of evaluations until optimum. The main focus is again on the number of DHC
flips because for each fitness evaluation at least one flip is typically performed and that is why the
number of flips is expected to be greater or equal than both the number of evaluations as well as
the product of the population size and the number of generations.
One of the most straightforward approaches to quantify relative performance of two algorithms
is to compute the ratio of the number of DHC flips (or some other statistic) for each problem
instance. The mean and other moments of the empirical distribution of these ratios can then be
estimated for different problem sizes and problem types. The results can then be used to better

8
4 k=2, step=1 k=3, step=1
10

Number of flips (UMDA)

Number of flips (UMDA)


k=2, step=2 4 k=3, step=2
10
k=2, step=3 k=3, step=3
k=3, step=4
3
10
3
10

2
10
2
10
20 40 60 80 100 20 40 60 80 100
Problem size Problem size
5
10
k=4, step=1 k=5, step=1
Number of flips (UMDA)

Number of flips (UMDA)


k=4, step=2 10 k=5, step=2
k=4, step=3 k=5, step=3
4
10 k=4, step=4 k=5, step=4
4
k=4, step=5 10 k=5, step=5
k=5, step=6
3
10 3
10

2 2
10 10
20 40 60 20 40 60
Problem size Problem size

Figure 2: Average number of flips for UMDA.

understand how the differences between the compared algorithms change with problem size or other
problem-related parameters. This approach has been used for example in Pelikan et al. (2006) and
Pelikan et al. (2008).
The mean ratio of the number of DHC flips until optimum for GA with uniform crossover and
that for hBOA is shown in figure 7. The ratio encodes the multiplicative factor by which hBOA
outperforms GA with uniform crossover. When the ratio is greater than 1, hBOA outperforms GA
with uniform crossover by this factor; when the ratio is smaller than 1, GA with uniform crossover
outperforms hBOA.
The results show that when measuring performance by the number of DHC flips, for k = 2,
hBOA is outperformed by GA with uniform crossover on the entire range of tested instances.
However, even for k = 2 the ratio grows with problem size and the situation can thus be expected
to change for bigger problems. For larger values of k, hBOA clearly outperforms GA with uniform
crossover and for the largest values of n and k, that is, for n = 120 and k = 5, the ratio for the
number of flips required is more than 3 regardless of step; that means that for the largest values of n
and k, hBOA requires on average less than third of the flips to solve the same problem. Even more
importantly, the ratio appears to grow faster than polynomially, indicating that the differences will
become much more substantial as the problem size increases.
The ratio between the number of DHC flips until optimum for GA with twopoint crossover and
that for hBOA is shown in figure 8. The results show that for twopoint crossover, the differences
between hBOA and GA are even more substantial than for the uniform crossover. This is supported
by the results presented in figure 9, which compares the two crossover operators used in GA; uniform
crossover clearly outperforms twopoint crossover for all tested problem instances.
In summary, the performance of hBOA is substantially better than that of other compared
algorithms especially for the most difficult problem instances with large neighborhood size. There
results are highlighted in tables 1 and 2, which show the average number of DHC flips and evalua-

9
4
10

Number of flips (GA, uniform)

Number of flips (GA, uniform)


k=2, step=1 4 k=3, step=1
k=2, step=2 10 k=3, step=2
k=2, step=3 k=3, step=3
k=3, step=4
3
10
3
10

2
10
2
10
20 40 60 80 100 20 40 60 80 100
Problem size Problem size
5 6
10 10
Number of flips (GA, uniform)

Number of flips (GA, uniform)


k=4, step=1 k=5, step=1
k=4, step=2 k=5, step=2
5
k=4, step=3 10 k=5, step=3
4
10 k=4, step=4 k=5, step=4
k=4, step=5 4
k=5, step=5
10 k=5, step=6
3
10
3
10

2 2
10 10
20 40 60 80 100 20 40 60 80 100
Problem size Problem size

Figure 3: Average number of flips for GA with uniform crossover.

tions until optimum required by hBOA and GA for k = 5 and n = 120.


Another interesting question is whether the instances that are difficult for one algorithm are
also difficult for other algorithms included in the comparison. Figures 10 and 11 visualize the
relationship between the number of flips required for the different instances for n = 120 and k = 5
with step = 1 and step = 6, respectively. In both these figures, the number of flips for each instance
is normalized by dividing the actual number of flips by the mean number of flips over all instances
for the same n, k, and step. The results show that in terms of the number of flips the performance
of all compared algorithms is much more strongly correlated for the problem with high overlap
(step = 1) than for the separable problem (step = 6). The results also show that, as expected, the
correlation between the two variants of GA is much stronger than that between hBOA and GA.
The relationship between performance of the compared algorithms on different problem in-
stances is also visualized in figure 12, which shows the average number of DHC flips until optimum
for various percentages of problems that are easiest for hBOA. These results also support the cor-
relation between performance of the different algorithms included in the comparison, because the
instances that are simpler for hBOA are clearly also simpler for the two variants GA.

5.6 Instance Difficulty


Assuming that recombination operator processes partial solutions or building blocks effectively,
based on scalability theory for selectorecombinative genetic algorithms (Goldberg & Rudnick, 1991;
Thierens & Goldberg, 1994; Harik et al., 1997; Goldberg, 2002) and multivariate estimation of
distribution algorithms (Mühlenbein & Mahnig, 1998; Pelikan et al., 2002; Pelikan, 2005; Yu et al.,
2007), there are three main sources of problem difficulty of generated instances: (1) signal-to-
noise ratio, (2) scaling, and (3) overlap between subproblems. These factors and their effects
on problem difficulty of the considered instances of NK landscapes are the main focus of this

10
4
10

Number of flips (GA, twopoint)

Number of flips (GA, twopoint)


k=2, step=1 4 k=3, step=1
k=2, step=2 10 k=3, step=2
k=2, step=3 k=3, step=3
k=3, step=4
3
10
3
10

2
10
2
10
20 40 60 80 100 20 40 60 80 100
Problem size Problem size
5 6
10 10
Number of flips (GA, twopoint)

Number of flips (GA, twopoint)


k=4, step=1 k=5, step=1
k=4, step=2 k=5, step=2
5
k=4, step=3 10 k=5, step=3
4
10 k=4, step=4 k=5, step=4
k=4, step=5 4
k=5, step=5
10 k=5, step=6
3
10
3
10

2 2
10 10
20 40 60 80 100 20 40 60 80 100
Problem size Problem size

Figure 4: Average number of flips for GA with twopoint crossover.

subsection. First, the effects of parameters n, k, and step are discussed. Then, the results are related
to existing scalability theory. Of course, performance of selectorecombinative genetic algorithms
also strongly depends on their ability to preserve and mix important partial solutions or building
blocks (Goldberg, 2002; Thierens, 1999) and the importance of effective recombination is also
expected to vary from instance to instance.
The results presented thus far in figures 1, 2, 3, 4, and 5 show that the larger the number of bits n,
the more difficult the problem becomes whether we measure algorithm performance by the number
of DHC flips or the number of evaluations until optimum. The main reason for this behavior is
that as n grows, the signal-to-noise ratio decreases and the complexity of selectorecombinative GAs
as well as hBOA is expected to grow with decreasing signal-to-noise ratio (Goldberg & Rudnick,
1991; Thierens & Goldberg, 1994; Harik et al., 1997; Goldberg, 2002; Pelikan et al., 2002); the
relationship between instance difficulty and the signal-to-noise ratio will be discussed also later
in this section. For hBOA, time complexity expressed in terms of the number of DHC flips or
the number of fitness evaluations appears to grow polynomially fast with n; for the remaining
algorithms, time complexity appears to grow slightly faster than polynomially fast. These results
are not surprising. It was argued elsewhere that if the problem can be decomposed into subproblems
of bounded order, then hBOA should be capable of discovering such a decomposition and solve
the problem in low-order polynomial time with respect to n (Pelikan, Sastry, & Goldberg, 2002;
Pelikan, 2005; Yu, Sastry, Goldberg, & Pelikan, 2007). Furthermore, it is known that fixed crossover
operators are often not capable of solving such problems in polynomial time because they often
break important partial solutions to the different subproblems or do not juxtapose these partial
solutions effectively enough (Thierens & Goldberg, 1993; Thierens, 1995). While it is possible
to create adversarial decomposable problems for which some model-building algorithms used in
multivariate estimation of distribution algorithms, such as hBOA, fail (Coffin & Smith, 2007), this
happens only in very specific cases and is unlikely to be the case with random problem instances

11
5
10
UMDA UMDA
GA (uniform) GA (twopoint)

Number of evaluations

Number of evaluations
4
4
10 GA (twopoint) 10 GA (uniform)
hBOA hBOA

3
10
3
10
2
10

2
1 10
10
20 40 60 80 100 20 40 60 80 100
Problem size Problem size

(a) k = 2, step = 1 (b) k = 5, step = 1

Figure 5: Average number of evaluations for k = 2 and k = 5 with step = 1.

GA (twopoint) 5
GA (twopoint)
5 GA (uniform) 10 GA (uniform)
10 hBOA hBOA
Number of flips

Number of flips
4
4
10 10

3 3
10 10
2 3 4 5 2 3 4 5
Neighborhood size, k Neighborhood size, k

(a) step = 1 (b) step = k + 1

Figure 6: Growth of the number of DHC flips with k for step = 1 (most overlap) and step = k + 1
(no overlap). All results are for n = 120.

or real-world decomposable problems.


The effects of k on performance were visualized in figure 6, which showed the growth of the
number of DHC flips until optimum with k for n = 120 and step ∈ {1, 6}; a similar relationship
can be observed for the number of evaluations (results omitted). The results show that for all
algorithms included in the comparison, performance grows at least exponentially with the value
of k. Furthermore, the results show that both variants of GA are much more sensitive to the
value of k than hBOA. This is not a surprising result because hBOA is capable of identifying the
subproblems and recombining solutions to respect the discovered decomposition whereas GA uses
a fixed recombination strategy regardless of the problem. The results for other values of n and step
are qualitatively similar and are thus omitted.
The effects of step on performance are somewhat more intricate. Intuitively, instances where
all subproblems are independent should be easiest and this is also supported with all experimental
results presented thus far. The results also indicate that the effects of overlap vary between the
compared algorithms as is shown in figure 13. More specifically, for both variants of GA, the most
difficult percentage of overlap (relative to the size of the subproblems) is about 0.5, whereas for
hBOA it is about 0.7.
As was mentioned earlier, time complexity of selectorecombinative GAs as well as hBOA is
directly related to signal-to-noise ratio where the signal is the difference between the fitness con-

12
Number of flips (GA, uniform) /

Number of flips (GA, uniform) /


k=2, step=1 k=3, step=1
0.8 k=2, step=2 k=3, step=2

Number of flips (hBOA)

Number of flips (hBOA)


k=2, step=3 1.25 k=3, step=3
0.7 k=3, step=4
1
0.6

0.75
0.5

0.4 0.5
20 40 60 80 100 20 40 60 80 100
Problem size Problem size
4 7
Number of flips (GA, uniform) /

Number of flips (GA, uniform) /


k=4, step=1 k=5, step=1
6
3 k=4, step=2 k=5, step=2
Number of flips (hBOA)

Number of flips (hBOA)


5
k=4, step=3 k=5, step=3
4
2 k=4, step=4 k=5, step=4
k=4, step=5 3 k=5, step=5
k=5, step=6
2
1

20 40 60 80 100 20 40 60 80 100
Problem size Problem size

Figure 7: Ratio of the number of flips for GA with uniform crossover and hBOA.

Number of DHC flips until optimum


n k step hBOA GA (uniform) GA (twopoint)
120 5 1 37,155 141,108 220,318
120 5 2 40,151 212,635 353,748
120 5 3 37,480 249,217 443,570
120 5 4 27,411 195,673 310,894
120 5 5 15,589 100,378 145,406
120 5 6 9,607 35,101 47,576

Table 1: Comparison of the number of DHC flips until optimum for hBOA and GA. For all settings,
the superiority of the results obtained by hBOA was verified with paired t-test with 99% confidence.

tributions of the best and the second best instances of a subproblem, and the noise models fitness
contributions of other subproblems (Goldberg & Rudnick, 1991; Goldberg, Deb, & Clark, 1992).
The smaller the signal-to-noise ratio, the larger the expected population size as well as the overall
complexity of an algorithm. As was discussed above, the signal-to-noise ratio is influenced primarily
by the value of n; however, the signal-to-noise ratio also depends on the subproblems themselves.
The influence of the signal-to-noise ratio on algorithm performance should be strongest for sepa-
rable problems with uniform scaling where all subproblems have approximately the same signal;
for problems with overlap and nonuniform scaling, other factors contribute to instance difficulty as
well. Another important factor influencing problem difficulty of decomposable problems is the scal-
ing of the signal coming from different subproblems (Thierens, Goldberg, & Pereira, 1998). Next
we examine the influence of the signal-to-noise ratio and scaling on performance of the compared
algorithms in more detail.
Figure 14 visualizes the effects of signal-to-noise ratio on the number of flips until optimum

13
k=2, step=1 k=3, step=1
k=2, step=2 k=3, step=2

Number of flips (GA, twopoint) /

Number of flips (GA, twopoint) /


0.8
k=2, step=3 k=3, step=3

Number of flips (hBOA)

Number of flips (hBOA)


1.25 k=3, step=4
0.7

1
0.6

0.75
0.5

0.4 0.5
20 40 60 80 100 20 40 60 80 100
Problem size Problem size
4
k=4, step=1 7 k=5, step=1
k=4, step=2 6 k=5, step=2
Number of flips (GA, twopoint) /

Number of flips (GA, twopoint) /


3
k=4, step=3 5 k=5, step=3
Number of flips (hBOA)

Number of flips (hBOA)


k=4, step=4 4 k=5, step=4
2 k=4, step=5 k=5, step=5
3 k=5, step=6

2
1

20 40 60 80 100 20 40 60 80 100
Problem size Problem size

Figure 8: Ratio of the number of flips for GA with twopoint crossover and hBOA.

for n = 120, k = 5, and step ∈ {1, 6}; since UMDA was not capable of solving many of these
problem instances in practical time, the results for UMDA are not included. The figure shows the
average number of DHC flips until optimum for different percentages of instances with smallest
signal-to-noise ratios. To make the visualization more effective, the number of flips is normalized
by dividing the values by the mean number of flips over the entire set of instances. The results
clearly show that for the separable problems (that is, step = 6), the smaller the signal-to-noise
ratio, the greater the number of flips. However, for problem instances with strong overlap (that is,
step = 1), problem difficulty does not appear to be directly related to the signal-to-noise ratio and
the primary source of problem difficulty appears to be elsewhere.
Figure 15 visualizes the influence of scaling on the number of flips until optimum. The figure
shows the average number of flips until optimum for different percentages of instances with smallest
signal variance. The larger the variance of the signal, the less uniformly the signal is distributed be-
tween the different subproblems. For the separable problem (that is, step = 6), the more uniformly
scaled instances appear to be more difficult for all compared algorithms than the less uniformly
scaled ones. For instances with strong overlap (that is, step = 1), the effects of scaling on algorithm
performance are negligible; again, the source of problem difficulty appears to be elsewhere.
Two observations related to the signal-to-noise ratio and scaling are somewhat surprising:
(1) Although scalability of selectorecombinative GAs gets worse with nonuniform scaling of sub-
problems, the results indicate that the actual performance is better on more nonuniformly scaled
problems. (2) Performance of the compared algorithms on problems with strong overlap does not
appear to be directly affected by signal-to-noise ratio or signal variance. How could these results
be explained?
We believe that the primary reason why more uniformly scaled problems are more difficult for
all tested algorithms is related to effectiveness of recombination. More specifically, practically any
recombination operator becomes more effective when the scaling is highly nonuniform; on the other

14
1.5 1.4

Number of flips (GA, twopoint) /

Number of flips (GA, twopoint) /


k=2, step=1 k=3, step=1

Number of flips (GA, uniform)

Number of flips (GA, uniform)


k=2, step=2 k=3, step=2
1.4 k=2, step=3 k=3, step=3
1.3 k=3, step=4

1.3

1.2
1.2

1.1
1.1
20 40 60 80 100 20 40 60 80 100
Problem size Problem size
1.5 1.6
Number of flips (GA, twopoint) /

Number of flips (GA, twopoint) /


k=4, step=1 k=5, step=1
Number of flips (GA, uniform)

Number of flips (GA, uniform)


k=4, step=2 1.5 k=5, step=2
1.4 k=4, step=3 k=5, step=3
k=4, step=4 k=5, step=4
k=4, step=5 1.4 k=5, step=5
1.3 k=5, step=6
1.3

1.2
1.2

1.1 1.1
20 40 60 80 100 20 40 60 80 100
Problem size Problem size

Figure 9: Ratio of the number of flips for GA with twopoint and uniform crossover.

hand, for uniformly scaled subproblems, fixed recombination operators are often expected to suffer
from inefficient juxtaposition and frequent disruption of important partial solutions contained in
the optimum or building blocks (Goldberg, 2002; Thierens, 1999; Harik & Goldberg, 1996; Thierens
& Goldberg, 1993).
For problems with strong overlap, the influence of the overlap appears to overshadow both
the signal-to-noise ratio and scaling. We believe that a likely reason for this is that the order of
interactions that must be covered by the probabilistic model may increase due to the effects of the
overlap, which leads to a larger order of important building blocks that must be reproduced and
juxtaposed to form the optimum. According to the existing population sizing theory (Goldberg
& Rudnick, 1991; Harik et al., 1997; Pelikan et al., 2002), this should lead to an increase in
the population size required to solve the problem (Harik et al., 1997). We are currently exploring
possible approaches to quantifying the effects of overlap in order to confirm or deny this hypothesis.

6 Future Work
Probably the most important topic for future work in this area is to develop tools, which could be
used to gain better understanding of the behavior of selectorecombinative genetic algorithms on
additively decomposable problems with overlap between subproblems. This is a challenging topic
but also a crucial one, because most difficult decomposable problems contain substantial overlap
between the different subproblems. A good starting point for this research is the theoretical work
on the factorized distribution algorithm (FDA) (Mühlenbein & Mahnig, 1998) and the scalability
theory for multivariate estimation of distribution algorithms (Pelikan, Sastry, & Goldberg, 2002;
Pelikan, 2005; Yu, Sastry, Goldberg, & Pelikan, 2007; Mühlenbein, 2008).
It may also be useful to look at other random distributions for generating the subproblems.
This avenue of research may split into at least two main directions. One may either bias the

15
Number of evaluations until optimum
n k step hBOA GA (uniform) GA (twopoint)
120 5 1 7,414 16,519 34,696
120 5 2 9,011 25,032 56,059
120 5 3 9,988 30,285 72,359
120 5 4 8,606 24,016 51,521
120 5 5 7,307 13,749 26,807
120 5 6 7,328 6,004 10,949

Table 2: Comparison of the number of evaluations until optimum for hBOA and GA. For all settings
except for step = 6, the superiority of the results obtained by hBOA was verified with paired t-test
with 99% confidence.
Number of flips (GA, twopoint)

Number of flips (GA, twopoint)


Number of flips (GA, uniform)

2 2
Correlation coefficient= 0.493 10 Correlation coefficient= 0.330 10 Correlation coefficient= 0.675
1
10

0
10 0 0
10 10

−1
10
0 0 0
10 10 10
Number of flips (hBOA) Number of flips (hBOA) Number of flips (GA, uniform)

Figure 10: Correlation between the number of flips for hBOA and GA for n = 120, k = 5 and
step = 1.

distribution used to generate subproblems in order to generate instances that are especially hard
for the algorithm under consideration with the goal of addressing weaknesses of this algorithm. On
the other hand, one may want to generate problem instances that resemble important classes of
real-world problems.
It may also be interesting to study instances of restricted and unrestricted NK landscapes
from the perspective of the theory of elementary landscapes (Barnes, Dimova, Dokov, & Solomon,
2003). Finally, the instances provided by methods described in this paper can be used to test other
optimization algorithms and hybrids.

7 Summary and Conclusions


This paper described a class of nearest-neighbor NK landscapes with tunable strength of overlap
between consequent subproblems. Shuffling was introduced to eliminate tight linkage and make
problem instances more challenging for algorithms with fixed variation operators. A dynamic
programming approach was described that can be used to solve the described instances to opti-
mality in low-order polynomial time. A large number of random instances of the described class
of NK landscapes were generated. Several evolutionary algorithms were then applied to the gener-
ated instances; more specifically, the paper considered the genetic algorithm (GA) with two-point
and uniform crossover and two estimation of distribution algorithms, specifically, the hierarchi-
cal Bayesian optimization algorithm (hBOA) and the univariate marginal distribution algorithm

16
Number of flips (GA, twopoint)

Number of flips (GA, twopoint)


Number of flips (GA, uniform) Correlation coefficient= 0.221 Correlation coefficient= 0.170 Correlation coefficient= 0.723
1 1
10 10

0
10 0 0
10 10

0 0 0
10 10 10
Number of flips (hBOA) Number of flips (hBOA) Number of flips (GA, uniform)

Figure 11: Correlation between the number of flips for hBOA and GA for n = 120, k = 5 and
step = 6.
1.2
1.8 GA (twopoint) GA (twopoint)
GA (uniform) GA (uniform)
Average number of flips

Average number of flips


1.6 hBOA hBOA
(divided by mean)

(divided by mean)
1.1
1.4
1.2 1
1
0.8 0.9
0.6
0.4 0.8
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Percent easiest hBOA instances Percent easiest hBOA instances

(a) step = 1 (b) step = 6

Figure 12: Performance of GA and hBOA as a function of instance difficulty for n = 120 and k = 5.

(UMDA). All algorithms were combined with a simple deterministic local searcher and niching was
used to maintain useful diversity. The results were analyzed and related to existing scalability
theory for selectorecombinative genetic algorithms.
hBOA was shown to outperform other algorithms included in the comparison on instances with
large neighborhood sizes. The factor by which hBOA outperforms other algorithms for largest
neighborhoods appears to grow faster than polynomially with problem size, indicating that the
differences will become even more substantial for problems with larger neighborhoods and larger
problem sizes. This suggests that linkage learning is advantageous when solving the considered
class of NK landscapes. The second best performance was achieved by GA with uniform crossover,
whereas the worst performance was achieved by UMDA.
The complexity of all algorithms was shown to grow exponentially fast with the size of the
neighborhood. The correlations between the time required to solve compared problem instances
were shown to be strongest for the two variants of GA; however, correlations were observed also
between other compared algorithms.
For problems with no overlap, the signal-to-noise ratio and the scaling of the signal in different
subproblems were shown to be significant factors affecting problem difficulty. More specifically,
the smaller the signal-to-noise ratio and the signal variance, the more difficult instances become.
However, for problems with substantial amount of overlap between consequent problems, the effects
of overlap were shown to overshadow the effects of signal-to-noise ratio and scaling.

17
1024000
k=5 k=5 1024000 k=5
32000 k=4 512000 k=4 k=4
512000
k=3 256000 k=3 k=3
256000
Number of flips

Number of flips

Number of flips
k=2 128000 k=2 k=2
128000
16000 64000
64000
32000
32000
16000
8000 16000
8000 8000
4000 4000
4000 2000 2000
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Percentage of overlap Percentage of overlap Percentage of overlap

(a) hBOA (b) GA (uniform) (c) GA (twopoint)

Figure 13: Influence of overlap for n = 120 and k = 5 (step varies with overlap).

1.075 1.075
GA (twpoint) GA (twpoint)
GA (uniform) GA (uniform)
Average number of flips

Average number of flips


1.05 hBOA 1.05 hBOA
(divided by mean)

(divided by mean)
1.025 1.025

1 1

0.975 0.975

0.95 0.95
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Signal to noise percentile (% smallest) Signal to noise percentile (% smallest)

(a) step = 1 (b) step = 6

Figure 14: Influence of signal-to-noise ratio on the number of flips for n = 120 and k = 5.

Acknowledgments
This project was sponsored by the National Science Foundation under CAREER grant ECS-
0547013, by the Air Force Office of Scientific Research, Air Force Materiel Command, USAF,
under grant FA9550-06-1-0096, and by the University of Missouri in St. Louis through the High
Performance Computing Collaboratory sponsored by Information Technology Services, and the
Research Award and Research Board programs.
The U.S. Government is authorized to reproduce and distribute reprints for government pur-
poses notwithstanding any copyright notation thereon. Any opinions, findings, and conclusions or
recommendations expressed in this material are those of the authors and do not necessarily reflect
the views of the National Science Foundation, the Air Force Office of Scientific Research, or the
U.S. Government. Some experiments were done using the hBOA software developed by Martin
Pelikan and David E. Goldberg at the University of Illinois at Urbana-Champaign and most exper-
iments were performed on the Beowulf cluster maintained by ITS at the University of Missouri in
St. Louis.

18
1.1 1.1
GA (twopoint) GA (twopoint)
GA (uniform) GA (uniform)

Average number of flips

Average number of flips


1.075 1.075
hBOA hBOA

(divided by mean)

(divided by mean)
1.05 1.05

1.025 1.025

1 1

0.975 0.975

0.95 0.95
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Signal variance percentile (% smallest) Signal variance percentile (% smallest)

(a) step = 1 (b) step = 6

Figure 15: Influence of signal variance on the number of flips for n = 120 and k = 5.

References
Ackley, D. H. (1987). An empirical study of bit vector function optimization. Genetic Algorithms
and Simulated Annealing, 170–204.
Aguirre, H. E., & Tanaka, K. (2003). Genetic algorithms on nk-landscapes: Effects of selection,
drift, mutation, and recombination. In Raidl, G. R., et al. (Eds.), Applications of Evolutionary
Computing: EvoWorkshops 2003 (pp. 131–142).
Altenberg, L. (1997). NK landscapes. In Bäck, T., Fogel, D. B., & Michalewicz, Z. (Eds.), Hand-
book of Evolutionary Computation (pp. B2.7:5–10). Bristol, New York: Institute of Physics
Publishing and Oxford University Press.
Baluja, S. (1994). Population-based incremental learning: A method for integrating genetic search
based function optimization and competitive learning (Tech. Rep. No. CMU-CS-94-163). Pitts-
burgh, PA: Carnegie Mellon University.
Barnes, J. W., Dimova, B., Dokov, S. P., & Solomon, A. (2003). The theory of elementary
landscapes. Appl. Math. Lett., 16 (3), 337–343.
Cheeseman, P., Kanefsky, B., & Taylor, W. M. (1991). Where the really hard problems are.
Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI-91), 331–
337.
Chickering, D. M., Heckerman, D., & Meek, C. (1997). A Bayesian approach to learning Bayesian
networks with local structure (Technical Report MSR-TR-97-07). Redmond, WA: Microsoft
Research.
Choi, S.-S., Jung, K., & Kim, J. H. (2005). Phase transition in a random NK landscape model.
pp. 1241–1248.
Coffin, D. J., & Smith, R. E. (2007). Why is parity hard for estimation of distribution algorithms?
Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2007), 624–
624.
Deb, K., & Goldberg, D. E. (1991). Analyzing deception in trap functions (IlliGAL Report No.
91009). Urbana, IL: University of Illinois at Urbana-Champaign, Illinois Genetic Algorithms
Laboratory.
Friedman, N., & Goldszmidt, M. (1999). Learning Bayesian networks with local structure. In
Jordan, M. I. (Ed.), Graphical models (pp. 421–459). Cambridge, MA: MIT Press.

19
Gao, Y., & Culberson, J. C. (2002). An analysis of phase transition in NK landscapes. Journal
of Artificial Intelligence Research (JAIR), 17 , 309–332.
Goldberg, D. E. (1989). Genetic algorithms in search, optimization, and machine learning. Read-
ing, MA: Addison-Wesley.
Goldberg, D. E. (2002). The design of innovation: Lessons from and for competent genetic
algorithms, Volume 7 of Genetic Algorithms and Evolutionary Computation. Kluwer Academic
Publishers.
Goldberg, D. E., Deb, K., & Clark, J. H. (1992). Genetic algorithms, noise, and the sizing of
populations. Complex Systems, 6 , 333–362.
Goldberg, D. E., & Rudnick, M. (1991). Genetic algorithms and the variance of fitness. Complex
Systems, 5 (3), 265–278. Also IlliGAL Report No. 91001.
Harik, G. R. (1995). Finding multimodal solutions using restricted tournament selection. Pro-
ceedings of the International Conference on Genetic Algorithms (ICGA-95), 24–31.
Harik, G. R., Cantú-Paz, E., Goldberg, D. E., & Miller, B. L. (1997). The gambler’s ruin problem,
genetic algorithms, and the sizing of populations. Proceedings of the International Conference
on Evolutionary Computation (ICEC-97), 7–12. Also IlliGAL Report No. 96004.
Harik, G. R., & Goldberg, D. E. (1996). Learning linkage. Foundations of Genetic Algorithms, 4 ,
247–262.
Heckerman, D., Geiger, D., & Chickering, D. M. (1994). Learning Bayesian networks: The
combination of knowledge and statistical data (Technical Report MSR-TR-94-09). Redmond,
WA: Microsoft Research.
Holland, J. H. (1975). Adaptation in natural and artificial systems. Ann Arbor, MI: University
of Michigan Press.
Juels, A. (1998). The equilibrium genetic algorithm. Submitted for publication.
Kauffman, S. (1989). Adaptation on rugged fitness landscapes. In Stein, D. L. (Ed.), Lecture
Notes in the Sciences of Complexity (pp. 527–618). Addison Wesley.
Kauffman, S. (1993). The origins of order: Self-organization and selection in evolution. Oxford
University Press.
Mühlenbein, H. (2008). Convergence of estimation of distribution algorithms for finite samples
(Technical Report). Sankt Augustin, Germany: Fraunhofer Institut Autonomous intelligent
Systems.
Mühlenbein, H., & Mahnig, T. (1998). Convergence theory and applications of the factorized
distribution algorithm. Journal of Computing and Information Technology, 7 (1), 19–32.
Mühlenbein, H., & Paaß, G. (1996). From recombination of genes to the estimation of distribu-
tions I. Binary parameters. Parallel Problem Solving from Nature, 178–187.
Pelikan, M. (2005). Hierarchical Bayesian optimization algorithm: Toward a new generation of
evolutionary algorithms. Springer.
Pelikan, M., & Goldberg, D. E. (2001). Escaping hierarchical traps with competent genetic
algorithms. Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-
2001), 511–518. Also IlliGAL Report No. 2000020.
Pelikan, M., & Goldberg, D. E. (2003). Hierarchical BOA solves Ising spin glasses and maxsat.
Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2003), II ,
1275–1286. Also IlliGAL Report No. 2003001.

20
Pelikan, M., Sastry, K., Butz, M. V., & Goldberg, D. E. (2006). Performance of evolutionary
algorithms on random decomposable problems. Parallel Problem Solving from Nature, 788–
797.
Pelikan, M., Sastry, K., Butz, M. V., & Goldberg, D. E. (2008). Analysis of estimation of
distribution algorithms and genetic algorithms on NK landscapes. pp. 1033–1040.
Pelikan, M., Sastry, K., & Goldberg, D. E. (2002). Scalability of the Bayesian optimization
algorithm. International Journal of Approximate Reasoning, 31 (3), 221–258. Also IlliGAL
Report No. 2001029.
Santarelli, S., Goldberg, D. E., & Yu, T.-L. (2004). Optimization of a constrained feed network
for an antenna array using simple and competent genetic algorithm techniques. Proceedings
of the Workshop Military and Security Application of Evolutionary Computation (MSAEC-
2004).
Sastry, K. (2001). Evaluation-relaxation schemes for genetic and evolutionary algorithms. Mas-
ter’s thesis, University of Illinois at Urbana-Champaign, Department of General Engineering,
Urbana, IL. Also IlliGAL Report No. 2002004.
Sastry, K., Pelikan, M., & Goldberg, D. E. (2007). Empirical analysis of ideal recombination on
random decomposable problems. pp. 1388–1395.
Thierens, D. (1995). Analysis and design of genetic algorithms. Doctoral dissertation, Katholieke
Universiteit Leuven, Leuven, Belgium.
Thierens, D. (1999). Scalability problems of simple genetic algorithms. Evolutionary Computa-
tion, 7 (4), 331–352.
Thierens, D., & Goldberg, D. (1994). Convergence models of genetic algorithm selection schemes.
Parallel Problem Solving from Nature, 116–121.
Thierens, D., & Goldberg, D. E. (1993). Mixing in genetic algorithms. Proceedings of the Inter-
national Conference on Genetic Algorithms (ICGA-93), 38–45.
Thierens, D., Goldberg, D. E., & Pereira, A. G. (1998). Domino convergence, drift, and the
temporal-salience structure of problems. Proceedings of the International Conference on Evo-
lutionary Computation (ICEC-98), 535–540.
Wright, A. H., Thompson, R. K., & Zhang, J. (2000). The computational complexity of N-K
fitness functions. IEEE Transactions Evolutionary Computation, 4 (4), 373–379.
Yu, T.-L., Sastry, K., Goldberg, D. E., & Pelikan, M. (2007). Population sizing for entropy-
based model building in estimation of distribution algorithms. Proceedings of the Genetic
and Evolutionary Computation Conference (GECCO-2007), 601–608.

21

You might also like