You are on page 1of 38

NASA ICASE

Contractor Report

Report

201616

No. 96-62

ICA
NORMAL-BOUNDARY AN ALTERNATE OPTIMAL PROBLEMS POINTS INTERSECTION: FOR GENERATING PARETO IN MULTICRITERIA OPTIMIZATION METHOD

Indraneel

Das

John Dennis

NASA Contract No. NASI-19480 November 1996 Institute NASA Hampton, Operated for Langley VA Computer Research 23681-0001 Space Research Association Applications Center in Science and Engineering

by Universities

National Space

Aeronautics Administration Research Virginia

and

Langley Hampton,

Center 23681-0001

NORMAL-BOUNDARY AN ALTERNATE IN METHOD FOR MULTICRITERIA

INTERSECTION: GENERATING OPTIMIZATION PARETO OPTIMAL PROBLEMS 1 POINTS

Indraneel Dept.

Das and John

Dennis

of Computational & Applied Mathematics Rice University Houston, TX 77251-1892

Abstract This paper proposes mal points for a general an alternate method for finding several Pareto optinonlinear multicriteria optimization problem, aimed

at capturing tile tradeoff among the various conflicting objectives. It can be rigorously proved that this method is completely independent of tile relative scales of the functions and is quite successful in producing an evenly distributed set of points in the Pareto set given an evenly distributed set of 'weights', a property which the popular method of linear combinations lacks. Further, this method can be easily extended in case of more than two objectives algorithms, the tradeoff while retaining the computational efficiency of continuation-type techniques for tracing which is an improvement curve. over homotopy

I This

research

was partially

supported

by the

Dept.

of Energy,

DOE

Grant

DE-FG03-

95ER25257 and by the National Contract No. NAS1-19480 while Computer Center, Applications Hampton, in Science VA 23681-0001.

Aeronautics and Space Administration the first author was in residence at and Engineering (ICASE), NASA

under NASA the Institute for Langley Research

Introduction
A wide variety of problems arising in design optimization of en-

gineering systems are essentially multicriteria in nature (see, for example, Eschenauer, Koski and Osyczka [1] and Statnikov and Matusov [2]). For example, a typical bridge-construction design might involve simultaneously minimizing the total mass of the structure and maximizing its stiffness. However, it is highly improbable that these conflicting objectives would both be 'extremized' by the same design, hence some tradeoff between the tives functions is desired to ensure an efficient design Mathematically a multicriteria optimization fl(x) problem can be written as: objecsuch

f2(x)
min F(x)
xEC

n >_2,

...(MOP)

where

c = {x: h(x) = O,g(x) < o,a <_x <_b},


f : _N _ _n, h : _N _ _ne and

g : ,_N __+ _i

are twice

continuously

differentiable the number of equality Since

mappings, of variables, and inequality

and a E (_ U {-ec}) N, b E (_ U {oc}) x, N being n the number of objectives, ne and ni the number constraints.

no single

x" would

concept of optimality of Pareto optimality, Definition: a (globally) A point efficient

a generally minimize every fi simultaneously, framework is that which is useful in the multiobjective as defined below: x* E C is said point to be (globally) Pareto optimal point or for one

or a non-dominated

or a non-inferior

(MOP) if and only if /Sx E C such that F(x) < F(x*) strict inequality (the <_ implies term-by-term inequality). The shadow vidual global minimum, f[, F*, is defined of the objectives,
:k

with at least

as the vector i.e.,

containing

the indi-

minima,

fl
F*= f_

(We assume here and henceforth the existence of a minimum for each of our objectives.) The shadow minimum could thus be attained only in the rare case when a single x minimizes all the objective functions. However, in practical minimum objectives. Very often in engineering applications the desired solution is a whole situations, the best we can hope for is to get close to the shadow and assure that there is an agreeable trade-off among the multiple

collection of Pareto of efficient solutions. optimal set. which

optimal points, representative of the entire spectrum Thus ideally, the desired solution is the entire Pareto can be obtained for some resulting recently, optimal small problems which allow in closed-form expressions attempts have been made solutions in bi-objective

themselves to be treated parametrically, for the Pareto set (see Lin [3]). More to approximate the entire curve

of Pareto

problems using techniques which trace the curve of parametrized optima (see Rakowska, Haftka and Watson [4], Rao and Papalambros [5], Lundberg and Poore applications, multiple objective tives. [6]). The next, best solution, is a set of Pareto optimal which is very points obtained acceptable in most by combining the the single the objecpoints

objectives into a single objective function and minimizing over various values of the parameters used to combine it is possible to generate a set of Pareto

For example,

optimal

by minimizing a convex combination of the objectives, a.TF(x), over x E C, where a _> 0 (component:wise) and _i'=1 _'i = 1, and performing the minimization for different choices of a (see, among many others, Koski [7]). In this article, we propose a new method for generating Pareto optimal points which is at least as efficient as these methods and, unlike the techniques for tracing the curve of Pareto optimal with more than two objectives. solutions, can be applied to problems

2
First

Preliminaries
let. us introduce Convex Hull some terminology: of Individual of fi(x),i Minima (CHIM): over Let x_ be the Let re-

spective

global

minimizers

= 1,...,n

x E C.

Fi* =

F(x_),i = 1,..., n. Let be the n n matrix whose ith column Then the set of points in _'_ that are convex combinations

is Fi* - F*. of Fi*, i.e.,

{q)w :

w E ,_n,_i-__lWi Individual Minima. Tile set by )c. so F contains 5c under F in

1, wi _> 0}, is referred

as the

Convex

Hull

of

of attainable objective vectors, {F = F(.r) : x E C} is denoted : C _ 5c, i.e.. C is mapped by F onto .7-. The space ,_n which is usually referred to as the objective space. The map of C the objective space is often called the multi-loss map 2 (bi-loss

map, if n = 2). We shall denote the boundary of )t- by 0.T. The set of all Pareto optimal points is usually denoted by 7). The complete curve/surface of Pareto minima (continuous or not) is often referred to as the trade-off function (see p9, Haimes, Hall and Freedman [8]).

that

CHIM+: contains

Let CHIMoc be the affine subspace of lowest dimension the CHIM. Then CHIM+ is defined as the smallest simplyevery point in the intersection of 09r and CHIMec. consider to touch extending (or withdrawing) the boundary of the 0_, the 'extension' of CHIM thus obtained is

connected

set that contains

More informally, CHIM simplex defined

as CHIM+. that the objective functions have been shifted to the origin, so that all the i.e., F(x) - F*. space, the is redefined as:

Henceforth, it shall be assumed defined with the shadow minimum objective functions are non-negative, F(x) We observe that in Fig.l, which

+-- F(x)

shows the set 5r in the objective

point A is F_, B is F_, O is the shadow minimum (and the origin), the broken line segment AB is the CHIM, while the 'arc' ACB is the set of all Pareto minima in the objective space; alternately, the trade-off curve. In this (and and any) problem with n = 2 (i.e., bi-objective), CHIM = CHIM+ the matrix (I) is anti-diagonal.

Central

Idea

The pivotal idea behind our approach will be introduced by means of a simple observation: the intersection point between the normal emanating from 2This terminology is widely used in game theory.

f2(x)

At
'od. i C _-0, 0

O ;

f l(x)

Figure any point mal point; in the CHIM the point and

1: A typical the boundary closest

bi-loss map 0J c is probably to the origin a Pareto opti-

of intersection

is a Pareto

minimal

point (while the one furthest is a Pareto maximal point). We say 'probably' because this may not always be true, e.g., when the boundary is 'folded' (see Fig.2). But it is true when the trade-off surface in the objective space is convex, which happens in almost every application 1, 2 and 7). found in the literature (see for example the problems in Refs.

Given a convex weighting w, _w represents a point in the CHIM. Let fi denote the unit normal to the CHIM simplex pointing towards the origin; then w + tfi, t E _ represents the set of points on that normal. Then the point of intersection is identical between the normal and the boundary of 9v closest to the origin to the solution
X,t

of the following

subproblem:

max t s.t. Cw + t_ = F(x) h(x) = 0 (NBI_)

9(,*)< o
a<x<b. The constraints w + th = F(x) ensure that the point x is actually mapped by F to a point on the normal, while the remaining constraints ensure feasibility of x with respect to the constrained set in the original problem (MOP).

'1

Q
i / /

o
Figure 2: NBI started at Q converges to P (locally Pareto optimal), the corresponding globally efficient point would have been P*. whereas The subproblem above shall be referred to as the NBI subproblem, often written as NBIw (since w is the characterizing parameter of the subproblem), and solutions of these subproblems will be referred to as NBI points. The idea is to solve NBI_ for various w and find several points on the boundary part of 9r, effectively constructing a pointwise minimal approximation set. optimal points. For to the of the boundary As indicated containing the Pareto

earlier,

all NBI points

are not Pareto

biobjective problems, for every Pareto optimal point there exists a corresponding NBI subproblem of which it is the solution. The same is true for n _> 3, with one difference: the components of the weight w for NBIw may not add up to 1. As a simple example, suppose _- is a sphere in 3 3 touching the coordinate axes, for simplicity. Then the CHIM simplex is the triangle formed clearly, by joining the three CHIM _ CHIM+ which points there points where the sphere touches the axes. Quite and there exist points in CHIM + \CHIM optimal points on the sphere. _' However they do not satisfy wi = 1. Thus,

underneath since these

are Pareto

are not in CHIM,

by solving NBIw for Y]_ wi = 1, a portion of the Pareto set might be overlooked for problems with n > 2. However, these overlooked points are likely to be 'extremal' Pareto points which are not interesting from the tradeoff standpoint, which is our primary goal.

3.1 3.1.1 The

Some

details of of is described O(:,i) by = F(xT) - F'.

Structure
i th

column

Since fi(xT)

= f_, clearly,

i) = o.
Furthermore, if x_' is the global minimizer of fi(x), then

(j,i) Thus, a negative element in position

>_ O,j i. (j, k) of signifies that x_ is not the

global minimizer of f#(x), and fk(x_) < fk(x*k), i.e., xy improves on the current local minimum of fk(x). This very fortunate occurrence can help refine the local minimum of an objective by a simple examination of _I,. Even a zero element of in an off-diagonal position, say, (j, k), would make xk

signify that x_ is a minimizer of both fj(x) or its nearby points very desirable choices. 3.1.2 Quasi-normal instead of normal

and fk(x),

which could

direction the boundary to the CHIM is valid simplex, even but

The idea of a family of normals intersecting if we do not have the exact normal direction some quasi-normal direction fi which a family of quasi-normal rays towards boundary points. In practice equally-weighted linear to ensure that it points

points towards the origin. 'Shooting' the boundary also gets us our desired direction to be an by -1

we choose our quasi-normal

combination of the columns of _, multiplied towards the origin. Explicitly,

where

e is the column

vector

of all ones. defined as above has the property independent that the of

The quasi-normal NBI point found

component

for a certain

w is completely

of the scales

the objective functions. In other words, if NBL_ is re-solved with the objective functions rescaled by arbitrary factors, the NBI point found remains 6

unchanged. This fact will be provedlater. Giventhat (I)hasnonnegative components sdiscussed the previous a in subsection, is clearthat all components f (I)eare nonnegative. it o Eventhougha quasi-normal directionwill beusedin our computations, wepreferto retainthe name'NBI', rather thanchange to somethinglike it 'QNBI'. The authorshopethat this misnomer ouldnot be considered w too harshly.
3.1.3 Since Further t is being insight: maximized NBI and goal programming and _w + tfi = F(x),

in the

NBI subproblem

x E C, this maximization far from a 'target' point anteeing of (Pw. nonincrease

subproblem attempts to find a feasible point x as w as possible, with fi _>0 (componentwise) guarof F(x) relative to the components

in the components

This is similar to goal programming. If we take the Pareto set to be convex in the objective space, 'equality goal programming '3 can be thought of as NBI where NBI,,, problem: rain f/(x)
x

the direction

fi is one of the canonical as the following

basis vectors

ei (i.e.

with 1 in the i*h position

and 0 in the rest).

To be precise,

the subproblem

with fi = ei has the same solution

goal programming

_.t.

fj(x)

= (_u,)(j), xEC,

j = 1,...,

n,

j g: i

where

((_w)(j)

denotes

the j_h component as equalities a Pareto

of the vector is untraditional,

_w. this kind of subin Lin[3]

Though problem and [9].


3Preferring inequalities.

posing

the goals

above

for obtaining

optimal

point

is discussed

to goal programming

where

the

goal

constraints

are

equalities

instead

of

3.1.4 The tional

Efficiently following expense simple

solving observation

the

subproblems plays the a key NBI role in lowering the computa-

involved weight

in solving w and Then, solution solved as the It is this

subproblems: that w is 'close to expect are to' ff_, i.e., the Ilu,-ff:ll solution othpoint NBI it the

Consider is 'small' (x*, t*) er'. (._*,t*). subproblem local flavor

vectors norm. and the

_ such (_*, [') NBI_, starting aspect

in some of NBI,, that with solver Then

it is reasonable

that 'close

of NBI_ first point of our and

to each the the gives

Assume

we have (._*,/*) can rate 4.

already

have NBI,_, that

for solving algorithm

be expected method. the

to converge

in a few iterations

at a fast

convergence

of a continuation-type we already of the and then CHIM have

Since vertices lem',

individual we start close to the

minima at x_ and one just for

of the solve solved,

functions, a 'nearby and

i.e.,

the

simplex,

subprob-

a subproblem the

so on. The

Let weights We can

us illustrate

above

strategy

a biobjective

problem.

w for only two objectives can take t3 to assume the values:

be expressed

as [/3, 1 -/3],

/3 E [0, 1].

where and k

(5 < =

1 is the i.e. weights

(uniform) the greatest is given

spacing integer

between < .

two Then

consecutive the set over the

wl

values as

I[_],

of 'uniformly values

distributed' above.

by [/3, 1 - 3], where/3

ranges

Now, expected with from x_,

assuming

(f <<

1 (say

($ = 0.05), of the subproblem as the

the

minimizer

of f2(x), NBI

i.e.,

x_, is

to be a small and its solution with

perturbation the NBI is used

solution with until

to the this the last point

subproblem starting the NBI

w = [& 1 - (f]. Thus

w is solved for solving weight

starting

subproblem Of course. lems with described

w = [2(5, 1 - 26], and tile subproblems'

so on,

is reached. for probas

'ordering that next

may

not but

be so obvious can still

more in the

two objective section.

functions,

be achieved,

4Q-quadratic if exact second derivatives like BFGS is used.

are used, superlinear

if a secant

approximation

Generating more than

w and two

ordering

the

subproblems

for

objectives
a (data) structure which simultaneously

In this section,

we shall describe

enables the generation of weights w and ordering the subproblems in a manner amenable not only to efficient solution but also to parallelization.

4.1

Generating that

w for an n-objective wj values nproblem, 6j > 0 is tile uniform spacing

Let us assume between

two consecutive

(i.e., tlle 'stepsize'

of w) for j = 1..... integer. The possible

1. For simplicity,

on the jth component 1 let us also assume that _ is an

values

that

can be assumed
[0,61,261,..., 1].

by Wl are

Define ml = _-_. Then the possible values of w2 corresponding (all the wi's must add up to 1) are [0, 52,262,..., where k2 = I[L__] = I[__2 _ ]. the possible values k252]

to wa = m151

W1

--

Now define m2 = _-_:. Then ??_I(_1 and w2 m2(_2 are

of w3 corresponding

to

[0, 53,263,...,
where

k353]
J"

k2

I[ 1-wl-w2]Sz

J ---- "kI'[1-m151-rn252153

Thus, ofwj

corresponding ..... n-lare

to wi = mi6i,

i = 1,...,j

1, the

possible

values

for j=2

[0,6j,2_j,...,kj_j], where kj = i[1Ei-_ 6j 9 mi6i.]

Finally the last component f w o


Wn =

is defined
n-I

as

1 --

E i=1

wi.

Clearly, the number

the entire of children

data

structure

above can be thought

of as a tree where However, a tree

varies with the node and generation.

structure is clearly unnecessary for implementation; all that requires storage are the numbers _fj. However the tree is useful as a conceptual aid. Of the subproblems generated by the weights in the above tree, n (with
6,

w = ei) are already solved while finding F*. Also note that since _ is not necessarily an integer Vi < j, the spacings between 'the last two' values of w_ may not be uniform. Special case: Equal stepsizes on all wi
= 1,..., n -

Let (fi = if, i Also assume As before, that = p is an integer. values

tile possible

of wl are 1] to wi = mi6i, i = 1,...,j1

[0, 6,2g,..., Then the possible for j=2 .... ,n-1 values are of wj corresponding

j-1

[o,6,2z,..., (pi=1

As before,

w_ = l_V'n-1 _i=a

wi, and now all the w_ values are uniformly

spaced.

4.2

Ordering

the

subproblems

Each path from the root of the tree (the topmost node) to a leaf (a member in the bottommost generation) represents a unique weight w. It should also be observed that the w vectors are already ordered on the basis of 'nearness' as one traverses the tree breadthwise. Thus 10 a strategy for picking the order

of thesubproblemsouldbeto start with theleftmostone(whichhasw c

= e_ and is already solved) and solve the next one in the Wn-l generation (which is w_-i = 5,-1, w, = 1 - 5_-1), then the next one in the w_-i generation ( wn-1 = 25n-1, w, = 1 - 25n-1), and so on until all the subproblems wi = 0, i = 1, ..., n - 2 have been solved. Then we move to the next in the u',-2 generation (i.e., with wi = 0, i = 1,...,n and visit all the children of this node, with the starting subproblems previous This chosen as the corresponding NBI subproblem node. is where the scope for parallelization comes in. The solution of for node

- 3, u,_-2 = 5,-2) points of the NBI solutions at the

the first subproblem at the second to wait until all the subproblems subproblem
Wn-1 _ 5n-l,

node in the w_-2 generation didn't have in the first node were solved. The first w_-2 generation with w,-2 = 5_-2, could be solved immediately after = 0, wn-1 = 5,_-l, in the second node can be solved in the first node, ..., and the k ta

in the second
Wn : 1-

node of the
5n--2 --5n-I

solving

the first subproblem

in the first node with w_-2

u'n = 1 - 5,-1. Thus the first subproblem in parallel with the second subproblem

subproblem in the second node can be solved in parallel with the (k + 1) th subproblem of the first node. Further,the k th subproblem in the third node can be solved in parallel with the (k + 1) th subproblem of the second node, with the solution point, topics of our future of the k th subproblem process research. of the second node as the starting is one of the and so on. This entire of efficient parallelization

Relationship minimizing tives

between a linear

the

NBI

subproblem of the

and objec-

combination

In this section method

we illustrate

how the NBI subproblem combination

is related

to the popular For ease of

of minimizing

a convex

of the objectives.

notation, we shall assume that the problem only has equality constraints, which can be assumed without loss of generality g. Let c_ E (?R+ U {0}) _, __,_ ai = 1, denote a positive, convex weighting weighted linear combination problem for obtaining
gh{x) inequality can be thought and of as the equality constraints

of the objectives. a Pareto optimal


by the active

The point
set of

augmented

constraints

bounds

11

is then

written

as min aTF(x)
X

s.t. The and for solution the optimality of a problem denoted respect like above

h(x)

= 0. be referred part' at of the that the to as an LCpoint, KKT conditions gradient

(1)

will often The 'first

problem with

by LCo. to x should

6 of (x*, A*) for

problem

(1) states vanish

of the

Lagrangian

(x*, A*), i.e.,

vxF(x')a
Similarly, different lem), the NBI if w denotes from the subproblem the can

+ vxh(z')_"
of weights a'z in the as

=0
in .%BI_. ,; linear (which has

(2)
a very

vector

meaning

weights

combinations

subprob-

be written

s.t.

F(x)

- Ou,-

t/7 = 0

h(x) = 0.
Then the first part of the KKT conditions states vanish at that the gradient of the Lagrangian with respect to (x, t) should (x*, t*, A( 1)*, A(2) *) , i.e.

V_F(x')_ (1)"+ V_h(x')_ (2)"= o


-1 + ?_T)_(1). = 0,

(3)

where

A (I)

E _

represents F(x) h(x)

the

vector

of multipliers A (2) E _n_

corresponding the

to the of

constraints the equality Claim: Suppose ponents

Ou, + t_. constraints

= 0, and = O.

denotes

multipliers

(x*, t*, A(1)*, A(2)*) is tile of the vector _ as

solution

of NBI_..

Now

define

tile

com-

A}x) *
O_i --

Era},/the first order necessary conditions for

6Karush-Kuhn-Tucker optimality.

conditions,

or

alternately

12

Then,

problem

( 1) with the above convex weighting [x*, A* 1

vector ct has the solution

P roof: Dividing both sides of (3) by the scalar between clearly, (2) and _ AI1)_ and observing obvious. to that that h(x*) = 0, the equivalence However, of _ _!l). quite (3) becomes

if for some i. the sign of AI1)* is opposite ca has a negative component

then the vector

and does not qual-

ify as a weight for problem (1). In such a case, either the Pareto optimality of the NBI point (x*, t*, A(1)*, A(2)*) is questionable, or the Pareto point lies in a nonconvex Also observe Just part the tacit of the Pareto assumption above set 7. that _ AIx)* 0. for obtaining a for probthe

as the analysis

suggests

a method of NBI_.,

lem LCo given the corresponding NBI point corresponding effort. Suppose

solution

one can also obtain

to a given solution

of problem

LC_ with very little

(x*, A*) solves

problem

LC_.

Let (if', t*) be the solution

of the

(n + 1) x (n + 1) linear system u, + tit = F(x*)

_--_wi = 1.
i=1

Then (x _, A') corresponds solution of NBIe is

to the solution

of NBIw

with

w = if,, i.e.,

the

A*
(X.,t. /k(1)., _ Cto, ')_(2)*Tit ca,Tit)"

P roof:

TPareto points in nonvonvex parts a linear combination of the objectives

of the Pareto set cannot be obtained by minimizing a proof of which will appear in a future article

13

Dividing (2) on both sidesby _Trt


that A(1). defined above satisfies of the KKT conditions and _w + tfi = F(x*), follows.

(assumed nonzero s) and observing hTA (1)* = 1, it can be seen that the first part observing that, h(x*) = 0 between LCo and NBI_v

for NBI_v holds. Further the required equivalence

Proof scales

of independence using the

with

respect

to function

quasi-normal
the quasifunctions

In this section we shall prove that the NBI point found using normal fi and a particular w is independent of how the individual are scaled. Let the objective functions fi(x) In other be scaled by positive i = 1,...,n. scalars si as

6-- s, fi(z),

words, if s is the vector with components

si and S = diag(s),

then

F(z)
Consequently VCF(x)

SF(z).

+-- V_F(x)S q_ = S,

The

quasi-normal Claim:

direction

fi = -e

after

scaling

becomes

= -S_I,e.

If (x', t', A(1)', A(z)') solves tile unscaled

NBI_.

(i.e. with S = In),

then

(x', t', S-1A (1)*, A(2).) solves 9 NBIw with the functions scaled as above. Proof: Since (x', t', A(1)', A(2}') solves the unscaled NBI_ (still with only equality constraints as in the previous (x)"
+ Vzh(x*)A (not point all zero) of the

section),
(2)" = and 0 h has negative programming components, problem'.

VxF(x')A
SSince a has nonnegative assumption holds. 9Here 'solves' means 'finds

components a stat,ionary

the

nonlinear

14

rtTA (1)* --_ 1 q'w + t*_ = F(x*)

h(x') = 0.
The first equation can be rewritten (S-1A to state that the following holds: (4)

(VrF(x*)S) Tile second equation implies

{D*) + Vxb(x*)A

(2)" = O.

erdpr_(1)* =_ eT_pTss-1)_ Since S


= S T,

-- ]. (1)* = 1.

the

above

is the

same

as (1)*) _-- 1. as

(cT(s(I))T)(s-I_ Tile third equation can be rewritten

(5)

w + t*(Pe = F(x*) =_ S_w Clearly, NBI,L. (QED) Tha above result holds result does not depend on being say, how order the vector of all ones and with equations the (4),(5) + t*S,e = SF(x*). that (x*, t*, S-1_ (1)*,)_(2)*) (6) solves

g: (6) imply by S.

functions

scaled

consequently The functions as if the above

if fi is scaled suggests NBI were that

by a factor, no matter the to the same

a normalization disparately finds of magnitude. the a set

constant. different of points

might functions

be scaled,

with

quasi-normal

all scaled

Advantages
Finds which objective for various a uniform parametrically and

of using
spread efficient of the of combines

NBI
Pareto all the points points: objective Then, Consider functions the in general, any into the method a single objective mapping

finds

by minimizing

single

values

parameters.

15

from the set of parameters

to the set of Pareto

optimal

points

is not

one-to-one. Thus it might so happen that minimizations over several different parameters produces the very same point each time, resulting in fruitless computational expensethis is never the case with NBI. Moreover, in the absence of convexity, "Pareto-optimal solutions obtained by this method so extreme, are often found that there such 'ground' to be so few, or the correspondto be no middle may actually 'ground' for exist" - Lin [9]. and ing indexes seems

any compromise,

although

For examples, refer to Lin [9], Katopis and Lin [10], Lin [11]. The interrelationship between the linear combinations subproblem the NBI subproblem binations technique optima. provides more insight fail to give a uniformly a in subproblem

into why the linear comdistributed set of Pareto LC, we are in effect fix-

By fixing the weights

ing the multipliers of the corresponding NBI subproblem, thus partly restricting the solution of the resultant subproblem. Even if the Pareto optima are uniformly distributed in the Pareto set, there is no reason why the corresponding However, the weights very desirable the objectives. distributed, points multipliers have to be uniformly distributed. in the linear combinations approach are often importance of are uniformly a for the NB!

because they give an idea of the relative Thus obtaining the NBI points, which and then finding the corresponding weights

can be very useful.

Advantages over homotopy techniques: NBI improves over homotopy/continuation techniques for tracing tile curve of Pareto optimal solutions, like the one discussed in Rakowska, Haftka & Watson [4], in the following respects:

It is applicable for more than two objectives For a multiobjective problem with more than two objectives the homotopy parameter is not a scalar to be a system and the associated of nonlinear partial differential differential rather of two equations equations turn out with not

readily available boundary conditions, initial value problem, as in the case extending homotopy Oil the other hand, two objectives quite

than an ordinary objectives. Thus

techniques to handle NBI can be extended easily.

n > 2 is very difficult. to handle more than

It does not require solving

exact Hessian. boundary

Even for a biobjective value problem requires

problem, exact sec-

the homotopy

16

ond derivative

information

(i.e., the

Hessian

of the Lagrangian), only a secant ap-

whereas the NBI subproblem solver requires proximation of the Hessian like BFGS. It can bypass tracking active sets.

For problems

with inequality

constraints or explicit bounds on variables , homotopy techniques need to keep track of the changes in active sets of the inequality constraints or bounds meticulously in course of the Initial Value Problem integration, which can present difficulties if the number an interior of inequalities or bounds is large. On the other hand

point NLP solver used as the NBI subproblem solver would handle this situation quite efficiently, and would not have a problem with frequent NBI improves the sense that It improves changes in the active set.

on other traditional methods like goal programming in it never requires any prior knowledge of 'feasible goals'. on multilevel optimization techniques from the tradeoff only for

standpoint, since multilevel techniques usually can only improve a few of the 'most important' objectives, leaving no compromise the rest.

A note

on

local

versus

global

It is worth observing here that unless the individual minima of the objectives obtained at the outset are guaranteed to be global minima there is no guarantee that NBI produces solutions that are globally Pareto optimal. In fact, as pointed out earlier, there is no guarantee that every solution produced by NBI is even locally Pareto optimal. All we can conjecture is that if the individual minima of the functions happen to be global minima and if we start NBI from every point on CHIM + UCHIM, the set of points thus obtained would contain all the globally Pareto optimal points, provided the boundary of )c is not 'folded'. However, even when 'folded', the point obtained could be locally Pareto optimal (see fig.2). Not being ent in every In homotopy able to find globally Pareto optimal points is a drawback inher-

method that finds a large number methods, it would involve finding

of efficient points of MOP. the global minimum of one

of the two objectives points by minimizing ized objective would

in the very beginning. In methods which find efficient a single objective, only a global minimum of the scalarcorrespond to a globally 17 efficient point. Even though

f2(x) N
i

P
/

f l(x) Figure 3: The normal from N intersects the boundary the objectives at P are each less than the corresponding E is not Pareto optimal. at E, but values of values at E, hence

a local minimum would still correspond to a locally efficient point, there is no guarantee that minimizing a single objective produces a local minimum since most single objective point of the problem, i.e. being a minimum local minimum!). Given and optimization algorithms one which only satisfies thus only converge to a KKT necessary conditions for (and not even a

could

well be a saddle-point

the shortcomings

of global optimization

applied

to nonconvex

prob-

lems, we choose to remain satisfied with the Pareto optimal points obtained by NBI, in spite of the fact that they may not be globally efficient.

A Numerical

Example
of employing NBI techniques on a small biobjective

Below is a brief account problem, stated below:

+ x 3 + x 4 + x_ min [fl(x)=z_+x_ f2(x)=3x1+2x2_. s.t. 4xl 222 .+O.Ol(x4_xs)3 ]

x1+2x2-x3-0.5x4+x_=2 - 2x2 +
0.8X3

Jr 0.6X4 + 0.5X_ = 0 18

.r_+x_+x32+x42+x52_< 10. NBI usingthe actualnormalto the CHIM simplex (a line segment in this case) was run three times on this problem for 21 different weight vectors w: first on the original problem, then on the problem with fl scaled by
a factor of 5 (to increase the disparity between the scales of the objective functions) and then with fl scaled by a factor of 10. The results in the following table shows that NBI succesfully produces a uniformly distributed set of Pareto optimal points even if the objective functions are scaled disparately. (Note that been converted back Weights (wl, w2) 0.00 , 1.00 0.0.5 , 0.95 0.10 , 0.90 0.15 , 0.85 0.20 , 0.80 0.25 , 0.75 0.30 , 0.70 0.35 , 0.65 0.40 , 0.60 0.45 , 0.55 0.50 , 0.50 0.55 , 0.45 0.60 , 0.40 0.65 , 0.35 0.70 , 0.30 0.75 , 0.25 0.80,0.20 0.85 , 0.15 0.90 , 0.10 0.95 , 0.05 1.00 0.00 The plots the tabulated Pareto optimal to their original scales.) values Objective function values have all

Objective

values

Objective

values

(original scale) 10.0000 ,-4.0111 9.4717 8.9453 8.4208 7.8985 7.3785 6.8612 6.3469 ,-3.7902 ,-3.5665 , -3.3398 , -3.1097 ,-2.87.59 ,-2.6381 ,-2.3958

(]'1 scaled by 5) 10.0000,-4.0111 9..5249,-3.8126 9.0499,-3.6113 8.5750,-3.4069 8.1002,-3.1991 7.6255,-2.9876 7.1508,-2.7720 6.6763,-2..5517 6.2020,-2.3263 5.7277,-2.0950 5.2537,-1.8570 4.7799,-1.6112 4.3063,-1.3562 3.8329,-1.0903 3.3600,-0.8107 2.8875,-0.5141 2.4155,-0.1947 1.9444, 0.1567 1.4747, 1.0074, 0.5551, objective as shown 0.5586 1.0583 2.1306

(fl scaled by 10) 10.0000,-4.0111 9.5270,-3.8135 9.0541,-3.6131 8.5812,-3.4095 8.1083,-3.2027 7.6354,-2.9921 7.1626,-2.7773 6.6897,-2.5580 6.2170,-2.3335 5.7442,-2.1032 5.2715,-1.8661 4.7989,-1.6213 4.3263,-1.3672 3.8538,-1.1022 3.3813,-0.8237 2.9090,-0.5281 2.4368,-0.2097 1.9649, 0.1406 1.4932, 1.0222, 0.5551, 0.5413 1.0398 2.1306 above reveal for very end

.5.8359 ,-2.1483 5.3286,-1.8951 4.8256 4.3275 .-1.6353 ,-1.3679

3.8353, -1.0916 3.3499,-0.8046 2.8730 ,-0.5047 2.4067 ,-0.1885 1.9542,0.1490 1.5209 , 0.5159 1.1164 , 0.9272 0.7635 , 1.4178 0.5551 , 2.1306 of Pareto optimal problems

vectors in Fig.4

as tabulated and Fig.5,

the original

and scaled

slight difference:

with the first objective

scaled,

one point on the F(x_)

19

movesa little further away.However, singthe quasi-normalfi, eventhis u slight nonuniformityof distributionof Paretopointsis eliminated(seeFig. 6). The Paretopointsobtainedusingthe quasi-normal, independent f the o scaleon fl, and are tabulated below:
Weights 0.00, 1.00 0.05, 0.95 0.10, 0.90 0.15, 0.85 0.20, 0.80 0.25, 0.75 0.30, 0.70 0.35, 0.65 0.40, 0.60 0.45, 0.55 0.50, 0.50 0.55, 0.45 0.60, 0.40 0.65, 0.35 O.70, 0.3O 0.75, 0.25 0.80, 0.20 0.85, 0.15 0.90, 0.10 0.95, 0.05 1.00, 0.00 Objective values 10.0000 , -4.0111 9.4254, 8.8546 8.2882 7.7264 7.1698 6.0743 5.5368 5.0072 4.4866 3.9764 3.4781 2.9939 2.5266 -3.7706 , -3.5276 ,-3.2818 ,-3.0329 ,-2.7807 ,-2.2647 ,-2.0000 ,-1.7302 ,-1.4546 ,-1.1722 ,-0.8820 ,-0.5827 ,-0.2724

6.6189,-2.5247

2.0801 , 0.0514 1.6597,0.3922 1.2740,0.7556 0.9370, 0.6754 0.5551 1.1506 , 1.5947 .2.1306

The method

of linear combinations e assuming problem,

was run thrice the same the

on the same problem, spread values as six of

with the weight vectors the w vector above I. When times run on the for six different

21 uniformly of f2(x)

original

minimizer

was found

c_, and there

was a considerable

gap 'in the middle' was found

the Pareto set [see fig.(7)]. With fl scaled by 5! the point found six times earlier
1The efficient solution scheme, i.e., starting the solution

only twice 11,


from x_. the

of a subproblem move away from

optimal point of a 'nearby subproblem' l lHeavily weighting the first objective

was used here too. made the minimizer

2O

but the Paretooptimal vectors obtainedwereconcentratedt the F(x_) a


and no 'middle ground for compromise' With ft scaled by 10, the point repeated the clustering The Pareto ulated below: Weights at the F(x_) optimal end increased obtained

end

was captured [see fig.(8)]. earlier was found only once, though [see fig.(9)]. using linear combinations are tab-

vectors

Objective

values

Objective

values

Objective

values

(Ol,
0.00,1.00 0.05,0.95 0.10,0.90 0.15,0.85 0.20 , 0.80 0.25 , 0.75 0.30,0.70 0.35 , 0.65 0.40,0.60 0.45,0.55 0.50,0.50 0.55 , 0.45 0.60,0.40 0.65 , 0.35 0.70, 0.75, 0.30 0.25

(original scale) 10.0000 ,-4.0111 10.0000 ,-4.0111 10.0000 ,-4.0111 10.0000 10.0000 ,-4.0111 ,-4.0111

(fl scaled by 5) 10.0000,-4.0111 10.0000,-4.0111 4.1857,-1.2896 1.6131, 0.4330 1.0180,1.0451 0.7975, 1.3592 0.6953,1.5506 0.6412, 1.6796 0.6100, 0.5909, 0.5788, 0.5707, 0.5654, 0.5618, 0.5593, 0.5576, 0.5565, 0.5558, 1.7725 1.8425 1.8973 1.9413 1.9773 2.0075 2.0331 2.0551 2.0741 2.0909

(fl scaled by 10) 10.0000, -4.0111 4.8211,-1.6330 1.1634, 0.7689, 0.6559, 0.6100, 0.5876, 0.5754, 0.5682, 0.5637, 0.5608, 0.5589, 0.8741 1.4083 1.6416 1.7724 1.8563 1.9146 1.9576 1.9905 2.0165 2.0376

10.0000 ,-4.0111 8.9403 ,-3.5644 4.5379 ,-1.4822 2.7307 ,-0.4109 1.8319, 0.2473 1.3357,0.6928 1.0425, 0.8615, 0.7463, 0.6719, 0.6236, 0.5926, 0.5734, 1.0147 1.2583 1.4492 1.6029 1.7295 1.8356 1.9258

0.5576,2.0551 0.5567, 2.0698 0.5561, 0.5557, 2.0823 2.0931

0.80,0.20 0.85,0.15 0.90 , 0.10 0.95 , 0.05 1.00 , 0.00

0.5554,2.1025 0.5553,2.1108 0.5552, 2.1181 0.5551,2.1247 0.5551,2.1306

0.5622,2.0035 0.5567, 2.0711 0.5551, 2.1306

0.5554,2.1057 0.5551, 2.1188 0.5551,2.1306

Clearly, the inability of tile method of linear combinations in sufficiently capturing the 'middle ground' of the Pareto set renders it fairly useless as a means of studying the tradeoff between the conflicting objectives.

21

9.1

Function

scaling

implicit

in NBI

Even though the NBI using the quasi-normal component is unaffected by the function scales, this property comes with a price. As the functions get more disparately scaled, the Pareto set gets more 'stretched', and consequently the NBI points get further apart from each other. Consequently, solving an NBI subproblem starting from the solution of the same nearby subproblem takes more iterations to converge. This was observed in the numerical example above and motivates the need to scale the functions properly to remove this disparity in scales. CHIM Geometrically, it can be perceived that if the vertices of the simplex are almost equidistant from the origin, i.e. the quantities I[F(x_) - F*II , j = 1.... ,n normal Pareto to the set we all

are almost equal, then the quasi normal direction fi is almost CHIM simplex. This would achieve the 'minimally stretched' want and could also be a good scaling for the problem the functions would be about possible ill-conditioning. the same order of magnitude,

in the sense that

and thus reduce

For the biobjective problem, achieve the above is obvious:

_5 is antidiagonal;

thus a scaling

that

would

fl fl I2
which gets each vertex of CHIM

f2
from the origin. for more than exactly scMings two obdi > 0

to be unit distance

However, jectives,

the solution

may not be so transparent

and it may not be possible

to get all the vertices

equidistant

from the origin. So now we shall attempt such that the functions scaled as fi _ will have vertices the property from the origin, that i.e. F*)II 2, 22 v_if_

to find function

the variance

among

the scaled

distances

of the

ltv/-D(F(x_)-

j = 1....

,n

will be minimized di).

(D = diag(d),

d represents

the

vector

with components

Let t'j = IIv/-D(F(x'_)-

F*)II 2, i.e.,

vj =
i=1

where The

i.j is the i th row jth column mean square distance 1 n


=

entry

of the matrix is defined "


j--1

(I). as

of the vertices 1 n
n i=1

The

variance

quantity

to be minimized
n

is given by

j=l

i.e.?

V(d)

n
j=l

n dii2j
i=l

n di(n 1_-2_
"= j=l

2 by

Let A be the matrix

with components

ai,j

given

aij = i2,j n k=l

i,k.

Then
n n

V(d)

= E(___
j:l i----1

diai,j )_;

i.e.,

V(d)

= dT AATd

= IIATdll 2.

This quadratic function is convex mizer at d = 0. Thus we shall demand an average


12Using result in the loss

in d, and has an unconstrained minia specific value of f), which represents from the origin 12 and
square distance for this

distance
mean

of the CHIM
instead

simplex
mean

is roughly
would

distance

of the

constraint

of convexity.

23

the sameorder of
encountered be r, which

magnitude as a typical function value of any objective in tile computation. Say we want a typical objective value to could be something like 10. Then we would enforce
1 n n

= -

7/

di

_,j) = r

along with a small lower bound on di. Thus solved to obtain our 'optimal' scales is min V(d)
d n

the optimization

problem

to be

= dr AAr d
n _)i,j) j=l "_ nT"

s.t.

E
i=1

di >= 10 .8 Thus we can see how the functions, which matrix

, i = 1, ....

n. scaling' of the

suggests

an 'improved

objective

is a bonus in the NBI approach.

10

Conclusion
for finding Pareto optimal points of any smooth, problem with essentially any number of objecis left open is how the user set generated would select the final algorithm by NBI (or any other

An algorithm was presented constrained multiobjective tives. design One question point that from the Pareto

which generates the Pareto set). For two or three objectives, the generated Pareto curve/surface can be visualized with standard 2-D or 3-D plots, which may be all the user needs to arrive at a final design point. However the visualization process may be complicated for more than three objectives, and how helpful it will be in guiding the user towards a better choice may depend on factors like the psychological aspects of the visualization. One procedure that could perhaps be useful is to have the user specify another 'cost' or 'utility' function, whose value could be reported at each of the Pareto optimal points generated final choice based on this 'cost'. tives and if it is possible of two or three (e.g. by NBI, and the user could make his/her Also, if there are more than three objecorder of preference than in blocks Pareto fl, f3), the

to set up a hierarchical

f2, ]'.4,f5 are more important

points for the combined problem starting at the most important,

could be visualized for each of the blocks, and the user could narrow down his/her 24

preferences Further garding solving

down

the blocks. is in progress regarding the above issue and also refor

research

the development of efficient nonlinear the NBI subproblems and parallelizing

programming techniques the entire algorithm.

11
The

Acknowledgements
authors would like to thank Paul Uhlig, Dept. of Mathematics, Rice

University for several insightful of Mechanical Eng., University helpful comments, Dr. division, NASA-Langley user preferences, and of Michigan, Ann Arbor

discussions, of Houston,

Dr. Jagannatha Rao, Dept. for providing motivation and and Dr. Edward Dean, MDO for their helpful comments on Engineering, on data structures. University

Natalia Alexandrov Research Center Jeffrey Hittinger, for a helpful

Aerospace discussion

References
[1] H. Eschenauer. J. Koski and A. Osyczka. tion. Berlin, Springer-Verlag, 1990. Multicriteria Design Optimiza-

[2] Roman B. Statnikov and Joseph B. Matusov. Multicriteria and Engineering. New York, Chapman & Hall_ 1995.

Optimization

[3] J. G. Lin. Three Methods for Determining Pareto-Optimal Solutions of Multiple-Objective Problems. Directions in Large-Scale Systems, pp. 117-138. Edited by Y. C. Ho and S. K. Mitter. New York, Plenum Press, 197.5. [4] J. Rakowska, R. T. Haftka and L. T. Watson. Tracing Curve for Multi-Objective Control-Structure Optimization. Systems in Engineering. Vol. 2, No. 6, pp. 461-471, 1991. the Efficient Computing

[5] J. R. Rao and P. Y. Papalambros. ation Strategy for One Parameter ceedings Canada,

A Non-linear Programming ContinuDesign Optimization Problems. ProConference, Montreal, Quebec,

of ASME Design Automation Sept. 17-20, 1989, pp. 77-89.

25

[6] B. N. LundbergandA. B. Poore.Bifurcations


metric Programming. Proceedings of Third sium on Recent Advances in Multidisciplinary tion, Sept. 24-26, 1990, San Francisco,

and Sensitivity in ParaAir Force/NASA SympoAnalysis and Optimiza-

CA, pp. 50-55. Multicriteria Optimization by W. Stadler. New York,

[7] J. Koski. Multicriteria Truss Optimization. in Engineering and in the Sciences. Edited Plenum Press, 1988. [8] Y. Haimes, W. Hall and Water Resources Systems. 1975. H. Freedman. Amsterdam,

Multiobjective Optimization Elsevier Scientific Publishing

in Co,

[9] J. G. Lin. Multiple-Objective Problems: Method of Proper Equality Constraints. matic Control. vol. AC-21, no.5, October [10] G. A. Katopis

Pareto-Optimal Solutions by IEEE Transactions on Auto1976, pp. 641-6.50. of Controls under Double Proceed1974, pp.

and J. G. Lin. Non-inferiority

Performance Objectives: Minimal ings of -[th Hawaii Int. Conf. Syst. 129-131.

Time and Minimal Energy. Sci., Honolulu, Hawaii, Jan

[11] J. G. Lin. Circuit Design under Multiple Performance Objectives. Proc. 1974 IEEE Int. Syrup. Circuits & Systems, San Francisco, CA, pp. 549.552, Apr. 1974.

26

Pareto 3
I I

points

obtained
I

using

NBIgeneral3
I 1

U,.

-2

-3

-5 0

6 F(1)

10

12

Figure actual

4: normal

Pareto on

optimal the original

vectors problem

in

the

objective

space

using

NBI

with

27

Pareto 3
I

points

obtained
I

using

NBIgeneral3
I

Y_

Y_

LL

-2
Y_ Y_ Y_

-3

Y_

Y_

-4

Y_

-5 0

6 F(1)

10

12

Figure actual

5: Pareto optimal vectors in the objective normal on the problem with fl scaled by 10

space

using

NBI

with

28

LI_

-2

-3
)K

-4

-5 0

6 F(1)

10

12

Figure

6: Pareto

optimal

vectors

in the

objective by 10

space

using

NB]

with

quasi-normal

on the problem

with fl scaled

29

Efficient 3
I

points

obtained
I

by minimizing
I

convex

combinations
[

of objectives
I

Y_

0
Y_

1.1_ Y_

-2

-3

--L

'

6 F(1)

10

12

Figure linear

7:

Pareto

optimal on the

vectors original

in

the

objective

space

using

the

method

of

combinations

problem

3O

Efficient

points
I

obtained
I

by minimizing
I

convex

combinations
I

of objectives
]

U-

-2

-3

-4

-5 0

6 F(1)

10

Figure linear

8: Pareto combinations

optimal

vectors problem

in the with

objective fl scaled

space by 5

using

the

method

of

on the

31

Efficient

points
I

obtained
I

by minimizing
I

convex

combinations
I

of objectives
I

U_

-2

-3

-4

-5 0

6 F(1)

10

12

Figure linear

9: Pareto combinations

optimal on the

vectors problem

in the with

objective fl scaled

space by

using 10

the

method

of

32

REPORT

DOCUMENTATION

PAGE

OMBNo 0704-0188 Form Approved

Public reporting burden for this collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gather n g and maintaining the data needed and completing and reviewing the collection of information Send comments regarding this burden estimate or any other aspect of this collection of information, including suggestions for reducing this burden, to Washington Headquarters Services. D rectorate for nformation Operations and Reports, ]215 Jefferson Davis Highway. Suite 1204, Arlington, VA 22202-4302, and to the Office of Management and Budget. Paperwork Reduction Projec! (0704-0188), Washington. DC 20503 I. AGENCY USE ONLY(Leave blank) 2. REPORT November 4. TITLE AND SUBTITLE DATE 1996 3. REPORT Contractor TYPE AND DATES COVERED

Report S. FUNDING NUMBERS

NORMAL-BOUNDARY INTERSECTION: METHOD FOR GENERATING PARETO MULTICRITERIA 6. AUTHOR(S) Indraneel Das John Dennis OPTIMIZATION

AN ALTERNATE OPTIMAL POINTS

IN

PROBLEMS

C NAS1-19480 WU 505-90-52-01

7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES) Institute Mail Stop Hampton, for Computer 403, NASA VA 23681-0001 Apphcations Langley in Science Center and Engineering Research

8.

PERFORMING REPORT

ORGANIZATION NUMBER

ICASE

Report

No.

96-62

9. SPONSORING/MONITORING National Aeronautics Langley Research Center Hampton, VA 23681-0001

AGENCY NAME(S) AND ADDRESS(ES) Administration

10.

SPONSORING/MONITORING AGENCY NASA ICASE REPORT CR-201616 Report No. 96-62 NUMBER

and Space

11.

SUPPLEMENTARY

NOTES

Langley

Technical

Monitor:

Dennis

M. Bushnell

Final Report Submitted to the SIAM


12a. DISTRIBUTION/AVAILABILITY

Journal

on Optimization.
12b. DISTRIBUTION CODE

STATEMENT

U nclassified-U Subject

nlimited 64

Category

13.

ABSTRACT This paper

(Maximum proposes

200 words)

an alternate method for finding severM Pareto optimal points for a general nonlinear multicriteria optimization problem, aimed at capturing the tradeoff among the various conflicting objectives. It can be rigorously proved that this method is completely independent, of the relative scales of the functions and is quite successful in producing an evenly distributed set of points in the Pareto set given an evenly distributed set of 'weights', a property which the popular method of linear combinations lacks. Further. this method can be easily extended in case of more than two objectives while retaining the computationM efficiency o[ continuation-type algorithms, which is an improvement, over homotopy techniques for tracing the tradeoff curve.

14.

SUBJECT Multicriteria

TERMS Optimization; Multiobjective Optimization:

IS.

NUMBER

OF

PAGES

34
16. PRICE CODE

Pareto
17. SECURITY OF

Optimahty;

Pareto

Set;
18.

Tradeoff
SECURITY OF THIS

Design
CLASSIFICATION PAGE 19. SECURITY OF ABSTRACT CLASSIFICATION

A03
CLASSIFICATION 20. LIMITATION OF ABSTRACT REPORT

Unclassified
NSN 7540-01-280-5500

Unclassified
Standard Form 298(Rev. 2-89) Prescribed by ANSI Std, 239-18 298-102

You might also like