9. Case studies


9.1. Robust linear optimization

In most linear optimization examples discussed in this manual it is implicitly assumed that the problem data, such as c and A, is known with certainty. However, in practice this is seldom the case, e.g. the data may just be roughly estimated, affected by measurement errors or be affected by random events.

In this section a robust linear optimization methodology is presented which removes the assumption that the problem data is known exactly. Rather it is assumed that the data belongs to some set, i.e. a box or an ellipsoid.

The computations are performed using the MOSEK optimization toolbox for MATLAB but could equally well have been implemented using the MOSEK API.

This section is co-authored with A. Ben-Tal and A. Nemirovski.

9.1.1. Introductory example

Consider the following toy-sized linear optimization problem: A company produces two kinds of drugs, DrugI and DrugII, containing a specific active agent A, which is extracted from a raw materials that should be purchased on the market. The drug production data are as follows:

Drug Selling price, Content of agent A, Production expenses per 1000 packs
$ per 1000 packs gm per 1000 packs
manpower,
hours
equipment,
hours
operational
costs, $
DrugI 6200 0.500 90.0 40.0 700
DrugII 6900 0.600 100.0 50.0 800

There are two kinds of raw materials, RawI and RawII, which can be used as sources of the active agent. The related data is as follows:

Raw material Purchasing price, Content of agent A,
  $ per kg gm per kg
RawI 100.00 0.01
RawII 199.90 0.02

Finally, the monthly resources dedicated to producing the drugs are as follows:

Budget, $ Manpower, hours Equipment, hours Capacity of raw materials storage, kg
100000 2000 800 1000

The problem is to find the production plan which maximizes the profit of the company, i.e. minimize the purchasing and operational costs

\begin{displaymath}\nonumber{}100\cdot \mathtt{RawI}+199.90\cdot \mathtt{RawII}+700\cdot \mathtt{DrugI}+800\cdot \mathtt{DrugII}\end{displaymath}

and maximize the income

\begin{displaymath}\nonumber{}6200\cdot \mathtt{DrugI}+6900\cdot \mathtt{DrugII}\end{displaymath}

The problem can be stated as the following linear programming program:

Minimize

\begin{math}\nonumber{}-\left(100\cdot \mathtt{RawI}+199.90\cdot \mathtt{RawII}+700\cdot \mathtt{DrugI}+800\cdot \mathtt{DrugII}\right)+\left(6200\cdot \mathtt{DrugI}+6900\cdot \mathtt{DrugII}\right)\end{math} (9.1.1)

subject to

\begin{math}\nonumber{}\begin{array}{rcll}\nonumber{}0.01\cdot \mathtt{RawI}+0.02\cdot \mathtt{RawII}-0.500\cdot \mathtt{DrugI}-0.600\cdot \mathtt{DrugII} & \geq{} & 0 & (a)\\\nonumber{}\mathtt{RawI}+\mathtt{RawII} & \leq{} & 1000 & (b)\\\nonumber{}90.0\cdot \mathtt{DrugI}+100.0\cdot \mathtt{DrugII} & \leq{} & 2000 & (c)\\\nonumber{}40.0\cdot \mathtt{DrugI}+50.0\cdot \mathtt{DrugII} & \leq{} & 800 & (d)\\\nonumber{}100.0\cdot \mathtt{RawI}+199.90\cdot \mathtt{RawII}+700\cdot \mathtt{DrugI}+800\cdot \mathtt{DrugII} & \leq{} & 100000 & (d)\\\nonumber{}\mathtt{RawI},\mathtt{RawII},\mathtt{DrugI},\mathtt{DrugII} & \geq{} & 0 & (e)\end{array}\end{math} (9.1.2)

where the variables are the amounts RawI, RawII (in kg) of raw materials to be purchased and the amounts DrugI, DrugII (in 1000 of packs) of drugs to be produced. The objective (9.1.1) denotes the profit to be maximized, and the inequalities can be interpreted as follows:

  • Balance of the active agent.
  • Storage restriction.
  • Manpower restriction.
  • Equipment restriction.
  • Ducget restriction.

Here is the MATLAB script which specifies the problem and solves it using the MOSEK optimization toolbox:

% rlo1.m

clear prob;

prob.c   = [-100;-199.9;6200-700;6900-800];
prob.a   = sparse([0.01,0.02,-0.500,-0.600;1,1,0,0;
                   0,0,90.0,100.0;0,0,40.0,50.0;100.0,199.9,700,800]);
prob.blc = [0;-inf;-inf;-inf;-inf];
prob.buc = [inf;1000;2000;800;100000];
prob.blx = [0;0;0;0];
prob.bux = [inf;inf;inf;inf];
[r,res]  = mosekopt('maximize',prob);
xx       = res.sol.itr.xx;
RawI     = xx(1);
RawII    = xx(2);
DrugI    = xx(3);
DrugII   = xx(4);

disp(sprintf('*** Optimal value: %8.3f',prob.c'*xx));
disp('*** Optimal solution:');
disp(sprintf('RawI:   %8.3f',RawI));
disp(sprintf('RawII:  %8.3f',RawII));
disp(sprintf('DrugI:  %8.3f',DrugI));
disp(sprintf('DrugII: %8.3f',DrugII));

When executing this script, the following is displayed:

*** Optimal value: 8819.658
*** Optimal solution:
RawI:     0.000
RawII:  438.789
DrugI:   17.552
DrugII:   0.000

We see that the optimal solution promises the company a modest but quite respectful profit of 8.8%. Please note that at the optimal solution the balance constraint is active: the production process utilizes the full amount of the active agent contained in the raw materials.

9.1.2. Data uncertainty and its consequences.

Please note that not all problem data can be regarded as “absolutely reliable”; e.g. one can hardly believe that the contents of the active agent in the raw materials are exactly the “nominal data” 0.01 gm/kg for RawI and 0.02 gm/kg for RawII. In reality, these contents definitely vary around the indicated values. A natural assumption here is that the actual contents of the active agent [[MathCmd 260]] in RawI and [[MathCmd 261]] in RawII are realizations of random variables somehow distributed around the “nominal contents” [[MathCmd 262]] and [[MathCmd 263]]. To be more specific, assume that [[MathCmd 260]] drifts in the 0.5% margin of [[MathCmd 265]], i.e. it takes with probability 0.5 the values from the interval [[MathCmd 266]]. Similarly, assume that [[MathCmd 261]] drifts in the 2% margin of [[MathCmd 268]], taking with probabilities 0.5 the values [[MathCmd 269]]. How do the perturbations of the contents of the active agent affect the production process?

The optimal solution prescribes to purchase 438.8 kg of RawII and to produce 17552 packs of DrugI. With the above random fluctuations in the content of the active agent in RawII, this production plan, with probability 0.5, will be infeasible – with this probability, the actual content of the active agent in the raw materials will be less than required to produce the planned amount of DrugI. For the sake of simplicity, assume that this difficulty is resolved in the simplest way: when the actual content of the active agent in the raw materials is insufficient, the output of the drug is reduced accordingly. With this policy, the actual production of DrugI becomes a random variable which takes, with probabilities 0.5, the nominal value of 17552 packs and the 2% less value of 17201 packs. These 2% fluctuations in the production affect the profit as well; the latter becomes a random variable taking, with probabilities 0.5, the nominal value 8,820 and the 21% less value 6,929. The expected profit is 7,843,which is by 11% less than the nominal profit 8,820 promised by the optimal solution of the problem.

We see that in our toy example that small (and in reality unavoidable) perturbations of the data may make the optimal solution infeasible, and a straightforward adjustment to the actual solution values may heavily affect the solution quality.

It turns out that the outlined phenomenon is found in many linear programs of practical origin. Usually, in these programs at least part of the data is not known exactly and can vary around its nominal values, and these data perturbations can make the nominal optimal solution – the one corresponding to the nominal data – infeasible. It turns out that the consequences of data uncertainty can be much more severe than in our toy example. The analysis of linear optimization problems from the NETLIB collection reported in [4] demonstrates that for 13 of 94 NETLIB problems, already 0.01% perturbations of “clearly uncertain” data can make the nominal optimal solution severely infeasible: with these perturbations, the solution, with a non-negligible probability, violates some of the constraints by 50% and more. It should be added that in the general case, in contrast to the toy example we have considered, there is no evident way to adjust the optimal solution by a small modification to the actual values of the data. Moreover there are cases when such an adjustment is impossible — in order to become feasible for the perturbed data, the nominal optimal solution should be “completely reshaped”.

9.1.3. Robust linear optimization methodology

A natural approach to handling data uncertainty in optimization is offered by the Robust Optimization Methodology which, as applied to linear optimization, is as follows.

9.1.3.1. Uncertain linear programs and their robust counterparts.

Consider a linear optimization problem

\begin{math}\nonumber{}\begin{array}{lccccc}\nonumber{}\mbox{minimize} &  &  & c^{T}x &  & \\\nonumber{}\mbox{subject to} & l_{c} & \leq{} & Ax & \leq{} & u_{c},\\\nonumber{} & l_{x} & \leq{} & x & \leq{} & u_{x},\end{array}\end{math} (9.1.3)

with the data [[MathCmd 271]], and assume that this data is not known exactly; all we know is that the data varies in a given uncertainty set [[MathCmd 272]]. The simplest example is the one of interval uncertainty, where every data entry can run through a given interval:

\begin{math}\nonumber{}\begin{array}{rcl}\nonumber{}\mathcal{U} & = & \bigg\lbrace{}(c,A,l_{c},u_{c},l_{x},u_{x}):\\\nonumber{} &  & (c^{\mathrm{n}}-dc,A^{\mathrm{n}}-dA,l_{c}^{\mathrm{n}}-dl_{c},u_{c}^{\mathrm{n}}-du_{c},l_{x}^{\mathrm{n}}-dl_{x},u_{x}^{\mathrm{n}}-du_{x})\leq{}(c,A,l_{c},u_{c},l_{x},u_{x})\\\nonumber{} &  & \leq{}(c^{\mathrm{n}}+dc,A^{\mathrm{n}}+dA,l_{c}^{\mathrm{n}}+dl_{c},u_{c}^{\mathrm{n}}+du_{c},l_{x}^{\mathrm{n}}+dl_{x},u_{x}^{\mathrm{n}}+du_{x})\bigg\rbrace{}.\end{array}\end{math} (9.1.4)

Here

\begin{displaymath}\nonumber{}(c^{\mathrm{n}},A^{\mathrm{n}},l_{c}^{\mathrm{n}},u_{c}^{\mathrm{n}},l_{x}^{\mathrm{n}},u_{x}^{\mathrm{n}})\end{displaymath}

is the nominal data,

\begin{displaymath}\nonumber{}dc,dA,dl_{c},du_{c},dl_{x},du_{x}\geq{}0\end{displaymath}

is the data perturbation bounds. Please note that some of the entries in the data perturbation bounds can be zero, meaning that the corresponding data entries are certain (the expected values equals the actual values).

  • The family of instances (9.1.3) with data running through a given uncertainty set [[MathCmd 272]] is called an uncertain linear optimization problem.
  • A vector x is called a robust feasible solution to an uncertain linear optimization problem, if it remains feasible for all realizations of the data from the uncertainty set, i.e. if

    \begin{math}\nonumber{}l_{c}\leq{}Ax\leq{}u_{c},\,l_{x}\leq{}x\leq{}u_{x}\hbox{ for all }(c,A,l_{c},u_{c},l_{x},u_{x})\in{}\mathcal{U}.\end{math} (9.1.5)
  • If for some value t we have [[MathCmd 278]] for all realizations of the objective from the uncertainty set, we say that robust value of the objective at x does not exceed t.

The Robust Optimization methodology proposes to associate with an uncertain linear program its robust counterpart (RC) which is the problem of minimizing the robust optimal value over the set of all robust feasible solutions, i.e. the problem

\begin{math}\nonumber{}\min \limits _{{t,x}}\left\lbrace{}t:c^{T}x\leq{}t,\,l_{c}\leq{}Ax\leq{}u_{c},\,l_{x}\leq{}x\leq{}u_{x}\hbox{ for all }(c,A,l_{c},u_{c},l_{x},u_{x})\in{}\mathcal{U}\right\rbrace{}.\end{math} (9.1.6)

The optimal solution to (9.1.6) is treated as the “uncertainty-immuned” solution to the original uncertain linear programming program.

9.1.3.2. Robust counterpart of an uncertain linear optimization problem with interval uncertainty

In general, the RC (9.1.6) of an uncertain linear optimization problem is not a linear optimization problem since(9.1.6) has infinitely many linear constraints. There are, however, cases when (9.1.6) can be rewritten equivalently as a linear programming program; in particular, this is the case for interval uncertainty (9.1.4). Specifically, in the case of (9.1.4), the robust counterpart of uncertain linear program is equivalent to the following linear program in variables x,y,t:

\begin{math}\nonumber{}\begin{array}{lcccccl}\nonumber{}\mbox{minimize} &  &  & t &  &  & \\\nonumber{}\mbox{subject to} &  &  & (c^{\mathrm{n}})^{T}x+(dc)^{T}y-t & \leq{} & 0, & (a)\\\nonumber{} & l_{c}^{\mathrm{n}}+dl_{c} & \leq{} & (A^{\mathrm{n}})x-(dA)y, &  &  & (b)\\\nonumber{} &  &  & (A^{\mathrm{n}})x+(dA)y & \leq{} & u_{c}^{\mathrm{n}}-du_{c}, & (c)\\\nonumber{} & 0 & \leq{} & x+y, &  &  & (d)\\\nonumber{} & 0 & \leq{} & -x+y, &  &  & (e)\\\nonumber{} & l_{x}^{\mathrm{n}}+dl_{x} & \leq{} & x & \leq{} & u_{x}^{\mathrm{n}}-du_{x}, & (f)\end{array}\end{math} (9.1.7)

The origin of (9.1.7) is quite transparent: The constraints (9.1.7.d-e) linking x and y merely say that [[MathCmd 281]] for all i. With this in mind, it is evident that at every feasible solution to (9.1.7) the entries in the vector

\begin{displaymath}\nonumber{}(A^{\mathrm{n}})x-(dA)y\end{displaymath}

are lower bounds on the entries of Ax with A from the uncertainty set (9.1.4), so that (9.1.7.b) ensures that [[MathCmd 283]] for all data from the uncertainty set. Similarly, (9.1.7.c) and (9.1.7.a), (9.1.7.f) ensure, for all data from the uncertainty set, that [[MathCmd 284]], [[MathCmd 278]], and that the entries in x satisfy the required lower and upper bounds, respectively.

Please note that at the optimal solution to (9.1.7), one clearly has [[MathCmd 286]]. It follows that when the bounds on the entries of x impose nonnegativity (nonpositivity) of an entry [[MathCmd 49]], then there is no need to introduce the corresponding additional variable [[MathCmd 288]] — from the very beginning it can be replaced with [[MathCmd 49]], if [[MathCmd 49]] is nonnegative, or with [[MathCmd 291]], if [[MathCmd 49]] is nonpositive.

Another possible formulation of problem (9.1.7) is the following. Let

\begin{displaymath}\nonumber{}l_{c}^{\mathrm{n}}+dl_{c}=(A^{\mathrm{n}})x-(dA)y-f,\quad{}f\geq{}0\end{displaymath}

then this equation is equivalent to (a) in (9.1.7.b). If [[MathCmd 294]], then equation i should be dropped from the computations. Similarly,

\begin{displaymath}\nonumber{}-x+y=g\geq{}0\end{displaymath}

is equivalent to (9.1.7.d). This implies that

\begin{displaymath}\nonumber{}l_{c}^{\mathrm{n}}+dl_{c}-(A^{\mathrm{n}})x+f=-(dA)y\end{displaymath}

and that

\begin{displaymath}\nonumber{}y=g+x\end{displaymath}

Substituting these values into (9.1.7) gives

\begin{math}\nonumber{}\begin{array}{lcccccl}\nonumber{}\mbox{minimize} &  &  & t &  &  & \\\nonumber{}\mbox{subject to} &  &  & (c^{\mathrm{n}})^{T}x+(dc)^{T}(g+x)-t & \leq{} & 0, & \\\nonumber{} & 0 & \leq{} & f, &  &  & \\\nonumber{} &  &  & 2(A^{\mathrm{n}})x+(dA)(g+x)+f+l_{c}^{\mathrm{n}}+dl_{c} & \leq{} & u_{c}^{\mathrm{n}}-du_{c}, & \\\nonumber{} & 0 & \leq{} & g, &  &  & \\\nonumber{} & 0 & \leq{} & 2x+g, &  &  & \\\nonumber{} & l_{x}^{\mathrm{n}}+dl_{x} & \leq{} & x & \leq{} & u_{x}^{\mathrm{n}}-du_{x}, &\end{array}\end{math} (9.1.8)

which after some simplifications leads to

\begin{math}\nonumber{}\begin{array}{lcccccl}\nonumber{}\mbox{minimize} &  &  & t &  &  & \\\nonumber{}\mbox{subject to} &  &  & (c^{\mathrm{n}}+dc)^{T}x+(dc)^{T}g-t & \leq{} & 0, & (a)\\\nonumber{} & 0 & \leq{} & f, &  &  & (b)\\\nonumber{} &  &  & 2(A^{\mathrm{n}}+dA)x+(dA)g+f-(l_{c}^{\mathrm{n}}+dl_{c}) & \leq{} & u_{c}^{\mathrm{n}}-du_{c}, & (c)\\\nonumber{} & 0 & \leq{} & g, &  &  & (d)\\\nonumber{} & 0 & \leq{} & 2x+g, &  &  & (e)\\\nonumber{} & l_{x}^{\mathrm{n}}+dl_{x} & \leq{} & x & \leq{} & u_{x}^{\mathrm{n}}-du_{x}, & (f)\end{array}\end{math} (9.1.9)

and

\begin{math}\nonumber{}\begin{array}{lcccccl}\nonumber{}\mbox{minimize} &  &  & t &  &  & \\\nonumber{}\mbox{subject to} &  &  & (c^{\mathrm{n}}+dc)^{T}x+(dc)^{T}g-t & \leq{} & 0, & (a)\\\nonumber{} &  &  & 2(A^{\mathrm{n}}+dA)x+(dA)g+f & \leq{} & u_{c}^{\mathrm{n}}-du_{c}+l_{c}^{\mathrm{n}}+dl_{c}, & (b)\\\nonumber{} & 0 & \leq{} & 2x+g, &  &  & (c)\\\nonumber{} & 0 & \leq{} & f, &  &  & (d)\\\nonumber{} & 0 & \leq{} & g, &  &  & (e)\\\nonumber{} & l_{x}^{\mathrm{n}}+dl_{x} & \leq{} & x & \leq{} & u_{x}^{\mathrm{n}}-du_{x}. & (f)\end{array}\end{math} (9.1.10)

Please note that this problem has more variables but much fewer constraints than (9.1.7). Therefore, (9.1.10) is likely to be solved faster than (9.1.7). Note too that (9.1.10.b) is trivially redundant if [[MathCmd 301]].

9.1.3.3. Introductory example (continued)

Let us apply the Robust Optimization methodology to our drug production example presented in Section 9.1.1, assuming that the only uncertain data is the contents of the active agent in the raw materials, and that these contents vary in 0.5% and 2% neighborhoods of the respective nominal values 0.01 and 0.02. With this assumption, the problem becomes an uncertain LP affected by interval uncertainty; the robust counterpart (9.1.7) of this uncertain LP is the linear program

\begin{displaymath}\nonumber{}\begin{array}{rcl}\nonumber{}\mbox{(Drug\_RC)}: &  & \\\nonumber{}\mbox{maximize} &  & \\\nonumber{}t &  & \\\nonumber{}\mbox{subject to} &  & \\\nonumber{}t\leq{}-100\cdot \mbox{\texttt{RawI}}-199.9\cdot \mbox{\texttt{RawII}}+5500\cdot \mbox{\texttt{DrugI}}+6100\cdot \mbox{\texttt{DrugII}} &  & \\\nonumber{}0.01\cdot 0.995\cdot \mbox{\texttt{RawI}}+0.02\cdot 0.98\cdot \mbox{\texttt{RawII}}-0.500\cdot \mbox{\texttt{DrugI}}-0.600\cdot \mbox{\texttt{DrugII}} & \geq{} & 0\\\nonumber{}\mbox{\texttt{RawI}}+\mbox{\texttt{RawII}} & \leq{} & 1000\\\nonumber{}90.0\cdot \mbox{\texttt{DrugI}}+100.0\cdot \mbox{\texttt{DrugII}} & \leq{} & 2000\\\nonumber{}40.0\cdot \mbox{\texttt{DrugI}}+50.0\cdot \mbox{\texttt{DrugII}} & \leq{} & 800\\\nonumber{}100.0\cdot \mbox{\texttt{RawI}}+199.90\cdot \mbox{\texttt{RawII}}+700\cdot \mbox{\texttt{DrugI}}+800\cdot \mbox{\texttt{DrugII}} & \leq{} & 100000\\\nonumber{}\mbox{\texttt{RawI}},\mbox{\texttt{RawII}},\mbox{\texttt{DrugI}},\mbox{\texttt{DrugII}} & \geq{} & 0\end{array}\end{displaymath}

Solving this problem with MOSEK we get the following output:

*** Optimal value: 8294.567
*** Optimal solution:
RawI:    877.732
RawII:     0.000
DrugI:    17.467
DrugII:    0.000

We see that the robust optimal solution we have built “costs money” – it promises a profit of just $ 8,295 (cf. with the profit of $ 8,820 promised by the nominal optimal solution). Please note, however, that the robust optimal solution remains feasible whatever are the realizations of the uncertain data from the uncertainty set in question, while the nominal optimal solution requires adjustment to this data and, with this adjustment, results in the average profit of $ 7,843, which is by 5.4% less than the profit of $ 8,295 guaranteed by the robust optimal solution. Note too that the robust optimal solution is significantly different from the nominal one: both solutions prescribe to produce the same drug DrugI (in the amounts 17,467 and 17,552 packs, respectively) but from different raw materials, RawI in the case of the robust solution and RawII in the case of the nominal solution. The reason is that although the price per unit of the active agent for RawII is sligthly less than for RawI, the content of the agent in RawI is more stable, so when possible fluctuations of the contents are taken into account, RawI turns out to be more profitable than RawII.

9.1.4. Random uncertainty and ellipsoidal robust counterpart

In some cases, it is natural to assume that the perturbations affecting different uncertain data entries are random and independent of each other. In these cases, the robust counterpart based on the interval model of uncertainty seems to be too conservative: Why should we expect that all the data will be simultaneously driven to its most unfavorable values and immune the solution against this highly unlikely situation? A less conservative approach is offered by the ellipsoidal model of uncertainty. To motivate this model, let us seseee what happens with a particular linear constraint

\begin{math}\nonumber{}a^{T}x\leq{}b\end{math} (9.1.11)

at a given candidate solution x in the case when the vector a of coefficients of the constraint is affected by random perturbations:

\begin{math}\nonumber{}a=a^{\mathrm{n}}+\zeta ,\end{math} (9.1.12)

where [[MathCmd 305]] is the vector of nominal coefficients and [[MathCmd 306]] is a random perturbation vector with zero mean and covariance matrix [[MathCmd 307]]. In this case the value of the left-hand side of (9.1.11), evaluated at a given x, becomes a random variable with the expected value [[MathCmd 308]] and the standard deviation [[MathCmd 309]]. Now let us act as an engineer who believes that the value of a random variable never exceeds its mean plus 3 times the standard deviation; we do not intend to be that specific and replace “3” in the above rule by a safety parameter [[MathCmd 310]] which will be in our control. Believing that the value of a random variable “never” exceeds its mean plus [[MathCmd 310]] times the standard deviation, we conclude that a “safe” version of (9.1.11) is the inequality

\begin{math}\nonumber{}(a^{\mathrm{n}})^{T}x+\Omega \sqrt{x^{T}V_{a}x}\leq{}b.\end{math} (9.1.13)

The word “safe” above admits a quantitative interpretation: If x satisfies (9.1.13), one can bound from above the probability of the event that random perturbations (9.1.12) result in violating the constraint (9.1.11) evaluated at x. The bound in question depends on what we know about the distribution of [[MathCmd 306]], e.g.

  1. We always have the bound given by the Tschebyshev inequality:

    \begin{math}\nonumber{}\hbox{ satisfies (9.1.13) }\Rightarrow \mbox{Prob}\left\lbrace{}a^{T}x>b\right\rbrace{}\leq{}\frac{1}{\Omega ^{2}}.\end{math} (9.1.14)
  2. When [[MathCmd 306]] is Gaussian, then the Tschebyshev bound can be improved to

    \begin{math}\nonumber{}\hbox{ satisfies (9.1.13) }\Rightarrow \mbox{Prob}\left\lbrace{}a^{T}x>b\right\rbrace{}\leq{}\frac{1}{\sqrt{2\pi }}\int \limits _{\Omega }^{\infty }\exp{}\lbrace{}-t^{2}/2\rbrace{}dt\leq{}0.5\exp{}\lbrace{}-\Omega ^{2}/2\rbrace{}.\end{math} (9.1.15)
  3. Assume that [[MathCmd 317]], where [[MathCmd 318]] is certain [[MathCmd 319]] matrix, and [[MathCmd 320]] is a random vector with independent coordinates [[MathCmd 321]] symmetrically distributed in the segment [-1,1]. Setting [[MathCmd 322]] (V is a natural “upper bound” on the covariance matrix of [[MathCmd 306]]), one has

    \begin{math}\nonumber{}x\hbox{ satisfies (9.1.13) }\Rightarrow \mbox{Prob}\left\lbrace{}a^{T}x>b\right\rbrace{}\leq{}0.5\exp{}\lbrace{}-\Omega ^{2}/2\rbrace{}.\end{math} (9.1.16)

Please note that in order to ensure the bounds in (9.1.15) and (9.1.16) to be [[MathCmd 325]], it suffices to set [[MathCmd 326]].

Now, assume that we are given a linear program affected by random perturbations:

\begin{math}\nonumber{}\begin{array}{lccccl}\nonumber{}\mbox{minimize} &  &  & [c^{\mathrm{n}}+dc]^{T}x &  & \\\nonumber{}\mbox{subject to} & (l_{c})_{i} & \leq{} & [a_{i}^{\mathrm{n}}+da_{i}]^{T}x & \leq{} & (u_{c})_{i},\,\,i=1,...,m,\\\nonumber{} & l_{x} & \leq{} & x & \leq{} & u_{x},\end{array}\end{math} (9.1.17)

where [[MathCmd 328]] are the nominal data, and dc, [[MathCmd 329]] are random perturbations with zero means. Assume, for the sake of definiteness, that every one of the random perturbations [[MathCmd 330]] satisfies either the assumption of item 2 or the assumption of item 3, and let [[MathCmd 331]] be the corresponding (upper bounds on the) covariance matrices of the perturbations. Choosing a safety parameter [[MathCmd 310]] and replacing the objective and the bodies of all the constraints by their safe bounds as explained above, we arrive at the following optimization problem:

\begin{math}\nonumber{}\begin{array}{lccccl}\nonumber{}\mbox{minimize} &  &  & t &  & \\\nonumber{}\mbox{subject to} &  &  & [c^{\mathrm{n}}]^{T}x+\Omega \sqrt{x^{T}V_{c}x} & \leq{} & t,\\\nonumber{} & (l_{c})_{i} & \leq{} & [a_{i}^{\mathrm{n}}]^{T}x-\Omega \sqrt{x^{T}V_{{a_{i}}}x}, &  & \\\nonumber{} &  &  & [a_{i}^{\mathrm{n}}]^{T}x+\Omega \sqrt{x^{T}V_{{a_{i}}}x} & \leq{} & (u_{c})_{i},\,\,i=1,...,m,\\\nonumber{} & l_{x} & \leq{} & x & \leq{} & u_{x}.\end{array}\end{math} (9.1.18)

The relation between problems (9.1.18) and (9.1.17) is as follows:

If (x,t) is a feasible solution of (9.1.18), then with probability at least

\begin{displaymath}\nonumber{}p=1-(m+1)\exp{}\lbrace{}-\Omega ^{2}/2\rbrace{}\end{displaymath}

x is feasible for randomly perturbed problem (9.1.17), and t is an upper bound on the objective of (9.1.17) evaluated at x.

We see that if [[MathCmd 310]] is not too small (9.1.18) can be treated as a “safe version” of (9.1.17).

On the other hand, it is easily seen that (9.1.18) is nothing but the robust counterpart of the uncertain linear optimization problem with the nominal data [[MathCmd 336]] and the row-wise ellipsoidal uncertainty given by the matrices [[MathCmd 337]]. In the corresponding uncertainty set, the uncertainty affects the coefficients of the objective and the constraint matrix only, and the perturbation vectors affecting the objective and the vectors of coefficients of the linear constraints run, independently of each other, through the respective ellipsoids

\begin{displaymath}\nonumber{}\begin{array}{rcl}\nonumber{}E_{c} & = & \left\lbrace{}dc=\Omega V_{c}^{{1/2}}u:u^{T}u\leq{}1\right\rbrace{},\\\nonumber{}E_{{a_{i}}} & = & \left\lbrace{}da_{i}=\Omega V_{{a_{i}}}^{{1/2}}u:u^{T}u\leq{}1\right\rbrace{},\,i=1,...,m.\end{array}\end{displaymath}

It turns out that in many cases the ellipsoidal model of uncertainty is significantly less conservative and thus better suited for practice, than the interval model of uncertainty.

Last but not least, it should be mentioned that problem (9.1.18) is equivalent to a conic quadratic program, specifically to the program

\begin{math}\nonumber{}\begin{array}{lccccl}\nonumber{}\mbox{minimize} &  &  & t &  & \\\nonumber{}\mbox{subject to} &  &  & [c^{\mathrm{n}}]^{T}x+\Omega z & \leq{} & t,\\\nonumber{} & (l_{c})_{i} & \leq{} & [a_{i}^{\mathrm{n}}]^{T}x-\Omega z_{i}, &  & \\\nonumber{} &  &  & [a_{i}^{\mathrm{n}}]^{T}x+\Omega z_{i} & \leq{} & (u_{c})_{i},\,\,i=1,...,m,\\\nonumber{} & 0 & = & w-D_{c}x &  & \\\nonumber{} & 0 & = & w^{i}-D_{{a_{i}}}x, &  & i=1,...,m,\\\nonumber{} & 0 & \leq{} & z-\sqrt{w^{T}w}, &  & \\\nonumber{} & 0 & \leq{} & z_{i}-\sqrt{(w^{i})^{T}w^{i}}, &  & i=1,...,m,\\\nonumber{} & l_{x} & \leq{} & x & \leq{} & u_{x}.\end{array}\end{math} (9.1.19)

where [[MathCmd 340]] and [[MathCmd 341]] are matrices satisfying the relations

\begin{displaymath}\nonumber{}V_{c}=D_{c}^{T}D_{c},\,V_{{a_{i}}}=D_{{a_{i}}}^{T}D_{{a_{i}}},\,i=1,...,m.\end{displaymath}

9.1.4.1. Example: Interval and Ellipsoidal robust counterparts of uncertain linear constraint with independent random perturbations of coefficients

Consider a linear constraint

\begin{math}\nonumber{}l\leq{}\sum \limits _{{j=1}}^{n}a_{j}x_{j}\leq{}u\end{math} (9.1.20)

and assume that the [[MathCmd 344]] coefficients of the body of the constraint are uncertain and vary in intervals [[MathCmd 345]]. The worst-case-oriented model of uncertainty here is the interval one, and the corresponding robust counterpart of the constraint is given by the system of linear inequalities

\begin{math}\nonumber{}\begin{array}{rcccl}\nonumber{}l & \leq{} & \sum \limits _{{j=1}}^{n}a^{\mathrm{n}}_{j}x_{j}-\sum \limits _{{j=1}}^{n}\sigma _{j}y_{j}, &  & \\\nonumber{} &  & \sum \limits _{{j=1}}^{n}a^{\mathrm{n}}_{j}x_{j}+\sum \limits _{{j=1}}^{n}\sigma _{j}y_{j} & \leq{} & u,\\\nonumber{}0 & \leq{} & x_{j}+y_{j}, &  & \\\nonumber{}0 & \leq{} & -x_{j}+y_{j}, &  & j=1,...,n.\end{array}\end{math} (9.1.21)

Now, assume that we have reasons to believe that the true values of the coefficients [[MathCmd 344]] are obtained from their nominal values [[MathCmd 348]] by random perturbations, independent for different j and symmetrically distributed in the segments [[MathCmd 349]]. With this assumption, we are in the situation of item 3 and can replace the uncertain constraint (9.1.20) with its ellipsoidal robust counterpart

\begin{math}\nonumber{}\begin{array}{rcccl}\nonumber{}l & \leq{} & \sum \limits _{{j=1}}^{n}a^{\mathrm{n}}_{j}x_{j}-\Omega z, &  & \\\nonumber{} &  & \sum \limits _{{j=1}}^{n}a^{\mathrm{n}}_{j}x_{j}+\Omega z & \leq{} & u,\\\nonumber{}0 & \leq{} & z-\sqrt{\sum \limits _{{j=1}}^{n}\sigma _{j}^{2}x_{j}^{2}}. &  &\end{array}\end{math} (9.1.22)

Please note that with the model of random perturbations, a vector x satisfying (9.1.22) satisfies a realization of (9.1.20) with probability at least [[MathCmd 351]]; for [[MathCmd 352]]. This probability is [[MathCmd 353]], which for all practical purposes is the same as sayiong that x satisfies all realizations of (9.1.20). On the other hand, the uncertainty set associated with (9.1.21) is the box

\begin{displaymath}\nonumber{}B=\left\lbrace{}a=(a_{1},...,a_{n})^{T}:a_{j}^{\mathrm{n}}-\sigma _{j}\leq{}a_{j}\leq{}a_{j}^{\mathrm{n}}+\sigma _{j},\,j=1,...,n\right\rbrace{},\end{displaymath}

while the uncertainty set associated with (9.1.22) is the ellipsoid

\begin{displaymath}\nonumber{}E(\Omega )=\left\lbrace{}a=(a_{1},...,a_{n})^{T}:\sum \limits _{{j=1}}^{n}{(a_{j}-a_{j}^{\mathrm{n}})^{\frac{2}{\sigma _{j}^{2}}}}\leq{}\Omega ^{2}\right\rbrace{}.\end{displaymath}

For a moderate value of [[MathCmd 310]], say [[MathCmd 352]], and n40, the ellipsoid [[MathCmd 358]] in its diameter, typical linear sizes, volume, etc. is incomparably less than the box B, the difference becoming more dramatic the larger the dimension n of the box and the ellipsoid. It follows that the ellipsoidal robust counterpart (9.1.22) of the randomly perturbed uncertain constraint (9.1.20) is much less conservative than the interval robust counterpart (9.1.21), while ensuring basically the same “robustness guarantees”. To illustrate this important point, consider the following numerical examples:

There are n different assets on the market. The return on $ 1 invested in asset j is a random variable distributed symmetrically in the segment [[MathCmd 359]], and the returns on different assets are independent of each other. The problem is to distribute $ 1 among the assets in order to get the largest possible total return on the resulting portfolio.

A natural model of the problem is an uncertain linear optimization problem

\begin{math}\nonumber{}\begin{array}{lccccl}\nonumber{}\mbox{maximize} &  &  & \sum \limits _{{j=1}}^{n}a_{j}x_{j} &  & \\\nonumber{}\mbox{subject to} &  &  & \sum \limits _{{j=1}}^{n}x_{j} & = & 1,\\\nonumber{} & 0 & \leq{} & x_{j}, &  & j=1,...,n.\end{array}\end{math} (9.1.23)

where [[MathCmd 344]] are the uncertain returns of the assets. Both the nominal optimal solution (set all returns [[MathCmd 344]] equal to their nominal values [[MathCmd 363]]) and the risk-neutral Stochastic Programming approach (maximize the expected total return) result in the same solution: Our $ 1 should be invested in the most promising asset(s) – the one(s) with the maximal nominal return. This solution, however, can be very unreliable if, as is typically the case in reality, the most promising asset has the largest volatility [[MathCmd 364]] and is in this sense the most risky. To reduce the risk, one can use the Robust Counterpart approach which results in the following optimization problems.

\begin{math}\nonumber{}\begin{array}{lccccl}\nonumber{}\mbox{\textbf{The Interval Model of Uncertainty:}} &  &  &  &  & \\\nonumber{}\mbox{maximize} &  & t &  &  & \\\nonumber{}\mbox{subject to} & 0 & \leq{} & -t+\sum \limits _{{j=1}}^{n}(\delta _{j}-\sigma _{j})x_{j}, &  & \\\nonumber{} &  &  & \sum \limits _{{j=1}}^{n}x_{j} & = & 1,\\\nonumber{} & 0 & \leq{} & x_{j}, &  & j=1,...,n\end{array}\end{math} (9.1.24)

and

\begin{math}\nonumber{}\begin{array}{lccccl}\nonumber{}\mbox{\textbf{The ellipsoidal Model of Uncertainty:}}\mbox{maximize} &  &  & t &  & \\\nonumber{}\mbox{subject to} & 0 & \leq{} & -t+\sum \limits _{{j=1}}^{n}(\delta _{j})x_{j}-\Omega z, &  & \\\nonumber{} & 0 & \leq{} & z-\sqrt{\sum \limits _{{j=1}}^{n}\sigma _{j}^{2}x_{j}^{2}}, &  & \\\nonumber{} &  &  & \sum \limits _{{j=1}}^{n}x_{j} & = & 1,\\\nonumber{} & 0 & \leq{} & x_{j}, &  & j=1,...,n.\end{array}\end{math} (9.1.25)

Note that the problem (9.1.25) is essentially the risk-averted portfolio model proposed in mid-50's by Markowitz.

The solution of (9.1.24) is evident — our $ 1 should be invested in the asset(s) with the largest possible guaranteed return [[MathCmd 367]]. In contrast to this very conservative policy (which in reality prescribes to keep the initial capital in a bank or in the most reliable, and thus low profit, assets), the optimal solution to (9.1.25) prescribes a quite reasonable diversification of investments which allows to get much better total return than (9.1.24) with basically zero risk. To illustrate this, assume that there are n=300 assets with the nominal returns (per year) varying from 1.04 (bank savings) to 2.00:

\begin{displaymath}\nonumber{}\delta _{j}=1.04+0.96\frac{j-1}{n-1},\,j=1,2,...,n=300\end{displaymath}

and volatilities varying from 0 for the bank savings to 1.2 for the most promising asset:

\begin{displaymath}\nonumber{}\sigma _{j}=1.152\frac{j-1}{n-1},\,j=1,...,n=300.\end{displaymath}

Here is a MATLAB script which builds the associated problem (9.1.25), solves it via the MOSEK optimization toolbox, displays the resulting robust optimal value of the total return and the distribution of investments, and finally runs 10,000 simulations to get the distribution of the total return on the resulting portfolio (in these simulations, the returns on all assets are uniformly distributed in the corresponding intervals):

% File: rlo2.m

% Problem:
%
% Maximize t subject to
% t <= sum(delta(j)*x(j)) -Omega*z,
% y(j) = sigma(j)*x(j), j=1,...,n,
% sum(x(j)) = 1,
% norm(y) <= z,
% 0 <= x.

clear prob;
n     = 300;
Omega = 6;

% Set nominal returns and volatilities
delta = (0.96/(n-1))*[0:1:n-1]+1.04;
sigma = (1.152/(n-1))*[0:1:n-1];

% Set mosekopt description of the problem
prob.c = -[1;zeros(2*n+1,1)];
A      = [-1,ones(1,n)+delta,-Omega,zeros(1,n);zeros(n+1,2*n+2)];
for j=1:n,
    % Body of the constraint y(j) - sigma(j)*x(j) = 0:
    A(j+1,j+1)   = -sigma(j);
    A(j+1,2+n+j) = 1;
end;
A(n+2,2:n+1)        = ones(1,n);
prob.a             = sparse(A);
prob.blc           = [zeros(n+1,1);1];
prob.buc           = [inf;zeros(n,1);1];
prob.blx           = [-inf;zeros(n,1);0;zeros(n,1)];
prob.bux           = inf*ones(2*n+2,1);
prob.cones         = cell(1,1);
prob.cones{1}.type = 'MSK_CT_QUAD';
prob.cones{1}.sub  = [n+2;[n+3:1:2*n+2]'];

% Run mosekopt
[r,res]=mosekopt('minimize echo(1)',prob);

% Display the solution
xx = res.sol.itr.xx;
t  = xx(1);

disp(sprintf('Robust optimal value: %5.4f',t));
x = max(xx(2:1+n),zeros(n,1));
plot([1:1:n],x,'-m');
grid on;

disp('Press <Enter> to run simulations');
pause

% Run simulations

Nsim = 10000;
out  = zeros(Nsim,1);
for i=1:Nsim,
    returns  = delta+(2*rand(1,n)-1).*sigma;
    out(i)   = returns*x;
end;
disp(sprintf('Actual returns over %d simulations:',Nsim));
disp(sprintf('Min=%5.4f Mean=%5.4f Max=%5.4f StD=%5.2f',...
    min(out),mean(out),max(out),std(out)));
hist(out);

Here are the results displayed by the script:

Robust optimal value: 1.3428
Actual returns over 10000 simulations:
Min=1.5724 Mean=1.6965 Max=1.8245 StD= 0.03
Figure 9.1: Distribution of investments among the assets in the optimal solution of.

Please note that with our set-up there is exactly one asset with guaranteed return greater than 1 – asset # 1 (bank savings, return 1.04, zero volatility). Consequently, the interval robust counterpart (9.1.24) prescribes to put our $ 1 in the bank, thus getting a 4% profit. In contrast to this, the diversified portfolio given by the optimal solution of (9.1.25) never yields profit less than 57.2%, and yields at average a 69.67% profit with pretty low (0.03) standard deviation. We see that in favorable circumstances the ellipsoidal robust counterpart of an uncertain linear program indeed is less conservative than, although basically as reliable as, the interval robust counterpart.

Finally, let us compare our results with those given by the nominal optimal solution. The latter prescribes to invest everything we have in the most promising asset (in our example this is the asset # 300 with a nominal return of 2.00 and volatility of1.152). Assuming that the actual return is uniformly distributed in the corresponding interval and running 10,000 simulations, we get the following results:

Nominal optimal value: 2.0000
Actual returns over 10000 simulations:
Min=0.8483 Mean=1.9918 Max=3.1519 StD= 0.66

We see that the nominal solution results in a portfolio which is much more risky, although better at average, than the portfolio given by the robust solution.

9.1.4.2. Combined Interval-Ellipsoidal Robust Counterpart

We have considered the case when the coefficients [[MathCmd 344]] of uncertain linear constraint (9.1.20) are affected by uncorrelated random perturbations symmetrically distributed in given intervals [[MathCmd 349]], and we have discussed two ways to model the uncertainty:

  • The interval uncertainty model (the uncertainty set [[MathCmd 272]] is the box B), where we ignore the stochastic nature of the perturbations and their independence. This model yields the Interval Robust Counterpart (9.1.21);
  • The ellipsoidal uncertainty model ([[MathCmd 272]] is the ellipsoid [[MathCmd 358]]), which takes into account the stochastic nature of data perturbations and yields the Ellipsoidal Robust Counterpart (9.1.22).

Please note that although for large n the ellipsoid [[MathCmd 358]] in its diameter, volume and average linear sizes is incomparably smaller than the box B, in the case of [[MathCmd 376]] the ellipsoid [[MathCmd 358]] in certain directions goes beyond the box. E.g. the ellipsoid E(6), although much more narrow than B in most of the directions, is 6 times wider than B in the directions of the coordinate axes. Intuition says that it hardly makes sense to keep in the uncertainty set realizations of the data which are outside of B and thus forbidden by our model of perturbations, so in the situation under consideration the intersection of [[MathCmd 358]] and B is a better model of the uncertainty set than the ellipsoid [[MathCmd 358]] itself. What happens when the model of the uncertainty set is the “combined interval-ellipsoidal” uncertainty [[MathCmd 380]] ?

First, it turns out that the RC of (9.1.20) corresponding to the uncertainty set [[MathCmd 381]] is still given by a system of linear and conic quadratic inequalities, specifically the system

\begin{math}\nonumber{}\begin{array}{rcccl}\nonumber{}l & \leq{} & \sum \limits _{{j=1}}^{n}a_{j}^{\mathrm{n}}x_{j}-\sum \limits _{{j=1}}^{n}\sigma _{j}y_{j}-\Omega \sqrt{\sum \limits _{{j=1}}^{n}\sigma _{j}^{2}u_{j}^{2}}, &  & \\\nonumber{} &  & \sum \limits _{{j=1}}^{n}a_{j}^{\mathrm{n}}x_{j}+\sum \limits _{{j=1}}^{n}\sigma _{j}z_{j}+\Omega \sqrt{\sum \limits _{{j=1}}^{n}\sigma _{j}^{2}v_{j}^{2}} & \leq{} & u,\\\nonumber{}-y_{j} & \leq{} & x_{j}-u_{j} & \leq{} & y_{j},\,j=1,...,n,\\\nonumber{}-z_{j} & \leq{} & x_{j}-v_{j} & \leq{} & z_{j},\,j=1,...,n.\end{array}\end{math} (9.1.26)

Second, it turns out that our intuition is correct: As a model of uncertainty, [[MathCmd 383]] is as reliable as the ellipsoid [[MathCmd 358]]. Specifically, if x can be extended to a feasible solution of (9.1.26), then the probability for x to satisfy a realization of (9.1.20) is [[MathCmd 385]].

The conclusion is that if we have reasons to assume that the perturbations of uncertain coefficients in a constraint of an uncertain linear optimization problem are (a) random, (b) independent of each other, and (c) symmetrically distributed in given intervals, then it makes sense to associate with this constraint an interval-ellipsoidal model of uncertainty and use a system of linear and conic quadratic inequalities (9.1.26). Please note that when building the robust counterpart of an uncertain linear optimization problem, one can use different models of the uncertainty (e.g., interval, ellipsoidal, combined interval-ellipsoidal) for different uncertain constraints within the same problem.

9.1.5. Further references

For further information about robust linear optimization consult [4, 5].

9.2. Geometric (posynomial) optimization

9.2.1. The problem

A geometric optimization problem can be stated as follows

\begin{math}\nonumber{}\begin{array}{lccll}\nonumber{}\mbox{minimize} & \sum \limits _{{k\in{}J_{0}}}c_{k}\prod \limits _{{j=0}}^{{n-1}}t_{j}^{{a_{{kj}}}} &  &  & \\\nonumber{}\mbox{subject to} & \sum \limits _{{k\in{}J_{i}}}c_{k}\prod \limits _{{j=0}}^{{n-1}}t_{j}^{{a_{{kj}}}} & \leq{} & 1, & i=1,\ldots ,m,\\\nonumber{} & t>0, &  &  &\end{array}\end{math} (9.2.1)

where it is assumed that

\begin{displaymath}\nonumber{}\cup _{{k=0}}^{m}J_{k}=\lbrace{}1,\ldots ,T\rbrace{}\end{displaymath}

and if [[MathCmd 98]], then

\begin{displaymath}\nonumber{}J_{i}\cap J_{j}=\emptyset .\end{displaymath}

Hence, A is a [[MathCmd 100]] matrix and c is a vector of length T. Given [[MathCmd 391]] then

\begin{displaymath}\nonumber{}c_{k}\prod \limits _{{j=0}}^{{n-1}}t_{j}^{{a_{{kj}}}}\end{displaymath}

is called a monomial . A sum of monomials i.e.

\begin{displaymath}\nonumber{}\sum \limits _{{k\in{}J_{i}}}c_{k}\prod \limits _{{j=0}}^{{n-1}}t_{j}^{{a_{{kj}}}}\end{displaymath}

is called a posynomial . In general, the problem (9.2.1) is very hard to solve. However, the posynomial case where it is required that

\begin{displaymath}\nonumber{}c>0\end{displaymath}

is relatively easy. The reason is that using a simple variable transformation a convex optimization problem can be obtained. Indeed using the variable transformation

\begin{math}\nonumber{}t_{j}=e^{{x_{j}}}\end{math} (9.2.2)

we obtain the problem

\begin{math}\nonumber{}\begin{array}{lccll}\nonumber{}\mbox{minimize} & \sum \limits _{{k\in{}J_{0}}}c_{k}e^{{\sum \limits _{{j=0}}^{{n-1}}a_{{kj}}x_{j}}} &  &  & \\\nonumber{}\mbox{subject to} & \sum \limits _{{k\in{}J_{i}}}c_{k}e^{{\sum \limits _{{j=0}}^{{n-1}}a_{{kj}}x_{j}}} & \leq{} & 1, & i=1,\ldots ,m,\end{array}\end{math} (9.2.3)

which is a convex optimization problem that can be solved using MOSEK. We will call

\begin{displaymath}\nonumber{}c_{t}e^{{\left(\sum \limits _{{j=0}}^{{n-1}}a_{{tj}}x_{j}\right)}}=e^{{\left(\log (c_{t})+\sum \limits _{{j=0}}^{{n-1}}a_{{tj}}x_{j}\right)}}\end{displaymath}

a term and hence the number of terms is T.

As stated, the problem (9.2.3) is non-separable. However, using

\begin{displaymath}\nonumber{}v_{t}=\log (c_{t})+\sum \limits _{{j=0}}^{{n-1}}a_{{tj}}x_{j}\end{displaymath}

we obtain the separable problem

\begin{math}\nonumber{}\begin{array}{lccll}\nonumber{}\mbox{minimize} & \sum \limits _{{t\in{}J_{0}}}e^{{v_{t}}} &  &  & \\\nonumber{}\mbox{subject to} & \sum \limits _{{t\in{}J_{i}}}e^{{v_{t}}} & \leq{} & 1, & i=1,\ldots ,m,\\\nonumber{} & \sum \limits _{{j=0}}^{{n-1}}a_{{tj}}x_{j}-v_{t} & = & -\log (c_{t}), & t=0,\ldots ,T,\end{array}\end{math} (9.2.4)

which is a separable convex optimization problem.

A warning about this approach is that the exponential function [[MathCmd 400]] is only numerically well-defined for values of x in a small interval around 0 since [[MathCmd 400]] grows very rapidly as x becomes larger. Therefore numerical problems may arise when solving the problem on this form.

9.2.2. Applications

A large number of practical applications, particularly in electrical circuit design, can be cast as a geometric optimization problem. We will not review these applications here but rather refer the reader to [13] and the references therein.

9.2.3. Modeling tricks

A lot of tricks that can be used for modeling posynomial optimization problems are described in [13]. Therefore, in this section we cover only one important case.

9.2.3.1. Equalities

In general, equalities are not allowed in (9.2.1), i.e.

\begin{displaymath}\nonumber{}\sum \limits _{{k\in{}J_{i}}}c_{k}\prod \limits _{{j=0}}^{{n-1}}t_{j}^{{a_{{kj}}}}=1\end{displaymath}

is not allowed. However, a monomial equality is not a problem. Indeed consider the example

\begin{displaymath}\nonumber{}xyz^{{-1}}=1\end{displaymath}

of a monomial equality. The equality is identical to

\begin{displaymath}\nonumber{}1\leq{}xyz^{{-1}}\leq{}1\end{displaymath}

which in turn is identical to the two inequalities

\begin{displaymath}\nonumber{}\begin{array}{lclcl}\nonumber{}xyz^{{-1}} &  &  & \leq{} & 1,\\\nonumber{}\frac{1}{xyz^{{-1}}} & = & x^{{-1}}y^{{-1}}z & \leq{} & 1.\end{array}\end{displaymath}

Hence, it is possible to model a monomial equality using two inequalities.

9.2.4. Problematic formulations

Certain formulations of geometric optimization problems may cause problems for the algorithms implemented in MOSEK. Basically there are two kinds of problems that may occur:

  • The solution vector is finite, but an optimal objective value can only be a approximated.
  • The optimal objective value is finite but implies that a variable in the solution is infinite.

9.2.4.1. Finite unattainable solution

The following problem illustrates an unattainable solution:

\begin{displaymath}\nonumber{}\begin{array}{lclcl}\nonumber{}\mbox{minimize} & x^{2}y &  & \\\nonumber{}\mbox{subject to} & xy & \leq{} & 1,\\\nonumber{} & x,y>0. &  &\end{array}\end{displaymath}

Clearly, the optimal objective value is 0 but because of the constraint the x,y>0 constraint this value can never be attained: To see why this is a problem, remember that MOSEK substitutes [[MathCmd 407]] and [[MathCmd 408]] and solves the problem as

\begin{displaymath}\nonumber{}\begin{array}{lclcl}\nonumber{}\mbox{minimize} & e^{{2t_{x}}}e^{{t_{y}}} &  & \\\nonumber{}\mbox{subject to} & e^{{t_{x}}}e^{{t_{y}}} & \leq{} & 1,\\\nonumber{} & t_{x},t_{y}\in{}\mathbb{R}. &  &\end{array}\end{displaymath}

The optimal solution implies that [[MathCmd 410]] or [[MathCmd 411]], and thus it is unattainable.

Now, the issue should be clear: If a variable x appears only with nonnegative exponents, then fixing x=0 will minimize all terms in which it appears — but such a solution cannot be attained.

9.2.4.2. Infinite solution

A similar problem will occur if a finite optimal objective value requires a variable to be infinite. This can be illustrated by the following example:

\begin{displaymath}\nonumber{}\begin{array}{lclcl}\nonumber{}\mbox{minimize} & x^{{-2}} &  & \\\nonumber{}\mbox{subject to} & x^{{-1}} & \leq{} & 1,\\\nonumber{} & x>0, &  &\end{array}\end{displaymath}

which is a valid geometric programming problem. In this case the optimal objective is 0, but this requires x=, which is unattainable.

Again, this specific case will appear if a variable x appears only with negative exponents in the problem, implying that each term in which it appears can be minimized for [[MathCmd 413]].

9.2.5. An example

Consider the example

\begin{displaymath}\nonumber{}\begin{array}{lclcl}\nonumber{}\mbox{minimize} & x^{{-1}}y &  & \\\nonumber{}\mbox{subject to} & x^{2}y^{{-\frac{1}{2}}}+3y^{{\frac{1}{2}}}z^{{-1}} & \leq{} & 1,\\\nonumber{} & xy^{{-1}} & = & z^{2},\\\nonumber{} & -x & \leq{} & -\frac{1}{10},\\\nonumber{} & x & \leq{} & 3,\\\nonumber{} & x,y,z>0, &  &\end{array}\end{displaymath}

which is not a geometric optimization problem. However, using the obvious transformations we obtain the problem

\begin{math}\nonumber{}\begin{array}{lclcl}\nonumber{}\mbox{minimize} & x^{{-1}}y &  & \\\nonumber{}\mbox{subject to} & x^{2}y^{{-\frac{1}{2}}}+3y^{{\frac{1}{2}}}z^{{-1}} & \leq{} & 1,\\\nonumber{} & xy^{{-1}}z^{{-2}} & \leq{} & 1,\\\nonumber{} & x^{{-1}}yz^{2} & \leq{} & 1,\\\nonumber{} & \frac{1}{10}x^{{-1}} & \leq{} & 1,\\\nonumber{} & \frac{1}{3}x & \leq{} & 1,\\\nonumber{} & x,y,z>0, &  &\end{array}\end{math} (9.2.5)

which is a geometric optimization problem.

9.2.6. Solving the example

The problem (9.2.5) can be defined and solved in the MOSEK toolbox as shown below.

% go2.m

c     = [1 1 3 1 1 0.1 1/3]';
a     = sparse([[-1  1  0];
                [2 -0.5 0];
                [0 0.5 -1];
                [1 -1 -2]; 
                [-1 1 2]; 
                [-1 0 0]; 
                [1 0 0]]);

map   = [0 1 1 2 3 4 5]';
[res] = mskgpopt(c,a,map);

fprintf('\nPrimal optimal solution to original gp:');
fprintf(' %e',exp(res.sol.itr.xx));
fprintf('\n\n');

% Compute the optimal objective value and
% the constraint activities.
v = c.*exp(a*res.sol.itr.xx);

% Add appropriate terms together.
f = sparse(map+1,1:7,ones(size(map)))*v;

% First objective value. Then constraint values.
fprintf('Objective value: %e\n',log(f(1)));
fprintf('Constraint values:');
fprintf(' %e',log(f(2:end)));
fprintf('\n\n');

% Dual multipliers (should be negative)
fprintf('Dual variables (should be negative):');
fprintf(' %e',res.sol.itr.y);
fprintf('\n\n');

9.2.7. Exporting to a file

It's possible to write a geometric optimization problem to a file with the command:

 mskgpwri(c,a,map,filename)

This file format is compatible with the mskexpopt command line tool. See the MOSEK Tools User's manual for details on mskexpopt. This file format can be useful for sending debug information to MOSEK or for testing. It's also possible to read the above format with the command:

 [c,a,map] = mskgpread(filename)

9.2.8. Further information

More information about geometric optimization problems is located in [17, 12, 13].

Wed Feb 29 16:16:54 2012