7. Modelling


In this chapter we will discuss the following issues:

7.1. Linear optimization

A linear optimization problem can be written as

\begin{math}\nonumber{}\begin{array}{lccccl}\nonumber{}\mbox{minimize} &  &  & c^{T}x+c^{f} &  & \\\nonumber{}\mbox{subject to} & l^{c} & \leq{} & Ax & \leq{} & u^{c},\\\nonumber{} & l^{x} & \leq{} & x & \leq{} & u^{x},\end{array}\end{math} (7.1.1)

where

A primal solution (x) is (primal) feasible if it satisfies all constraints in (7.1.1). If (7.1.1) has at least one primal feasible solution, then (7.1.1) is said to be (primal) feasible.

In case (7.1.1) does not have a feasible solution, the problem is said to be (primal) infeasible .

7.1.1. Duality for linear optimization

Corresponding to the primal problem (7.1.1), there is a dual problem

\begin{math}\nonumber{}\begin{array}{lccl}\nonumber{}\mbox{maximize} & (l^{c})^{T}s_{l}^{c}-(u^{c})^{T}s_{u}^{c} &  & \\\nonumber{} & +(l^{x})^{T}s_{l}^{x}-(u^{x})^{T}s_{u}^{x}+c^{f} &  & \\\nonumber{}\mbox{subject to} & A^{T}y+s_{l}^{x}-s_{u}^{x} & = & c,\\\nonumber{} & -y+s_{l}^{c}-s_{u}^{c} & = & 0,\\\nonumber{} & s_{l}^{c},s_{u}^{c},s_{l}^{x},s_{u}^{x}\geq{}0. &  &\end{array}\end{math} (7.1.2)

If a bound in the primal problem is plus or minus infinity, the corresponding dual variable is fixed at 0, and we use the convention that the product of the bound value and the corresponding dual variable is 0. E.g.

\begin{displaymath}\nonumber{}l_{j}^{x}=-\infty ~\Rightarrow ~(s_{l}^{x})_{j}=0\mbox{ and }l_{j}^{x}\cdot (s_{l}^{x})_{j}=0.\end{displaymath}

This is equivalent to removing variable [[MathCmd 159]] from the dual problem.

A solution

\begin{displaymath}\nonumber{}(y,s_{l}^{c},s_{u}^{c},s_{l}^{x},s_{u}^{x})\end{displaymath}

to the dual problem is feasible if it satisfies all the constraints in (7.1.2). If (7.1.2) has at least one feasible solution, then (7.1.2) is (dual) feasible, otherwise the problem is (dual) infeasible .

We will denote a solution

\begin{displaymath}\nonumber{}(x,y,s_{l}^{c},s_{u}^{c},s_{l}^{x},s_{u}^{x})\end{displaymath}

so that x is a solution to the primal problem (7.1.1), and

\begin{displaymath}\nonumber{}(y,s_{l}^{c},s_{u}^{c},s_{l}^{x},s_{u}^{x})\end{displaymath}

is a solution to the corresponding dual problem (7.1.2). A solution which is both primal and dual feasible is denoted a primal-dual feasible solution .

7.1.1.1. A primal-dual feasible solution

Let

\begin{displaymath}\nonumber{}(x^{*},y^{*},(s_{l}^{c})^{*},(s_{u}^{c})^{*},(s_{l}^{x})^{*},(s_{u}^{x})^{*})\end{displaymath}

be a primal-dual feasible solution, and let

\begin{displaymath}\nonumber{}(x^{c})^{*}:=Ax^{*}.\end{displaymath}

For a primal-dual feasible solution we define the optimality gap as the difference between the primal and the dual objective value,

\begin{displaymath}\nonumber{}\begin{array}{c}\nonumber{}c^{T}x^{*}+c^{f}-((l^{c})^{T}s_{l}^{c}-(u^{c})^{T}s_{u}^{c}+(l^{x})^{T}s_{l}^{x}-(u^{x})^{T}s_{u}^{x}+c^{f})\\\nonumber{}=\sum \limits _{{i=1}}^{m}((s_{l}^{c})_{i}^{*}((x_{i}^{c})^{*}-l_{i}^{c})+(s_{u}^{c})_{i}^{*}(u_{i}^{c}-(x_{i}^{c})^{*})+\sum \limits _{{j=1}}^{n}((s_{l}^{x})_{j}^{*}(x_{j}-l_{j}^{x})+(s_{u}^{x})_{j}^{*}(u_{j}^{x}-x_{j}^{*}))\\\nonumber{}\geq{}0\end{array}\end{displaymath}

where the first relation can be obtained by multiplying the dual constraints (7.1.2) by x and [[MathCmd 166]] respectively, and the second relation comes from the fact that each term in each sum is nonnegative. It follows that the primal objective will always be greater than or equal to the dual objective.

We then define the duality gap as the difference between the primal objective value and the dual objective value, i.e.

\begin{displaymath}\nonumber{}c^{T}x^{*}+c^{f}-((l^{c})^{T}s_{l}^{c}-(u^{c})^{T}s_{u}^{c}+(l^{x})^{T}s_{l}^{x}-(u^{x})^{T}s_{u}^{x}+c^{f})\end{displaymath}

Please note that the duality gap will always be nonnegative.

7.1.1.2. An optimal solution

It is well-known that a linear optimization problem has an optimal solution if and only if there exist feasible primal and dual solutions so that the duality gap is zero, or, equivalently, that the complementarity conditions

\begin{displaymath}\nonumber{}\begin{array}{rcll}\nonumber{}(s_{l}^{c})_{i}^{*}((x_{i}^{c})^{*}-l_{i}^{c}) & = & 0, & i=1,\ldots ,m,\\\nonumber{}(s_{u}^{c})_{i}^{*}(u_{i}^{c}-(x_{i}^{c})^{*}) & = & 0, & i=1,\ldots ,m,\\\nonumber{}(s_{l}^{x})_{j}^{*}(x_{j}-l_{j}^{x}) & = & 0, & j=1,\ldots ,n,\\\nonumber{}(s_{u}^{x})_{j}^{*}(u_{j}^{x}-x_{j}^{*}) & = & 0, & j=1,\ldots ,n\end{array}\end{displaymath}

are satisfied.

If (7.1.1) has an optimal solution and MOSEK solves the problem successfully, both the primal and dual solution are reported, including a status indicating the exact state of the solution.

7.1.1.3. Primal infeasible problems

If the problem (7.1.1) is infeasible (has no feasible solution), MOSEK will report a certificate of primal infeasibility: The dual solution reported is a certificate of infeasibility, and the primal solution is undefined.

A certificate of primal infeasibility is a feasible solution to the modified dual problem

\begin{math}\nonumber{}\begin{array}{lccl}\nonumber{}\mbox{maximize} & (l^{c})^{T}s_{l}^{c}-(u^{c})^{T}s_{u}^{c}+(l^{x})^{T}s_{l}^{x}-(u^{x})^{T}s_{u}^{x} &  & \\\nonumber{}\mbox{subject to} & A^{T}y+s_{l}^{x}-s_{u}^{x} & = & 0,\\\nonumber{} & -y+s_{l}^{c}-s_{u}^{c} & = & 0,\\\nonumber{} & s_{l}^{c},s_{u}^{c},s_{l}^{x},s_{u}^{x}\geq{}0. &  &\end{array}\end{math} (7.1.3)

so that the objective is strictly positive, i.e. a solution

\begin{displaymath}\nonumber{}(y^{*},(s_{l}^{c})^{*},(s_{u}^{c})^{*},(s_{l}^{x})^{*},(s_{u}^{x})^{*})\end{displaymath}

to (7.1.3) so that

\begin{displaymath}\nonumber{}(l^{c})^{T}(s_{l}^{c})^{*}-(u^{c})^{T}(s_{u}^{c})^{*}+(l^{x})^{T}(s_{l}^{x})^{*}-(u^{x})^{T}(s_{u}^{x})^{*}>0.\end{displaymath}

Such a solution implies that (7.1.3) is unbounded, and that its dual is infeasible.

We note that the dual of (7.1.3) is a problem which constraints are identical to the constraints of the original primal problem (7.1.1): If the dual of (7.1.3) is infeasible, so is the original primal problem.

7.1.1.4. Dual infeasible problems

If the problem (7.1.2) is infeasible (has no feasible solution), MOSEK will report a certificate of dual infeasibility: The primal solution reported is a certificate of infeasibility, and the dual solution is undefined.

A certificate of dual infeasibility is a feasible solution to the problem

\begin{math}\nonumber{}\begin{array}{lccccl}\nonumber{}\mbox{minimize} &  &  & c^{T}x &  & \\\nonumber{}\mbox{subject to} &  &  & Ax-x^{c} & = & 0,\\\nonumber{} & \bar{l}^{c} & \leq{} & x^{c} & \leq{} & \bar{u}^{c},\\\nonumber{} & \bar{l}^{x} & \leq{} & x & \leq{} & \bar{u}^{x}\end{array}\end{math} (7.1.4)

where

\begin{displaymath}\nonumber{}\bar{l}_{i}^{c}=\left\lbrace{}\begin{array}{ll}\nonumber{}0, & \mbox{if }l_{i}^{c}>-\infty ,\\\nonumber{}-\infty  & \mbox{otherwise}\end{array}\right.\quad{}\mbox{and}\quad{}\bar{u}_{i}^{c}:=\left\lbrace{}\begin{array}{ll}\nonumber{}0, & \mbox{if }u_{i}^{c}<\infty ,\\\nonumber{}\infty  & \mbox{otherwise}\end{array}\right.\end{displaymath}

and

\begin{displaymath}\nonumber{}\bar{l}_{j}^{x}=\left\lbrace{}\begin{array}{ll}\nonumber{}0, & \mbox{if }l_{j}^{x}>-\infty ,\\\nonumber{}-\infty  & \mbox{otherwise}\end{array}\right.\quad{}\mbox{and}\quad{}\bar{u}_{j}^{x}:=\left\lbrace{}\begin{array}{ll}\nonumber{}0, & \mbox{if }u_{j}^{x}<\infty ,\\\nonumber{}\infty  & \mbox{otherwise}\end{array}\right.\end{displaymath}

so that the objective value [[MathCmd 175]] is negative. Such a solution implies that (7.1.4) is unbounded, and that the dual of (7.1.4) is infeasible.

We note that the dual of (7.1.4) is a problem which constraints are identical to the constraints of the original dual problem (7.1.2): If the dual of (7.1.4) is infeasible, so is the original dual problem.

7.1.2. Primal and dual infeasible case

In case that both the primal problem (7.1.1) and the dual problem (7.1.2) are infeasible, MOSEK will report only one of the two possible certificates — which one is not defined (MOSEK returns the first certificate found).

7.2. Quadratic and quadratically constrained optimization

A convex quadratic optimization problem is an optimization problem of the form

\begin{math}\nonumber{}\begin{array}{lrcccll}\nonumber{}\mbox{minimize} &  &  & \frac{1}{2}x^{T}Q^{o}x+c^{T}x+c^{f} &  &  & \\\nonumber{}\mbox{subject to} & l_{k}^{c} & \leq{} & \frac{1}{2}x^{T}Q^{k}x+\sum \limits _{{j=0}}^{{n-1}}a_{{k,i}}x_{j} & \leq{} & u_{k}^{c}, & k=0,\ldots ,m-1,\\\nonumber{} & l^{x} & \leq{} & x & \leq{} & u^{x}, & j=0,\ldots ,n-1,\end{array}\end{math} (7.2.1)

where the convexity requirement implies that

The convexity requirement is very important and it is strongly recommended that MOSEK is applied to convex problems only.

7.2.1. A general recommendation

Any convex quadratic optimization problem can be reformulated as a conic optimization problem. It is our experience that for the majority of practical applications it is better to cast them as conic problems because

  • the resulting problem is convex by construction, and
  • the conic optimizer is more efficient than the optimizer for general quadratic problems.

See Section 7.3.3.1 for further details.

7.2.2. Reformulating as a separable quadratic problem

The simplest quadratic optimization problem is

\begin{math}\nonumber{}\begin{array}{lccl}\nonumber{}\mbox{minimize} & 1/2x^{T}Qx+c^{T}x &  & \\\nonumber{}\mbox{subject to} & Ax & = & b,\\\nonumber{} & x\geq{}0. &  &\end{array}\end{math} (7.2.2)

The problem (7.2.2) is said to be a separable problem if Q is a diagonal matrix or, in other words, if the quadratic terms in the objective all have this form

\begin{displaymath}\nonumber{}x_{j}^{2}\end{displaymath}

instead of this form

\begin{displaymath}\nonumber{}x_{j}x_{i}.\end{displaymath}

The separable form has the following advantages:

  • It is very easy to check the convexity assumption, and
  • the simpler structure in a separable problem usually makes it easier to solve.

It is well-known that a positive semi-definite matrix Q can always be factorized, i.e. a matrix F exists so that

\begin{math}\nonumber{}Q=F^{T}F.\end{math} (7.2.3)

In many practical applications of quadratic optimization F is known explicitly; e.g. if Q is a covariance matrix, F is the set of observations producing it.

Using (7.2.3), the problem (7.2.2) can be reformulated as

\begin{math}\nonumber{}\begin{array}{lccl}\nonumber{}\mbox{minimize} & 1/2y^{T}Iy+c^{T}x &  & \\\nonumber{}\mbox{subject to} & Ax & = & b,\\\nonumber{} & Fx-y & = & 0,\\\nonumber{} & x\geq{}0. &  &\end{array}\end{math} (7.2.4)

The problem (7.2.4) is also a quadratic optimization problem and has more constraints and variables than (7.2.2). However, the problem is separable. Normally, if F has fewer rows than columns, it is worthwhile to reformulate as a separable problem. Indeed consider the extreme case where F has one dense row and hence Q will be a dense matrix.

The idea presented above is applicable to quadratic constraints too. Now, consider the constraint

\begin{math}\nonumber{}1/2x^{T}(F^{T}F)x\leq{}b\end{math} (7.2.5)

where F is a matrix and b is a scalar. (7.2.5) can be reformulated as

\begin{displaymath}\nonumber{}\begin{array}{rcl}\nonumber{}1/2y^{T}Iy & \leq{} & b,\\\nonumber{}Fx-y & = & 0.\end{array}\end{displaymath}

It should be obvious how to generalize this idea to make any convex quadratic problem separable.

Next, consider the constraint

\begin{displaymath}\nonumber{}1/2x^{T}(D+F^{T}F)x\leq{}b\end{displaymath}

where D is a positive semi-definite matrix, F is a matrix, and b is a scalar. We assume that D has a simple structure, e.g. that D is a diagonal or a block diagonal matrix. If this is the case, it may be worthwhile performing the reformulation

\begin{displaymath}\nonumber{}\begin{array}{rcl}\nonumber{}1/2((x^{T}Dx)+y^{T}Iy) & \leq{} & b,\\\nonumber{}Fx-y & = & 0.\end{array}\end{displaymath}

Now, the question may arise: When should a quadratic problem be reformulated to make it separable or near separable? The simplest rule of thumb is that it should be reformulated if the number of non-zeros used to represent the problem decreases when reformulating the problem.

7.3. Conic optimization

Conic optimization can be seen as a generalization of linear optimization. Indeed a conic optimization problem is a linear optimization problem plus a constraint of the form

\begin{displaymath}\nonumber{}x\in{}\mathcal{C}\end{displaymath}

where [[MathCmd 57]] is a convex cone. A complete conic problem has the form

\begin{math}\nonumber{}\begin{array}{lccccc}\nonumber{}\mbox{minimize} &  &  & c^{T}x+c^{f} &  & \\\nonumber{}\mbox{subject to} & l^{c} & \leq{} & Ax & \leq{} & u^{c},\\\nonumber{} & l^{x} & \leq{} & x & \leq{} & u^{x},\\\nonumber{} &  &  & x\in{}\mathcal{C}. &  &\end{array}\end{math} (7.3.1)

The cone [[MathCmd 57]] can be a Cartesian product of p convex cones, i.e.

\begin{displaymath}\nonumber{}\mathcal{C}=\mathcal{C}_{1}\times \cdots \times \mathcal{C}_{p}\end{displaymath}

in which case [[MathCmd 62]] can be written as

\begin{displaymath}\nonumber{}x=(x_{1},\ldots ,x_{p}),~x_{1}\in{}\mathcal{C}_{1},\ldots ,x_{p}\in{}\mathcal{C}_{p}\end{displaymath}

where each [[MathCmd 201]]. Please note that the n-dimensional Euclidean space [[MathCmd 202]] is a cone itself, so simple linear variables are still allowed.

MOSEK supports only a limited number of cones, specifically

\begin{displaymath}\nonumber{}\mathcal{C}=\mathcal{C}_{1}\times \cdot \times \mathcal{C}_{p}\end{displaymath}

where each [[MathCmd 204]] has one of the following forms

Although these cones may seem to provide only limited expressive power they can be used to model a large range of problems as demonstrated in Section 7.3.3.

7.3.1. Duality for conic optimization

The dual problem corresponding to the conic optimization problem (7.3.1) is given by

\begin{math}\nonumber{}\begin{array}{lccccc}\nonumber{}\mbox{maximize} & (l^{c})^{T}s_{l}^{c}-(u^{c})^{T}s_{u}^{c} &  & \\\nonumber{} & +(l^{x})^{T}s_{l}^{x}-(u^{x})^{T}s_{u}^{x}+c^{f} &  & \\\nonumber{}\mbox{subject to} & A^{T}y+s_{l}^{x}-s_{u}^{x}+s_{n}^{x} & = & c,\\\nonumber{} & -y+s_{l}^{c}-s_{u}^{c} & = & 0,\\\nonumber{} & s_{l}^{c},s_{u}^{c},s_{l}^{x},s_{u}^{x} & \geq{} & 0,\\\nonumber{} & s_{n}^{x}\in{}\mathcal{C}^{*} &  &\end{array}\end{math} (7.3.2)

where the dual cone [[MathCmd 210]] is a product of the cones

\begin{displaymath}\nonumber{}\mathcal{C}^{*}=\mathcal{C}^{*}_{1}\times \cdots \mathcal{C}^{*}_{p}\end{displaymath}

where each [[MathCmd 212]] is the dual cone of [[MathCmd 204]]. For the cone types MOSEK can handle, the relation between the primal and dual cone is given as follows:

  • [[MathCmd 64]] set:

    \begin{displaymath}\nonumber{}\mathcal{C}_{t}=\left\lbrace{}x\in{}\mathbb{R}^{{n^{t}}}\right\rbrace{}\quad{}\Leftrightarrow \quad{}\mathcal{C}^{*}_{t}:=\left\lbrace{}s\in{}\mathbb{R}^{{n^{t}}}:~s=0\right\rbrace{}.\end{displaymath}
  • Quadratic cone:

    \begin{displaymath}\nonumber{}\mathcal{C}_{t}:=\left\lbrace{}x\in{}\mathbb{R}^{{n^{t}}}:x_{1}\geq{}\sqrt{\sum \limits _{{j=2}}^{{n^{t}}}x_{j}^{2}}\right\rbrace{}\quad{}\Leftrightarrow \quad{}\mathcal{C}^{*}_{t}=\mathcal{C}_{t}.\end{displaymath}
  • Rotated quadratic cone:

    \begin{displaymath}\nonumber{}\mathcal{C}_{t}:=\left\lbrace{}x\in{}\mathbb{R}^{{n^{t}}}:2x_{1}x_{2}\geq{}\sum \limits _{{j=3}}^{{n^{t}}}x_{j}^{2},~x_{1},x_{2}\geq{}0\right\rbrace{}.\quad{}\Leftrightarrow \quad{}\mathcal{C}^{*}_{t}=\mathcal{C}_{t}.\end{displaymath}

Please note that the dual problem of the dual problem is identical to the original primal problem.

7.3.2. Infeasibility

In case MOSEK finds a problem to be infeasible it reports a certificate of the infeasibility. This works exactly as for linear problems (see Sections 7.1.1.3 and 7.1.1.4).

7.3.3. Examples

This section contains several examples of inequalities and problems that can be cast as conic optimization problems.

7.3.3.1. Quadratic objective and constraints

From Section 7.2.2 we know that any convex quadratic problem can be stated on the form

\begin{math}\nonumber{}\begin{array}{lccl}\nonumber{}\mbox{minimize} & 0.5\left\|Fx\right\|^{2}+c^{T}x, &  & \\\nonumber{}\mbox{subject to} & 0.5\left\|Gx\right\|^{2}+a^{T}x & \leq{} & b,\end{array}\end{math} (7.3.3)

where F and G are matrices and c and a are vectors. For simplicity we assume that there is only one constraint, but it should be obvious how to generalize the methods to an arbitrary number of constraints.

Problem (7.3.3) can be reformulated as

\begin{math}\nonumber{}\begin{array}{lccl}\nonumber{}\mbox{minimize} & 0.5\left\|t\right\|^{2}+c^{T}x, &  & \\\nonumber{}\mbox{subject to} & 0.5\left\|z\right\|^{2}+a^{T}x & \leq{} & b,\\\nonumber{} & Fx-t & = & 0,\\\nonumber{} & Gx-z & = & 0\end{array}\end{math} (7.3.4)

after the introduction of the new variables t and z. It is easy to convert this problem to a conic quadratic optimization problem, i.e.

\begin{math}\nonumber{}\begin{array}{lccll}\nonumber{}\mbox{minimize} & v+c^{T}x, &  &  & \\\nonumber{}\mbox{subject to} & p+a^{T}x & = & b, & \\\nonumber{} & Fx-t & = & 0, & \\\nonumber{} & Gx-z & = & 0, & \\\nonumber{} & w & = & 1, & \\\nonumber{} & q & = & 1, & \\\nonumber{} & \left\|t\right\|^{2} & \leq{} & 2vw, & v,w\geq{}0,\\\nonumber{} & \left\|z\right\|^{2} & \leq{} & 2pq, & p,q\geq{}0.\end{array}\end{math} (7.3.5)

In this case we can model the last two inequalities using rotated quadratic cones.

If we assume that F is a non-singular matrix — e.g. a diagonal matrix — then

\begin{displaymath}\nonumber{}x=F^{{-1}}t\end{displaymath}

and hence we can eliminate x from the problem to obtain:

\begin{math}\nonumber{}\begin{array}{lccll}\nonumber{}\mbox{minimize} & v+c^{T}F^{{-1}}t, &  &  & \\\nonumber{}\mbox{subject to} & p+a^{T}F^{{-1}}t & = & b, & \\\nonumber{} & GF^{{-1}}t-z & = & 0, & \\\nonumber{} & w & = & 1, & \\\nonumber{} & q & = & 1, & \\\nonumber{} & \left\|t\right\|^{2} & \leq{} & 2vw, & v,w\geq{}0,\\\nonumber{} & \left\|z\right\|^{2} & \leq{} & 2pq, & p,q\geq{}0.\end{array}\end{math} (7.3.6)

In most cases MOSEK performs this reduction automatically during the presolve phase before the optimization is performed.

7.3.3.2. Minimizing a sum of norms

The next example is the problem of minimizing a sum of norms, i.e. the problem

\begin{math}\nonumber{}\begin{array}{lccl}\nonumber{}\mbox{minimize} & \sum \limits _{{i=1}}^{{k}}\left\|x^{i}\right\| &  & \\\nonumber{}\mbox{subject to} & Ax & = & b,\end{array}\end{math} (7.3.7)

where

\begin{displaymath}\nonumber{}x:=\left[\begin{array}{c}\nonumber{}x^{1}\\\nonumber{}\vdots \\\nonumber{}x^{k}\end{array}\right].\end{displaymath}

This problem is equivalent to

\begin{math}\nonumber{}\begin{array}{lccll}\nonumber{}\mbox{minimize} & \sum \limits _{{i=1}}^{{k}}z_{i} &  &  & \\\nonumber{}\mbox{subject to} & Ax & = & b, & \\\nonumber{} & \left\|x^{i}\right\| & \leq{} & z_{i}, & i=1,\ldots ,k,\end{array}\end{math} (7.3.8)

which in turn is equivalent to

\begin{math}\nonumber{}\begin{array}{lccl}\nonumber{}\mbox{minimize} & \sum \limits _{{i=1}}^{{k}}z_{i} &  & \\\nonumber{}\mbox{subject to} & Ax & = & b,\\\nonumber{} & (z_{i},x^{i})\in{}\mathcal{C}_{i}, &  & i=1,\ldots ,k\end{array}\end{math} (7.3.9)

where all [[MathCmd 227]] are of the quadratic type, i.e.

\begin{displaymath}\nonumber{}\mathcal{C}_{i}:=\left\lbrace{}(z_{i},x^{i}):~z_{i}\geq{}\left\|x^{i}\right\|\right\rbrace{}.\end{displaymath}

The dual problem corresponding to (7.3.9) is

\begin{math}\nonumber{}\begin{array}{lccll}\nonumber{}\mbox{maximize} & b^{T}y &  &  & \\\nonumber{}\mbox{subject to} & A^{T}y+s & = & c, & \\\nonumber{} & t_{i} & = & 1, & i=1,\ldots ,k,\\\nonumber{} & (t_{i},s^{i})\in{}\mathcal{C}_{i}, &  &  & i=1,\ldots ,k\end{array}\end{math} (7.3.10)

where

\begin{displaymath}\nonumber{}s:=\left[\begin{array}{c}\nonumber{}s^{1}\\\nonumber{}\vdots \\\nonumber{}s^{k}\end{array}\right].\end{displaymath}

This problem is equivalent to

\begin{math}\nonumber{}\begin{array}{lccll}\nonumber{}\mbox{maximize} & b^{T}y &  &  & \\\nonumber{}\mbox{subject to} & A^{T}y+s & = & c, & \\\nonumber{} & \left\|s^{i}\right\|_{2}^{2} & \leq{} & 1, & i=1,\ldots ,k.\end{array}\end{math} (7.3.11)

Please note that in this case the dual problem can be reduced to an “ordinary” convex quadratically constrained optimization problem due to the special structure of the primal problem. In some cases it turns out that it is much better to solve the dual problem (7.3.10) rather than the primal problem (7.3.9).

7.3.3.3. Modelling polynomial terms using conic optimization

Generally an arbitrary polynomial term of the form

\begin{displaymath}\nonumber{}fx^{g}\end{displaymath}

cannot be represented with conic quadratic constraints, however in the following we will demonstrate some special cases where it is possible.

A particular simple polynomial term is the reciprocal, i.e.

\begin{displaymath}\nonumber{}\frac{1}{x}.\end{displaymath}

Now, a constraint of the form

\begin{displaymath}\nonumber{}\frac{1}{x}\leq{}y\end{displaymath}

where it is required that x>0 is equivalent to

\begin{displaymath}\nonumber{}1\leq{}xy\mbox{ and }x>0\end{displaymath}

which in turn is equivalent to

\begin{displaymath}\nonumber{}\begin{array}{rcl}\nonumber{}z & = & \sqrt{2},\\\nonumber{}z^{2} & \leq{} & 2xy.\end{array}\end{displaymath}

The last formulation is a conic constraint plus a simple linear equality.

E.g., consider the problem

\begin{displaymath}\nonumber{}\begin{array}{lccl}\nonumber{}\mbox{minimize} & c^{T}x &  & \\\nonumber{}\mbox{subject to} & \sum \limits _{{j=1}}^{n}\frac{f_{j}}{x_{j}} & \leq{} & b,\\\nonumber{} & x\geq{}0, &  &\end{array}\end{displaymath}

where it is assumed that [[MathCmd 238]] and b>0. This problem is equivalent to

\begin{math}\nonumber{}\begin{array}{lccll}\nonumber{}\mbox{minimize} & c^{T}x &  &  & \\\nonumber{}\mbox{subject to} & \sum \limits _{{j=1}}^{n}f_{j}z_{j} & = & b, & \\\nonumber{} & v_{j} & = & \sqrt{2}, & j=1,\ldots ,n,\\\nonumber{} & v_{j}^{2} & \leq{} & 2z_{j}x_{j}, & j=1,\ldots ,n,\\\nonumber{} & x,z\geq{}0, &  &  &\end{array}\end{math} (7.3.12)

because

\begin{displaymath}\nonumber{}v_{j}^{2}=2\leq{}2z_{j}x_{j}\end{displaymath}

implies that

\begin{displaymath}\nonumber{}\frac{1}{x_{j}}\leq{}z_{j}\mbox{ and }\sum \limits _{{j=1}}^{n}\frac{f_{j}}{x_{j}}\leq{}\sum \limits _{{j=1}}^{n}f_{j}z_{j}=b.\end{displaymath}

The problem (7.3.12) is a conic quadratic optimization problem having n 3-dimensional rotated quadratic cones.

The next example is the constraint

\begin{displaymath}\nonumber{}\begin{array}{rcl}\nonumber{}\sqrt{x} & \geq{} & |t|,\\\nonumber{}x & \geq{} & 0,\end{array}\end{displaymath}

where both t and x are variables. This set is identical to the set

\begin{math}\nonumber{}\begin{array}{rcl}\nonumber{}t^{2} & \leq{} & 2xz,\\\nonumber{}z & = & 0.5,\\\nonumber{}x,z, & \geq{} & 0.\end{array}\end{math} (7.3.13)

Occasionally, when modeling the market impact term in portfolio optimization, the polynomial term [[MathCmd 244]] occurs. Therefore, consider the set defined by the inequalities

\begin{math}\nonumber{}\begin{array}{rcl}\nonumber{}x^{{1.5}} & \leq{} & t,\\\nonumber{}0 & \leq{} & x.\end{array}\end{math} (7.3.14)

We will exploit that [[MathCmd 246]] . First define the set

\begin{math}\nonumber{}\begin{array}{rcl}\nonumber{}x^{2} & \leq{} & 2st,\\\nonumber{}s,t & \geq{} & 0.\end{array}\end{math} (7.3.15)

Now, if we can make sure that

\begin{displaymath}\nonumber{}2s\leq{}\sqrt{x},\end{displaymath}

then we have the desired result since this implies that

\begin{displaymath}\nonumber{}x^{{1.5}}=\frac{x^{2}}{\sqrt{x}}\leq{}\frac{x^{2}}{2 s}\leq{}t.\end{displaymath}

Please note that s can be chosen freely and that [[MathCmd 250]] is a valid choice.

Let

\begin{math}\nonumber{}\begin{array}{rcl}\nonumber{}x^{2} & \leq{} & 2st,\\\nonumber{}w^{2} & \leq{} & 2vr,\\\nonumber{}x & = & v,\\\nonumber{}s & = & w,\\\nonumber{}r & = & \frac{1}{8},\\\nonumber{}s,t,v,r & \geq{} & 0,\end{array}\end{math} (7.3.16)

then

\begin{displaymath}\nonumber{}\begin{array}{rcl}\nonumber{}s^{2} & = & w^{2}\\\nonumber{} & \leq{} & 2vr\\\nonumber{} & = & \frac{v}{4}\\\nonumber{} & = & \frac{x}{4}.\end{array}\end{displaymath}

Moreover,

\begin{displaymath}\nonumber{}\begin{array}{rcl}\nonumber{}x^{2} & \leq{} & 2st,\\\nonumber{} & \leq{} & 2\sqrt{\frac{x}{4}}t\end{array}\end{displaymath}

leading to the conclusion that

\begin{displaymath}\nonumber{}x^{{1.5}}\leq{}t.\end{displaymath}

(7.3.16) is a conic reformulation which is equivalent to (7.3.14). Please note that the x0 constraint does not appear explicitly in (7.3.15) and (7.3.16), but implicitly since x=v0.

As we shall see next, any polynomial term of the form [[MathCmd 255]] where g is a positive rational number can be represented using conic quadratic constraints [11, pp. 12-13], [4].

7.3.3.4. Optimization with rational polynomials

We next demonstrate how to model convex polynomial constraints of the form [[MathCmd 256]] (where p and q are both positive integers) as a set of rotated quadratic cone constraints.

Following Ben-Tal et al. [4, p. 105] we use an intermediate result, namely that the set

\begin{displaymath}\nonumber{}\lbrace{}s\in{}\mathbb{R},\,y\in{}\mathbb{R}^{{2^{l}}}_{+}\ |\ s\leq{}(2^{{l2^{{l-1}}}}y_{1}y_{2}\cdots y_{{2^{l}}})^{{1/2^{l}}}\rbrace{}\end{displaymath}

is convex and can be represented as a set of rotated quadratic cone constraints. To see this, we rewrite the condition (exemplified for l=3),

\begin{math}\nonumber{}s\leq{}\left(2^{{12}}\cdot y_{1}\cdot y_{2}\cdot y_{3}\cdot y_{4}\cdot y_{5}\cdot y_{6}\cdot y_{7}\cdot y_{8}\right)^{{1/8}}\end{math} (7.3.17)

as

\begin{math}\nonumber{}s^{8}\leq{}\left(2^{{12}}\cdot y_{1}\cdot y_{2}\cdot y_{3}\cdot y_{4}\cdot y_{5}\cdot y_{6}\cdot y_{7}\cdot y_{8}\right)\end{math} (7.3.18)

since all [[MathCmd 260]]. We next introduce l levels of auxiliary variables and (rotated cone) constraints

\begin{math}\nonumber{}y_{{11}}^{2}\leq{}2y_{1}y_{2},\quad{}y_{{12}}^{2}\leq{}2y_{3}y_{4},\quad{}y_{{13}}^{2}\leq{}2y_{5}y_{6},\quad{}y_{{14}}^{2}\leq{}2y_{7}y_{8},\end{math} (7.3.19)
\begin{math}\nonumber{}y_{{21}}^{2}\leq{}2y_{{11}}y_{{12}},\quad{}y_{{22}}^{2}\leq{}2y_{{13}}y_{{14}},\end{math} (7.3.20)

and finally

\begin{math}\nonumber{}s^{2}\leq{}2y_{{21}}y_{{22}}.\end{math} (7.3.21)

By simple substitution we see that (7.3.21) and (7.3.18) are equivalent, and since (7.3.21) involves only a set of simple rotated conic constraints then the original constraint (7.3.17) can be represented using only rotated conic constraints.

7.3.3.5. Convex increasing power functions

Using the intermediate result in section 7.3.3.4 we can include convex power functions with positive rational powers, i.e., constraints of the form

\begin{displaymath}\nonumber{}x^{{p/q}}\leq{}t,\quad{}x\geq{}0\end{displaymath}

where p and q are positive integers and p/q1. For example, consider the constraints

\begin{displaymath}\nonumber{}x^{{5/3}}\leq{}t,\quad{}x\geq{}0.\end{displaymath}

We rewrite it as

\begin{displaymath}\nonumber{}x^{8}\leq{}x^{3}t^{3},\quad{}x\geq{}0\end{displaymath}

which in turn is equivalent to

\begin{displaymath}\nonumber{}x^{8}\leq{}2^{{12}}y_{1}y_{2}\cdots y_{8},\quad{}x=y_{1}=y_{2}=y_{3},\quad{}y_{4}=y_{5}=y_{6}=t,\quad{}y_{6}=1,\quad{}y_{7}=2^{{-12}},\quad{}x,y_{i}\geq{}0,\end{displaymath}

i.e., it can be represented as a set of rotated conic and linear constraints using the reformulation above.

For general p and q we choose l as the smallest integer such that [[MathCmd 268]] and we construct the problem as

\begin{displaymath}\nonumber{}x^{{2^{l}}}\leq{}2^{{l2^{{l-1}}}}y_{1}y_{2}\cdots y_{{2^{l}}},\quad{}x,y_{i}\geq{}0,\end{displaymath}

with the first [[MathCmd 270]] elements of y set to x, the next q elements set to t, and the product of the remaining elements as [[MathCmd 271]], i.e.,

\begin{displaymath}\nonumber{}x^{{2^{l}}}\leq{}x^{{2^{l}-p}}t^{q},\quad{}x\geq{}0\qquad{}\Longleftrightarrow \qquad{}x^{{p/q}}\leq{}t,\quad{}x\geq{}0.\end{displaymath}

7.3.3.6. Decreasing power functions

We can also include decreasing power functions with positive rational powers

\begin{displaymath}\nonumber{}x^{{-p/q}}\leq{}t,\quad{}x\geq{}0\end{displaymath}

where p and q are positive integers. For example, consider

\begin{displaymath}\nonumber{}x^{{-5/2}}\leq{}t,\quad{}x\geq{}0,\end{displaymath}

or equivalently

\begin{displaymath}\nonumber{}1\leq{}x^{5}t^{2},\quad{}x\geq{}0,\end{displaymath}

which, in turn, can be rewritten as

\begin{displaymath}\nonumber{}s^{8}\leq{}2^{{12}}y_{1}y_{2}\cdots y_{8},\quad{}s=2^{{3/2}},\quad{}y_{1}=\cdots =y_{5}=x,\quad{}y_{6}=y_{7}=y_{8}=t,\quad{}x,y_{i}\geq{}0.\end{displaymath}

For general p and q we choose l as the smallest integer such that [[MathCmd 277]] and we construct the problem as

\begin{displaymath}\nonumber{}s^{{2^{l}}}\leq{}y_{1}y_{2}\cdots y_{{2^{l}}},\quad{}y_{i}\geq{}0,\end{displaymath}

with [[MathCmd 279]] and the first p elements of y set to x, the next q elements set to t, and the remaining elements set to 1, i.e.,

\begin{displaymath}\nonumber{}1\leq{}x^{p}t^{q},\quad{}x\geq{}0\qquad{}\Longleftrightarrow \qquad{}x^{{-p/q}}\leq{}t,\quad{}x\geq{}0.\end{displaymath}

7.3.3.7. Minimizing general polynomials

Using the formulations in section 7.3.3.5 and section 7.3.3.6 it is straightforward to minimize general polynomials. For example, we can minimize

\begin{displaymath}\nonumber{}f(x)=x^{2}+x^{{-2}}\end{displaymath}

which is used in statistical matching. We first formulate the problem

\begin{displaymath}\nonumber{}\begin{array}{ll}\nonumber{}\mbox{minimize} & u+v\\\nonumber{}\mbox{subject to} & x^{2}\leq{}u\\\nonumber{} & x^{{-2}}\leq{}v,\end{array}\end{displaymath}

which is equivalent to the quadratic conic optimization problem

\begin{displaymath}\nonumber{}\begin{array}{ll}\nonumber{}\mbox{minimize} & u+v\\\nonumber{}\mbox{subject to} & x^{2}\leq{}2uw\\\nonumber{} & s^{2}\leq{}2y_{{21}}y_{{22}}\\\nonumber{} & y_{{21}}^{2}\leq{}2y_{1}y_{2}\\\nonumber{} & y_{{22}}^{2}\leq{}2y_{3}y_{4}\\\nonumber{} & w=1\\\nonumber{} & s=2^{{3/4}}\\\nonumber{} & y_{1}=y_{2}=x\\\nonumber{} & y_{3}=v\\\nonumber{} & y_{4}=1\end{array}\end{displaymath}

in the variables [[MathCmd 284]].

7.3.3.8. Further reading

If you want to learn more about what can be modeled as a conic optimization problem we recommend the references [11, 4, 18].

7.3.4. Potential pitfalls in conic optimization

While a linear optimization problem either has a bounded optimal solution or is infeasible, the conic case is not as simple as that.

7.3.4.1. Non-attainment in the primal problem

Consider the example

\begin{math}\nonumber{}\begin{array}{lccl}\nonumber{}\mbox{minimize} & z &  & \\\nonumber{}\mbox{subject to} & 2yz & \geq{} & x^{2},\\\nonumber{} & x & = & \sqrt{2},\\\nonumber{} & y,z & \geq{} & 0,\end{array}\end{math} (7.3.22)

which corresponds to the problem

\begin{math}\nonumber{}\begin{array}{lccl}\nonumber{}\mbox{minimize} & \frac{1}{y} &  & \\\nonumber{}\mbox{subject to} & y & \geq{} & 0.\end{array}\end{math} (7.3.23)

Clearly, the optimal objective value is zero but it is never attained because implicitly we assume that the optimal y is finite.

7.3.4.2. Non-attainment in the dual problem

Next, consider the example

\begin{math}\nonumber{}\begin{array}{lccl}\nonumber{}\mbox{minimize} & x_{4} &  & \\\nonumber{}\mbox{subject to} & x_{3}+x_{4} & = & 1,\\\nonumber{} & x_{1} & = & 0,\\\nonumber{} & x_{2} & = & 1,\\\nonumber{} & 2x_{1}x_{2} & \geq{} & x_{3}^{2},\\\nonumber{} & x_{1},x_{2} & \geq{} & 0,\end{array}\end{math} (7.3.24)

which has the optimal solution

\begin{displaymath}\nonumber{}x_{1}^{*}=0,~x_{2}^{*}=1,~x_{3}^{*}=0\mbox{ and }x_{4}^{*}=1\end{displaymath}

implying that the optimal primal objective value is 1.

Now, the dual problem corresponding to (7.3.24) is

\begin{math}\nonumber{}\begin{array}{lccl}\nonumber{}\mbox{maximize} & y_{1}+y_{3} &  & \\\nonumber{}\mbox{subject to} & y_{2}+s_{1} & = & 0,\\\nonumber{} & y_{3}+s_{2} & = & 0,\\\nonumber{} & y_{1}+s_{3} & = & 0,\\\nonumber{} & y_{1} & = & 1,\\\nonumber{} & 2s_{1}s_{2} & \geq{} & s_{3}^{2},\\\nonumber{} & s_{1},s_{2} & \geq{} & 0.\end{array}\end{math} (7.3.25)

Therefore,

\begin{displaymath}\nonumber{}y_{1}^{*}=1\end{displaymath}

and

\begin{displaymath}\nonumber{}s_{3}^{*}=-1.\end{displaymath}

This implies that

\begin{displaymath}\nonumber{}2s_{1}^{*}s_{2}^{*}\geq{}(s_{3}^{*})^{2}=1\end{displaymath}

and hence [[MathCmd 293]]. Given this fact we can conclude that

\begin{displaymath}\nonumber{}\begin{array}{rcl}\nonumber{}y_{1}^{*}+y_{3}^{*} & = & 1-s_{2}^{*}\\\nonumber{} & < & 1\end{array}\end{displaymath}

implying that the optimal dual objective value is 1, however, this is never attained. Hence, no primal-dual bounded optimal solution with zero duality gap exists. Of course it is possible to find a primal-dual feasible solution such that the duality gap is close to zero, but then [[MathCmd 295]] will be similarly large. This is likely to make the problem (7.3.24) hard to solve.

An inspection of the problem (7.3.24) reveals the constraint [[MathCmd 296]], which implies that [[MathCmd 297]]. If we either add the redundant constraint

\begin{displaymath}\nonumber{}x_{3}=0\end{displaymath}

to the problem (7.3.24) or eliminate [[MathCmd 299]] and [[MathCmd 79]] from the problem it becomes easy to solve.

7.4. Recommendations

Often an optimization problem can be formulated in several different ways, and the exact formulation used may have a significant impact on the solution time and the quality of the solution. In some cases the difference between a “good” and a “bad” formulation means the ability to solve the problem or not.

Below is a list of several issues that you should be aware of when developing a good formulation.

  1. Sparsity is very important. The constraint matrix A is assumed to be a sparse matrix, where sparse means that it contains many zeros (typically less than 10% non-zeros). Normally, when A is sparser, less memory is required to store the problem and it can be solved faster.
  2. Avoid large bounds as these can introduce all sorts of numerical problems. Assume that a variable [[MathCmd 301]] has the bounds

    \begin{displaymath}\nonumber{}0.0\leq{}x_{j}\leq{}1.0e16.\end{displaymath}

    The number 1.0e16 is large and it is very likely that the constraint [[MathCmd 303]] is non-binding at optimum, and therefore that the bound 1.0e16 will not cause problems. Unfortunately, this is a naïve assumption because the bound 1.0e16 may actually affect the presolve, the scaling, the computation of the dual objective value, etc. In this case the constraint [[MathCmd 304]] is likely to be sufficient, i.e. 1.0e16 is just a way of representing infinity.

  3. Avoid large penalty terms in the objective, i.e. do not have large terms in the linear part of the objective function. They will most likely cause numerical problems.
  4. On a computer all computations are performed in finite precision, which implies that

    \begin{displaymath}\nonumber{}1=1+\varepsilon\end{displaymath}

    where [[MathCmd 306]] is about [[MathCmd 307]]. This means that the results of all computations are truncated and therefore causing rounding errors. The upshot is that very small numbers and very large numbers should be avoided, e.g. it is recommended that all elements in A either are zero or belong to the interval [[MathCmd 308]]. The same holds for the bounds and the linear objective.

  5. Decreasing the number of variables or constraints does not necessarily make it easier to solve a problem. In certain cases, i.e. in nonlinear optimization, it may be a good idea to introduce more constraints and variables if it makes the model separable. Furthermore, a big but sparse problem may be advantageous compared to a smaller but denser problem.
  6. Try to avoid linearly dependent rows among the linear constraints. Network flow problems and multi-commodity network flow problems, for example, often contain one or more linearly dependent rows.
  7. Finally, it is recommended to consult some of the papers about preprocessing to get some ideas about efficient formulations. See e.g. [10, 5, 24, 22].

7.4.1. Avoid near infeasible models

Consider the linear optimization problem

\begin{math}\nonumber{}\begin{array}{lccl}\nonumber{}\mbox{minimize} &  &  & \\\nonumber{}\mbox{subject to} & x+y & \leq{} & 10^{{-10}}+\alpha ,\\\nonumber{} & 1.0e4x+2.0e4y & \geq{} & 10^{{-6}},\\\nonumber{} & x,y\geq{}0. &  &\end{array}\end{math} (7.4.1)

Clearly, the problem is feasible for [[MathCmd 310]]. However, for [[MathCmd 311]] the problem is infeasible. This implies that an insignificant change in the right side of the constraints makes the problem status switch from feasible to infeasible. Such a model should be avoided.

7.5. Examples continued

7.5.1. The absolute value

Assume that we have a constraint for the form

\begin{math}\nonumber{}|f^{T}x+g|\leq{}b\end{math} (7.5.1)

where [[MathCmd 150]] is a vector of variables, and [[MathCmd 314]] and [[MathCmd 315]] are constants.

It is easy to verify that the constraint (7.5.1) is equivalent to

\begin{math}\nonumber{}-b\leq{}f^{T}x+g\leq{}b\end{math} (7.5.2)

which is a set of ordinary linear inequality constraints.

Please note that equalities involving an absolute value such as

\begin{displaymath}\nonumber{}|x|=1\end{displaymath}

cannot be formulated as a linear or even a as convex nonlinear optimization problem. It requires integer constraints.

7.5.2. The Markowitz portfolio model

In this section we will show how to model several versions of the Markowitz portfolio model using conic optimization.

The Markowitz portfolio model deals with the problem of selecting a portfolio of assets, i.e. stocks, bonds, etc. The goal is to find a portfolio such that for a given return the risk is minimized. The assumptions are:

  • A portfolio can consist of n traded assets numbered [[MathCmd 318]] held over a period of time.
  • [[MathCmd 319]] is the initial holding of asset j where [[MathCmd 320]].
  • [[MathCmd 321]] is the return on asset j and is assumed to be a random variable. r has a known mean [[MathCmd 322]] and covariance [[MathCmd 323]].

The variable [[MathCmd 301]] denotes the amount of asset j traded in the given period of time and has the following meaning:

  • If [[MathCmd 325]], then the amount of asset j is increased (by purchasing).
  • If [[MathCmd 326]], then the amount of asset j is decreased (by selling).

The model deals with two central quantities:

  • Expected return:

    \begin{displaymath}\nonumber{}E[r^{T}(w^{0}+x)]=\bar{r}^{T}(w^{0}+x).\end{displaymath}
  • Variance (Risk):

    \begin{displaymath}\nonumber{}V[r^{T}(w^{0}+x)]=(w^{0}+x)^{T}\Sigma (w^{0}+x).\end{displaymath}

By definition [[MathCmd 323]] is positive semi-definite and

\begin{displaymath}\nonumber{}\begin{array}{rcl}\nonumber{}\mbox{Std. dev.} & = & \left\|\Sigma ^{{\frac{1}{2}}}(w^{0}+x)\right\|\\\nonumber{} & = & \left\|L^{T}(w^{0}+x)\right\|\end{array}\end{displaymath}

where L is any matrix such that

\begin{displaymath}\nonumber{}\Sigma =LL^{T}\end{displaymath}

A low rank of [[MathCmd 323]] is advantageous from a computational point of view. A valid L can always be computed as the Cholesky factorization of [[MathCmd 323]].

7.5.2.1. Minimizing variance for a given return

In our first model we want to minimize the variance while selecting a portfolio with a specified expected target return t. Additionally, the portfolio must satisfy the budget (self-financing) constraint asserting that the total amount of assets sold must equal the total amount of assets purchased. This is expressed in the model

\begin{math}\nonumber{}\begin{array}{lccl}\nonumber{}\mbox{minimize} & V[r^{T}(w^{0}+x)] &  & \\\nonumber{}\mbox{subject to} & E[r^{T}(w^{0}+x)] & = & t,\\\nonumber{} & e^{T}x & = & 0,\end{array}\end{math} (7.5.3)

where [[MathCmd 335]]. Using the definitions above this may be formulated as a quadratic optimization problem:

\begin{math}\nonumber{}\begin{array}{lccl}\nonumber{}\mbox{minimize} & (w^{0}+x)^{T}\Sigma (w^{0}+x) &  & \\\nonumber{}\mbox{subject to} & \bar{r}^{T}(w^{0}+x) & = & t,\\\nonumber{} & e^{T}x & = & 0.\end{array}\end{math} (7.5.4)

7.5.2.2. Conic quadratic reformulation

An equivalent conic quadratic reformulation is given by:

\begin{math}\nonumber{}\begin{array}{lcccl}\nonumber{}\mbox{minimize} & f &  & \\\nonumber{}\mbox{subject to} & \Sigma ^{{\frac{1}{2}}}(w^{0}+x)-g & = & 0,\\\nonumber{} & \bar{r}^{T}(w^{0}+x) & = & t,\\\nonumber{} & e^{T}x & = & 0,\\\nonumber{} & f\geq{}\left\|g\right\|. &  &\end{array}\end{math} (7.5.5)

Here we minimize the standard deviation instead of the variance. Please note that [[MathCmd 338]] can be replaced by any matrix L where [[MathCmd 339]]. A low rank L is computationally advantageous.

7.5.2.3. Transaction costs with market impact term

We will now expand our model to include transaction costs as a fraction of the traded volume. [15, pp. 445-475] argues that transaction costs can be modeled as follows

\begin{math}\nonumber{}\mbox{commission}+\frac{\mbox{bid}}{\mbox{ask}}-\mbox{spread}+\theta \sqrt{\frac{\mbox{trade volume}}{\mbox{daily volume}}},\end{math} (7.5.6)

and that it is important to incorporate these into the model.

In the following we deal with the last of these terms denoted the market impact term. If you sell (buy) a lot of assets the price is likely to go down (up). This can be captured in the market impact term

\begin{displaymath}\nonumber{}\theta \sqrt{\frac{\mbox{trade volume}}{\mbox{daily volume}}}\approx{}m_{j}\sqrt{|x_{j}|}.\end{displaymath}

The [[MathCmd 342]] and “daily volume” have to be estimated in some way, i.e.

\begin{displaymath}\nonumber{}m_{j}=\frac{\theta }{\sqrt{\mbox{daily volume}}}\end{displaymath}

has to be estimated. The market impact term gives the cost as a fraction of daily traded volume ([[MathCmd 344]]). Therefore, the total cost when trading an amount [[MathCmd 301]] of asset j is given by

\begin{displaymath}\nonumber{}|x_{j}|(m_{j}|x_{j}|^{{\frac{1}{2}}}).\end{displaymath}

This leads us to the model:

\begin{math}\nonumber{}\begin{array}{lcccl}\nonumber{}\mbox{minimize} & f &  & \\\nonumber{}\mbox{subject to} & \Sigma ^{{\frac{1}{2}}}(w^{0}+x)-g & = & 0,\\\nonumber{} & \bar{r}^{T}(w^{0}+x) & = & t,\\\nonumber{} & e^{T}x+e^{T}y & = & 0,\\\nonumber{} & |x_{j}|(m_{j}|x_{j}|^{{\frac{1}{2}}}) & \leq{} & y_{j},\\\nonumber{} & f\geq{}\left\|g\right\|. &  &\end{array}\end{math} (7.5.7)

Now, defining the variable transformation

\begin{displaymath}\nonumber{}y_{j}=m_{j}\bar{y}_{j}\end{displaymath}

we obtain

\begin{math}\nonumber{}\begin{array}{lcccl}\nonumber{}\mbox{minimize} & f &  & \\\nonumber{}\mbox{subject to} & \Sigma ^{{\frac{1}{2}}}(w^{0}+x)-g & = & 0,\\\nonumber{} & \bar{r}^{T}(w^{0}+x) & = & t,\\\nonumber{} & e^{T}x+m^{T}\bar{y} & = & 0,\\\nonumber{} & |x_{j}|^{{3/2}} & \leq{} & \bar{y}_{j},\\\nonumber{} & f\geq{}\left\|g\right\|. &  &\end{array}\end{math} (7.5.8)

As shown in Section 7.3.3.3 the set

\begin{displaymath}\nonumber{}|x_{j}|^{{3/2}}\leq{}\bar{y}_{j}\end{displaymath}

can be modeled by

\begin{math}\nonumber{}\begin{array}{rcl}\nonumber{}x_{j} & \leq{} & z_{j},\\\nonumber{}-x_{j} & \leq{} & z_{j},\\\nonumber{}z_{j}^{2} & \leq{} & 2s_{j}\bar{y}_{j},\\\nonumber{}u_{j}^{2} & \leq{} & 2v_{j}q_{j},\\\nonumber{}z_{j} & = & v_{j},\\\nonumber{}s_{j} & = & u_{j},\\\nonumber{}q_{j} & = & \frac{1}{8},\\\nonumber{}q_{j},s_{j},\bar{y}_{j},v_{j},q_{j} & \geq{} & 0.\end{array}\end{math} (7.5.9)

7.5.2.4. Further reading

For further reading please see [13] in particular, and [3] and [15], which also contain relevant material.

Wed Feb 29 16:00:42 2012