Non-linear constrained minimization

Started by
10 comments, last by Moeller 17 years, 6 months ago
I have to minimize a target function
T(x) = T(x1 ... xn)
under vectorial constraints of equality:
∀ i Ei(x) = 0
and inequality:
∀ i Ii(x) > 0
I'm looking for only one minimum, not necessarily the entire set of minima. Also, approximations are possible (I'm not looking for utter precision, or a complete formula for the answer: just the numerical value of each xi). The function and constraints are all expressed in terms of sums, products, powers and other continuous differentiable functions (sin, cos, tan, exp, log...). I have access to an entire formal calculus system for differentiating, simplyfing and extracting various properties from any mathematical expression. If no constraints are present, minimization can be done with least-squares (linear if applicable, non-linear otherwise). Conversely, if the target function is constant, there are no inequality constraints, but the equality constraints are linear, I can solve it as a system of linear equations. This is also possible with a few polynomials. In fact, even if the equality or inequality constraints are arbitrary, I can use a numerical method to approximate the solution. But as soon as the problem gets more complex, I'm lost. What general method can I use when the target function is not constant, and there are both equality and inequality constraints (assumed non-trivial) ?
Advertisement
This is a variational problem. Depending on the nature of T, you'll want to use either a Lagrange Multiplier or the unadulterated Euler-Lagrange equation.

If an optimal solution exists, it will solve the EL integral equation. I only have direct experience solving analytically, but I imagine there is a wealth of information regarding numerical (iterative) solution of the Euler-Lagrange equation under a host of variational circumstances.
You say T is composed of trig and exponential functions. I assume you mean finitely many, else the function could be anything that's 'differentiable almost-everywhere'. Is that as specific as you can go?

Regards
Admiral
Ring3 Circus - Diary of a programmer, journal of a hacker.
T is a finite expression, which can be composed of a finite amount of trigonometric and exponential functions. In more specific cases, it can be a linear combination of variables (→ simplex), a sum of squares, or even a constant.

However, I am always able to compute its derivative w.r.t. any variable.

The problem I encountered with the Lagrange multiplier is twofold: it only handles equality constraints (although I believe inequality could be expressed as such, even if I don't know how to apply Kuhn & Tucker), and it cannot distinguish, AFAIK, sets of minima from sets of maxima. Other than that, it is sufficient to convert a maximization problem into an equality problem (finding the zeros of a system of equations), which I can do using nonlinear least squares or simply Newton-Rhapson.
I'm not sure what to suggest in reference to the inequality issue. As the constraint is on the parameters, rather than the result, you may get away with just dropping it altogether. I appreciate that this isn't optimal. Perhaps Farkas's Lemma may be easier to apply (though I suspect that the implication is pointing the wrong way).

As for differentiating maxima from minima, I'm sure an elegant solution exists, but couldn't you just compare the result against an arbitrary function and deduce the nature of the extremal from the result?

Regards
Admiral
Ring3 Circus - Diary of a programmer, journal of a hacker.
I've thought about this some more, and came to the follow conclusions:

I can eliminate optimization altogether. In essence, max F is equivalent to dF = 0, d²F > 0 (I can differentiate F several times, since it's made of infinitely differentiable functions). This is a little bit more complex when using other constraints, but Kuhn&Tucker makes the conversion possible by introducing the λ and μ parameters for equalities and inequalities.

Therefore, my problem becomes a simple constraint satisfaction problem, where I have to get Ai(x) = 0, Bj(x) > 0. With only equalities, multidimensional Newton-Raphson convergence is fine. And, again, I'm stuck on the inequalities.
As far as I understand, your problem is not a variational problem but more a Nonlinear Program (NLP). Basically this is a very general formulation and without further assumptions ruther difficult to solve. At least if you need more than a local minimum.

Maybe take a look at this FAQ about Nonlinear Programming.

http://www-unix.mcs.anl.gov/otc/Guide/faq/nonlinear-programming-faq.html

For many special cases like linear programs (LP) there are strong theories and algorithms. I would suggest to check if you really need to solve a NLP or if you have any less general formulation of your problem.
Quote:Original post by Moeller
As far as I understand, your problem is not a variational problem but more a Nonlinear Program (NLP). Basically this is a very general formulation and without further assumptions ruther difficult to solve. At least if you need more than a local minimum.

For many special cases like linear programs (LP) there are strong theories and algorithms. I would suggest to check if you really need to solve a NLP or if you have any less general formulation of your problem.


One specialization I can allow is that I'm only looking for a local minimum, not necessarily a global minimum (or a set of minima). However, the expressions can all be non-linear (they include powers and various usual functions).

I already have an infrastructure to detect cases where the problem is linear, and they branch out on the simplex method and similar approaches. I have also managed to reduce the optimization problem to a constraint-solving problem (as described in my above post).

So, I guess I can also allow an additional specialization: that the target function is constant.
At any extremum, each inequality is either binding (the solution is butted up against it, trying to get past it) or non-binding (the solution does not care about it). In other words, any minimum of f(x) such that all g_n(x) = 0 and all h_n(x) > 0 is expressible as a minimum of f(x) such that all g_n(x) = 0 and a subset of h_n(x) = 0.

In other words, you can express a system with n inequalities as 2^n systems of equalities only. For a small number of inequalities, this works great. For large numbers of inequalities, more complicated methods are necessary to determine when to bind/unbind inequality constraints.

(Note that this is sort of tangential to the Kuhn-Tucker stuff, which you need to use for constraint walking. It's just in the way of background, in case you don't have this piece of the puzzle yet.)

EDIT: Oh, and if you can bring in MATLAB, its Optimization Toolbox's fmincon program is great at this stuff.
The Karush-Kuhn-Tucker (KKT) necessary conditions for an Nonlinear Program (NLP) yield an Nonlinear Complementarity Problem (NCP). As described for example here:

http://www.csc.fi/cschelp/sovellukset/stat/sas/sasdoc/sashtml/ormp/chap5/sect26.htm





Solving a Nonlinear Complementarity Problem (NCP) is in general again a very difficult task. If you have some special properties like linearity or convexity you can simplify your problem. One very usefull property would be a low dimension of your problem. Then you can solve the Nonlinear Complementarity Problem (NCP) using an enumerative approach. A set of n complementarity conditions can be transformed into 2^n different sets of equality constraints (and inequlity constraints determining wether the solution of th i-th problem is a solution of the oroginal problem).

So to solve one NCP with n complementarity constraints you have to solve 2^n systems of nonlinear equations. For large n this is of course a bad idea.
I would rather avoid the 2N factor.

I'm thinking of a different approach: for each constraing G(x) > 0, define an equality constraint H(x) = 0, where H(x) = max { -G(x)-ε, 0 }

Then, I could use Newton-Raphson to find a zero of my (F,H) constraints, which would yield a position where H(x) = 0 and thus G(x) > ε

What is the problem with this approach, and why isn't it used?

This topic is closed to new replies.

Advertisement