Quadratic optimization problem-need help

Started by
1 comment, last by Halo Vortex 17 years, 6 months ago
After applying Lagrange-constraints to my problem, I get the following task: I have to maximize: F(L)=Sum(for each i,j=1..N) Li*Lj*A[i,j] -simple quadratic polynom Where A is a known matrix of something like 10k x 10k elements (still enough to fit in memory as a whole object) Maximization is performed with L=(L1,L2,...,Ln) using the following constraints: 0<=Li<=Const1 Sum of BiLi=0 (Bi - known koefs) Sum of Li=Const2 So if there weren't any constaints, I would have just used gradient opimization. But I don't know what to do with constraints, especially the 0<=Li<=Const one Also, it sure doesn't have to be realtime, but still it would be nice if solution is found in a reasonable time (5-10 secs for 10k elements for example). I faced this problem while trying to implement Support Vector Machine data-classification method. Any help or links would be appreciated
Advertisement
For short notation: You want to maximize the quadratic form F(L) = Dot(L,A*L), where A is an N-by-N symmetric matrix and L is an N-by-1 vector. The constraints are 0 <= L <= m (a constant), Dot(b,L) = 0 (b is a constant N-by-1 vector), and Dot(w,L) = s (a constant) where w is the vector whose components are all 1.

For motivation, look at the problem when N = 2. The constraints 0 <= L <= m define a solid square in the plane with vertices (0,0), (m,0), (m,m), and (0,m). The constraint Dot(b,L) = 0 defines a line through the origin. The constraint Dot(w,L) = s defines another line. For there to be a solution, the line Dot(w,L) = s must intersect the square, so you need 0 <= s <= 2*m. The line Dot(b,L) = 0 must intersect the line Dot(w,L) = s at a point _inside_ the square, call that point P. The maximum of the quadratic form is F(P). Not a very interesting case, but it has some insight to it.

Now look at the problem when N = 3. The constraints 0 <= L <= m define a solid cube in space. The constraint Dot(b,L) = 0 defines a plane through the origin. The constraint Dot(w,L) = s defines another plane. For there to be a solution, the plane Dot(w,L) = s must intersect the cube, so you need 0 <= s <= 3*m. The plane Dot(b,L) = 0 must intersect the plane Dot(w,L) = s. The intersection is a line, say of the form L(t) = P+t*D, where P is a point on the line and D is a unit-length vector. This line must intersect the cube for there to be a solution. Clip the line against the cube to produce the segment L(t) = P+t*D for t0 <= t <= t1; the t0 and t1 values are what you compute in the clipping process.

Substitute the line equation into the quadratic form to produce the quadratic function Q(t) = F(L(t)) = Dot(P+t*D,A*(P+t*D)), with t0 <= t <= t1. This is now in the form that you can apply the standard maximization methods from calculus. Compute T for which Q'(T) = 0. The maximum is the largest value of Q(t0), Q(T), and Q(t1). If your A is positive definite, since you are computing the _maximum_ of Q, this necessarily occurs at t0 or t1, so you do not have to compute/analyze the derivative of Q. This makes the problem easier in higher dimensions.

In general dimension N, the constraints 0 <= L <= m define a solid hypercube. The constraint Dot(b,L) = 0 defines a hyperplane through the origin. The constraint Dot(w,L) = s defines another hyperplane. For there to be a solution, the two hyperplanes must intersect. Assuming they do, the intersection is an affine space of dimension N-2 (an "(N-2)-flat"). This space must intersect the hypercube for there to be a solution.

Let P be a point in the affine space and construct an orthonormal basis for the linear space that goes with it. The basis has N-2 vectors V for 1 <= i <= N-2. The affine space is parameterized by L(t[1],...,t[N-1]) = P + sum_{i=1}^{N-1} t*V for scalars t. Substitute this into F to obtain a quadratic function Q(t[1],...,t[N-1]), just like in the case N = 3. The domain of this function is determined by clipping the affine space against the hypercube. This gives you a convex hyper-polyhedron on which to maximize Q. Once again, a problem that can be solved with standard calculus methods. The most difficult part of the implementation is computing the convex hyper-polyhedron (the data structures for high-dimensional objects like this get complicated).

If your A is positive definite, then the maximum of Q must necessarily occur at one of the vertices of the convex hyper-polyhedron. A gradient ascent should work nicely to solve this one (you get to follow a path restricted to vertices, edges, faces, hyperfaces,...).
Unfortunately, A isn't positive - it's a general matrix.
I understand how the search volume looks like, but since A isn't positive, constructing it and walking along it will give no result.

I guess what I need is a nice iterative method. Pretty much like Simplex, but that will work with Dot(L, AL) and support simple boundary constrains for all variables.

This topic is closed to new replies.

Advertisement