Quadratic optimization problem-need help
After applying Lagrange-constraints to my problem, I get the following task:
I have to maximize:
F(L)=Sum(for each i,j=1..N) Li*Lj*A[i,j] -simple quadratic polynom
Where A is a known matrix of something like 10k x 10k elements (still enough to fit in memory as a whole object)
Maximization is performed with L=(L1,L2,...,Ln)
using the following constraints:
0<=Li<=Const1
Sum of BiLi=0 (Bi - known koefs)
Sum of Li=Const2
So if there weren't any constaints, I would have just used gradient opimization.
But I don't know what to do with constraints, especially the 0<=Li<=Const one
Also, it sure doesn't have to be realtime, but still it would be nice if solution is found in a reasonable time (5-10 secs for 10k elements for example).
I faced this problem while trying to implement Support Vector Machine data-classification method.
Any help or links would be appreciated
For short notation: You want to maximize the quadratic form F(L) = Dot(L,A*L), where A is an N-by-N symmetric matrix and L is an N-by-1 vector. The constraints are 0 <= L <= m (a constant), Dot(b,L) = 0 (b is a constant N-by-1 vector), and Dot(w,L) = s (a constant) where w is the vector whose components are all 1.
For motivation, look at the problem when N = 2. The constraints 0 <= L <= m define a solid square in the plane with vertices (0,0), (m,0), (m,m), and (0,m). The constraint Dot(b,L) = 0 defines a line through the origin. The constraint Dot(w,L) = s defines another line. For there to be a solution, the line Dot(w,L) = s must intersect the square, so you need 0 <= s <= 2*m. The line Dot(b,L) = 0 must intersect the line Dot(w,L) = s at a point _inside_ the square, call that point P. The maximum of the quadratic form is F(P). Not a very interesting case, but it has some insight to it.
Now look at the problem when N = 3. The constraints 0 <= L <= m define a solid cube in space. The constraint Dot(b,L) = 0 defines a plane through the origin. The constraint Dot(w,L) = s defines another plane. For there to be a solution, the plane Dot(w,L) = s must intersect the cube, so you need 0 <= s <= 3*m. The plane Dot(b,L) = 0 must intersect the plane Dot(w,L) = s. The intersection is a line, say of the form L(t) = P+t*D, where P is a point on the line and D is a unit-length vector. This line must intersect the cube for there to be a solution. Clip the line against the cube to produce the segment L(t) = P+t*D for t0 <= t <= t1; the t0 and t1 values are what you compute in the clipping process.
Substitute the line equation into the quadratic form to produce the quadratic function Q(t) = F(L(t)) = Dot(P+t*D,A*(P+t*D)), with t0 <= t <= t1. This is now in the form that you can apply the standard maximization methods from calculus. Compute T for which Q'(T) = 0. The maximum is the largest value of Q(t0), Q(T), and Q(t1). If your A is positive definite, since you are computing the _maximum_ of Q, this necessarily occurs at t0 or t1, so you do not have to compute/analyze the derivative of Q. This makes the problem easier in higher dimensions.
In general dimension N, the constraints 0 <= L <= m define a solid hypercube. The constraint Dot(b,L) = 0 defines a hyperplane through the origin. The constraint Dot(w,L) = s defines another hyperplane. For there to be a solution, the two hyperplanes must intersect. Assuming they do, the intersection is an affine space of dimension N-2 (an "(N-2)-flat"). This space must intersect the hypercube for there to be a solution.
Let P be a point in the affine space and construct an orthonormal basis for the linear space that goes with it. The basis has N-2 vectors V for 1 <= i <= N-2. The affine space is parameterized by L(t[1],...,t[N-1]) = P + sum_{i=1}^{N-1} t*V for scalars t. Substitute this into F to obtain a quadratic function Q(t[1],...,t[N-1]), just like in the case N = 3. The domain of this function is determined by clipping the affine space against the hypercube. This gives you a convex hyper-polyhedron on which to maximize Q. Once again, a problem that can be solved with standard calculus methods. The most difficult part of the implementation is computing the convex hyper-polyhedron (the data structures for high-dimensional objects like this get complicated).
If your A is positive definite, then the maximum of Q must necessarily occur at one of the vertices of the convex hyper-polyhedron. A gradient ascent should work nicely to solve this one (you get to follow a path restricted to vertices, edges, faces, hyperfaces,...).
For motivation, look at the problem when N = 2. The constraints 0 <= L <= m define a solid square in the plane with vertices (0,0), (m,0), (m,m), and (0,m). The constraint Dot(b,L) = 0 defines a line through the origin. The constraint Dot(w,L) = s defines another line. For there to be a solution, the line Dot(w,L) = s must intersect the square, so you need 0 <= s <= 2*m. The line Dot(b,L) = 0 must intersect the line Dot(w,L) = s at a point _inside_ the square, call that point P. The maximum of the quadratic form is F(P). Not a very interesting case, but it has some insight to it.
Now look at the problem when N = 3. The constraints 0 <= L <= m define a solid cube in space. The constraint Dot(b,L) = 0 defines a plane through the origin. The constraint Dot(w,L) = s defines another plane. For there to be a solution, the plane Dot(w,L) = s must intersect the cube, so you need 0 <= s <= 3*m. The plane Dot(b,L) = 0 must intersect the plane Dot(w,L) = s. The intersection is a line, say of the form L(t) = P+t*D, where P is a point on the line and D is a unit-length vector. This line must intersect the cube for there to be a solution. Clip the line against the cube to produce the segment L(t) = P+t*D for t0 <= t <= t1; the t0 and t1 values are what you compute in the clipping process.
Substitute the line equation into the quadratic form to produce the quadratic function Q(t) = F(L(t)) = Dot(P+t*D,A*(P+t*D)), with t0 <= t <= t1. This is now in the form that you can apply the standard maximization methods from calculus. Compute T for which Q'(T) = 0. The maximum is the largest value of Q(t0), Q(T), and Q(t1). If your A is positive definite, since you are computing the _maximum_ of Q, this necessarily occurs at t0 or t1, so you do not have to compute/analyze the derivative of Q. This makes the problem easier in higher dimensions.
In general dimension N, the constraints 0 <= L <= m define a solid hypercube. The constraint Dot(b,L) = 0 defines a hyperplane through the origin. The constraint Dot(w,L) = s defines another hyperplane. For there to be a solution, the two hyperplanes must intersect. Assuming they do, the intersection is an affine space of dimension N-2 (an "(N-2)-flat"). This space must intersect the hypercube for there to be a solution.
Let P be a point in the affine space and construct an orthonormal basis for the linear space that goes with it. The basis has N-2 vectors V for 1 <= i <= N-2. The affine space is parameterized by L(t[1],...,t[N-1]) = P + sum_{i=1}^{N-1} t*V for scalars t. Substitute this into F to obtain a quadratic function Q(t[1],...,t[N-1]), just like in the case N = 3. The domain of this function is determined by clipping the affine space against the hypercube. This gives you a convex hyper-polyhedron on which to maximize Q. Once again, a problem that can be solved with standard calculus methods. The most difficult part of the implementation is computing the convex hyper-polyhedron (the data structures for high-dimensional objects like this get complicated).
If your A is positive definite, then the maximum of Q must necessarily occur at one of the vertices of the convex hyper-polyhedron. A gradient ascent should work nicely to solve this one (you get to follow a path restricted to vertices, edges, faces, hyperfaces,...).
Unfortunately, A isn't positive - it's a general matrix.
I understand how the search volume looks like, but since A isn't positive, constructing it and walking along it will give no result.
I guess what I need is a nice iterative method. Pretty much like Simplex, but that will work with Dot(L, AL) and support simple boundary constrains for all variables.
I understand how the search volume looks like, but since A isn't positive, constructing it and walking along it will give no result.
I guess what I need is a nice iterative method. Pretty much like Simplex, but that will work with Dot(L, AL) and support simple boundary constrains for all variables.
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement