Sign in to follow this  

need help with differentiation

This topic is 4049 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

is it possible to compute the derivative of y^{T}(K+C)^{-1}y with respect to either K or C? Can the answer even be computed simply? y is a vector and K and C are square matrices. thanks Shaobo

Share this post


Link to post
Share on other sites
If it can be computed simply something like a TI-92 will spit out something or choke.
At least that's what I used to use when I had a question like yours or even better use mathematica or some other symbolic differentation software.

Share this post


Link to post
Share on other sites
Quote:
Original post by daviangel
If it can be computed simply something like a TI-92 will spit out something or choke.
At least that's what I used to use when I had a question like yours or even better use mathematica or some other symbolic differentation software.


What is a TI-92?

I don't have access to Mathematica, is it more advanced than matlab's symbolic toolbox? That does not appear to have direct support for linear algebra, making the differentiation somewhat difficult.

Any suggestions for good symbolic differentiation softwares that can handle linear algebra?

Share this post


Link to post
Share on other sites
A TI-92 is a calculator made by Texas Instruments (TI). I have a TI-83 Plus, but I doubt it could compute that, I don't know how to differentiate a vector or take a vector to a power of a matrix or whatever that equation says. Anyways, a TI-92 is a very good graphing calclulator that is really great for derivitives and intergals and what not.

Share this post


Link to post
Share on other sites
Quote:
Original post by DaBookshah
Partial differentiation with respect to a component I can understand, but with respect to a whole matrix, how is that defined? Is it defined?

Maybe if we had more info on how this original problem relates to game programming we would have a better idea?

Share this post


Link to post
Share on other sites
Quote:
Original post by daviangel
Quote:
Original post by DaBookshah
Partial differentiation with respect to a component I can understand, but with respect to a whole matrix, how is that defined? Is it defined?

Maybe if we had more info on how this original problem relates to game programming we would have a better idea?


well, it may be slightly off topic but I am trying to compute is fairly general. I am basically trying compute the derivative of the Mahalanobis distance http://en.wikipedia.org/wiki/Mahalanobis_distance. Since the result is a scalar, derivative with respect to the matrix K should be another matrix.

I use it as part of an algorithm to cluster some animation data.

Share this post


Link to post
Share on other sites
Computing the derivative of yTC-1y:

(C+E)-1=(C(I+C-1E))-1=(I+C-1E)-1C-1=(I-C-1E+C-1EC-1E-...)C-1=C-1-C-1EC-1+O(||E||2),

so

yT(C+E)-1y=yTC-1y-yTC-1EC-1y+O(||E||2)

and there you have your linear functional (-(yTC-1)•(C-1y)). If you want it in a standard form tr •X, take X=-(C-1y)(yTC-1)=-||y||2C-2, but for raw calculations the form above is just as good.

disclaimer: I assume no responsibility for mangled signs and inappropriate transposes.

Share this post


Link to post
Share on other sites
thanks very much, it looks promising, from what little I know of matrix inversion identities. I was trying to work it out from the matrix inversion lemma but I wasn't quite clever enough.

Anyway, I'll let you know how I get on with it.

Quote:
Original post by Darkstrike
Computing the derivative of yTC-1y:

(C+E)-1=(C(I+C-1E))-1=(I+C-1E)-1C-1=(I-C-1E+C-1EC-1E-...)C-1=C-1-C-1EC-1+O(||E||2),



Actually, if you can give some sort of short explanation for what is happening here, that would be great. It looks like some sort of taylor expansion is going on. And what does this O(||E||2) mean?

thanks

[Edited by - shaobohou on November 11, 2006 9:28:43 AM]

Share this post


Link to post
Share on other sites
So i am guessing that there is no easy way to compute an exact derivative of yT(K+C)-1y which does not involve (K+C)-1 in its formulation.

Also, how do you actually compute the derivative of yTC-1EC-1y with respect to C? Sorry if this is actually trivial, I am trying to brush up my linear algebra with the help of the Matrix cookbook.

thanks

Shaobo

Share this post


Link to post
Share on other sites
Quote:
Original post by shaobohou
So i am guessing that there is no easy way to compute an exact derivative of yT(K+C)-1y which does not involve (K+C)-1 in its formulation.

Since knowing the derivative is essentially knowing (K+C)-2 (scaled by y), you can't get around calculating the inverse.
Quote:
Original post by shaobohou
Also, how do you actually compute the derivative of yTC-1EC-1y with respect to C?

There's no need for that, unless you want a second partial derivative. The mapping E→yT(K+C)-1E(K+C)-1y is the derivative itself (as a mapping).

Share this post


Link to post
Share on other sites
Quote:
Original post by Darkstrike
Quote:
Original post by shaobohou
So i am guessing that there is no easy way to compute an exact derivative of yT(K+C)-1y which does not involve (K+C)-1 in its formulation.

Since knowing the derivative is essentially knowing (K+C)-2 (scaled by y), you can't get around calculating the inverse.

Well, actually computing the inverse is not really a problem for me, it is more to do with the fact that K and C will be involved in additional algebraic manipulation later on and having (K+C)-1 will probably make the manipulation more difficult, i.e. assume C is know and setting the derivative to 0 and finding the least square estimate for K.
What would be the exact derivative if (K+C)-1 is allowed in the formulation?

Quote:
Original post by Darkstrike
Quote:
Original post by shaobohou
Also, how do you actually compute the derivative of yTC-1EC-1y with respect to C?

There's no need for that, unless you want a second partial derivative. The mapping E→yT(K+C)-1E(K+C)-1y is the derivative itself (as a mapping).


There might be some slight confusion here over what E and C denotes, so let me try again. According to the approximation you outlined and my understanding,
yT(K+C)-1y = yTK-1y - yTK-1CK-1y + ...,

derivative of yTK-1y with respect to K is -K-1yyTK-1, which approximates the derivative that I want to calculate if C is relatively small compared to K

if I also computed the derivative of yTC-1EC-1y with respect to K and add it to the above, would this not give a more accurate answer?

Share this post


Link to post
Share on other sites
E is my auxiliary variable from the definition of the derivative, which will be going to the 0 matrix. Could you please clarify what exactly do you want, since we seem to be talking about different things?

Share this post


Link to post
Share on other sites
Quote:
Original post by Darkstrike
E is my auxiliary variable from the definition of the derivative, which will be going to the 0 matrix. Could you please clarify what exactly do you want, since we seem to be talking about different things?


Given two square symmetric matrix K and C, a vector y, a scalar distance can be computed as
s = yT(K+C)-1y. I want to be able to calculated the derivative of s with respect to K, i.e. ds/dK, which is another matrix. Ideally, the derivative should not include the term (K+C)-1 as it makes it difficult to rearrange the derivatives to calculate least square estimate of K.

If it helps, you can assume that C only has values on the diagonal and quite often it is relatively small compared to K.

thanks

Shaobo

[Edited by - shaobohou on November 13, 2006 8:52:02 AM]

Share this post


Link to post
Share on other sites
ds/dK (=ds/dC) is not a matrix, it's a linear functional on the space of matrices; it has the form M→-(yT(C+K)-1)M((C+K)-1y). It can be identified with a matrix in a multitude of ways, my favourite is to assign the functional M→tr MX; if you choose that identification, then the derivative is -||y||2(C+K)-2. The structure of K and C can ease the inversion, but that's it (of course, you can approximate, but I'm not an expert on that).

Share this post


Link to post
Share on other sites
thanks very much for your help,

I think the exact derivative is not useful to me for reasons I have explained before. The approximation is probably sufficient if the values in C are relatively small.

Share this post


Link to post
Share on other sites

This topic is 4049 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Guest
This topic is now closed to further replies.
Sign in to follow this